DMS (Document Management System)
Protokol DMS is a built-in integration for storing, managing, and converting files in your Protokol project. It includes a template document engine, file conversion utilities, and a full media library.
Overview
- File storage — upload, download, and organize files in project libraries
- Document templates — create, version, and render documents using Go
text/template - PDF generation — convert HTML to PDF with template data, fill PDF forms
- Data conversion — bidirectional conversion between JSON, CSV, and Excel
- Text extraction — extract text from PDF, DOCX, XLSX, images, and 40+ formats
- OCR extraction — extract structured data from documents using reusable OCR models
- EXIF extraction — extract metadata from image files
- Workflow events — react to file upload, download, and deletion events
Uploading Files
Upload via Base64
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/upload \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"path": "/documents/reports",
"is_public": false,
"files": [{
"name": "report.pdf",
"content_type": "application/pdf",
"data": "JVBERi0xLjQK..."
}]
}'
Upload Fields
| Field | Type | Required | Description |
|---|---|---|---|
files |
File[] |
Yes | Array of files to upload (multipart) |
uploadDir |
string |
Yes | Target directory path |
public |
boolean |
No | Make files publicly accessible |
replace |
boolean |
No | Replace existing files with same name |
expiresAt |
string |
No | Expiration timestamp (ISO 8601) |
metadata |
Record<string, string> |
No | Key-value metadata pairs |
Base64 Upload Fields
| Field | Type | Required | Description |
|---|---|---|---|
path |
string |
Yes | Target directory path |
is_public |
boolean |
No | Make files publicly accessible |
replace |
boolean |
No | Replace existing files with same name |
files |
array |
Yes | Array of { name, content_type, data } objects |
Listing and Downloading Files
import { DMS } from "@ptkl/sdk/beta"
const dms = new DMS()
// List files
const files = await dms.list({ path: "/invoices" })
// Download a file (returns Blob)
const content = await dms.download("invoices/2024/invoice-001.pdf")
// Get file with encoding (e.g. "base64")
const data = await dms.getMedia("invoices/2024/invoice-001.pdf", "base64")
# List files
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/list \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{ "path": "/invoices" }'
# Download a file
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/download \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{ "key": "invoices/2024/invoice-001.pdf" }' \
--output invoice.pdf
Directories
Libraries
HTML to PDF
Convert HTML to PDF using the Chrome rendering engine. Supports Go text/template
syntax in the HTML — pass a data map to inject values before conversion.
import { DMS } from "@ptkl/sdk/beta"
const dms = new DMS()
// Simple HTML to PDF (inline HTML)
const pdf = await dms.html2pdf({
input_html: "<h1>Invoice #123</h1><p>Total: $500</p>",
output_pdf: true,
})
// HTML template with data injection
const pdf2 = await dms.html2pdf({
input_html: "<h1>{{.Title}}</h1><p>Amount: {{.Amount}} RSD</p>",
data: { Title: "Invoice #456", Amount: "15000" },
output_type: "base64",
})
// From a file in the media library, save result back
const pdf3 = await dms.html2pdf({
input_path: "templates/invoice.html",
data: { CustomerName: "Acme Corp", InvoiceNumber: "INV-001" },
output_path: "generated/invoices",
output_file_name: "INV-001",
replace: true,
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/pdf/html2pdf \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"input_html": "<h1>{{.Title}}</h1><p>Amount: {{.Amount}}</p>",
"data": { "Title": "Invoice", "Amount": "15000 RSD" },
"output_pdf": true
}'
HTML to PDF Fields
| Field | Type | Required | Description |
|---|---|---|---|
input_html |
string |
No | Raw HTML string (provide this or input_path) |
input_path |
string |
No | Path to an HTML file in the media library |
output_path |
string |
No | Save generated PDF to this library path |
output_pdf |
boolean |
No | Return PDF as file download |
output_type |
string |
No | "pdf", "base64", or "file" |
output_file_name |
string |
No | Name for the generated file (without extension) |
replace |
boolean |
No | Replace existing file at output path |
data |
Record<string, any> |
No | Template data for Go text/template rendering |
Template Syntax
HTML templates use Go text/template syntax, not Mustache.
Use {{.FieldName}} to inject values, {{range .Items}} for loops,
and {{if .Condition}} for conditionals. See
Go template documentation for full syntax.
PDF to HTML
PDF to HTML Fields
| Field | Type | Required | Description |
|---|---|---|---|
input_path |
string |
No | Path to a PDF file in the media library |
input_pdf |
string |
No | Raw PDF content |
output_path |
string |
No | Save converted HTML to this library path |
output_file_name |
string |
No | Name for the generated file |
replace |
boolean |
No | Replace existing file at output path |
Fill PDF Form
Fill a PDF form template with typed field values.
import { DMS } from "@ptkl/sdk/beta"
const dms = new DMS()
const filled = await dms.fillPdf({
input_path: "templates/application-form.pdf",
output_path: "generated/forms",
output_file_name: "filled-form",
form_data: {
full_name: "John Doe",
date: "2024-03-15",
total: "1500.00",
},
})
// Or with typed form structure
const filled2 = await dms.fillPdf({
input_path: "templates/application-form.pdf",
output_pdf: true,
forms: [{
textfield: [
{ name: "full_name", value: "John Doe" },
{ name: "address", value: "123 Main St" },
],
checkbox: [
{ name: "agree_terms", value: true },
],
datefield: [
{ name: "sign_date", value: "2024-03-15" },
],
}],
})
Fill PDF Fields
| Field | Type | Required | Description |
|---|---|---|---|
input_path |
string |
No | Path to PDF form template in library |
output_path |
string |
No | Save filled PDF to this library path |
output_pdf |
boolean |
No | Return PDF as file download |
output_type |
string |
No | "pdf", "base64", or "file" |
output_file_name |
string |
No | Name for the generated file |
replace |
boolean |
No | Replace existing file at output path |
form_data |
Record<string, any> |
No | Simple key-value field data |
forms |
PDFFormStructure[] |
No | Typed form structure (textfield, checkbox, datefield, etc.) |
EXIF Data Extraction
Text Extraction
Extract text content from any document format — PDF, DOCX, XLSX, PPTX, ODT, RTF, EPUB, CSV, images, and 40+ other types. Provide either a file path from the DMS library or base64-encoded file data.
import { DMS } from "@ptkl/sdk/beta"
const dms = new DMS()
// From a DMS file path
const result = await dms.extractText({
input_path: "uploads/contract.pdf",
})
console.log(result.data.text) // extracted text content
console.log(result.data.char_count) // character count
// From base64-encoded data
const result2 = await dms.extractText({
file_data: "data:application/pdf;base64,JVBERi0xLjQK...",
file_name: "invoice.pdf",
})
# From a DMS file path
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/extract-text \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{ "input_path": "uploads/contract.pdf" }'
# From base64
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/extract-text \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"file_data": "data:application/pdf;base64,JVBERi0xLjQK...",
"file_name": "invoice.pdf"
}'
Extract Text Fields
| Field | Type | Required | Description |
|---|---|---|---|
input_path |
string |
No | File path in the DMS library (provide this or file_data) |
file_data |
string |
No | Base64-encoded file content with data URI prefix |
file_name |
string |
No | Original file name with extension — helps detect format when using file_data |
Extract Text Response
| Field | Type | Description |
|---|---|---|
text |
string |
Extracted text content |
char_count |
number |
Number of characters in the extracted text |
Supported Formats
Text extraction supports 40+ formats including PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, RTF, EPUB, CSV, TSV, HTML, XML, Markdown, plain text, and image files (via OCR). See the full list in the API response headers.
OCR Extraction
Extract structured data from documents using pre-defined OCR models. Models are created in the DMS UI by drawing labeled regions on sample documents — each region maps to a field you want to extract (name, date of birth, ID number, etc.).
Once a model is saved, use the SDK or API to run extraction on new documents and get structured key-value results.
Paid Plan Required
OCR extraction requires the paid DMS plan. Projects on the free plan receive
a 403 response.
import { DMS } from "@ptkl/sdk/beta"
const dms = new DMS()
// Extract from a DMS file
const result = await dms.ocrExtract({
model_uuid: "your-ocr-model-uuid",
image_key: "uploads/id-card-scan.jpg",
document_boundary: {
x: 0.05,
y: 0.08,
width: 0.9,
height: 0.85,
},
})
console.log(result.data.results)
// { name: "John Smith", date_of_birth: "01.01.1990", id_number: "123456789" }
// Extract from base64 image
const result2 = await dms.ocrExtract({
model_uuid: "your-ocr-model-uuid",
image: "data:image/jpeg;base64,/9j/4AAQ...",
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/ocr/extract \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"model_uuid": "your-ocr-model-uuid",
"image_key": "uploads/id-card-scan.jpg",
"document_boundary": {
"x": 0.05,
"y": 0.08,
"width": 0.9,
"height": 0.85
}
}'
OCR Extract Fields
| Field | Type | Required | Description |
|---|---|---|---|
model_uuid |
string |
Yes | UUID of the OCR model to use |
image_key |
string |
No | File path in the DMS library (provide this or image) |
image |
string |
No | Base64-encoded image with data URI prefix |
document_boundary |
object |
No | Document boundary rectangle as fractional coordinates (0.0–1.0). If omitted, the entire image is treated as the document. |
Document Boundary
The document_boundary object marks the edges of the document within the image.
Region coordinates in the OCR model are relative to this boundary, making
extraction resilient to different scan positions and scales.
| Field | Type | Description |
|---|---|---|
x |
number |
Left edge (0.0–1.0 of image width) |
y |
number |
Top edge (0.0–1.0 of image height) |
width |
number |
Width (0.0–1.0 of image width) |
height |
number |
Height (0.0–1.0 of image height) |
OCR Extract Response
| Field | Type | Description |
|---|---|---|
model_uuid |
string |
UUID of the model used |
model_name |
string |
Name of the model used |
results |
Record<string, string> |
Extracted text keyed by region label |
errors |
string[] |
Per-region errors, if any |
Data Conversion
Convert data bidirectionally between JSON, CSV, and Excel formats.
JSON to CSV
JSON to Excel
CSV to JSON
CSV to Excel
Excel to JSON
Excel to CSV
Generic Conversion
Data Validation
Data Info
Document Templates
Create, manage, and version structured documents. Document template content uses
Go text/template syntax for dynamic rendering.
Template Engine
Document templates use Go text/template, not Mustache or Handlebars.
Use {{.FieldName}} to reference data fields. Dot access, range loops,
conditionals, and pipes are all supported.
Create a Document
Create Document Fields
| Field | Type | Required | Description |
|---|---|---|---|
path |
string |
Yes | Document path (directory + name) |
content |
string |
No | Template content (Go text/template syntax) |
variant |
string |
No | Variant name (defaults to "default") |
sources |
DocumentSource[] |
No | Data source definitions for the template |
fields |
DocumentField[] |
No | Field definitions with labels and mapping |
List Documents
Get Document
Update Document
Delete, Restore, Comment
// Delete
await dms.deleteDocument("templates/invoice")
// List deleted documents
const deleted = await dms.listDeletedDocuments()
// Restore
await dms.restoreDocument({ path: "templates/invoice" })
// Add comment
await dms.createDocumentComment({
path: "templates/invoice",
content: "Updated VAT calculation section",
variant: "default",
})
Generate Document
Render a stored document template with data. Fetches the template, applies
Go text/template rendering with the provided data map, and returns the
rendered content.
import { DMS } from "@ptkl/sdk/beta"
const dms = new DMS()
const result = await dms.generateDocument({
path: "templates/invoice",
variant: "default",
data: {
CompanyName: "Acme Corp",
Number: "INV-001",
Items: [
{ Name: "Widget A", Qty: 10, Price: "100.00" },
{ Name: "Widget B", Qty: 5, Price: "200.00" },
],
Total: "2000.00",
},
})
console.log(result.data.content) // rendered HTML string
curl -X POST https://lemon.protokol.io/luma/integrations/v2/dms/documents/document/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"path": "templates/invoice",
"variant": "default",
"data": {
"CompanyName": "Acme Corp",
"Number": "INV-001",
"Total": "2000.00"
}
}'
Generate Document Fields
| Field | Type | Required | Description |
|---|---|---|---|
path |
string |
Yes | Path to the document template |
variant |
string |
No | Variant to render (defaults to "default") |
data |
Record<string, any> |
Yes | Data map passed to text/template execution |
Generate Document Response
| Field | Type | Description |
|---|---|---|
status |
boolean |
Whether generation succeeded |
content |
string |
The rendered document content |
Generate + PDF
To produce a PDF from a generated document, pass the rendered content
to html2pdf:
Workflow Integration
File Events
The DMS publishes lifecycle events that can trigger workflows:
| Event | When |
|---|---|
integration.dms.file.uploaded |
File uploaded to DMS |
integration.dms.file.downloaded |
File downloaded from DMS |
integration.dms.file.removed |
File deleted from DMS |
integration.dms.dir.removed |
Directory removed from DMS |
Permissions
| Permission | Description |
|---|---|
use |
Basic DMS access |
read |
Read and list files |
library_edit |
Manage library structure |
delete |
Delete files |
upload |
Upload files |
create_document |
Create template documents |
delete_document |
Delete template documents |
read_documents |
View template documents |
edit_document |
Edit template documents |
utilities |
Use conversion and extraction utilities |
Using in Platform Functions
module.exports = async function ({ $sdk, $input, response }) {
const { DMS } = $sdk.version("0.10")
const dms = new DMS()
// Generate a document from template with data
const rendered = await dms.generateDocument({
path: "templates/invoice",
data: {
CompanyName: $input.body.company,
Number: $input.body.invoice_number,
Amount: $input.body.amount,
},
})
// Convert to PDF and save
const pdf = await dms.html2pdf({
input_html: rendered.data.content,
output_path: "generated/invoices",
output_file_name: $input.body.invoice_number,
replace: true,
})
response.json({ success: true, pdf: pdf.data })
}
SDK Reference
import { DMS } from "@ptkl/sdk/beta"
// Or via the Integrations facade
import { Integrations } from "@ptkl/sdk/beta"
const integrations = new Integrations()
const dms = integrations.getDMS()
File Operations
| Method | Description |
|---|---|
dms.upload(payload) |
Upload files (multipart form data) |
dms.uploadBase64(data) |
Upload files via base64-encoded content |
dms.list(data) |
List files in a library |
dms.download(key) |
Download a file by key |
dms.getMedia(key, encoding) |
Get file with specific encoding |
dms.delete(data) |
Delete files |
dms.getExifData(key) |
Extract EXIF data from an image |
dms.libraries() |
List available libraries |
dms.dirs(path) |
List directories |
dms.createDir(path) |
Create a directory |
dms.deleteDir(path) |
Delete a directory |
PDF Operations
| Method | Description |
|---|---|
dms.html2pdf(options) |
Convert HTML to PDF (supports template data) |
dms.pdf2html(options) |
Convert PDF to HTML |
dms.fillPdf(options) |
Fill a PDF form template |
Data Conversion
| Method | Description |
|---|---|
dms.convertData(data, params) |
Generic format conversion |
dms.jsonToCsv(data, options?) |
Convert JSON to CSV |
dms.jsonToExcel(data, options?) |
Convert JSON to Excel |
dms.csvToJson(csvData) |
Convert CSV to JSON |
dms.csvToExcel(csvData, options?) |
Convert CSV to Excel |
dms.excelToJson(data, options?) |
Convert Excel to JSON |
dms.excelToCsv(data, options?) |
Convert Excel to CSV |
dms.validateData(data, params) |
Validate data format |
dms.getDataInfo(data, params) |
Get data structure info |
Text Extraction
| Method | Description |
|---|---|
dms.extractText(payload) |
Extract text from any document (PDF, DOCX, images, 40+ formats) |
OCR Extraction
| Method | Description |
|---|---|
dms.ocrExtract(payload) |
Extract structured data using a pre-defined OCR model (paid plan) |
Document Templates
| Method | Description |
|---|---|
dms.createDocument(payload) |
Create a template document |
dms.getDocument(path) |
Get a document by path (JSON) |
dms.getDocumentRaw(path) |
Get raw document content (.pdoc) |
dms.updateDocument(payload) |
Update a document variant |
dms.deleteDocument(path) |
Delete a document |
dms.listDocuments(params?) |
List documents |
dms.listDeletedDocuments(params?) |
List deleted documents |
dms.restoreDocument(payload) |
Restore a deleted document |
dms.createDocumentComment(payload) |
Add a comment to a document |
dms.generateDocument(payload) |
Render a template with data |