Skip to content

DMS (Document Management System)

Protokol DMS is a built-in integration for storing, managing, and converting files in your Protokol project. It includes a template document engine, file conversion utilities, and a full media library.


Overview

  • File storage — upload, download, and organize files in project libraries
  • Document templates — create, version, and render documents using Go text/template
  • PDF generation — convert HTML to PDF with template data, fill PDF forms
  • Data conversion — bidirectional conversion between JSON, CSV, and Excel
  • Text extraction — extract text from PDF, DOCX, XLSX, images, and 40+ formats
  • OCR extraction — extract structured data from documents using reusable OCR models
  • EXIF extraction — extract metadata from image files
  • Workflow events — react to file upload, download, and deletion events

Uploading Files

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()

// Multipart upload
const result = await dms.upload({
  files: [file1, file2],
  uploadDir: "/invoices/2024",
  public: true,
  replace: false,
  metadata: { category: "finance" },
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@invoice.pdf" \
  -F "path=/invoices/2024" \
  -F "is_public=true"

Upload via Base64

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const result = await dms.uploadBase64({
  path: "/documents/reports",
  is_public: false,
  replace: true,
  files: [
    {
      name: "report.pdf",
      content_type: "application/pdf",
      data: "JVBERi0xLjQK...",
    },
  ],
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/upload \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/documents/reports",
    "is_public": false,
    "files": [{
      "name": "report.pdf",
      "content_type": "application/pdf",
      "data": "JVBERi0xLjQK..."
    }]
  }'

Upload Fields

Field Type Required Description
files File[] Yes Array of files to upload (multipart)
uploadDir string Yes Target directory path
public boolean No Make files publicly accessible
replace boolean No Replace existing files with same name
expiresAt string No Expiration timestamp (ISO 8601)
metadata Record<string, string> No Key-value metadata pairs

Base64 Upload Fields

Field Type Required Description
path string Yes Target directory path
is_public boolean No Make files publicly accessible
replace boolean No Replace existing files with same name
files array Yes Array of { name, content_type, data } objects

Listing and Downloading Files

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()

// List files
const files = await dms.list({ path: "/invoices" })

// Download a file (returns Blob)
const content = await dms.download("invoices/2024/invoice-001.pdf")

// Get file with encoding (e.g. "base64")
const data = await dms.getMedia("invoices/2024/invoice-001.pdf", "base64")
# List files
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/list \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "path": "/invoices" }'

# Download a file
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/download \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "key": "invoices/2024/invoice-001.pdf" }' \
  --output invoice.pdf

Directories

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()

// List directories
const directories = await dms.dirs("/invoices")

// Create a directory
await dms.createDir("/invoices/2024")

// Delete a directory
await dms.deleteDir("/invoices/old")

Libraries

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const libs = await dms.libraries()

HTML to PDF

Convert HTML to PDF using the Chrome rendering engine. Supports Go text/template syntax in the HTML — pass a data map to inject values before conversion.

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()

// Simple HTML to PDF (inline HTML)
const pdf = await dms.html2pdf({
  input_html: "<h1>Invoice #123</h1><p>Total: $500</p>",
  output_pdf: true,
})

// HTML template with data injection
const pdf2 = await dms.html2pdf({
  input_html: "<h1>{{.Title}}</h1><p>Amount: {{.Amount}} RSD</p>",
  data: { Title: "Invoice #456", Amount: "15000" },
  output_type: "base64",
})

// From a file in the media library, save result back
const pdf3 = await dms.html2pdf({
  input_path: "templates/invoice.html",
  data: { CustomerName: "Acme Corp", InvoiceNumber: "INV-001" },
  output_path: "generated/invoices",
  output_file_name: "INV-001",
  replace: true,
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/pdf/html2pdf \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "input_html": "<h1>{{.Title}}</h1><p>Amount: {{.Amount}}</p>",
    "data": { "Title": "Invoice", "Amount": "15000 RSD" },
    "output_pdf": true
  }'

HTML to PDF Fields

Field Type Required Description
input_html string No Raw HTML string (provide this or input_path)
input_path string No Path to an HTML file in the media library
output_path string No Save generated PDF to this library path
output_pdf boolean No Return PDF as file download
output_type string No "pdf", "base64", or "file"
output_file_name string No Name for the generated file (without extension)
replace boolean No Replace existing file at output path
data Record<string, any> No Template data for Go text/template rendering

Template Syntax

HTML templates use Go text/template syntax, not Mustache. Use {{.FieldName}} to inject values, {{range .Items}} for loops, and {{if .Condition}} for conditionals. See Go template documentation for full syntax.


PDF to HTML

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const html = await dms.pdf2html({
  input_path: "documents/report.pdf",
  output_path: "converted",
  output_file_name: "report",
})

PDF to HTML Fields

Field Type Required Description
input_path string No Path to a PDF file in the media library
input_pdf string No Raw PDF content
output_path string No Save converted HTML to this library path
output_file_name string No Name for the generated file
replace boolean No Replace existing file at output path

Fill PDF Form

Fill a PDF form template with typed field values.

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const filled = await dms.fillPdf({
  input_path: "templates/application-form.pdf",
  output_path: "generated/forms",
  output_file_name: "filled-form",
  form_data: {
    full_name: "John Doe",
    date: "2024-03-15",
    total: "1500.00",
  },
})

// Or with typed form structure
const filled2 = await dms.fillPdf({
  input_path: "templates/application-form.pdf",
  output_pdf: true,
  forms: [{
    textfield: [
      { name: "full_name", value: "John Doe" },
      { name: "address", value: "123 Main St" },
    ],
    checkbox: [
      { name: "agree_terms", value: true },
    ],
    datefield: [
      { name: "sign_date", value: "2024-03-15" },
    ],
  }],
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/pdf/fill \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "input_path": "templates/application-form.pdf",
    "output_pdf": true,
    "form_data": {
      "full_name": "John Doe",
      "date": "2024-03-15"
    }
  }'

Fill PDF Fields

Field Type Required Description
input_path string No Path to PDF form template in library
output_path string No Save filled PDF to this library path
output_pdf boolean No Return PDF as file download
output_type string No "pdf", "base64", or "file"
output_file_name string No Name for the generated file
replace boolean No Replace existing file at output path
form_data Record<string, any> No Simple key-value field data
forms PDFFormStructure[] No Typed form structure (textfield, checkbox, datefield, etc.)

EXIF Data Extraction

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const exif = await dms.getExifData("photos/img-001.jpg")
curl -X GET "https://lemon.protokol.io/luma/integrations/v3/dms/media/exif/photos/img-001.jpg" \
  -H "Authorization: Bearer $TOKEN"

Text Extraction

Extract text content from any document format — PDF, DOCX, XLSX, PPTX, ODT, RTF, EPUB, CSV, images, and 40+ other types. Provide either a file path from the DMS library or base64-encoded file data.

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()

// From a DMS file path
const result = await dms.extractText({
  input_path: "uploads/contract.pdf",
})

console.log(result.data.text)       // extracted text content
console.log(result.data.char_count) // character count

// From base64-encoded data
const result2 = await dms.extractText({
  file_data: "data:application/pdf;base64,JVBERi0xLjQK...",
  file_name: "invoice.pdf",
})
# From a DMS file path
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/extract-text \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{ "input_path": "uploads/contract.pdf" }'

# From base64
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/extract-text \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "file_data": "data:application/pdf;base64,JVBERi0xLjQK...",
    "file_name": "invoice.pdf"
  }'

Extract Text Fields

Field Type Required Description
input_path string No File path in the DMS library (provide this or file_data)
file_data string No Base64-encoded file content with data URI prefix
file_name string No Original file name with extension — helps detect format when using file_data

Extract Text Response

Field Type Description
text string Extracted text content
char_count number Number of characters in the extracted text

Supported Formats

Text extraction supports 40+ formats including PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, RTF, EPUB, CSV, TSV, HTML, XML, Markdown, plain text, and image files (via OCR). See the full list in the API response headers.


OCR Extraction

Extract structured data from documents using pre-defined OCR models. Models are created in the DMS UI by drawing labeled regions on sample documents — each region maps to a field you want to extract (name, date of birth, ID number, etc.).

Once a model is saved, use the SDK or API to run extraction on new documents and get structured key-value results.

Paid Plan Required

OCR extraction requires the paid DMS plan. Projects on the free plan receive a 403 response.

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()

// Extract from a DMS file
const result = await dms.ocrExtract({
  model_uuid: "your-ocr-model-uuid",
  image_key: "uploads/id-card-scan.jpg",
  document_boundary: {
    x: 0.05,
    y: 0.08,
    width: 0.9,
    height: 0.85,
  },
})

console.log(result.data.results)
// { name: "John Smith", date_of_birth: "01.01.1990", id_number: "123456789" }

// Extract from base64 image
const result2 = await dms.ocrExtract({
  model_uuid: "your-ocr-model-uuid",
  image: "data:image/jpeg;base64,/9j/4AAQ...",
})
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/ocr/extract \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "model_uuid": "your-ocr-model-uuid",
    "image_key": "uploads/id-card-scan.jpg",
    "document_boundary": {
      "x": 0.05,
      "y": 0.08,
      "width": 0.9,
      "height": 0.85
    }
  }'

OCR Extract Fields

Field Type Required Description
model_uuid string Yes UUID of the OCR model to use
image_key string No File path in the DMS library (provide this or image)
image string No Base64-encoded image with data URI prefix
document_boundary object No Document boundary rectangle as fractional coordinates (0.0–1.0). If omitted, the entire image is treated as the document.

Document Boundary

The document_boundary object marks the edges of the document within the image. Region coordinates in the OCR model are relative to this boundary, making extraction resilient to different scan positions and scales.

Field Type Description
x number Left edge (0.0–1.0 of image width)
y number Top edge (0.0–1.0 of image height)
width number Width (0.0–1.0 of image width)
height number Height (0.0–1.0 of image height)

OCR Extract Response

Field Type Description
model_uuid string UUID of the model used
model_name string Name of the model used
results Record<string, string> Extracted text keyed by region label
errors string[] Per-region errors, if any

Data Conversion

Convert data bidirectionally between JSON, CSV, and Excel formats.

JSON to CSV

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const csv = await dms.jsonToCsv([
  { name: "Widget A", price: 10.99, quantity: 100 },
  { name: "Widget B", price: 25.50, quantity: 50 },
])
curl -X POST https://lemon.protokol.io/luma/integrations/v3/dms/media/convert/json-to-csv \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '[{"name":"Widget A","price":10.99},{"name":"Widget B","price":25.50}]'

JSON to Excel

const excel = await dms.jsonToExcel(data, { sheet_name: "Products" })

CSV to JSON

const json = await dms.csvToJson(csvString)

CSV to Excel

const excel = await dms.csvToExcel(csvString, { sheet_name: "Data" })

Excel to JSON

const json = await dms.excelToJson(excelBlob, { sheet_name: "Sheet1" })

Excel to CSV

const csv = await dms.excelToCsv(excelBlob)

Generic Conversion

const result = await dms.convertData(inputData, {
  from: "json",
  to: "csv",
  header_as_comment: true,
  field_order: ["name", "price", "quantity"],
})
curl -X POST "https://lemon.protokol.io/luma/integrations/v3/dms/media/convert/data?from=json&to=csv" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '[{"name":"Widget A","price":10.99}]'

Data Validation

const validation = await dms.validateData(inputData, { format: "json" })
// { valid: true, message: "Data is valid JSON format" }

Data Info

const info = await dms.getDataInfo(inputData, { format: "csv" })
// { format: "csv", size_bytes: 245, record_count: 10, field_count: 3, fields: [...] }

Document Templates

Create, manage, and version structured documents. Document template content uses Go text/template syntax for dynamic rendering.

Template Engine

Document templates use Go text/template, not Mustache or Handlebars. Use {{.FieldName}} to reference data fields. Dot access, range loops, conditionals, and pipes are all supported.

Create a Document

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const doc = await dms.createDocument({
  path: "templates/invoice",
  content: "<h1>{{.CompanyName}}</h1><p>Invoice #{{.Number}}</p>",
  variant: "default",
})
curl -X POST https://lemon.protokol.io/luma/integrations/v2/dms/documents/document \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "path": "templates/invoice",
    "content": "<h1>{{.CompanyName}}</h1><p>Invoice #{{.Number}}</p>",
    "variant": "default"
  }'

Create Document Fields

Field Type Required Description
path string Yes Document path (directory + name)
content string No Template content (Go text/template syntax)
variant string No Variant name (defaults to "default")
sources DocumentSource[] No Data source definitions for the template
fields DocumentField[] No Field definitions with labels and mapping

List Documents

const docs = await dms.listDocuments({ searchText: "invoice" })
curl -X GET "https://lemon.protokol.io/luma/integrations/v2/dms/documents/list?search=invoice" \
  -H "Authorization: Bearer $TOKEN"

Get Document

const doc = await dms.getDocument("templates/invoice")

// Raw binary format (.pdoc)
const raw = await dms.getDocumentRaw("templates/invoice")

Update Document

await dms.updateDocument({
  path: "templates/invoice",
  variant: "default",
  version: 1,
  content: "<h1>{{.CompanyName}}</h1><p>Updated Invoice #{{.Number}}</p>",
})

Delete, Restore, Comment

// Delete
await dms.deleteDocument("templates/invoice")

// List deleted documents
const deleted = await dms.listDeletedDocuments()

// Restore
await dms.restoreDocument({ path: "templates/invoice" })

// Add comment
await dms.createDocumentComment({
  path: "templates/invoice",
  content: "Updated VAT calculation section",
  variant: "default",
})

Generate Document

Render a stored document template with data. Fetches the template, applies Go text/template rendering with the provided data map, and returns the rendered content.

import { DMS } from "@ptkl/sdk/beta"

const dms = new DMS()
const result = await dms.generateDocument({
  path: "templates/invoice",
  variant: "default",
  data: {
    CompanyName: "Acme Corp",
    Number: "INV-001",
    Items: [
      { Name: "Widget A", Qty: 10, Price: "100.00" },
      { Name: "Widget B", Qty: 5, Price: "200.00" },
    ],
    Total: "2000.00",
  },
})

console.log(result.data.content) // rendered HTML string
curl -X POST https://lemon.protokol.io/luma/integrations/v2/dms/documents/document/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "path": "templates/invoice",
    "variant": "default",
    "data": {
      "CompanyName": "Acme Corp",
      "Number": "INV-001",
      "Total": "2000.00"
    }
  }'

Generate Document Fields

Field Type Required Description
path string Yes Path to the document template
variant string No Variant to render (defaults to "default")
data Record<string, any> Yes Data map passed to text/template execution

Generate Document Response

Field Type Description
status boolean Whether generation succeeded
content string The rendered document content

Generate + PDF

To produce a PDF from a generated document, pass the rendered content to html2pdf:

const rendered = await dms.generateDocument({
  path: "templates/invoice",
  data: { CompanyName: "Acme Corp", Number: "INV-001" },
})

const pdf = await dms.html2pdf({
  input_html: rendered.data.content,
  output_pdf: true,
})

Workflow Integration

File Events

The DMS publishes lifecycle events that can trigger workflows:

Event When
integration.dms.file.uploaded File uploaded to DMS
integration.dms.file.downloaded File downloaded from DMS
integration.dms.file.removed File deleted from DMS
integration.dms.dir.removed Directory removed from DMS

Permissions

Permission Description
use Basic DMS access
read Read and list files
library_edit Manage library structure
delete Delete files
upload Upload files
create_document Create template documents
delete_document Delete template documents
read_documents View template documents
edit_document Edit template documents
utilities Use conversion and extraction utilities

Using in Platform Functions

module.exports = async function ({ $sdk, $input, response }) {
  const { DMS } = $sdk.version("0.10")
  const dms = new DMS()

  // Generate a document from template with data
  const rendered = await dms.generateDocument({
    path: "templates/invoice",
    data: {
      CompanyName: $input.body.company,
      Number: $input.body.invoice_number,
      Amount: $input.body.amount,
    },
  })

  // Convert to PDF and save
  const pdf = await dms.html2pdf({
    input_html: rendered.data.content,
    output_path: "generated/invoices",
    output_file_name: $input.body.invoice_number,
    replace: true,
  })

  response.json({ success: true, pdf: pdf.data })
}

SDK Reference

import { DMS } from "@ptkl/sdk/beta"

// Or via the Integrations facade
import { Integrations } from "@ptkl/sdk/beta"
const integrations = new Integrations()
const dms = integrations.getDMS()

File Operations

Method Description
dms.upload(payload) Upload files (multipart form data)
dms.uploadBase64(data) Upload files via base64-encoded content
dms.list(data) List files in a library
dms.download(key) Download a file by key
dms.getMedia(key, encoding) Get file with specific encoding
dms.delete(data) Delete files
dms.getExifData(key) Extract EXIF data from an image
dms.libraries() List available libraries
dms.dirs(path) List directories
dms.createDir(path) Create a directory
dms.deleteDir(path) Delete a directory

PDF Operations

Method Description
dms.html2pdf(options) Convert HTML to PDF (supports template data)
dms.pdf2html(options) Convert PDF to HTML
dms.fillPdf(options) Fill a PDF form template

Data Conversion

Method Description
dms.convertData(data, params) Generic format conversion
dms.jsonToCsv(data, options?) Convert JSON to CSV
dms.jsonToExcel(data, options?) Convert JSON to Excel
dms.csvToJson(csvData) Convert CSV to JSON
dms.csvToExcel(csvData, options?) Convert CSV to Excel
dms.excelToJson(data, options?) Convert Excel to JSON
dms.excelToCsv(data, options?) Convert Excel to CSV
dms.validateData(data, params) Validate data format
dms.getDataInfo(data, params) Get data structure info

Text Extraction

Method Description
dms.extractText(payload) Extract text from any document (PDF, DOCX, images, 40+ formats)

OCR Extraction

Method Description
dms.ocrExtract(payload) Extract structured data using a pre-defined OCR model (paid plan)

Document Templates

Method Description
dms.createDocument(payload) Create a template document
dms.getDocument(path) Get a document by path (JSON)
dms.getDocumentRaw(path) Get raw document content (.pdoc)
dms.updateDocument(payload) Update a document variant
dms.deleteDocument(path) Delete a document
dms.listDocuments(params?) List documents
dms.listDeletedDocuments(params?) List deleted documents
dms.restoreDocument(payload) Restore a deleted document
dms.createDocumentComment(payload) Add a comment to a document
dms.generateDocument(payload) Render a template with data