Quantra Documentation

Glossary

This glossary provides definitions for the key terms, acronyms, and concepts used throughout the Quantra documentation and platform. Terms are listed in alphabetical order.

AI Helper
A small node that adds an Artificial Intelligence (AI) capability to the tool or workbench it is attached to. Helpers are connected with a special edge type (a helper edge) and are used by tools such as Summary, NHS Summary, PII Detect, and the workbenches Q-ROT and Q-RoPA. Examples include the OpenAI helper, Anthropic Claude helper, Mistral helper, and Aleph Alpha helper.
AWS
Amazon Web Services. The public cloud platform whose Textract service is one of the reading modes available in HWT-OCR. Selecting the Textract mode requires AWS credentials configured by an administrator on the OCR microservice.
Canvas
The visual pipeline editor inside a Quantra project. The canvas is a two-dimensional workspace where users place nodes (datasources, tools, and workbenches) and connect them with edges to define the data flow and processing sequence of a pipeline. The canvas auto-saves; node positions are stored as x and y coordinates in the database.
Chain JSON
The JavaScript Object Notation (JSON) document the platform constructs when a pipeline run is initiated. The chain JSON describes the complete execution graph — the starting node, every node configuration, and every edge between them — and is stored on the GraphExecution model for auditability and replay. See the API Reference for the full schema.
CNO
Central Node Orchestrator. The execution engine that orchestrates pipeline runs. The CNO reads the chain JSON, performs a topological sort of the nodes, and executes each node in dependency order by communicating with the corresponding microservices over Google Remote Procedure Call (gRPC). All microservice interactions from the platform go through the CNO; no direct microservice calls are made from the Django application.
Datasource
One of the three plugin kinds in Quantra. A datasource node brings information into a pipeline from a system outside Quantra — a network folder, a mailbox, a document library, a database, and so on. Datasource plugins are stored in /plugins/datasources/ and use "kind": "datasource" in their manifest.
DPO
Data Protection Officer. A role defined under the General Data Protection Regulation (GDPR) responsible for oversight of personal-data processing. The Q-RoPA workbench surfaces a dedicated administrator and DPO view for approving and signing off the Record of Processing Activities.
Edge
A directed connection between two nodes on the canvas. Edges define how data flows from one processing step to the next in a pipeline. Each edge has a source node, a target node, and an edge mode that determines the nature of the connection. Edges are stored as CanvasEdge records in the database.
Edge Mode
The type of connection represented by an edge. Quantra supports four edge modes:
  • Flow: standard data flow. The output of the source node is passed as input to the target node during pipeline execution. Nodes connected by flow edges are executed in topological order.
  • Interactive: the target node is paused for human interaction (typically a workbench). The pipeline only continues once the reviewer has finished.
  • Reference: the target node is given a pointer to the source node's data rather than a copy of the data itself.
  • Helper: the source node is attached as a helper (typically an AI helper) to the target node. The helper does not appear as a separate stage in the pipeline.
eos
End of Stream. A boolean field in the standard Quantra frame format that indicates whether a frame is the last one in a streaming response. When eos is true, the receiver knows no more frames will follow and can finalise processing. Every microservice response stream must terminate with a frame where eos is true.
EXIF
Exchangeable Image File Format. A metadata standard embedded in photographs and video files (camera model, date taken, location, and so on). The Summary tool can extract EXIF metadata as part of its output.
FAISS
Facebook AI Similarity Search. An open-source library for efficient similarity search and clustering of dense vectors. Used by some Quantra microservices for vector-based document similarity searches such as duplicate detection and semantic matching.
Frame Format
The standard JSON structure used for all data exchange between the platform and its gRPC microservices. Every frame contains four top-level fields: oauth2 (Open Authorization context with a url field), data (an array of items, each with meta and content objects), tod (type of data: "object" or "table"), and eos (end of stream boolean).
GDPR
General Data Protection Regulation. The European Union regulation governing the processing of personal data. Several Quantra workbenches (Q-ROT, Q-SAR, Q-RoPA, Q-NHS-SAR) and the SAR Release tool are designed for GDPR-aligned workflows.
GPU
Graphics Processing Unit. The accelerator hardware recommended for running local Optical Character Recognition (OCR) reading modes such as PaddleOCR and TrOCR at production speed. GPU provisioning is an administrator responsibility.
gRPC
Google Remote Procedure Call. A high-performance, open-source remote-procedure-call framework. Quantra uses gRPC for all communication between the platform and its microservices. The protocol supports bidirectional streaming, which lets large documents be transferred efficiently and progress to be reported in real time. All Quantra microservices implement the MicroService gRPC service with a single Call RPC.
GUID
Globally Unique Identifier. A 128-bit identifier in the form {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}. Used by the M-Files datasource to identify which vault to read from.
HWT-OCR
Handwritten Text Optical Character Recognition. The Quantra tool that turns the text on a page (printed or handwritten) into machine-readable words, including the position of each word on the page. HWT-OCR supports three reading modes: Textract (cloud), PaddleOCR (local, printed), and TrOCR (local, handwriting).
MFA
Multi-Factor Authentication. A second sign-in factor in addition to the password. Quantra implements MFA using the Time-based One-Time Password (TOTP) algorithm via the django-otp library, with single-use recovery tokens. The platform-wide policy is set on the SystemSettings.mfa_mode field (enforced, optional, or disabled).
MRN
Medical Record Number. The local hospital identifier for a patient. The NHS Summary tool extracts MRN values alongside National Health Service (NHS) numbers; Q-NHS-SAR shows MRN in its default columns.
mTLS
Mutual Transport Layer Security. An extension of standard TLS in which both the client and the server present X.509 certificates and verify each other's identity during the handshake. In Quantra, mTLS is used for all gRPC communication between the platform and its microservices, ensuring that only authorised components can participate in the microservice mesh. Certificates live in /home/einar/quantra/ms/certs/.
NDJSON
Newline-Delimited JSON. A data format where each line of a file is a complete JSON object, separated by newline characters. Used in some Quantra streaming and batch-processing workflows so that each record can be parsed independently.
NHS
National Health Service. The publicly funded healthcare system of the United Kingdom. Quantra has NHS-specific tools and workbenches (NHS Summary and Q-NHS-SAR) that follow NHS information-governance rules and recognise NHS-specific identifiers such as NHS numbers and Medical Record Numbers (MRN).
NINO
National Insurance Number. A United Kingdom government identifier used for tax and benefits. PII Detect can recognise NINO values among the country-specific identifier categories it scans for.
Node
A single processing step on the canvas. Each node represents an instance of a plugin (datasource, tool, or workbench) with a specific configuration. Nodes have a kind, a plugin identifier, a display label, x/y coordinates on the canvas, and a JSON configuration object. Nodes are stored as CanvasNode records and are connected by edges to form the pipeline graph.
OAuth
Open Authorization. The standard sign-in flow Quantra uses to connect to cloud services such as Box, Outlook, SharePoint, OneDrive, Gmail, and Google Drive. End users see an OAuth pop-up window when a node first needs access to a protected source. Administrators register the OAuth application in the relevant provider's developer console and may also configure shared group credentials for the platform.
OCR
Optical Character Recognition. The general technique of converting images of text into machine-readable text. In Quantra, OCR is exposed through the HWT-OCR tool, which supports printed and handwritten text and produces word-level positions used by downstream redaction tools.
PII
Personally Identifiable Information. Any data that could identify a specific individual — names, contact details, government identifiers, and so on. Quantra provides the PII Detect tool, the Q-Viewer highlighter, and the SAR Release packager to find, review, and apply redactions to PII.
Pipeline
A sequence of connected processing steps defined on a project's canvas. A pipeline starts at one or more datasource nodes, flows through tool nodes, and may end at workbench nodes for interactive review. When run, the pipeline's nodes are executed in topological order, with data flowing along the edges from source to target. The CNO orchestrates execution.
Plugin
A modular extension that adds functionality to the Quantra platform. Plugins come in three kinds: datasources, tools, and workbenches. Each plugin consists of a manifest file (plugin.json), a server handler (server.py), and optional templates and static assets. Plugins are discovered automatically by the plugin host at startup.
Plugin Host
The platform component responsible for discovering, validating, registering, and serving plugins. At startup, the plugin host scans the /plugins/ directory, reads each plugin's manifest, validates its required fields, registers Uniform Resource Locator (URL) routes for each declared page, and integrates the plugin into the Django URL routing system.
Plugin Manifest
The plugin.json file that declares a plugin's identity and configuration. The manifest contains the plugin's id (unique identifier), kind (datasource, tool, or workbench), name (display name), version (in the form MAJOR.MINOR.BUILD), pages (an array of route and template mappings), and optional meta (category and description).
Project
The top-level organisational unit in Quantra. A project contains a single canvas with nodes and edges that together define a document-processing pipeline. Projects are owned by a user and can be shared with other users at one of three permission levels: Read, Execute, or Write. Each project also keeps its own execution history, schedules, sharing settings, and audit log.
Protocol Buffers
Also known as protobuf. A language-neutral, platform-neutral serialisation format used by gRPC for defining service interfaces and message structures. In Quantra, the microservice.proto file defines the MicroService service and its request and response types.
Q-DACT
The redaction-viewer component used inside the Review tool and the Q-Viewer, Q-ROT, Q-SAR, Q-RoPA, and Q-NHS-SAR workbenches. Q-DACT renders Portable Document Format (PDF) files, images, and videos at a fixed scale of two image pixels per PDF point so that redaction boxes drawn on the image map back to PDF points precisely.
Q-NHS-SAR
The National Health Service variant of Q-SAR, with column defaults, metadata fields, and PII rules tuned for healthcare records. See Workbenches.
Q-RoPA
The workbench for maintaining the Record of Processing Activities (RoPA) required by GDPR Article 30. See Workbenches.
Q-ROT
The workbench for Redundant, Obsolete, Trivial (ROT) data analysis. Classifies a document collection using the Red, Amber, Green (RAG) traffic-light scheme and groups near-duplicates. See Workbenches.
Q-SAR
The workbench for fulfilling a Subject Access Request (SAR) under GDPR. Provides a searchable, filterable, three-panel viewer over the documents and records gathered for a request. See Workbenches.
Q-Viewer
The read-only document-viewer workbench, with PII highlighting and controlled bulk download. See Workbenches.
RAG
Red, Amber, Green. The traffic-light classification used by Q-ROT to summarise the health of every analysed document: Red items need action, Amber items need a closer look, and Green items are healthy.
Redaction
The process of removing or masking sensitive information in documents. Quantra's redaction workflow is built around PII Detect (which finds the regions to redact), the Review tool (where a person approves the proposed redactions), and SAR Release (which applies the approved redactions and packages the deliverable).
RoPA
Record of Processing Activities. The register of personal-data processing activities required by GDPR Article 30. The Q-RoPA workbench is the place to maintain it.
ROT
Redundant, Obsolete, Trivial. The classification that the Q-ROT workbench applies to a document collection: redundant items are duplicates or near-duplicates, obsolete items are older than the configured retention horizon, and trivial items are small system files.
SAR
Subject Access Request. A request made by an individual under GDPR for access to the personal data an organisation holds about them. Quantra provides the Q-SAR and Q-NHS-SAR workbenches, the Review tool, and the SAR Release tool to gather, review, redact, and deliver the response.
Service Endpoint
A database record in the ServiceEndpoint table that stores the connection details for a registered gRPC microservice. Each endpoint has a unique name, a category (datasource, tool, or AI), a host, and a port (typically in the 50050–50090 range). The platform uses these records to find and connect to microservices during pipeline execution.
SMB
Server Message Block. The network file-sharing protocol used by Windows file shares. The Network Drives datasource connects to file shares over SMB.
Streaming
The process of sending data as a continuous flow of frames rather than as a single complete message. In Quantra's gRPC protocol both requests and responses use bidirectional streaming, so large documents can be sent in chunks, progress can be reported in real time, and microservices can begin processing before the entire input has arrived. Each stream terminates with a frame where eos is true.
TLS
Transport Layer Security. A cryptographic protocol that provides confidential, integrity-protected communication over a network. See also mTLS for the mutual variant Quantra uses between the platform and microservices.
TLSH
Trend Micro Locality Sensitive Hash. A fuzzy-hashing algorithm whose hash values stay similar for similar inputs. Quantra uses TLSH in the Hash tool and in Q-ROT to detect near-duplicate documents in large collections.
Tool
One of the three plugin kinds in Quantra. A tool node performs an automatic processing operation — text extraction, PII detection, redaction, summarisation, hashing, archiving, and so on. Tool plugins are stored in /plugins/tools/ and use "kind": "tool" in their manifest.
Topological Sort
An ordering of the nodes of a directed acyclic graph such that for every edge from node A to node B, node A appears before node B. When Quantra runs a pipeline, the CNO performs a topological sort of the canvas graph to determine the correct execution order, ensuring each node runs only after its upstream dependencies have produced their output.
TOTP
Time-based One-Time Password. The Multi-Factor Authentication (MFA) algorithm defined in RFC 6238. A shared secret combined with the current 30-second time window produces a one-time numeric code. Quantra uses TOTP via the django-otp library; users enrol by scanning a Quick Response (QR) code with an authenticator app.
TrOCR
Transformer-based Optical Character Recognition. A deep-learning OCR model that uses a transformer architecture, suited to handwritten text. TrOCR is one of the three reading modes available in HWT-OCR.
Workbench
One of the three plugin kinds in Quantra. A workbench node provides an interactive user interface for reviewing, annotating, redacting, or signing off processed data. Workbench plugins are stored in /plugins/workbenches/ and use "kind": "workbench" in their manifest.