A complete self-contained microsystem in a single Docker image. Your data never leaves your infrastructure. GDPR & global privacy frameworks, AI compliance-ready (EU AI Act · NIST AI RMF · ISO/IEC 42001), native multi-site federated cluster.
If you handle confidential data that you cannot legally or ethically hand over to the cloud, and you still want a powerful internal AI assistant, Cellule-PRO is for you.
Professional secrecy, sensitive client files, contract review, case-law search. Cloud = leak of privilege. Host it in-house.
Health data, patient records, clinical protocols. Health-data hosting regulations require self-hosting or sovereign certified hosting.
Intellectual property, technical drawings, supplier contracts. Federated cluster across your factories and offices, with no central server.
Classified data, mandatory state sovereignty. Air-gapped runtime guaranteed. Docker image delivered without any Internet dependency.
ChatGPT Enterprise, Claude for Work, Azure OpenAI: your data transits and is stored at a third party. Incompatible with privilege, health-data regulations, industrial IP.
One machine, one user, no RAG, no enterprise governance, no audit trail. Nice for tinkering, unusable in production.
Wiring vLLM + pgvector + OpenWebUI + Keycloak + monitoring = 6 months of integration. You're not an AI startup.
August 2026: audit, traceability and DPIA obligations. Today's cloud AI does not provide these guarantees. You need to take back control.
Private LLM chat + conversational RAG + document RAG + collaborative project spaces + OpenAI-compatible API + admin dashboard + GDPR employee portal. All in one shippable image.
Air-gapped runtime guaranteed. The image installs on your hardware. Full firewall isolation possible. No phone-home, no cloud licensing.
Ed25519 federation with no central pool. Headquarters + branches + GPU datacenter, all federated as a single cluster. Symmetric admin HA — any pool gives you the full view.
Self-service employee portal: access, rectification, erasure, portability. Append-only audit trail. Cascade anonymization. Ready for regulatory audit without consulting fees.
Every switchover, migration and alert is visible, traced, approvable. Configurable failsafe. No "the AI decided on its own". Your IT team stays in control.
Ed25519 private key generated air-gapped on your premises during a controlled ritual. Your cryptographic identity depends on no third party. Same model used by certificate authorities and bank cold wallets.
A simple mental model that surfaces everywhere: in the docs, the admin UI, the commercial diagrams. Everything flows from this.
Worker node (GPU/CPU)
or lightweight proxy. Interchangeable, stateless.
Docker orchestrator: routing, RAG, API, admin. Self-contained and stateful.
Mesh of N Ed25519-paired pools. No central pool. Symmetric admin HA.
GDPR audit, offline JWT licensing, incident workflow, failsafe.
Explore the 6 architecture diagrams → ▶ Open the control room (live)
Beyond private LLM chat and RAG, Cellule-PRO ships an orchestration layer designed for IT departments that want a microsystem still autonomous three years from now — not a POC that becomes unmanageable after six months.
Drop a model (drag-drop) on any pool: it replicates automatically across sites through signed Ed25519 federation. If a site goes down, another takes over with one click — multi-site RAID for your models, no central server, no third-party cloud.
A single API, several specialized models behind it. The pool routes each request to the right model: a Coder for code, a conversational model for chat, a reasoning model for long tasks. The user writes, the pool picks. Zero friction.
Your devs use opencode, Cursor, Continue.dev on
Monday, close their laptop, reopen Wednesday on another machine — the LLM still
remembers: the bug to fix, the architecture decisions, the project conventions.
End any prompt with [MEMORIZE: fact] to plant a fact, then it surfaces
automatically next time you ask about it. Per-user encrypted RAG, on-LAN, no
--continue needed. The only solution that does this without
sending your data to the cloud.
Your office PCs, workstations and GPU servers don't have the same horsepower — that's normal. On startup, each worker self-benchmarks and the pool assigns it the right model (2B, 4B, 9B, 30B). Your heterogeneous IT fleet onboards without manual intervention.
The OpenAI-compatible API lets your IT team use Cellule-PRO as a sub-agent in their own dev/admin workflow and internal procedure writing. Virtuous loop: your private LLM helps you maintain the system that runs it. Complete sovereignty, near-zero marginal cost.
Your employees never touch a terminal or a config file. They
open a URL, log in, and chat directly: saved conversations, RAG
document drag-and-drop, collaborative project workspaces, [MEMORIZE: fact]
mode. Project mode = an isolated workspace for a client matter,
a patient case, or an audit, with its own dedicated memory. All on your
LAN, zero heavy client to install.
Your devs prefer opencode / Continue.dev
/ Cursor? Your data scientists want their Python openai SDK? Some
teams already adopted OpenWebUI on Ollama and want to keep it?
The OpenAI-compatible API of Cellule-PRO accepts them all via
sk-cellule-* tokens. The built-in UI for those who want
simple, the API for those who want integration.
Cellule-PRO ships continuously. Here is what landed in production over the last 30 days, all directly available in the current image.
Bump the pool, the entire workforce updates itself on the next handshake — Windows ScheduledTask, Linux systemd, all under SYSTEM/root, no admin walks across the office. Verified end-to-end on heterogeneous fleets, including rollout of new flags and protocols. You ship, they catch up by themselves.
The pool detects each worker's real VRAM and caps
ctx_size at a safe value — no GPU saturation, no manual tuning.
Observed end-to-end: a saturated RTX 2060 went from 0.1 tok/s to 46 tok/s
once the cap kicked in (a 460× speedup), and an entry-level RTX A400 jumped from
4 to 18.9 tok/s. The DSI sets a single global toggle, the pool handles every model.
A second site joins your federation with one shell line — the image streams from the master over LAN (no DockerHub, no internet), the join token is auto-consumed, the federation handshake is signed Ed25519, and the catalog of GGUF models is replicated automatically. Total LAN sovereignty. No admin needed on the satellite.
Same one-liner installer for every employee desktop — the script auto-detects OS and architecture, pulls the bundled Python and the matching engine wheel (CUDA/Metal/CPU), then registers the worker as a native service (ScheduledTask on Windows, systemd on Linux, launchd plist on macOS). M1/M2/M3/M4 Apple Silicon supported with Metal acceleration.
Every operational tunable — VRAM cap, recall depth, queue size, KNN top-k, audit retention, satellite gossip, forwarding policy — is a labeled toggle with a business description and per-profile recommendation, in French and English. No hidden environment variables, no patched config files. The DSI pilots, the pool obeys.
The license validator, the federation cryptography, the
smart-routing engine, the MoE sharding logic, the RAG retrieval engine and the
anti-entropy loop are compiled to native shared objects inside
the delivered Docker image. A pentest engagement at the client site can't
trivially recover the Python source of the load-bearing IP from
docker save. Your competitive edge stays yours.
The wheels served via /pypi to employee desktops keep their
cross-platform Python source — only the pool runtime is hardened.
The pool's license verifier supports multiple
signing keys in parallel — built for decade-long deployments. When a
signing key needs to rotate (planned maintenance, regulatory updates, ceremony
renewal), a transitional image accepts both the old and the new key during a
grace period. Your deployment never goes dark for a key reason.
Same pattern used by certificate authorities for root CA rotation. Constant-time
token comparisons throughout the stack (hmac.compare_digest), no
admin-token timing leak under network probing.
| Criterion | ChatGPT Enterprise | Azure OpenAI | Ollama / LM Studio | Cellule-PRO |
|---|---|---|---|---|
| Data stays on your premises | No (OpenAI cloud) | Partial (EU Azure cloud) | Yes | Yes (air-gap runtime) |
| Built-in document RAG | Limited knowledge files | Build it yourself | No | Yes (pgvector + citations) |
| Collaborative project mode | Basic workspaces | No | No | Yes (Ed25519 membership) |
| Multi-site cluster | N/A | Build it yourself | No | Native (Ed25519 federation) |
| GDPR articles 15-22 | OpenAI DPA | Microsoft DPA | Build it yourself | Native (self-service) |
| Incident audit trail | Limited | Azure Monitor | No | Append-only DB |
| EU AI Act compliance | In progress | In progress | N/A | Ready by design |
| Multi-model smart routing | Single model | Single model | Script it yourself | Auto (Coder / Chat / Reasoning) |
| Multi-site replicated model catalog | N/A | Build it yourself | No | Yes (Ed25519 federation) |
| OpenAI-compatible API (opencode/Continue/OpenWebUI) | Proprietary API | Azure subset | Limited | Yes (sk-cellule-* tokens) |
| Hardened binary delivery (no readable source IP) | N/A (cloud) | N/A (cloud) | No (plain .py) | Yes (compiled .so) |
We frame it: your data, your employees, your regulatory constraints, your existing infrastructure. We confirm Cellule-PRO is the right fit (sometimes it isn't — we'll say so).
We define together the perimeter: number of pilot employees, data used, success criteria at day 60. Formal commitment only after cross-validation of the BETA.
Docker image + PostgreSQL + first-boot wizard. We guide remotely or on-site as you prefer. Your IT team follows step by step.
2-hour live tutorial: chat, document RAG, project mode, GDPR self-service. Your employees start using the system right away.
Email + video. You ask any question, we tune the config. Goal: your team is autonomous by day 40.
Review with your IT team: actual usage, employee satisfaction, perceived ROI. Continue in production or clean stop. No pressure.
For 5-20 employees: 1 server with a recent GPU (RTX 4080/4090 or A4000+) or a powerful AMD Ryzen AI CPU, 32-96 GB RAM, 500 GB SSD. For 50-200 employees: 2-3 servers. Validated precisely during the discovery call.
Qwen 3.5 by default (open source, multilingual, strong on office workloads). You can load any GGUF model: Mistral, Llama, Gemma, DeepSeek. No proprietary-model lock-in.
Fair question. The Docker image is on your hardware, the source code is accessible to you through a notarized escrow. You can keep operating it without us. The offline JWT Ed25519 license model depends on no remote server.
You export your data (GDPR article 20 native: standard JSON/ZIP export in one click). You stop the containers. Done.
Cellule-PRO is proprietary, distributed under commercial license with code
access via notarized escrow in case of vendor failure. A public sister project
(cellule.ai) remains under AGPLv3 for the community — both share a
common technical base but are distributed separately.
Email (48h SLA on business days), crisis video calls, signed image updates. Support tiers are scoped to your needs during the discovery call.
30-min discovery call to find out if Cellule-PRO meets your needs.
No pressure. If it isn't the right fit, we'll tell you — and point you elsewhere.