Who it's for

Built for regulated organizations

If you handle confidential data that you cannot legally or ethically hand over to the cloud, and you still want a powerful internal AI assistant, Cellule-PRO is for you.

privilege

Law firms

Professional secrecy, sensitive client files, contract review, case-law search. Cloud = leak of privilege. Host it in-house.

health data

Hospitals & healthcare

Health data, patient records, clinical protocols. Health-data hosting regulations require self-hosting or sovereign certified hosting.

IP

Multi-site industrial groups

Intellectual property, technical drawings, supplier contracts. Federated cluster across your factories and offices, with no central server.

classified

Defense & government

Classified data, mandatory state sovereignty. Air-gapped runtime guaranteed. Docker image delivered without any Internet dependency.

The problem

The problem you're living today

✕

Cloud AI = data leak

ChatGPT Enterprise, Claude for Work, Azure OpenAI: your data transits and is stored at a third party. Incompatible with privilege, health-data regulations, industrial IP.

✕

Ollama / LM Studio = a toy

One machine, one user, no RAG, no enterprise governance, no audit trail. Nice for tinkering, unusable in production.

✕

DIY stack = nightmare

Wiring vLLM + pgvector + OpenWebUI + Keycloak + monitoring = 6 months of integration. You're not an AI startup.

✕

EU AI Act is coming

August 2026: audit, traceability and DPIA obligations. Today's cloud AI does not provide these guarantees. You need to take back control.

The solution

Cellule-PRO: the complete sovereign microsystem

1 image

A single Docker image

Private LLM chat + conversational RAG + document RAG + collaborative project spaces + OpenAI-compatible API + admin dashboard + GDPR employee portal. All in one shippable image.

air-gap

No data leaves your network

Air-gapped runtime guaranteed. The image installs on your hardware. Full firewall isolation possible. No phone-home, no cloud licensing.

multi-site

Native multi-site cluster

Ed25519 federation with no central pool. Headquarters + branches + GPU datacenter, all federated as a single cluster. Symmetric admin HA — any pool gives you the full view.

art. 15-22

Native GDPR articles 15-22

Self-service employee portal: access, rectification, erasure, portability. Append-only audit trail. Cascade anonymization. Ready for regulatory audit without consulting fees.

admin

Admin pilots, zero magic

Every switchover, migration and alert is visible, traced, approvable. Configurable failsafe. No "the AI decided on its own". Your IT team stays in control.

Ed25519

Initialization ceremony

Ed25519 private key generated air-gapped on your premises during a controlled ritual. Your cryptographic identity depends on no third party. Same model used by certificate authorities and bank cold wallets.

Architecture

Four-layer architecture

A simple mental model that surfaces everywhere: in the docs, the admin UI, the commercial diagrams. Everything flows from this.

Atom

Worker node (GPU/CPU)
or lightweight proxy. Interchangeable, stateless.

Pool

Docker orchestrator: routing, RAG, API, admin. Self-contained and stateful.

Federation

Mesh of N Ed25519-paired pools. No central pool. Symmetric admin HA.

Governance

GDPR audit, offline JWT licensing, incident workflow, failsafe.

Explore the 6 architecture diagrams → ▶ Open the control room (live)

Differentiators

What you won't find anywhere else

Beyond private LLM chat and RAG, Cellule-PRO ships an orchestration layer designed for IT departments that want a microsystem still autonomous three years from now — not a POC that becomes unmanageable after six months.

RAID

Multi-site replicated model catalog

Drop a model (drag-drop) on any pool: it replicates automatically across sites through signed Ed25519 federation. If a site goes down, another takes over with one click — multi-site RAID for your models, no central server, no third-party cloud.

routing

Automatic smart routing

A single API, several specialized models behind it. The pool routes each request to the right model: a Coder for code, a conversational model for chat, a reasoning model for long tasks. The user writes, the pool picks. Zero friction.

memory

Infinite cross-session memory — your LLM remembers you

Your devs use opencode, Cursor, Continue.dev on Monday, close their laptop, reopen Wednesday on another machine — the LLM still remembers: the bug to fix, the architecture decisions, the project conventions. End any prompt with [MEMORIZE: fact] to plant a fact, then it surfaces automatically next time you ask about it. Per-user encrypted RAG, on-LAN, no --continue needed. All of it stays on your LAN — nothing is sent to the cloud.

auto-tier

Auto-tier workers on heterogeneous fleets

Your office PCs, workstations and GPU servers don't have the same horsepower — that's normal. On startup, each worker self-benchmarks and the pool assigns it the right model (2B, 4B, 9B, 30B). Your heterogeneous IT fleet onboards without manual intervention.

loop

Self-improvement: your LLM helps your IT team

The OpenAI-compatible API lets your IT team use Cellule-PRO as a sub-agent in their own dev/admin workflow and internal procedure writing. Virtuous loop: your private LLM helps you maintain the system that runs it. Complete sovereignty, near-zero marginal cost.

web UI

Built-in employee web UI — no OpenWebUI needed

Your employees never touch a terminal or a config file. They open a URL, log in, and chat directly: saved conversations, RAG document drag-and-drop, collaborative project workspaces, [MEMORIZE: fact] mode. Project mode = an isolated workspace for a client matter, a patient case, or an audit, with its own dedicated memory. All on your LAN, zero heavy client to install.

API

…and compatible with the tech tools they already use

Your devs prefer opencode / Continue.dev / Cursor? Your data scientists want their Python openai SDK? Some teams already adopted OpenWebUI on Ollama and want to keep it? The OpenAI-compatible API of Cellule-PRO accepts them all via sk-cellule-* tokens. The built-in UI for those who want simple, the API for those who want integration.

Continuous delivery

Latest releases

Cellule-PRO ships continuously. Here is what recently landed in production, all directly available in the current image.

zero-touch

Zero-touch worker upgrades

Bump the pool, the entire workforce updates itself on the next handshake — Windows ScheduledTask, Linux systemd, all under SYSTEM/root, no admin walks across the office. Verified end-to-end on heterogeneous fleets, including rollout of new flags and protocols. You ship, they catch up by themselves.

VRAM cap

Automatic VRAM-aware context cap

The pool detects each worker's real VRAM and caps ctx_size at a safe value — no GPU saturation, no manual tuning. Observed end-to-end: a saturated RTX 2060 went from 0.1 tok/s to 46 tok/s once the cap kicked in (a 460× speedup), and an entry-level RTX A400 jumped from 4 to 18.9 tok/s. The DSI sets a single global toggle, the pool handles every model.

satellite

Satellite pool in one command, zero wizard

A second site joins your federation with one shell line — the image streams from the master over LAN (no DockerHub, no internet), the join token is auto-consumed, the federation handshake is signed Ed25519, and the catalog of GGUF models is replicated automatically. Total LAN sovereignty. No admin needed on the satellite.

multi-OS

Workers on Windows, Linux, macOS Apple Silicon

Same one-liner installer for every employee desktop — the script auto-detects OS and architecture, pulls the bundled Python and the matching engine wheel (CUDA/Metal/CPU), then registers the worker as a native service (ScheduledTask on Windows, systemd on Linux, launchd plist on macOS). M1/M2/M3/M4 Apple Silicon supported with Metal acceleration.

flags

26 admin flags across 11 DSI categories

Every operational tunable — VRAM cap, recall depth, queue size, KNN top-k, audit retention, satellite gossip, forwarding policy — is a labeled toggle with a business description and per-profile recommendation, in French and English. No hidden environment variables, no patched config files. The DSI pilots, the pool obeys.

hardened

Hardened binary delivery — IP-critical modules compiled

The license validator, the federation cryptography, the smart-routing engine, the MoE sharding logic, the RAG retrieval engine and the anti-entropy loop are compiled to native shared objects inside the delivered Docker image. A pentest engagement at the client site can't trivially recover the Python source of the load-bearing IP from docker save. Your competitive edge stays yours. The wheels served via /pypi to employee desktops keep their cross-platform Python source — only the pool runtime is hardened.

rotation

Long-term cryptographic continuity

The pool's license verifier supports multiple signing keys in parallel — built for decade-long deployments. When a signing key needs to rotate (planned maintenance, regulatory updates, ceremony renewal), a transitional image accepts both the old and the new key during a grace period. Your deployment never goes dark for a key reason. Same pattern used by certificate authorities for root CA rotation. Constant-time token comparisons throughout the stack (hmac.compare_digest), no admin-token timing leak under network probing.

Compare

Cellule-PRO vs alternatives

Criterion	ChatGPT Enterprise	Azure OpenAI	Ollama / LM Studio	Cellule-PRO
Data stays on your premises	No (OpenAI cloud)	Partial (EU Azure cloud)	Yes	Yes (air-gap runtime)
Built-in document RAG	Limited knowledge files	Build it yourself	No	Yes (pgvector + citations)
Collaborative project mode	Basic workspaces	No	No	Yes (Ed25519 membership)
Multi-site cluster	N/A	Build it yourself	No	Native (Ed25519 federation)
GDPR articles 15-22	OpenAI DPA	Microsoft DPA	Build it yourself	Native (self-service)
Incident audit trail	Limited	Azure Monitor	No	Append-only DB
EU AI Act compliance	In progress	In progress	N/A	Ready by design
Multi-model smart routing	Single model	Single model	Script it yourself	Auto (Coder / Chat / Reasoning)
Multi-site replicated model catalog	N/A	Build it yourself	No	Yes (Ed25519 federation)
OpenAI-compatible API (opencode/Continue/OpenWebUI)	Proprietary API	Azure subset	Limited	Yes (sk-cellule-* tokens)
Hardened binary delivery (no readable source IP)	N/A (cloud)	N/A (cloud)	No (plain .py)	Yes (compiled .so)

Get started

How to get started?

      60-day BETA — zero commitment

      You provide a machine (server or VM). We deploy the Docker image, train your IT team,
      onboard 1-5 pilot employees. At day 60 you decide: continue in production, or stop —
      we destroy everything, your data was always yours.
    

1. Discovery call (30 min)

We frame it: your data, your employees, your regulatory constraints, your existing infrastructure. We confirm Cellule-PRO is the right fit (sometimes it isn't — we'll say so).

2. BETA scoping

We define together the perimeter: number of pilot employees, data used, success criteria at day 60. Formal commitment only after cross-validation of the BETA.

3. Installation (2-4h)

Docker image + PostgreSQL + first-boot wizard. We guide remotely or on-site as you prefer. Your IT team follows step by step.

4. Pilot training (2h)

2-hour live tutorial: chat, document RAG, project mode, GDPR self-service. Your employees start using the system right away.

5. 60-day support

Email + video. You ask any question, we tune the config. Goal: your team is autonomous by day 40.

6. Day-60 decision

Review with your IT team: actual usage, employee satisfaction, perceived ROI. Continue in production or clean stop. No pressure.

FAQ

Frequently asked questions

What hardware do we need?

For 5-20 employees: 1 server with a recent GPU (RTX 4080/4090 or A4000+) or a powerful AMD Ryzen AI CPU, 32-96 GB RAM, 500 GB SSD. For 50-200 employees: 2-3 servers. Validated precisely during the discovery call.

Which LLM is used?

Qwen 3.5 by default (open source, multilingual, strong on office workloads). You can load any GGUF model: Mistral, Llama, Gemma, DeepSeek. No proprietary-model lock-in.

What if you go away?

Fair question. The Docker image is on your hardware, the source code is accessible to you through a notarized escrow. You can keep operating it without us. The offline JWT Ed25519 license model depends on no remote server.

What if we want to stop?

You export your data (GDPR article 20 native: standard JSON/ZIP export in one click). You stop the containers. Done.

Is Cellule-PRO open source?

Cellule-PRO is proprietary, distributed under commercial license with code access via notarized escrow in case of vendor failure. A public sister project (cellule.ai) remains under AGPLv3 for the community — both share a common technical base but are distributed separately.

What support do you offer?

Email (48h SLA on business days), crisis video calls, signed image updates. Support tiers are scoped to your needs during the discovery call.

Ready to test?

30-min discovery call to find out if Cellule-PRO meets your needs.
No pressure. If it isn't the right fit, we'll tell you — and point you elsewhere.

Book a discovery call Explore architecture

Self-hosted LLM inference :for organizations that won't hand their data over to the cloud.