Applied AI ·
Local LLMs and the NDA problem
Why the default cloud-API answer is wrong for almost every operator I have worked with — and what is now possible on a laptop.
The default architecture for an “AI-enabled” engineering tool, as proposed by ninety percent of vendors, sends client documents to a cloud API. For most operators in the Middle East — and a lot of operators elsewhere — that architecture is unshippable. The legal review never finishes. The procurement loop never starts. The project moves on without the tool.
I have lost count of the number of digital-transformation pitches I have watched die at the point where someone in the room says: “And the documents go where?”
The architecture that ships is the opposite. The model lives on the engineer’s laptop. The document never leaves the laptop. The output never leaves the laptop. There is no API key, no egress, no telemetry, no vendor terms of service. The legal review is one paragraph long: this tool runs locally, no data is transmitted.
This used to be impossible. It is no longer impossible. A small open-weight model, quantised, fits on a current-generation laptop. For the narrow tasks that actually matter in EPC project work — structured extraction, classification, schema-fill, retrieval-grounded answers — that model is more than enough.
What you give up is the bleeding edge. A frontier-class model in the cloud will write better prose, reason over longer chains, do more clever multi-step inference. None of that matters for the workflows I am building tools for. The job is reliable, structured output against a known schema. The job is the same dictation a human estimator already does. A small local model does that as well as any frontier model — better, actually, because you can fine-tune it on your own data without exposing the data.
What you gain is everything else. Compliance is solved at the architecture level, not at the policy level. Cost is fixed, not metered. Latency is predictable. The tool works on a flight, in a remote site office, behind a corporate firewall that blocks the cloud. The engineer never has to ask permission to use it.
The pattern that works:
Start narrow. Pick one bottleneck — scope extraction, deliverable status parsing, change-order classification, risk-register categorisation. Build the tool for that one bottleneck. Ship it on the laptops that already exist. Resist the urge to expand into a “platform.”
Use a small model. Seven billion parameters is plenty. Anything bigger is showing off.
Treat the model as a typist, not an oracle. The model fills a schema. The engineer reads the result. The judgement does not move.
Audit everything. Every value the model emits gets a citation back to the source. Every output is validated against the schema. Every run is logged with the input hash, the model version, the schema version, and the output hash. When the legal team asks how this works, the answer is one paragraph and a log file.
There is a class of tooling here that almost no vendor is selling and almost every operator needs. The reason no vendor is selling it is structural: a local-only tool is a one-time license, not a SaaS subscription. The economics of the SaaS layer are exactly why the SaaS layer is the wrong default for this workload.
Build narrow, ship local, audit everything. That is the playbook. Anything else does not survive contact with a serious operator’s legal team.