An autonomous AI agent found a SQL injection in McKinsey Lilli that standard scanners missed — and gained full read/write access to the entire production database within 2 hours.
CodeWall's autonomous offensive agent was pointed at McKinsey & Company — one of the world's most prestigious and security-conscious consulting firms — with no credentials and no human-in-the-loop. The results were far beyond expected.
Global management consulting firm with 43,000+ employees, world-class technology teams, and significant security investment. Running Lilli in production for over 2 years before the breach.
Internal AI platform for 43,000+ employees. Chat, document analysis, RAG over 100,000+ documents, AI-powered search. Launched 2023. Used by 70%+ of the firm. 500,000+ prompts/month.
Autonomous offensive AI. No credentials. No insider knowledge. No human-in-the-loop. Just a domain name. Autonomously selected McKinsey as a research target citing their responsible disclosure policy.
Within 2 hours: full read and write access to the entire production database. 27 findings documented. The agent's chain-of-thought logged "This is devastating." on seeing the full scale.
The vulnerability wasn't exotic — but it was invisible to every tool that checked before the agent arrived.
200+ endpoints fully documented and publicly accessible. No credentials needed to read the attack surface map.
Of 200+ documented endpoints, 22 required no authentication. One wrote user search queries to the production database.
Values were safely parameterised. But JSON key names were concatenated directly into SQL — invisible to checklist scanners.
Each error message revealed more about the query shape. After 15 iterations, live production data began flowing back.
Read and write access to the entire production database, confirmed within 2 hours of starting with no credentials.
The production database contained years of McKinsey's most sensitive internal work — stored in plaintext, accessible without authentication.
| Data Category | Volume | Sensitivity |
|---|---|---|
| Chat messages | 46,500,000 | Strategy, M&A, client work, financials — in plaintext |
| Total files (with direct download URL) | 728,000 | Filenames alone sensitive; direct download URL exposed |
| PDF documents | 192,000 | Research reports, client deliverables |
| Excel spreadsheets | 93,000 | Financial models, analysis datasets |
| PowerPoint decks | 93,000 | Client presentations, internal strategies |
| Word documents | 58,000 | Reports, memos, research documents |
| User accounts | 57,000 | Every employee on the platform |
| AI assistants | 384,000 | Full organisational AI usage structure |
| Workspaces | 94,000 | Team and project workspace configurations |
"When the first real employee identifier appeared: 'WOW!' the agent's chain of thought showed. When the full scale became clear — tens of millions of messages, tens of thousands of users: 'This is devastating.'"
— CodeWall, chain-of-thought log during enumerationThe agent didn't stop at SQL enumeration. Chaining vulnerabilities, it surfaced the full AI infrastructure behind Lilli.
95 configs across 12 model types. System prompts, guardrails, fine-tuned model details, deployment specifics — the entire AI instruction stack exposed.
3.68 million document chunks — decades of proprietary McKinsey research, frameworks, and methodologies. S3 storage paths and internal file metadata included.
1.1 million files and 217,000 agent messages flowing through external AI APIs. Including 266,000+ OpenAI vector stores, exposing the full document→embedding→retrieval pipeline.
The agent chained the SQL injection with an IDOR vulnerability to read individual employees' search histories — revealing what thousands of consultants were actively working on.
Reading data is catastrophic. But the SQL injection was read-write — meaning an attacker could silently rewrite the AI's instructions with a single HTTP request.
Lilli's system prompts — defining how the AI answers questions, what guardrails it follows, and what it refuses to do — were stored in the same database. The same SQL injection that read data could rewrite them with a single UPDATE statement. No deployment. No code change. No log trail.
An attacker could tweak financial models, strategic recommendations, or risk assessments delivered to 43,000 consultants. The output would be trusted precisely because it came from their own internal tool.
Prompts could instruct Lilli to embed confidential information into its responses — which consultants might then copy into client-facing documents or external emails.
Safety instructions could be stripped so Lilli would disclose internal data, ignore access controls, or follow injected instructions from document content — invisible to end users.
Unlike a compromised server, a modified prompt leaves no file changes, no process anomalies, no log entries. The AI just starts behaving differently. Nobody notices until the damage is done.
"AI system prompts are internal config — nobody gets to them through a web API."
"Prompts are stored in databases, passed through APIs, cached in config files. They rarely have access controls, version history, or integrity monitoring. Yet they control the output that employees trust, that clients receive, and that decisions are built on. AI prompts are the new Crown Jewel assets."
This wasn't a startup with three engineers. And the vulnerability wasn't exotic. That's the point.
"In the AI era, the threat landscape is shifting drastically — AI agents autonomously selecting and attacking targets will become the new normal."
— Paul Price, CodeWallFrom first contact to public disclosure in 9 days. All issues confirmed remediated before publication.
What organisations should do to avoid the patterns that left Lilli exposed for over 2 years.
Map every endpoint and enforce authentication on 100% of write paths. Remove or gate all public API documentation that hands attackers a pre-built attack surface map of 200+ endpoints.
Safely parameterised values are not enough. Any dynamic identifier fed into SQL — column names, table names, JSON keys — must be allowlisted or fully parameterised. No concatenation.
Store system prompts in a separate, access-controlled store with role-based write permissions, version history, and integrity monitoring. Treat prompt writes as security-critical events.
Every object reference must carry an authorisation check. Confirm users cannot access other users' search histories, workspaces, or files by manipulating IDs in API requests.
Periodic penetration tests and checklist scanners like OWASP ZAP miss non-obvious patterns. Autonomous agents that map, probe, chain, and escalate continuously provide coverage equivalent to a persistent real attacker.
Apply strict access controls to RAG document chunks, embedding stores, and S3 paths. Scoped credentials for vector database APIs and integrations like OpenAI vector stores are non-optional.
Publish a clear responsible disclosure policy with scope, safe harbour, and communication channels. McKinsey's CISO acknowledged within 24 hours and patched within 48 — a model response that depended on having a programme ready before researchers arrived.
System prompts require the same controls as API keys: access restrictions, rotation policies, and audit logs on every write.
Public API docs are a gift to attackers. Internal documentation should require authentication — especially endpoint inventories.
Security testing must probe all dynamic SQL inputs, including JSON key names, as potential injection vectors — not just parameterised values.
Autonomous attacker agents don't wait for your next scheduled pentest. Your defences need to be continuous too.
If you can't tell whether a system prompt was tampered with last week, you have a silent persistence problem waiting to happen.
Every document chunk entering a RAG system should have access controls, provenance metadata, and retrieval audit trails.
Key questions about the McKinsey Lilli breach and its implications for AI platform security.
Key terms from AI platform security and the McKinsey Lilli breach.
An attack inserting malicious SQL via user-supplied input. Affects any input concatenated into a SQL statement — including JSON key names, not just values.
A SQLi variant where the attack vector is the key name (field name) in a JSON payload, bypassing standard parameterisation which only protects values.
Insecure Direct Object Reference. Exposed object references without proper auth checks allow attackers to access data belonging to other users by manipulating IDs.
AI architecture combining a language model with a retrieval system over a document corpus. The model fetches relevant chunks before generating responses, grounding outputs in specific documents.
System prompts and instructions stored in databases or config files governing AI behaviour. Often unversioned and unmonitored — a new high-value attack surface class.
SQLi technique where results can't be directly read but database structure is inferred from error messages, timing, or conditional behaviours. Used in 15 iterations against Lilli.
Discovery and cataloguing of all interaction points — API endpoints, auth flows, file handlers, external integrations — where an attacker could interact with a system.
Practice of privately reporting a security flaw to the affected organisation, allowing remediation time, and only publishing after fixes are confirmed in place.
Database optimised for high-dimensional vector embeddings used in AI retrieval. Exposure reveals both document content and retrieval architecture — as seen with 266,000+ OpenAI vector stores.
Security testing by AI agents that autonomously discover targets, map surfaces, identify vulnerabilities, chain exploits, and document findings — without continuous human direction.