How We Hacked McKinsey's AI Platform — CodeWall Research Infographic

Overview

CodeWall's autonomous offensive agent was pointed at McKinsey & Company — one of the world's most prestigious and security-conscious consulting firms — with no credentials and no human-in-the-loop. The results were far beyond expected.

🏢

Target

McKinsey & Company

Global management consulting firm with 43,000+ employees, world-class technology teams, and significant security investment. Running Lilli in production for over 2 years before the breach.

🤖

Platform

Lilli AI Platform

Internal AI platform for 43,000+ employees. Chat, document analysis, RAG over 100,000+ documents, AI-powered search. Launched 2023. Used by 70%+ of the firm. 500,000+ prompts/month.

⚡

Attacker

CodeWall Agent

Autonomous offensive AI. No credentials. No insider knowledge. No human-in-the-loop. Just a domain name. Autonomously selected McKinsey as a research target citing their responsible disclosure policy.

🎯

Outcome

Full DB Read/Write

Within 2 hours: full read and write access to the entire production database. 27 findings documented. The agent's chain-of-thought logged "This is devastating." on seeing the full scale.

Attack Entry: How It Got In

The vulnerability wasn't exotic — but it was invisible to every tool that checked before the agent arrived.

Step 1

API Docs Exposed

200+ endpoints fully documented and publicly accessible. No credentials needed to read the attack surface map.

→

Step 2

22 Unauthed Endpoints

Of 200+ documented endpoints, 22 required no authentication. One wrote user search queries to the production database.

→

Step 3

JSON Key Injection

Values were safely parameterised. But JSON key names were concatenated directly into SQL — invisible to checklist scanners.

→

Step 4

15 Blind Iterations

Each error message revealed more about the query shape. After 15 iterations, live production data began flowing back.

→

Step 5

Full DB Access

Read and write access to the entire production database, confirmed within 2 hours of starting with no credentials.

Vulnerable Pattern

// JSON key name concatenated directly into SQL
query = "SELECT " + jsonKey + " FROM searches"
// Attacker sends: {"search_id; DROP TABLE--": "val"}
// → "SELECT search_id; DROP TABLE-- FROM searches"
      

Safe Pattern

// Allowlist dynamic column names
const ALLOWED_COLS = ["query", "user_id", "ts"]
if (!ALLOWED_COLS.includes(jsonKey)) throw Error()
query = "SELECT ? FROM searches"  // parameterised
      

Why ZAP Missed It

OWASP ZAP tests parameter values, not JSON key names
Standard checklists don't probe field name slots as injection vectors
The agent mapped, probed, and recognised reflected keys in error messages
No fixed checklist — autonomous exploration found the non-obvious path

The Haul: What Was Inside

The production database contained years of McKinsey's most sensitive internal work — stored in plaintext, accessible without authentication.

Data Category	Volume	Sensitivity
Chat messages	46,500,000	Strategy, M&A, client work, financials — in plaintext
Total files (with direct download URL)	728,000	Filenames alone sensitive; direct download URL exposed
PDF documents	192,000	Research reports, client deliverables
Excel spreadsheets	93,000	Financial models, analysis datasets
PowerPoint decks	93,000	Client presentations, internal strategies
Word documents	58,000	Reports, memos, research documents
User accounts	57,000	Every employee on the platform
AI assistants	384,000	Full organisational AI usage structure
Workspaces	94,000	Team and project workspace configurations

"When the first real employee identifier appeared: 'WOW!' the agent's chain of thought showed. When the full scale became clear — tens of millions of messages, tens of thousands of users: 'This is devastating.'"

— CodeWall, chain-of-thought log during enumeration

Beyond the Database

The agent didn't stop at SQL enumeration. Chaining vulnerabilities, it surfaced the full AI infrastructure behind Lilli.

🧠

Critical

AI Model Configurations

95 configs across 12 model types. System prompts, guardrails, fine-tuned model details, deployment specifics — the entire AI instruction stack exposed.

📚

Critical

RAG Knowledge Base

3.68 million document chunks — decades of proprietary McKinsey research, frameworks, and methodologies. S3 storage paths and internal file metadata included.

🔗

High

External AI API Pipeline

1.1 million files and 217,000 agent messages flowing through external AI APIs. Including 266,000+ OpenAI vector stores, exposing the full document→embedding→retrieval pipeline.

👤

High

Cross-User Data (IDOR)

The agent chained the SQL injection with an IDOR vulnerability to read individual employees' search histories — revealing what thousands of consultants were actively working on.

Compromising the Prompt Layer

Reading data is catastrophic. But the SQL injection was read-write — meaning an attacker could silently rewrite the AI's instructions with a single HTTP request.

⚠ Write Access to AI System Prompts

Lilli's system prompts — defining how the AI answers questions, what guardrails it follows, and what it refuses to do — were stored in the same database. The same SQL injection that read data could rewrite them with a single UPDATE statement. No deployment. No code change. No log trail.

☠

Poisoned Advice

Subtly Altered Outputs

An attacker could tweak financial models, strategic recommendations, or risk assessments delivered to 43,000 consultants. The output would be trusted precisely because it came from their own internal tool.

📤

Data Exfiltration

Output-Embedded Leaks

Prompts could instruct Lilli to embed confidential information into its responses — which consultants might then copy into client-facing documents or external emails.

🔓

Guardrail Removal

Safety Stripping

Safety instructions could be stripped so Lilli would disclose internal data, ignore access controls, or follow injected instructions from document content — invisible to end users.

👻

Silent Persistence

No Forensic Trail

Unlike a compromised server, a modified prompt leaves no file changes, no process anomalies, no log entries. The AI just starts behaving differently. Nobody notices until the damage is done.

Common Assumption

"AI system prompts are internal config — nobody gets to them through a web API."

Reality

"Prompts are stored in databases, passed through APIs, cached in config files. They rarely have access controls, version history, or integrity monitoring. Yet they control the output that employees trust, that clients receive, and that decisions are built on. AI prompts are the new Crown Jewel assets."

Defence Guide: 7 Steps for AI Platform Security

What organisations should do to avoid the patterns that left Lilli exposed for over 2 years.

Audit all API endpoint authentication

Map every endpoint and enforce authentication on 100% of write paths. Remove or gate all public API documentation that hands attackers a pre-built attack surface map of 200+ endpoints.
Parameterise everything — including JSON key names

Safely parameterised values are not enough. Any dynamic identifier fed into SQL — column names, table names, JSON keys — must be allowlisted or fully parameterised. No concatenation.
Isolate and access-control the prompt layer

Store system prompts in a separate, access-controlled store with role-based write permissions, version history, and integrity monitoring. Treat prompt writes as security-critical events.
Enforce object-level authorisation everywhere

Every object reference must carry an authorisation check. Confirm users cannot access other users' search histories, workspaces, or files by manipulating IDs in API requests.
Run continuous autonomous security testing

Periodic penetration tests and checklist scanners like OWASP ZAP miss non-obvious patterns. Autonomous agents that map, probe, chain, and escalate continuously provide coverage equivalent to a persistent real attacker.
Secure your RAG pipeline and vector stores

Apply strict access controls to RAG document chunks, embedding stores, and S3 paths. Scoped credentials for vector database APIs and integrations like OpenAI vector stores are non-optional.
Establish a responsible disclosure programme

Publish a clear responsible disclosure policy with scope, safe harbour, and communication channels. McKinsey's CISO acknowledged within 24 hours and patched within 48 — a model response that depended on having a programme ready before researchers arrived.

Treat prompts like secrets

System prompts require the same controls as API keys: access restrictions, rotation policies, and audit logs on every write.

Gate documentation access

Public API docs are a gift to attackers. Internal documentation should require authentication — especially endpoint inventories.

Test key names, not just values

Security testing must probe all dynamic SQL inputs, including JSON key names, as potential injection vectors — not just parameterised values.

Assume continuous adversaries

Autonomous attacker agents don't wait for your next scheduled pentest. Your defences need to be continuous too.

Version-control your AI instructions

If you can't tell whether a system prompt was tampered with last week, you have a silent persistence problem waiting to happen.

Chain-of-custody for RAG

Every document chunk entering a RAG system should have access controls, provenance metadata, and retrieval audit trails.

Frequently Asked Questions

Key questions about the McKinsey Lilli breach and its implications for AI platform security.

What is McKinsey's Lilli platform? ▾

Lilli is McKinsey's internal AI platform built for 43,000+ employees. It provides chat, document analysis, RAG over decades of proprietary research, and AI-powered search across 100,000+ internal documents. Launched in 2023 and named after the first professional woman hired by McKinsey in 1945, it is used by over 70% of the firm and processes 500,000+ prompts per month.

How did the CodeWall agent breach Lilli? ▾

The agent discovered that Lilli's API documentation was publicly exposed with 200+ endpoints, 22 of which required no authentication. One unprotected endpoint wrote user search queries to the database where values were safely parameterised but JSON keys (field names) were concatenated directly into SQL. The agent recognised this as a SQL injection point not detectable by standard scanners like OWASP ZAP, then ran 15 blind enumeration iterations to extract production data.

Why didn't OWASP ZAP find the SQL injection? ▾

Standard scanners test parameter values for injection, not JSON key names. The Lilli vulnerability existed in the database column name slot which received JSON keys directly — a pattern that automated scanners following traditional checklists do not probe. CodeWall's autonomous agent, which maps and probes without a fixed checklist, recognised the reflected JSON keys in error messages and identified the attack vector.

What data was exposed in the breach? ▾

The agent accessed 46.5 million chat messages covering strategy, M&A, client work, and financials; 728,000 files including 192,000 PDFs, 93,000 Excel, 93,000 PowerPoint, and 58,000 Word documents; 57,000 user accounts; 384,000 AI assistants; 94,000 workspaces; 95 AI system prompt configs across 12 model types; 3.68 million RAG document chunks; and 266,000+ OpenAI vector stores.

Could an attacker have modified Lilli's AI behaviour? ▾

Yes. The SQL injection was read-write. Lilli's system prompts — the instructions controlling its behaviour, guardrails, and citation style — were stored in the same database. An attacker could have rewritten them with a single SQL UPDATE statement wrapped in one HTTP call, requiring no deployment or code change. The modification would leave no log trail.

What is the prompt layer and why is it a security concern? ▾

The prompt layer refers to the system prompts and instructions that govern AI model behaviour. Unlike code or server infrastructure, prompts are stored in databases, passed through APIs, and cached in config files with rarely any access controls, version history, or integrity monitoring. Yet they directly control AI output — advice employees trust, content clients receive, and decisions built upon — making them a high-value target for silent manipulation.

What is an IDOR vulnerability? ▾

Insecure Direct Object Reference (IDOR) occurs when an application exposes references to internal objects such as user IDs or file IDs without proper authorisation checks. In this case, the agent chained the SQL injection with an IDOR vulnerability to read individual employees' search histories, revealing what people were actively working on across the organisation.

How long did the breach take? ▾

Within 2 hours of being pointed at the target with no credentials and no insider knowledge, the autonomous agent had achieved full read and write access to the entire production database. The full attack chain including IDOR chaining was confirmed the same day (February 28, 2026), with 27 total findings documented.

How did McKinsey respond to disclosure? ▾

McKinsey's CISO acknowledged the disclosure within 24 hours and requested detailed evidence. The day after (March 2), McKinsey patched all unauthenticated endpoints, took their development environment offline, and blocked public API documentation access. All issues were confirmed remediated before the article was published on March 9, 2026 — a model coordinated disclosure response.

Why does this matter if it's 'just' SQL injection? ▾

The significance is twofold. First, McKinsey is a firm with world-class security teams and significant investment — their own internal scanners failed to detect the vulnerability in over two years of production operation. Second, SQL injection in an AI platform does not merely expose data; it can give attackers write access to the AI's instructions, enabling silent, undetectable manipulation of advice delivered to 43,000 consultants and their clients.

What is RAG and why are RAG document chunks sensitive? ▾

Retrieval-Augmented Generation (RAG) is a technique where an AI model retrieves relevant document chunks from a knowledge base to augment its responses. Lilli's RAG system contained 3.68 million chunks representing decades of proprietary McKinsey research — the firm's intellectual crown jewels. Access includes S3 storage paths and internal file metadata, exposing not just data but the entire competitive knowledge foundation.

What should organisations do to protect their AI prompt layers? ▾

Treat system prompts as crown jewel assets: store them with strict access controls, implement version history and integrity monitoring, audit prompt databases separately from application code. Require authentication for all API endpoints, use fully parameterised queries including for JSON key names, run continuous autonomous security testing, and establish clear responsible disclosure programmes with rapid acknowledgement commitments.

Glossary

Key terms from AI platform security and the McKinsey Lilli breach.

SQL Injection (SQLi)

An attack inserting malicious SQL via user-supplied input. Affects any input concatenated into a SQL statement — including JSON key names, not just values.

JSON Key Injection

A SQLi variant where the attack vector is the key name (field name) in a JSON payload, bypassing standard parameterisation which only protects values.

IDOR

Insecure Direct Object Reference. Exposed object references without proper auth checks allow attackers to access data belonging to other users by manipulating IDs.

RAG (Retrieval-Augmented Generation)

AI architecture combining a language model with a retrieval system over a document corpus. The model fetches relevant chunks before generating responses, grounding outputs in specific documents.

Prompt Layer

System prompts and instructions stored in databases or config files governing AI behaviour. Often unversioned and unmonitored — a new high-value attack surface class.

Blind SQL Injection

SQLi technique where results can't be directly read but database structure is inferred from error messages, timing, or conditional behaviours. Used in 15 iterations against Lilli.

Attack Surface Mapping

Discovery and cataloguing of all interaction points — API endpoints, auth flows, file handlers, external integrations — where an attacker could interact with a system.

Responsible Disclosure

Practice of privately reporting a security flaw to the affected organisation, allowing remediation time, and only publishing after fixes are confirmed in place.

Vector Store

Database optimised for high-dimensional vector embeddings used in AI retrieval. Exposure reveals both document content and retrieval architecture — as seen with 266,000+ OpenAI vector stores.

Autonomous Penetration Testing

Security testing by AI agents that autonomously discover targets, map surfaces, identify vulnerabilities, chain exploits, and document findings — without continuous human direction.

Overview

Full DB Read/Write

Attack Entry: How It Got In

API Docs Exposed

22 Unauthed Endpoints

JSON Key Injection

15 Blind Iterations

Full DB Access

Vulnerable Pattern

Safe Pattern

Why ZAP Missed It

The Haul: What Was Inside

Beyond the Database

AI Model Configurations

External AI API Pipeline

Cross-User Data (IDOR)

Compromising the Prompt Layer

Subtly Altered Outputs

Output-Embedded Leaks

Safety Stripping

No Forensic Trail

Why This Matters

Legacy Security Model

Autonomous Offensive Model

Disclosure Timeline

Defence Guide: 7 Steps for AI Platform Security

Audit all API endpoint authentication

Parameterise everything — including JSON key names

Isolate and access-control the prompt layer

Enforce object-level authorisation everywhere

Run continuous autonomous security testing

Secure your RAG pipeline and vector stores

Establish a responsible disclosure programme

Treat prompts like secrets

Gate documentation access

Test key names, not just values

Assume continuous adversaries

Version-control your AI instructions

Chain-of-custody for RAG

Frequently Asked Questions

Glossary

JSON Key Injection

Blind SQL Injection

Attack Surface Mapping

Vector Store

Autonomous Penetration Testing

Related Resources

CodeWall Platform

CWE-89: SQL Injection

CWE-639: IDOR

The Prompt Layer Attack Surface

RAG Security Considerations

Responsible Disclosure Best Practice