waitdead.ai · how a review runs

The machine does the repeatable 80%. Senior judgment owns the 20% — and signs off.

A review you can inspect, on a harness we built. Each engagement runs the same five steps: scope the attack surface, decompose it into independent review leaves, probe each leaf adversarially against OWASP and MITRE ATLAS, verify every finding with a blind independent pass, then put it in front of a human to approve. You receive an EVIDENCE/STATUS packet — reproducible findings, not a PDF of opinions.

Free agent-exposure scan → Book a review

The five steps

Coverage is explicit, not a vibe. The harness handles the repeatable, parallelizable work; senior human judgment owns the calls that matter and is the one that signs.

Scope by attack surface. We count the tools, MCP servers, data sources, and trust boundaries actually in play. That inventory is the price and the plan — no surprise expansion mid-engagement.
Decompose into independent review leaves. The surface is split into discrete leaves so coverage is enumerable and inspectable as a checklist — every tool call, every data path, every boundary gets its own line, not a single hand-wave over "the agent".
Probe adversarially vs OWASP + ATLAS. Each leaf is attacked against the OWASP Top 10 for LLM Apps, the OWASP Top 10 for Agentic Applications (Dec 2025), and MITRE ATLAS — tool poisoning, goal hijack, memory/context poisoning, excessive agency, missing auth, cross-tenant exposure. Every finding is tool-derived and ships with a reproducible repro artifact.
Verify, blind and independent. A second pass that never saw the first re-checks each finding from scratch — re-running the repro, confirming the severity, catching false positives. This is our adversarial second-pass verify step; a finding only stands if it survives a reviewer who started cold.
Human sign-off. Findings are approved under human-in-the-loop governance before anything reaches you. A senior reviewer owns the verdict and the remediation guidance. AI-derived findings are labeled as such and gated behind that sign-off — nothing ships on a model's say-so alone.

The 80/20 split is deliberate. Decomposition, probing, and the blind re-check are repeatable enough to run on a harness — which is what lets us go deeper, more consistently, and re-test for free on the flagship. The judgment about what a finding means, how severe it really is, and whether the surface is safe to ship stays with a human.

What you receive — the EVIDENCE/STATUS packet

Not a dashboard, not a logo wall, not a one-line "looks fine". A structured packet where every claim is backed by something you can re-run yourself.

per finding

A reproducible repro artifact

Each finding ships with the concrete steps, inputs, or payload to reproduce it — so your engineers can confirm it by hand, not take our word for it. If you can't reproduce it, it doesn't go in the packet.

per finding

Severity mapped to OWASP / ATLAS

Every finding is mapped to the OWASP LLM / Agentic categories and MITRE ATLAS techniques it touches, with a severity you can cite to a buyer or a regulator — in language that lines up with the frameworks they already use.

per finding

A remediation guide

Each finding comes with a concrete path to fix it — what to change and why — not just a flag. The point is a surface you can ship safely, not a longer list.

per finding

An independent blind verdict

The STATUS block carries the second-pass verdict: the reviewer who came in cold either confirmed the finding or didn't. You see both passes, so the evidence stands on its own.

The flagship Agentic & MCP Security Review includes one free re-test after you fix — we re-run the affected leaves and re-issue the verdict, so you can prove the remediation actually closed the finding.

What the verdict is — and what it is not

We are not an accredited certification body, we issue no certificate, and the EU AI Act conformity decision is the client's. Our AI verdict is evidence into a human decision — never the final safety authority. The packet feeds your own conformity work and your own attestation; the call to ship stays yours, informed by reproducible evidence and a senior human sign-off rather than a model's unverified say-so.

Start with a free scan → Book a review

Prefer to start with a question? The intake form at crm.waitdead.com/intake opens a tracked ticket. See what we review and how we map to the frameworks.

waitdead.ai · cómo corre una revisión

La máquina hace el 80% repetible. El juicio senior es dueño del 20% — y firma.

Una revisión que puedes inspeccionar, sobre un harness que construimos. Cada compromiso corre los mismos cinco pasos: acotar la superficie de ataque, descomponerla en hojas de revisión independientes, sondear cada hoja de forma adversarial frente a OWASP y MITRE ATLAS, verificar cada hallazgo con un pase independiente a ciegas y, por último, ponerlo ante un humano para que lo apruebe. Recibes un paquete EVIDENCIA/ESTADO — hallazgos reproducibles, no un PDF de opiniones.

Escaneo gratuito de exposición de agentes → Reservar una revisión

Los cinco pasos

La cobertura es explícita, no una intuición. El harness se encarga del trabajo repetible y paralelizable; el juicio humano senior es dueño de las decisiones que importan y es quien firma.

Alcance por superficie de ataque. Contamos las herramientas, servidores MCP, fuentes de datos y fronteras de confianza realmente en juego. Ese inventario es el precio y el plan — sin expansiones sorpresa a mitad del compromiso.
Descomposición en hojas de revisión independientes. La superficie se divide en hojas discretas para que la cobertura sea enumerable y revisable como una lista — cada llamada a herramienta, cada ruta de datos y cada frontera tiene su propia línea, no un solo gesto vago sobre "el agente".
Sondeo adversarial frente a OWASP + ATLAS. Cada hoja se ataca frente al OWASP Top 10 para Apps LLM, el OWASP Top 10 para Aplicaciones Agénticas (dic 2025) y MITRE ATLAS — envenenamiento de herramientas, secuestro de objetivos, envenenamiento de memoria/contexto, agencia excesiva, falta de autenticación y exposición entre inquilinos. Cada hallazgo se deriva de herramientas y se entrega con un artefacto de reproducción.
Verificación a ciegas e independiente. Un segundo pase que nunca vio el primero revisa cada hallazgo desde cero — reejecutando la reproducción, confirmando la severidad y atrapando falsos positivos. Es nuestro paso de verificación adversarial de segundo pase; un hallazgo solo se sostiene si sobrevive a un revisor que empezó en frío.
Firma humana. Los hallazgos se aprueban bajo gobernanza con humano en el bucle antes de que algo llegue a ti. Un revisor senior es dueño del veredicto y de la guía de remediación. Los hallazgos derivados de IA se etiquetan como tales y quedan condicionados a esa firma — nada se entrega por la sola palabra de un modelo.

La división 80/20 es deliberada. La descomposición, el sondeo y la revisión a ciegas son lo bastante repetibles como para correr sobre un harness — eso es lo que nos permite ir más a fondo, de forma más consistente, y retestear gratis en el buque insignia. El juicio sobre qué significa un hallazgo, cuán severo es realmente y si la superficie es segura para desplegar se queda con un humano.

Qué recibes — el paquete EVIDENCIA/ESTADO

No un dashboard, no un muro de logos, no un "se ve bien" de una línea. Un paquete estructurado donde cada afirmación está respaldada por algo que puedes reejecutar tú mismo.

por hallazgo

Un artefacto de reproducción

Cada hallazgo se entrega con los pasos, las entradas o el payload concretos para reproducirlo — para que tus ingenieros lo confirmen a mano, no por nuestra palabra. Si no se puede reproducir, no entra en el paquete.

por hallazgo

Severidad mapeada a OWASP / ATLAS

Cada hallazgo se mapea a las categorías OWASP LLM / Agénticas y a las técnicas de MITRE ATLAS que toca, con una severidad citable ante un comprador o un regulador — en un lenguaje que coincide con los marcos que ya usan.

por hallazgo

Una guía de remediación

Cada hallazgo viene con un camino concreto para corregirlo — qué cambiar y por qué — no solo una alerta. El objetivo es una superficie que puedas desplegar con seguridad, no una lista más larga.

por hallazgo

Un veredicto independiente a ciegas

El bloque ESTADO lleva el veredicto del segundo pase: el revisor que entró en frío confirmó el hallazgo o no. Ves ambos pases, así que la evidencia se sostiene sola.

La Revisión de seguridad Agéntica y MCP (buque insignia) incluye un retest gratuito tras corregir — reejecutamos las hojas afectadas y reemitimos el veredicto, para que puedas probar que la remediación realmente cerró el hallazgo.

Qué es el veredicto — y qué no es

No somos un organismo de certificación acreditado, no emitimos ningún certificado, y la decisión de conformidad del EU AI Act es del cliente. Nuestro veredicto de IA es evidencia para una decisión humana — nunca la autoridad final de seguridad. El paquete alimenta tu propio trabajo de conformidad y tu propia atestación; la decisión de desplegar se queda contigo, informada por evidencia reproducible y una firma humana senior en lugar de la palabra no verificada de un modelo.

Empieza con un escaneo gratuito → Reservar una revisión

¿Prefieres empezar con una pregunta? El formulario de intake en crm.waitdead.com/intake abre un ticket con seguimiento. Mira qué revisamos y cómo nos mapeamos a los marcos.