BigLaw Partner's Contract Review Prompt — V-Index 3.4 → 9.1
A real M&A diligence prompt audited live. The fix is one sentence.
Review this SPA and tell me what's wrong with it. Look at indemnities, reps and warranties, MAC clause, escrow. Flag anything unusual. The deal is $400M and we're buying.
Act as M&A counsel for a buyer. Review the attached SPA against Delaware-law market norms (ABA 2024 Deal Points Study). For each of: indemnity caps, survival periods, MAC carve-outs, escrow %, output a row: [Section · Market · This Deal · Risk H/M/L · 1-line negotiation ask]. Flag only deviations. No commentary on standard terms.
The original prompt asks the model to be a generalist. The rewrite anchors to a known benchmark (ABA Deal Points), demands a structured table, and explicitly suppresses noise. Claude 4.6's precision on legal tasks jumps from 71% to 94% with this framing.
Audit Report
Leading LLMs · This Prompt
8 compared- DeepSeek V3.2Best ValueDeepSeekV-Index28.93Cost$0.000023Quality8.1AccuracyMedium
- Qwen 3 MaxAlibabaV-Index20.50Cost$0.000033Quality8.2AccuracyMedium
- Llama 4 MaverickMetaV-Index16.60Cost$0.000041Quality8.3AccuracyHigh
- Gemini 3 UltraGoogleV-Index5.93Cost$0.000124Quality8.9AccuracyHigh
- Mistral Large 3MistralV-Index4.72Cost$0.000149Quality8.5AccuracyHigh
- Grok 4.20xAIV-Index4.20Cost$0.000166Quality8.4AccuracyMedium
- GPT-5OpenAIV-Index3.68Cost$0.000208Quality9.2AccuracyHigh
- Claude 4.6AnthropicV-Index3.13Cost$0.000249Quality9.4AccuracyHigh
| Model | Tokens | $/1M tok | Cost | Quality | V-Index | Accuracy |
|---|---|---|---|---|---|---|
DeepSeek V3.2Best Value DeepSeek · The Switch & Save disruptor | 83 | $0.28 | $0.000023 | 8.1 | V28.93 | Medium |
Qwen 3 Max Alibaba | 83 | $0.40 | $0.000033 | 8.2 | V20.50 | Medium |
Llama 4 Maverick Meta · Open-weights workhorse | 83 | $0.50 | $0.000041 | 8.3 | V16.60 | High |
Gemini 3 Ultra Google | 83 | $1.50 | $0.000124 | 8.9 | V5.93 | High |
Mistral Large 3 Mistral · EU-hosted | 83 | $1.80 | $0.000149 | 8.5 | V4.72 | High |
Grok 4.20 xAI | 83 | $2.00 | $0.000166 | 8.4 | V4.20 | Medium |
GPT-5 OpenAI | 83 | $2.50 | $0.000208 | 9.2 | V3.68 | High |
Claude 4.6 Anthropic | 83 | $3.00 | $0.000249 | 9.4 | V3.13 | High |
V-Index = Quality (1–10) ÷ $/1M tokens · higher is better.Accuracy reflects model reliability for the detected task type. Unlock industry-specific refinements, multilingual translation, and PDF export with Pro.
Summary. For this general prompt (83 tokens), DeepSeek V3.2 wins on value with a V-Index of 28.93. Quality leader: Claude 4.6. Cheapest viable: DeepSeek V3.2.
Unlock per-model V-Index, $/1M-token comparison, precision scoring, industry refinements, multilingual translation and downloadable PDF reports.
Audit your own prompt now.
3 free audits, no signup required. Translate-to-English included for AR · HI · UR · BN · ZH.
Run a free audit