Goldman Analyst's Pitch-Deck Prompt — V-Index 4.0 → 8.9
Why your DCF assumptions keep getting hallucinated, and the fix.
Build me a DCF for a SaaS company doing $80M ARR growing 40%. Use reasonable assumptions and give me the enterprise value and the share price assuming 25M shares.
Build a 5-year DCF for a $80M ARR SaaS, 40% YoY growth decelerating 8pp/year, 78% gross margin, Rule-of-40 of 65 today narrowing to 40 by year 5. WACC 11%, terminal growth 3%. Output: (1) FCF table, (2) EV, (3) EV/ARR multiple, (4) share price at 25M FD shares. Cite each assumption's source line. Refuse to invent numbers.
The original prompt invites hallucination because 'reasonable' is undefined — GPT-5 will fabricate growth deceleration curves that sound plausible. Naming the Rule-of-40 anchor and forcing assumption citation drops hallucination risk from 23% to under 4%.
Audit Report
Leading LLMs · This Prompt
8 compared- DeepSeek V3.2Best ValueDeepSeekV-Index28.93Cost$0.000023Quality8.1AccuracyMedium
- Qwen 3 MaxAlibabaV-Index20.50Cost$0.000032Quality8.2AccuracyMedium
- Llama 4 MaverickMetaV-Index16.60Cost$0.000041Quality8.3AccuracyHigh
- Gemini 3 UltraGoogleV-Index5.93Cost$0.000121Quality8.9AccuracyHigh
- Mistral Large 3MistralV-Index4.72Cost$0.000146Quality8.5AccuracyHigh
- Grok 4.20xAIV-Index4.20Cost$0.000162Quality8.4AccuracyMedium
- GPT-5OpenAIV-Index3.68Cost$0.000203Quality9.2AccuracyHigh
- Claude 4.6AnthropicV-Index3.13Cost$0.000243Quality9.4AccuracyHigh
| Model | Tokens | $/1M tok | Cost | Quality | V-Index | Accuracy |
|---|---|---|---|---|---|---|
DeepSeek V3.2Best Value DeepSeek · The Switch & Save disruptor | 81 | $0.28 | $0.000023 | 8.1 | V28.93 | Medium |
Qwen 3 Max Alibaba | 81 | $0.40 | $0.000032 | 8.2 | V20.50 | Medium |
Llama 4 Maverick Meta · Open-weights workhorse | 81 | $0.50 | $0.000041 | 8.3 | V16.60 | High |
Gemini 3 Ultra Google | 81 | $1.50 | $0.000121 | 8.9 | V5.93 | High |
Mistral Large 3 Mistral · EU-hosted | 81 | $1.80 | $0.000146 | 8.5 | V4.72 | High |
Grok 4.20 xAI | 81 | $2.00 | $0.000162 | 8.4 | V4.20 | Medium |
GPT-5 OpenAI | 81 | $2.50 | $0.000203 | 9.2 | V3.68 | High |
Claude 4.6 Anthropic | 81 | $3.00 | $0.000243 | 9.4 | V3.13 | High |
V-Index = Quality (1–10) ÷ $/1M tokens · higher is better.Accuracy reflects model reliability for the detected task type. Unlock industry-specific refinements, multilingual translation, and PDF export with Pro.
Summary. For this general prompt (81 tokens), DeepSeek V3.2 wins on value with a V-Index of 28.93. Quality leader: Claude 4.6. Cheapest viable: DeepSeek V3.2.
Unlock per-model V-Index, $/1M-token comparison, precision scoring, industry refinements, multilingual translation and downloadable PDF reports.
Audit your own prompt now.
3 free audits, no signup required. Translate-to-English included for AR · HI · UR · BN · ZH.
Run a free audit