Our Estimation Methodology

When restaurants don't publish nutritional data, we estimate it — and we've measured exactly how accurate those estimates are.

84 kcal
Avg calorie error (image only)
29%
Calorie MAPE, beating every tested app
2.5g
Protein error when calories known
60
Dishes blind-tested across 3 chains
Download Estimation Methodology (PDF)

We ran a controlled blind experiment: 60 real dishes from Pret A Manger, Itsu, and Farmer J, each with complete restaurant-reported nutritional data. We asked our AI to estimate those same values — then compared the outputs against the known truth. The result: our single-shot estimation pipeline outperforms every independently tested commercial nutrition app in the published literature.

Sourced data first. Always.

Health Freak isn't an estimation-only platform. Our data hierarchy is:

  1. Restaurant-published data — If a restaurant reports nutritional values, we use those. Always the most accurate source.
  2. AI-estimated data — For restaurants that publish nothing or only partial data, we use LLM-based estimation. Every estimated value is clearly labelled.
  3. Transparent provenance — Every data point carries a tag showing whether it was sourced or estimated.

We estimate because the alternative — leaving users with no nutritional information at all — is worse. But we're honest about what's sourced and what's estimated.

How we tested accuracy

We selected 20 dishes from each of Pret A Manger, Itsu, and Farmer J — chosen because they represent different cuisine types and complexity levels, and all had complete nutritional data to test against. Each dish was run through four scenarios, giving the AI progressively more information:

Scenario What the AI was given What it had to estimate
S1: Image onlyPhoto + dish nameEverything
S2: + DescriptionS1 + menu descriptionEverything
S3: + CaloriesS2 + reported caloriesMacros, salt, sugar, sat fat
S4: + MacrosS3 + protein, fat, carbs, fibreSalt, sugar, sat fat only

All estimates were generated by GPT-4.1 (OpenAI). A second round using Google's Gemini 2.5 Flash is in progress.

Key findings

Calorie estimation: 84 kcal average error from a photo alone

For a typical restaurant dish of 300–700 kcal, that's an error of 12–28% — enough to reliably distinguish a light salad from a calorie-dense bowl.

Restaurant MAE (kcal) MAPE
Pret A Manger6823%
Itsu6327%
Farmer J12136%

Adding the menu description doesn't help

Providing the menu description (S1 → S2) produced no meaningful accuracy improvement. The AI's visual analysis of the image, combined with the dish name and restaurant context, already captures whatever nutritional signal the description provides.

Knowing calories unlocks accurate macro estimation

The single biggest accuracy improvement comes from providing restaurant-reported calories. Many UK chains are legally required to publish these.

Nutrient Without calories (S1) With calories (S3) Improvement
Protein3.8g MAE2.5g MAE34% better
Fat6.7g MAE4.2g MAE37% better
Carbs10.5g MAE8.9g MAE15% better
Fibre1.9g MAE1.4g MAE26% better

Salt, sugars, and saturated fat: an industry-wide challenge

These three fields remain difficult regardless of how much information the model has. Salt MAPE stays around 50% across all scenarios. This isn't a failure of our pipeline — it's a fundamental limitation of estimating from visual and textual cues alone. No commercial nutrition app currently attempts to estimate salt from images.

How we compare to other systems

vs. Commercial nutrition apps

Most comprehensive independent benchmark: Yan et al. (2025), Nature Communications Medicine

System Calorie MAE Notes
Health Freak84 kcalImage + dish name, GPT-4.1
DietAI24 (research)48 kcalMulti-stage RAG, not commercially available
Foodvisor168 kcalCommercial app
SnapCalorie169 kcalUses LIDAR depth sensors
ViT baseline199 kcalTrained Vision Transformer
Calorie Mama277 kcalCommercial app

Our error rate is approximately half that of the best commercial apps — achieved using a general-purpose LLM with a well-designed prompt, without custom-trained models, depth sensors, or proprietary food image datasets.

vs. Direct LLM benchmarks

Benchmark: Fridolfsson et al. (2025), Current Developments in Nutrition

System Calorie MAPE
Health Freak29%
ChatGPT-4o36%
Claude 3.5 Sonnet36%
Gemini 1.5 Pro64–110%

vs. Human estimation

Validation studies using doubly-labelled water show that untrained humans underreport energy intake by 20–50%. Even trained nutrition professionals miss portion-based calorie estimates by approximately 41%. Our 29% MAPE from a standard photograph, without depth sensing, sits comfortably within this range.

What this means for you

We continue to expand our benchmark as new AI models emerge. A second model comparison (Google Gemini 2.5 Flash) is currently in progress, and we are investigating multi-stage estimation pipelines with food composition database integration.

References

Download the full white paper

Complete methodology, per-dish results, statistical tables, and full literature references.

Download Estimation Methodology (PDF)