Our Estimation Methodology

When restaurants don't publish nutritional data, we estimate it — and we've measured exactly how accurate those estimates are.

84 kcal

Avg calorie error (image only)

29%

Calorie MAPE, beating every tested app

2.5g

Protein error when calories known

Dishes blind-tested across 3 chains

Download Estimation Methodology (PDF)

We ran a controlled blind experiment: 60 real dishes from Pret A Manger, Itsu, and Farmer J, each with complete restaurant-reported nutritional data. We asked our AI to estimate those same values — then compared the outputs against the known truth. The result: our single-shot estimation pipeline outperforms every independently tested commercial nutrition app in the published literature.

Sourced data first. Always.

Health Freak isn't an estimation-only platform. Our data hierarchy is:

Restaurant-published data — If a restaurant reports nutritional values, we use those. Always the most accurate source.
AI-estimated data — For restaurants that publish nothing or only partial data, we use LLM-based estimation. Every estimated value is clearly labelled.
Transparent provenance — Every data point carries a tag showing whether it was sourced or estimated.

We estimate because the alternative — leaving users with no nutritional information at all — is worse. But we're honest about what's sourced and what's estimated.

How we tested accuracy

We selected 20 dishes from each of Pret A Manger, Itsu, and Farmer J — chosen because they represent different cuisine types and complexity levels, and all had complete nutritional data to test against. Each dish was run through four scenarios, giving the AI progressively more information:

Scenario	What the AI was given	What it had to estimate
S1: Image only	Photo + dish name	Everything
S2: + Description	S1 + menu description	Everything
S3: + Calories	S2 + reported calories	Macros, salt, sugar, sat fat
S4: + Macros	S3 + protein, fat, carbs, fibre	Salt, sugar, sat fat only

All estimates were generated by GPT-4.1 (OpenAI). A second round using Google's Gemini 2.5 Flash is in progress.

Key findings

Calorie estimation: 84 kcal average error from a photo alone

For a typical restaurant dish of 300–700 kcal, that's an error of 12–28% — enough to reliably distinguish a light salad from a calorie-dense bowl.

Restaurant	MAE (kcal)	MAPE
Pret A Manger	68	23%
Itsu	63	27%
Farmer J	121	36%

Adding the menu description doesn't help

Providing the menu description (S1 → S2) produced no meaningful accuracy improvement. The AI's visual analysis of the image, combined with the dish name and restaurant context, already captures whatever nutritional signal the description provides.

Knowing calories unlocks accurate macro estimation

The single biggest accuracy improvement comes from providing restaurant-reported calories. Many UK chains are legally required to publish these.

Nutrient	Without calories (S1)	With calories (S3)	Improvement
Protein	3.8g MAE	2.5g MAE	34% better
Fat	6.7g MAE	4.2g MAE	37% better
Carbs	10.5g MAE	8.9g MAE	15% better
Fibre	1.9g MAE	1.4g MAE	26% better

Salt, sugars, and saturated fat: an industry-wide challenge

These three fields remain difficult regardless of how much information the model has. Salt MAPE stays around 50% across all scenarios. This isn't a failure of our pipeline — it's a fundamental limitation of estimating from visual and textual cues alone. No commercial nutrition app currently attempts to estimate salt from images.

How we compare to other systems

vs. Commercial nutrition apps

Most comprehensive independent benchmark: Yan et al. (2025), Nature Communications Medicine

System	Calorie MAE	Notes
Health Freak	84 kcal	Image + dish name, GPT-4.1
DietAI24 (research)	48 kcal	Multi-stage RAG, not commercially available
Foodvisor	168 kcal	Commercial app
SnapCalorie	169 kcal	Uses LIDAR depth sensors
ViT baseline	199 kcal	Trained Vision Transformer
Calorie Mama	277 kcal	Commercial app

Our error rate is approximately half that of the best commercial apps — achieved using a general-purpose LLM with a well-designed prompt, without custom-trained models, depth sensors, or proprietary food image datasets.

vs. Direct LLM benchmarks

Benchmark: Fridolfsson et al. (2025), Current Developments in Nutrition

System	Calorie MAPE
Health Freak	29%
ChatGPT-4o	36%
Claude 3.5 Sonnet	36%
Gemini 1.5 Pro	64–110%

vs. Human estimation

Validation studies using doubly-labelled water show that untrained humans underreport energy intake by 20–50%. Even trained nutrition professionals miss portion-based calorie estimates by approximately 41%. Our 29% MAPE from a standard photograph, without depth sensing, sits comfortably within this range.

What this means for you

When we show sourced data, it's direct from the restaurant — as accurate as the restaurant's own measurements.
When we show estimated data, it's clearly labelled, and the estimates are more accurate than any commercially available nutrition app we've benchmarked against.
For less reliable fields (particularly salt, sugars, saturated fat), our scoring algorithms degrade gracefully — weighting sourced data more heavily and applying wider confidence bands.

We continue to expand our benchmark as new AI models emerge. A second model comparison (Google Gemini 2.5 Flash) is currently in progress, and we are investigating multi-stage estimation pipelines with food composition database integration.

References

Yan, R. et al. (2025). "DietAI24 as a framework for comprehensive nutrition estimation using multimodal large language models." Communications Medicine, 5, 458.
Fridolfsson, J. et al. (2025). "Performance Evaluation of 3 Large Language Models for Nutritional Content Estimation from Food Images." Current Developments in Nutrition, 9(10), 107556.
Chotwanvirat, P. et al. (2024). "Advancements in Using AI for Dietary Assessment Based on Food Images: Scoping Review." Journal of Medical Internet Research, 26, e51432.
Li, X. et al. (2024). "Evaluating the Quality and Comparative Validity of Manual Food Logging and AI-Enabled Food Image Recognition in Apps for Nutrition Care." Nutrients, 16(15), 2573.
Azimi, I. et al. (2025). "Evaluation of LLMs accuracy and consistency in the registered dietitian exam through prompt engineering and knowledge retrieval." Scientific Reports, 15, 1506.

Download the full white paper

Complete methodology, per-dish results, statistical tables, and full literature references.

Download Estimation Methodology (PDF)