Skip to content

๐Ÿ“Š Performance Benchmarks

MiroFlow achieves state-of-the-art performance across multiple agentic benchmarks, demonstrating its effectiveness in complex reasoning and tool-use tasks.


Performance on Future Prediction

Future X Benchmark Results

MiroFlow demonstrates exceptional performance in future prediction tasks.

Future X Performance Results


โœจ Performance on Benchmarks

Comprehensive Benchmark Analysis

We benchmark MiroFlow on a series of benchmarks including GAIA, HLE, BrowseComp and xBench-DeepSearch.

Comprehensive Benchmark Performance Comparison


Other Benchmark Results

Detailed Performance Comparison

Comprehensive comparison across multiple benchmark categories and competing frameworks.

Reasoning & Language Understanding

Model/Framework GAIA Val HLE HLE-Text
MiroFlow 82.4% 27.2% 29.5%
OpenAI Deep Research 67.4% 26.6% -
Gemini Deep Research - 26.9% -
Kimi Researcher - - 26.9%
WebSailor-72B 55.4% - -
Manus 73.3% - -
DeepSeek v3.1 - - 29.8%

Web Browsing & Search Tasks

Model/Framework BrowserComp-EN BrowserComp-ZH xBench-DeepSearch
MiroFlow 33.2% 47.1% 72.0%
OpenAI Deep Research 51.5% 42.9% -
Gemini Deep Research - - 50+%
Kimi Researcher - - 69.0%
WebSailor-72B - 30.1% 55.0%
DeepSeek v3.1 - - 71.2%

Documentation Info

Last Updated: September 2025 ยท Doc Contributor: Team @ MiroMind AI