๐ Performance Benchmarks
MiroFlow achieves state-of-the-art performance across multiple agentic benchmarks, demonstrating its effectiveness in complex reasoning and tool-use tasks.
Performance on Future Prediction
Future X Benchmark Results
MiroFlow demonstrates exceptional performance in future prediction tasks.
โจ Performance on Benchmarks
Comprehensive Benchmark Analysis
We benchmark MiroFlow on a series of benchmarks including GAIA, HLE, BrowseComp and xBench-DeepSearch.
Other Benchmark Results
Detailed Performance Comparison
Comprehensive comparison across multiple benchmark categories and competing frameworks.
Reasoning & Language Understanding
Model/Framework | GAIA Val | HLE | HLE-Text |
---|---|---|---|
MiroFlow | 82.4% | 27.2% | 29.5% |
OpenAI Deep Research | 67.4% | 26.6% | - |
Gemini Deep Research | - | 26.9% | - |
Kimi Researcher | - | - | 26.9% |
WebSailor-72B | 55.4% | - | - |
Manus | 73.3% | - | - |
DeepSeek v3.1 | - | - | 29.8% |
Web Browsing & Search Tasks
Model/Framework | BrowserComp-EN | BrowserComp-ZH | xBench-DeepSearch |
---|---|---|---|
MiroFlow | 33.2% | 47.1% | 72.0% |
OpenAI Deep Research | 51.5% | 42.9% | - |
Gemini Deep Research | - | - | 50+% |
Kimi Researcher | - | - | 69.0% |
WebSailor-72B | - | 30.1% | 55.0% |
DeepSeek v3.1 | - | - | 71.2% |
Documentation Info
Last Updated: September 2025 ยท Doc Contributor: Team @ MiroMind AI