EvalRank — OpenClaw 中文模型决策站
基于 PinchBench v1.0 的 AI 编程代理评测数据,提供中文解读、国内可调用性和类 Claw 产品选择建议。
已收录 50 个模型,2 个类 Claw 产品。
| # | 模型 | 供应商 | Best% | Avg% | 开源 |
|---|---|---|---|---|---|
| 1 | GPT-5.4 | OpenAI | 90.5% | 81.6% | ✗ |
| 2 | Qwen3.5-27B | 阿里云(通义千问) | 90% | 78.5% | ✓ |
| 3 | qwen3.5-397b-a17b | 阿里云(通义千问) | 89.1% | 80.4% | ✓ |
| 4 | Claude Sonnet 4.5 | Anthropic | 88.2% | 80.7% | ✗ |
| 5 | claude-sonnet-4.6 | Anthropic | 88% | 80% | ✗ |
| 6 | MiniMax M2.5 | MiniMax(稀宇科技) | 87.8% | 79.3% | ✓ |
| 7 | claude-opus-4.6 | Anthropic | 87.4% | 82.3% | ✗ |
| 8 | claude-opus-4.5 | Anthropic | 87.2% | 79.2% | ✗ |
| 9 | minimax-m2.7 | MiniMax(稀宇科技) | 87.1% | 81.8% | ✗ |
| 10 | gemini-3.1-pro-preview | 86.7% | 75.9% | ✗ | |
| 11 | glm-5-turbo | Z AI | 86.5% | 81.6% | ✗ |
| 12 | glm-5 | Z AI | 86.4% | 80.3% | ✓ |
| 13 | qwen3.5-plus-02-15 | 阿里云(通义千问) | 85.8% | 79.1% | ✗ |
| 14 | glm-4.5-air | Z AI | 85.7% | 77.3% | ✓ |
| 15 | nemotron-3-super-120b-a12b | NVIDIA | 85.6% | 77.3% | ✓ |
| 16 | mimo-v2-omni | 小米 | 85.6% | 81.8% | ✗ |
| 17 | qwen3.5-122b-a10b | 阿里云(通义千问) | 85.5% | 80.8% | ✓ |
| 18 | step-3.5-flash | 阶跃星辰(StepFun) | 85.3% | 76.1% | ✓ |
| 19 | gemini-3-flash-preview | 85.2% | 74% | ✗ | |
| 20 | kimi-k2.5 | 月之暗面(Moonshot AI) | 84.8% | 78.9% | ✓ |
| 21 | deepseek-v3.2 | 深度求索(DeepSeek) | 84.3% | 69.4% | ✓ |
| 22 | minimax-m2.1 | MiniMax(稀宇科技) | 84.3% | 79.9% | ✓ |
| 23 | mimo-v2-pro | 小米 | 84% | 81% | ✗ |
| 24 | hunter-alpha | OpenRouter | 83.3% | 77.3% | ✗ |
| 25 | grok-4.1-fast | xAI | 82.4% | 71% | ✗ |
| 26 | devstral-2512 | Mistral AI | 82% | 74.9% | ✓ |
| 27 | claude-haiku-4.5 | Anthropic | 82% | 76.6% | ✗ |
| 28 | mimo-v2-flash | 小米 | 81.5% | 67.5% | ✗ |
| 29 | healer-alpha | OpenRouter | 80.8% | 77.3% | ✗ |
| 30 | claude-sonnet-4 | Anthropic | 80.5% | 80.5% | ✗ |
| 31 | qwen3-max-thinking | 阿里云(通义千问) | 80.3% | 71.8% | ✗ |
| 32 | gpt-5-mini | OpenAI | 80.3% | 68.9% | ✗ |
| 33 | qwen3-coder-next | 阿里云(通义千问) | 79.1% | 79.1% | ✓ |
| 34 | qwen3.5-35b-a3b | 阿里云(通义千问) | 78.4% | 71.7% | ✓ |
| 35 | mercury-2 | Inception | 78% | 70% | ✗ |
| 36 | trinity-large-preview:free | Arcee AI | 77.7% | 65.1% | ✓ |
| 37 | gpt-4o-mini | OpenAI | 75% | 62.7% | ✗ |
| 38 | nemotron-3-super-120b-a12b:free | NVIDIA | 75% | 69.6% | ✗ |
| 39 | trinity-large-preview | Arcee AI | 74.3% | 67% | ✗ |
| 40 | mistral-large-2512 | Mistral AI | 72.2% | 65.4% | ✓ |
| 41 | gemini-2.5-pro | 71.9% | 65.3% | ✗ | |
| 42 | deepseek-chat | 深度求索(DeepSeek) | 71.7% | 63.4% | ✓ |
| 43 | gpt-4o | OpenAI | 71.1% | 54.2% | ✗ |
| 44 | gemini-3-pro-preview | 70.7% | 67.7% | ✗ | |
| 45 | gemini-2.5-flash | 70.7% | 58.2% | ✗ | |
| 46 | gpt-5-nano | OpenAI | 68.8% | 57.1% | ✗ |
| 47 | gpt-oss-20b | OpenAI | 66% | 48.8% | ✓ |
| 48 | gpt-oss-120b | OpenAI | 60.6% | 47.7% | ✓ |
| 49 | llama-4-maverick | Meta | 46.1% | 34.8% | ✓ |
| 50 | qwen-2.5-7b-instruct | 阿里云(通义千问) | 40.3% | 34.1% | ✓ |