EvalRank — OpenClaw 中文模型决策站

基于 PinchBench v1.0 的 AI 编程代理评测数据,提供中文解读、国内可调用性和类 Claw 产品选择建议。

已收录 50 个模型,2 个类 Claw 产品。

# 模型 供应商 Best% Avg% 开源
1 GPT-5.4 OpenAI 90.5% 81.6%
2 Qwen3.5-27B 阿里云(通义千问) 90% 78.5%
3 qwen3.5-397b-a17b 阿里云(通义千问) 89.1% 80.4%
4 Claude Sonnet 4.5 Anthropic 88.2% 80.7%
5 claude-sonnet-4.6 Anthropic 88% 80%
6 MiniMax M2.5 MiniMax(稀宇科技) 87.8% 79.3%
7 claude-opus-4.6 Anthropic 87.4% 82.3%
8 claude-opus-4.5 Anthropic 87.2% 79.2%
9 minimax-m2.7 MiniMax(稀宇科技) 87.1% 81.8%
10 gemini-3.1-pro-preview Google 86.7% 75.9%
11 glm-5-turbo Z AI 86.5% 81.6%
12 glm-5 Z AI 86.4% 80.3%
13 qwen3.5-plus-02-15 阿里云(通义千问) 85.8% 79.1%
14 glm-4.5-air Z AI 85.7% 77.3%
15 nemotron-3-super-120b-a12b NVIDIA 85.6% 77.3%
16 mimo-v2-omni 小米 85.6% 81.8%
17 qwen3.5-122b-a10b 阿里云(通义千问) 85.5% 80.8%
18 step-3.5-flash 阶跃星辰(StepFun) 85.3% 76.1%
19 gemini-3-flash-preview Google 85.2% 74%
20 kimi-k2.5 月之暗面(Moonshot AI) 84.8% 78.9%
21 deepseek-v3.2 深度求索(DeepSeek) 84.3% 69.4%
22 minimax-m2.1 MiniMax(稀宇科技) 84.3% 79.9%
23 mimo-v2-pro 小米 84% 81%
24 hunter-alpha OpenRouter 83.3% 77.3%
25 grok-4.1-fast xAI 82.4% 71%
26 devstral-2512 Mistral AI 82% 74.9%
27 claude-haiku-4.5 Anthropic 82% 76.6%
28 mimo-v2-flash 小米 81.5% 67.5%
29 healer-alpha OpenRouter 80.8% 77.3%
30 claude-sonnet-4 Anthropic 80.5% 80.5%
31 qwen3-max-thinking 阿里云(通义千问) 80.3% 71.8%
32 gpt-5-mini OpenAI 80.3% 68.9%
33 qwen3-coder-next 阿里云(通义千问) 79.1% 79.1%
34 qwen3.5-35b-a3b 阿里云(通义千问) 78.4% 71.7%
35 mercury-2 Inception 78% 70%
36 trinity-large-preview:free Arcee AI 77.7% 65.1%
37 gpt-4o-mini OpenAI 75% 62.7%
38 nemotron-3-super-120b-a12b:free NVIDIA 75% 69.6%
39 trinity-large-preview Arcee AI 74.3% 67%
40 mistral-large-2512 Mistral AI 72.2% 65.4%
41 gemini-2.5-pro Google 71.9% 65.3%
42 deepseek-chat 深度求索(DeepSeek) 71.7% 63.4%
43 gpt-4o OpenAI 71.1% 54.2%
44 gemini-3-pro-preview Google 70.7% 67.7%
45 gemini-2.5-flash Google 70.7% 58.2%
46 gpt-5-nano OpenAI 68.8% 57.1%
47 gpt-oss-20b OpenAI 66% 48.8%
48 gpt-oss-120b OpenAI 60.6% 47.7%
49 llama-4-maverick Meta 46.1% 34.8%
50 qwen-2.5-7b-instruct 阿里云(通义千问) 40.3% 34.1%