搜尋「Silicon,LLM」— LargitData 部落格

rapid-mlx vs oMLX: MLX Inference Benchmark on Apple M5 Max (35B LLM)

LargitData
MLX,Apple Silicon,LLM benchmark,QubicX,on-premise AI,M5 Max,LargitData
May 10, 2026, 2:43 p.m.

Head-to-head MLX inference benchmark on Apple M5 Max (64 GB unified memory): rapid-mlx vs omlx vs dflash-mlx vs mlx-vlm running a 35B LLM — tokens/sec, time-to-first-token, memory use, and which framework to pick in 2026.

MLX 推論框架基準測試：Apple Silicon M5 Max 跑 35B LLM 實測比較

LargitData
MLX,Apple Silicon,LLM benchmark,QubicX,地端AI,M5 Max,大數軟體
May 10, 2026, 2:43 p.m.

在 Apple M5 Max（64 GB 統一記憶體）上以 35B 量化 MoE 模型實測 rapid-mlx、omlx、dflash-mlx、mlx-vlm 四大 MLX 推論框架，涵蓋 64 至 32K Tokens 七個上下文長度的解碼速度、TTFT 與穩定度比較，並提供企業地端 AI 選型建議。原始基準測試資料由 ywchiu/mlx_benchmark_lab 開源公開。

LargitData — 企業情報與風險 AI 平台

部落格 — LargitData AI 與大數據技術文章

rapid-mlx vs oMLX: MLX Inference Benchmark on Apple M5 Max (35B LLM)

MLX 推論框架基準測試：Apple Silicon M5 Max 跑 35B LLM 實測比較