搜尋「Max,LargitData」— LargitData 部落格

MLX Inference Benchmark: 4 Frameworks on Apple M5 Max with 35B LLM

LargitData
MLX,Apple Silicon,LLM benchmark,QubicX,on-premise AI,M5 Max,LargitData
May 10, 2026, 2:43 p.m.

Real benchmark of four MLX inference frameworks (rapid-mlx, omlx, dflash-mlx, mlx-vlm) on Apple M5 Max with 64 GB unified memory using a 35B quantized MoE model across seven context lengths (64 to 32K tokens). Decode speed, TTFT, stability, and enterprise on-premise AI selection guide. Source data from ywchiu/mlx_benchmark_lab.

LargitData — 企業情報與風險 AI 平台

部落格 — LargitData AI 與大數據技術文章

MLX Inference Benchmark: 4 Frameworks on Apple M5 Max with 35B LLM