Hacker News Clone new | comments | show | ask | jobs | submit | github repologin
MLX LM 0.20.1 has the comparable speed as llama.cpp with flash attention (old.reddit.com)
1 points by tosh 2 hours ago | hide | past | web | discuss | favorite







Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: