Post
2706
I tested Muon vs MuonClip vs Muon+AdamW for fine-tuning LLMs
Just published a blog on that, Read here π https://huggingface.co/blog/KingNish/optimizer-part1
Just published a blog on that, Read here π https://huggingface.co/blog/KingNish/optimizer-part1