randomicity

AI & ML interests

None yet

Recent Activity

upvoted an article 7 days ago

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

upvoted a collection about 1 month ago

Nordic datasets with Fineweb-edu predictions

reacted to onekq's post with 👍 about 1 month ago

Instead of architectural upgade, each major model drop nowadays perfects a regional innovation. What Kimi brought to spot light this time is quantization aware training (QAT). I wrote an article to explain it and why it matters to reasoning models. https://huggingface.co/blog/onekq/qat-bonsai If you are interested in this kind of posts, I will introduce the Muon optimizers, another technology behind Kimi success.

View all activity

Organizations

None yet

upvoted an article 7 days ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

8 days ago

•

upvoted a collection about 1 month ago

Nordic datasets with Fineweb-edu predictions

Collection

16 items • Updated 19 days ago • 1

reacted to onekq's post with 👍 about 1 month ago

Post

2442

Instead of architectural upgade, each major model drop nowadays perfects a regional innovation. What Kimi brought to spot light this time is quantization aware training (QAT). I wrote an article to explain it and why it matters to reasoning models.

https://huggingface.co/blog/onekq/qat-bonsai

If you are interested in this kind of posts, I will introduce the Muon optimizers, another technology behind Kimi success.