Tung-Lin Wu's picture

Tung-Lin Wu

tunglinwu

·

tunglinwood

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

ResembleAI/chatterbox-turbo:Supported languages

liked a model 7 days ago

ResembleAI/chatterbox-turbo

upvoted an article 29 days ago

Continuous batching from first principles

View all activity

Organizations

None yet

upvoted an article 29 days ago

Article

Continuous batching from first principles

+1

Nov 25, 2025

•

296

upvoted a collection 5 months ago

DeepSeek-V3.1

4 items • Updated Nov 27, 2025 • 257

upvoted a collection 8 months ago

Qwen3

84 items • Updated 6 days ago • 1.54k

upvoted a collection 9 months ago

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated Jun 30, 2025 • 133

upvoted 2 papers 9 months ago

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3, 2025 • 222

Training Sparse Mixture Of Experts Text Embedding Models

Paper • 2502.07972 • Published Feb 11, 2025 • 9

upvoted a collection 9 months ago

Qwen2.5-Omni

End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated 6 days ago • 160

upvoted a paper 10 months ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 138

upvoted an article 10 months ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28, 2024

•

262

upvoted 2 papers 10 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 253

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published Feb 24, 2025 • 32

upvoted a paper 11 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 433

upvoted 3 articles 11 months ago

Article

What is test-time compute and how to scale it?

Feb 6, 2025

•

110

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.02k

Article

Open-source DeepResearch – Freeing our search agents

+3

Feb 4, 2025

•

1.31k

upvoted a collection about 1 year ago

Llama 3.3

This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 191

upvoted a paper about 1 year ago

HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2, 2024 • 24

upvoted a collection about 1 year ago

Emu3

Emu3: Next-Token Prediction is All You Need • 7 items • Updated Feb 13, 2025 • 79

upvoted a paper over 1 year ago

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 74

upvoted a collection over 1 year ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 649