Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

reinforcement-learning

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

75,523

Base only

Active filters: reinforcement-learning

ByteDance/UniVR-34B-Planning

Image-Text-to-Text • Updated 3 days ago • 15 • 11

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 125 • 167

OpenMOSS-Team/MOSS-Transcribe-preview-2B

Automatic Speech Recognition • 2B • Updated 23 days ago • 2.33k • 37

Danau5tin/ai-trains-ai-trainer

Text Generation • Updated 7 days ago • 12 • 4

MooreThreads/MusaCoder-27B

Reinforcement Learning • 3.05M • Updated Jun 10 • 232 • 59

Tesleum/Shirdel-Finance-E4B

Reinforcement Learning • 8B • Updated 18 days ago • 445 • 5

mradermacher/Tifa-Deepsex-14b-CoT-GGUF

Reinforcement Learning • 15B • Updated Jul 31, 2025 • 732 • 26

PrimeIntellect/INTELLECT-3

Text Generation • 107B • Updated Nov 27, 2025 • 2.93k • 216

zai-org/GLM-TTS

Text-to-Speech • Updated Jan 12 • 1.03k • 344

nvidia/GEAR-SONIC

Reinforcement Learning • Updated Jun 17 • 58

Jinyang23/Seed-AlfWorld-3B

Text Generation • 3B • Updated 2 days ago • 426 • 2

sb3/dqn-MountainCar-v0

Reinforcement Learning • Updated Oct 11, 2022 • 20 • 2

LLParallax/sf_finetuning_forgetting_human_monk

Reinforcement Learning • Updated Apr 7, 2024 • 1

tensorblock/DeepSeek-R1-Medical-COT-GGUF

Reinforcement Learning • 8B • Updated Jan 27 • 71 • 3

bartowski/THU-KEG_LongWriter-Zero-32B-GGUF

Text Generation • 33B • Updated Jun 26, 2025 • 1.05k • 5

shiviktech/Trident

Text Generation • 4B • Updated Jan 7 • 5

nvidia/finite-difference-flow-optimization

Text-to-Image • Updated Mar 16 • 2

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 556

Maincode/Maincoder-1B

Text Generation • 1B • Updated May 13 • 288 • 95

OpenDataArena/ODA-Fin-RL-8B

Reinforcement Learning • 8B • Updated Mar 10 • 23 • 4

SolarSys2026/EnergyTrading

Reinforcement Learning • Updated Feb 8 • 2

Tzafon/Northstar-CUA-Fast

Image-Text-to-Text • 5B • Updated Apr 2 • 173 • 6

Aion2/llama3.2-3b-grpo-v1

Text Generation • Updated 4 days ago • 1

simplex-ai-inc/LiteResearcher-4B

Text Generation • 4B • Updated Apr 22 • 86 • • 6

bue0912/ToolOmni-Qwen3-4B

Text Generation • 4B • Updated Apr 16 • 11 • 4

InternScience/Agents-K1

Text Generation • 4B • Updated Jun 12 • 856 • • 29

Kerimhan1/kerimhanipip

Reinforcement Learning • Updated May 8 • 1

twnlp/ChineseErrorCorrector4-4B

Text Generation • 4B • Updated Jun 7 • 383 • 7

harryhsing/OmniAgent-RL-7B

Video-Text-to-Text • 9B • Updated Jun 18 • 59 • 1

sanju-1007/rl_course_vizdoom_health_gathering_supreme

Reinforcement Learning • Updated Jun 12 • 1