ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8 Reinforcement Learning • 8B • Updated Mar 28 • 2.36k • 188
mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-i1-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 752 • 4
JonusNattapong/Reinforcement-Learning-for-Gold-Trading-Model Reinforcement Learning • Updated 1 day ago • 20 • 1