From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
Paper • 2604.14142 • Published • 26
None defined yet.
From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space
PLUME: Latent Reasoning Based Universal Multimodal Embedding