2 2 7

Abhranil Chandra PRO

abhranil14

AI & ML interests

Reinforcement Learning, Deep Unsupervised Learning, NLP and Bayesian Deep Learning

Recent Activity

updated a model 17 days ago

abhranil14/L8B_GSM8k_H_7473_batch256_lr1e-6_epoch10_linear

published a model 17 days ago

abhranil14/L8B_GSM8k_H_7473_batch256_lr1e-6_epoch10_linear

updated a model 17 days ago

abhranil14/L8B_GSM8k_H_Para_7473_batch256_lr1e-6_epoch10_linear

View all activity

Organizations

Collections 8

View 8 collections

Papers 5

spaces 1

First Agent Template

⚡

Find the current local time in any timezone

models 82

datasets 5

abhranil14/VideoAgent_Data

Preview • Updated Jul 17, 2025 • 4

abhranil14/syn_qs_and_soln_cleaned_0_and_less20_multiple_soln_per_qs_1937545

Viewer • Updated May 12, 2025 • 1.94M • 3

abhranil14/syn_qs_and_soln_cleaned_0_and_less20_1_soln_per_qs_131845

Viewer • Updated May 12, 2025 • 132k • 11

abhranil14/instruct-human-assistant-prompt-clean-105k

Viewer • Updated Sep 18, 2024 • 105k • 12

abhranil14/first-instruct-human-assistant-prompt-clean-33k

Viewer • Updated Sep 18, 2024 • 33.1k • 14

Abhranil Chandra PRO

AI & ML interests

Recent Activity

Organizations

Collections 8

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Offline Reinforcement Learning for LLM Multi-Step Reasoning

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Offline Reinforcement Learning for LLM Multi-Step Reasoning

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Papers 5

spaces 1

First Agent Template

models 82

abhranil14/L8B_GSM8k_H_7473_batch256_lr1e-6_epoch10_linear

abhranil14/L8B_GSM8k_H_Para_7473_batch256_lr1e-6_epoch10_linear

abhranil14/G2B_GSM8k_H_7473_batch64_lr2e-5_epoch10_linear

abhranil14/G2B_GSM8k_H_Para_7473_batch64_lr2e-5_epoch10_linear

abhranil14/Q1.5B_MATH_G_4408_batch64_lr2e-5_epoch10_linear

abhranil14/Q1.5B_MATH_H_4408_batch64_lr2e-5_epoch10_linear

abhranil14/Q1.5B_MATH_W_4408_batch64_lr2e-5_epoch10_linear

abhranil14/L8B_MATH_G_4408_batch256_lr1e-6_epoch10_linear

abhranil14/L8B_MATH_H_4408_batch256_lr1e-6_epoch10_linear

abhranil14/L8B_MATH_W_4408_batch256_lr1e-6_epoch10_linear

datasets 5

abhranil14/VideoAgent_Data

abhranil14/syn_qs_and_soln_cleaned_0_and_less20_multiple_soln_per_qs_1937545

abhranil14/syn_qs_and_soln_cleaned_0_and_less20_1_soln_per_qs_131845

abhranil14/instruct-human-assistant-prompt-clean-105k

abhranil14/first-instruct-human-assistant-prompt-clean-33k

Abhranil Chandra PRO

AI & ML interests

Recent Activity

Organizations

Collections 8

Papers 5

spaces 1

First Agent Template

models 82 Sort: Recently updated

datasets 5 Sort: Recently updated

models 82

datasets 5