Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
gitlost-murali 's Collections
Agentic & Multi-turn Chat

Agentic & Multi-turn Chat

updated Jul 19, 2025

Literature for evaluating agents and multi-turn chat. Blogs: https://arize.com/blog/prompt-learning-using-english-feedback-to-optimize-llm-systems

Upvote
-

  • CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs

    Paper • 2408.02193 • Published Aug 5, 2024 • 1

  • google/frames-benchmark

    Viewer • Updated Oct 15, 2024 • 824 • 6.57k • 238

  • gaia-benchmark/GAIA

    Viewer • Updated Oct 28, 2025 • 932 • 16k • 578

  • callanwu/WebWalkerQA

    Viewer • Updated Sep 8, 2025 • 14.3k • 10k • 45

  • WebSailor: Navigating Super-human Reasoning for Web Agent

    Paper • 2507.02592 • Published Jul 3, 2025 • 123

  • Establishing Best Practices for Building Rigorous Agentic Benchmarks

    Paper • 2507.02825 • Published Jul 3, 2025 • 1

  • promptfoo/CCP-sensitive-prompts

    Viewer • Updated Jan 28, 2025 • 1.36k • 382 • 55
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs