Imama Shehzad's picture

4

Imama Shehzad

ImamaS

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

reacted to m-ric's post with 🔥 11 days ago

A new research paper from KAIST builds on smolagents to push boundaries of distillation 🥳 ➡️ "Distilling LLM Agent into Small Models with Retrieval and Code Tools" teaches that, when trying to distil reasoning capability from a strong LLM ("teacher") into a smaller one ("student"), it's much better to use Agent traces than CoT traces. Advantages are: 1. Improved generalization Intuitively, this is because your agent can encounter more "surprising" results by interacting with its environment : for example, a web research called by the LLM teacher in agent mode can bring results that the LLM teacher would not have generated in CoT. 2. Reduce hallucinations The trace won't hallucinate tool call outputs! Thank you @akseljoonas for mentioning this paper!

updated a model 4 months ago

ImamaS/telugu_summary-T5-v2

View all activity

Organizations

Collections 1

models 3

ImamaS/telugu_summary-T5-v2

Text2Text Generation • Updated Jan 27 • 3

ImamaS/setfit_model

ImamaS/updated_tokenizer

Updated Oct 8, 2024

datasets 0

None public yet