-
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Paper • 2304.09842 • Published • 2 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 27 -
Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 5 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 5
Collections
Discover the best community collections!
Collections including paper arxiv:2411.14257
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 9 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 49 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 73 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 39
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 42 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 119 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 49 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Paper • 2411.14257 • Published • 13 -
Distinguishing Ignorance from Error in LLM Hallucinations
Paper • 2410.22071 • Published -
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Paper • 2410.18860 • Published • 11 -
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Paper • 2410.11779 • Published • 27
-
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Paper • 2411.14257 • Published • 13 -
Scaling and evaluating sparse autoencoders
Paper • 2406.04093 • Published • 3 -
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper • 2408.05147 • Published • 40 -
Disentangling Dense Embeddings with Sparse Autoencoders
Paper • 2408.00657 • Published • 1
-
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Paper • 2409.10516 • Published • 44 -
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Paper • 2409.11242 • Published • 7 -
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Paper • 2409.11136 • Published • 25 -
On the Diagram of Thought
Paper • 2409.10038 • Published • 14
-
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper • 2208.01626 • Published • 2 -
BERT Rediscovers the Classical NLP Pipeline
Paper • 1905.05950 • Published • 3 -
A Multiscale Visualization of Attention in the Transformer Model
Paper • 1906.05714 • Published • 2 -
Analyzing Transformers in Embedding Space
Paper • 2209.02535 • Published • 3
-
TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs
Paper • 2506.23423 • Published • 1 -
Stochastic Parameter Decomposition
Paper • 2506.20790 • Published • 1 -
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Paper • 2506.10920 • Published • 6 -
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Paper • 2506.03093 • Published • 2
-
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Paper • 2304.09842 • Published • 2 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 27 -
Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 5 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 5
-
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Paper • 2411.14257 • Published • 13 -
Distinguishing Ignorance from Error in LLM Hallucinations
Paper • 2410.22071 • Published -
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Paper • 2410.18860 • Published • 11 -
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Paper • 2410.11779 • Published • 27
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 9 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 49 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 73 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 39
-
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Paper • 2411.14257 • Published • 13 -
Scaling and evaluating sparse autoencoders
Paper • 2406.04093 • Published • 3 -
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper • 2408.05147 • Published • 40 -
Disentangling Dense Embeddings with Sparse Autoencoders
Paper • 2408.00657 • Published • 1
-
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Paper • 2409.10516 • Published • 44 -
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Paper • 2409.11242 • Published • 7 -
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Paper • 2409.11136 • Published • 25 -
On the Diagram of Thought
Paper • 2409.10038 • Published • 14
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 42 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 119 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 49 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper • 2208.01626 • Published • 2 -
BERT Rediscovers the Classical NLP Pipeline
Paper • 1905.05950 • Published • 3 -
A Multiscale Visualization of Attention in the Transformer Model
Paper • 1906.05714 • Published • 2 -
Analyzing Transformers in Embedding Space
Paper • 2209.02535 • Published • 3
-
TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs
Paper • 2506.23423 • Published • 1 -
Stochastic Parameter Decomposition
Paper • 2506.20790 • Published • 1 -
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Paper • 2506.10920 • Published • 6 -
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Paper • 2506.03093 • Published • 2