Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models Paper β’ 2504.10615 β’ Published Apr 14 β’ 1
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper β’ 2502.05171 β’ Published Feb 7 β’ 142
β UI is a good thing π β Collection cool spaces with a cool UI, what could be better? β’ 5 items β’ Updated May 5 β’ 20
view article Article I Clicked βI Agreeβ, But What Am I Really Consenting To? By giadap β’ Mar 26 β’ 24
Model Hubs and Beyond: Analyzing Model Popularity, Performance, and Documentation Paper β’ 2503.15222 β’ Published Mar 19 β’ 1
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub Paper β’ 2405.13058 β’ Published May 20, 2024 β’ 2
SpaceByte: Towards Deleting Tokenization from Large Language Modeling Paper β’ 2404.14408 β’ Published Apr 22, 2024 β’ 7
T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings Paper β’ 2406.19223 β’ Published Jun 27, 2024 β’ 11
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper β’ 2502.14258 β’ Published Feb 20 β’ 26
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. β’ 36 items β’ Updated Apr 6 β’ 32
Finch: Prompt-guided Key-Value Cache Compression Paper β’ 2408.00167 β’ Published Jul 31, 2024 β’ 18
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models Paper β’ 2503.08686 β’ Published Mar 11 β’ 19