Tag: llm
76 posts
MiniMax Sparse Attention: Per-Group Block Selection for Cheap Million-Token Inference
MiniMax Sparse Attention is a practical sparse-attention design for million-token LLMs - it uses a lightweight learne...
Testing MiniMax M3 on real tasks: repo refactor, screenshot debugging, and Spotify recommendations
A hands-on look at MiniMax M3 through Claude Code — what its new MiniMax Sparse Attention (MSA) is and how it differs...
Book Review: 50 ML Projects to Understand LLMs
A review of Mike X Cohen 50 ML Projects To Understand LLMs, a hands-on book that uses code, statistics, and controlle...
Testing MiniMax M2.7 via API on three real ML and coding workflows
An evaluation of MiniMax M2.7 used through Claude Code on three workflows I run regularly — writing code for a Kaggle...
DeepSeek-V4 Review: Why Million-Token Context Needs Efficient Attention, Not Just Larger Windows
DeepSeek V4 pairs a hybrid sparse-attention stack with on-policy distillation across domain specialists to bring 1M-t...
FIPO: Teaching LLMs Which Thoughts Actually Matter
FIPO - an RL algorithm that fixes one of the core limitations of RL for LLM reasoning - credit assignment. Instead of...
Book Review: Unlocking Data with Generative AI and RAG, Second Edition
A review of Keith Bourne second edition of Unlocking Data with Generative AI and RAG, covering the running example th...
Book Review: A Practical Guide to Reinforcement Learning from Human Feedback
A review of Sandip Kulkarni book on RLHF, covering its strengths as a structured learning resource, its reliance on b...
Collaborative Reinforcement Learning: Why HACRL Trains Models in Teams Instead of Isolation
HACRL proposes a new paradigm for reinforcement learning - instead of training models in isolation, multiple agents c...
Beyond Positional Bias: How DroPE Unlocks Zero-Shot Long Context in LLMs
A review of DroPE, a simple but counterintuitive method that extends LLM context length by dropping positional embedd...
Kimi k2.5 Review: Native Multimodality and Agent Swarms at 1 Trillion Parameters
A deep-dive review of Kimi K2.5, a next-generation open multimodal model that combines native vision-language trainin...
Paper Review: mHC: Manifold-Constrained Hyper-Connections
My review of the paper mHC Manifold-Constrained Hyper-Connections
Paper Review: SAM 3: Segment Anything with Concepts
Meta's unified model for detecting, segmenting, and tracking objects using text or image prompts — trained on 4M conc...
Paper Review: HunyuanImage 3.0 Technical Report
My review of the paper HunyuanImage 3.0 Technical Report
Paper Review: Chronos-2: From Univariate to Universal Forecasting
Chronos-2 extends zero-shot time series forecasting to multivariate and covariate settings with a new group attention...
Paper Review: The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
A biologically inspired LLM built as a graph of spiking neurons with Hebbian learning — it matches GPT-2 scaling whil...
Paper Review: Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
My review of the paper Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing
Paper Review: Group Sequence Policy Optimization
My review of the paper Group Sequence Policy Optimization
Paper Review: Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
My review of the paper Subliminal Learning Language models transmit behavioral traits via hidden signals in data
Paper Review: ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
My review of the paper ProRL Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper Review: Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Only ~20% of tokens actually matter when training LLMs to reason with RL. Updating the low-entropy majority actively ...
Paper Review: SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
My review of the paper SWE-rebench An Automated Pipeline for Task Collection and Decontaminated Evaluation of Softwar...
Paper Review: Visual Planning: Lets Think Only with Images
My review of the paper Visual Planning Let's Think Only with Images
Paper Review: AlphaEvolve: A coding agent for scientific and algorithmic discovery
DeepMind's autonomous coding agent that evolves algorithms through LLM-driven iteration — it discovered the first imp...
Paper Review: AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents
My review of the paper AgentA/B Automated and Scalable Web A/BTesting with Interactive LLM Agents
Paper Review: M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
My review of the paper M1 Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper Review: Large Language Diffusion Models
LLaDA replaces autoregressive token generation with diffusion-based masked prediction, rivaling LLaMA3 8B while natur...
Paper Review: Titans: Learning to Memorize at Test Time
A new architecture that pairs attention with a learnable long-term memory module, scaling to 2M+ tokens and outperfor...
Paper Review: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
How pure reinforcement learning (without supervised fine-tuning) can teach LLMs to reason, producing open-source mode...
Paper Review: Training Large Language Models to Reason in a Continuous Latent Space
Coconut lets LLMs reason in latent space instead of generating text tokens, enabling breadth-first exploration of rea...
Paper Review: Byte Latent Transformer: Patches Scale Better Than Tokens
My review of the paper Byte Latent Transformer Patches Scale Better Than Tokens
Paper Review: Reverse Thinking Makes LLMs Stronger Reasoners
My review of the paper Reverse Thinking Makes LLMs Stronger Reasoners
Paper Review: Project Sid: Many-agent simulations toward AI civilization
What happens when you put 1k AI agents in Minecraft and let them self-organize? They form governments, transmit cultu...
Paper Review: Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
My review of the paper Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper Review: Unbounded: A Generative Infinite Game of Character Life Simulation
My review of the paper Unbounded A Generative Infinite Game of Character Life Simulation
Paper Review: Training Language Models to Self-Correct via Reinforcement Learning
My review of the paper Training Language Models to Self-Correct via Reinforcement Learning
Paper Review: Agentic Retrieval-Augmented Generation for Time Series Analysis
My review of the paper Agentic Retrieval-Augmented Generation for Time Series Analysis
Paper Review: Winning Amazon KDD Cup24
My review of the paper Winning Amazon KDD Cup24
Paper Review: Wolf: Captioning Everything with a World Summarization Framework
My review of the paper Wolf Captioning Everything with a World Summarization Framework
Paper Review: RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
My review of the paper RankRAG Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
Paper Review: Unveiling Encoder-Free Vision-Language Models
My review of the paper Unveiling Encoder-Free Vision-Language Models
Paper Review: Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
My review of the paper Husky A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper Review: FlowMind: Automatic Workflow Generation with LLMs
My review of the paper FlowMind Automatic Workflow Generation with LLMs
Paper Review: Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
My review of the paper Ferret-v2 An Improved Baseline for Referring and Grounding with Large Language Models
Paper Review: Chronos: Learning the Language of Time Series
Amazon's framework that tokenizes time series data for pretrained language models, enabling zero-shot forecasting tha...
Paper Review: Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
My review of the paper Lag-Llama Towards Foundation Models for Probabilistic Time Series Forecasting
Paper Review: Ferret: Refer and Ground Anything Anywhere at Any Granularity
My review of the paper Ferret Refer and Ground Anything Anywhere at Any Granularity
Paper Review: DocLLM: A layout-aware generative language model for multimodal document understanding
My review of the paper DocLLM A layout-aware generative language model for multimodal document understanding
Paper Review: Pixel Aligned Language Models
My review of the paper Pixel Aligned Language Models
Paper Review: Orca 2: Teaching Small Language Models How to Reason
My review of the paper Orca 2 Teaching Small Language Models How to Reason
Paper Review: Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
My review of the paper Chain-of-Note Enhancing Robustness in Retrieval-Augmented Language Models
Paper Review: Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
My review of the paper Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Paper Review: Collaborative Large Language Model for Recommender Systems
My review of the paper Collaborative Large Language Model for Recommender Systems
Paper Review: Zephyr: Direct Distillation of LM Alignment
My review of the paper Zephyr Direct Distillation of LM Alignment
Paper Review: Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
My review of the paper Self-RAG Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper Review: PaLI-3 Vision Language Models: Smaller, Faster, Stronger
My review of the paper PaLI-3 Vision Language Models Smaller, Faster, Stronger
Paper Review: InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
My review of the paper InstructRetro Instruction Tuning post Retrieval-Augmented Pretraining
Paper Review: Mistral 7B
My review of the paper Mistral 7B
Paper Review: Think before you speak: Training Language Models With Pause Tokens
My review of the paper Think before you speak Training Language Models With Pause Tokens
Paper Review: QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
My review of the paper QA-LoRA Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper Review: DreamLLM: Synergistic Multimodal Comprehension and Creation
My review of the paper DreamLLM Synergistic Multimodal Comprehension and Creation
Paper Review: Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
My review of the paper Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper Review: RecMind: Large Language Model Powered Agent For Recommendation
My review of the paper RecMind Large Language Model Powered Agent For Recommendation
Paper Review: Giraffe: Adventures in Expanding Context Lengths in LLMs
My review of the paper Giraffe Adventures in Expanding Context Lengths in LLMs
Paper Review: OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
My review of the paper OBELISC An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Paper Review: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
A systematic survey of what's broken in RLHF — from reward hacking to evaluation gaps — and what techniques can fix, ...
Paper Review: UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition
My review of the paper UniversalNER Targeted Distillation from Large Language Models for Open Named Entity Recognition
Paper Review: Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
My review of the paper Skeleton-of-Thought Large Language Models Can Do Parallel Decoding
Paper Review: Retentive Network: A Successor to Transformer for Large Language Models
My review of the paper Retentive Network A Successor to Transformer for Large Language Models
Paper Review: Multilingual End to End Entity Linking
My review of the paper Multilingual End to End Entity Linking
Paper Review: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
My review of the paper Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
Paper Review: Chain of Hindsight Aligns Language Models with Feedback
My review of the paper Chain of Hindsight Aligns Language Models with Feedback
Paper Review: DarkBERT: A Language Model for the Dark Side of the Internet
My review of the paper DarkBERT A Language Model for the Dark Side of the Internet
Paper Review: Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
My review of the paper Distilling Step-by-Step Outperforming Larger Language Models with Less Training Data and Small...
Paper Review: Hyena Hierarchy: Towards Larger Convolutional Language Models
My review of the paper Hyena Hierarchy Towards Larger Convolutional Language Models
Paper Review: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
My review of the paper Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models