Tag: deeplearning

Jul 20, 2026

Harness Handbook: The Missing Layer for Editing AI Agents

Harness Handbook proposes a new way to navigate large AI agent codebases by organizing them around behaviors instead ...

paperreview deeplearning llm nlp

Jul 17, 2026

Book Review: Time Series with PyTorch

A review of Time Series with PyTorch, a broad deep-learning-for-forecasting book that teaches from honest evaluation ...

blogpost books timeseries forecasting

Jun 22, 2026

LocateAnything Explained: Parallel Box Decoding and how it makes visual grounding faster and more precise

A review of LocateAnything, an NVIDIA vision-language model that treats each bounding box as one atomic unit and deco...

paperreview deeplearning computervision objectdetection

Jun 15, 2026

MiniMax Sparse Attention: Per-Group Block Selection for Cheap Million-Token Inference

MiniMax Sparse Attention is a practical sparse-attention design for million-token LLMs - it uses a lightweight learne...

paperreview deeplearning llm attention

Jun 01, 2026

Gamma-World: Simplex Agent Encoding and Hub Attention for Multi-Agent World Models

A review of Gamma-World, NVIDIA's generative multi-agent world model that produces shared, action-controllable video ...

paperreview deeplearning computervision generativemodels

Apr 24, 2026

DeepSeek-V4 Review: Why Million-Token Context Needs Efficient Attention, Not Just Larger Windows

DeepSeek V4 pairs a hybrid sparse-attention stack with on-policy distillation across domain specialists to bring 1M-t...

paperreview deeplearning llm moe

Apr 20, 2026

FIPO: Teaching LLMs Which Thoughts Actually Matter

FIPO - an RL algorithm that fixes one of the core limitations of RL for LLM reasoning - credit assignment. Instead of...

paperreview deeplearning llm rl

Mar 16, 2026

Collaborative Reinforcement Learning: Why HACRL Trains Models in Teams Instead of Isolation

HACRL proposes a new paradigm for reinforcement learning - instead of training models in isolation, multiple agents c...

paperreview deeplearning rl llm

Feb 23, 2026

Beyond Positional Bias: How DroPE Unlocks Zero-Shot Long Context in LLMs

A review of DroPE, a simple but counterintuitive method that extends LLM context length by dropping positional embedd...

paperreview deeplearning llm attention

Feb 16, 2026

Kimi k2.5 Review: Native Multimodality and Agent Swarms at 1 Trillion Parameters

A deep-dive review of Kimi K2.5, a next-generation open multimodal model that combines native vision-language trainin...

paperreview deeplearning llm vlm

Feb 09, 2026

Paper Review: PaperBanana: Automating Academic Illustration for AI Scientists

My review of the paper PaperBanana Automating Academic Illustration for AI Scientists

paperreview deeplearning agent vlm

Jan 26, 2026

Paper Review: mHC: Manifold-Constrained Hyper-Connections

My review of the paper mHC Manifold-Constrained Hyper-Connections

paperreview deeplearning architecture llm

Dec 24, 2025

Top-10 ML papers I read in 2025

Top-10 ML and AI papers I read in 2025

paperreview deeplearning blogpost

Dec 22, 2025

Paper Review: NitroGen: A Foundation Model for Generalist Gaming Agents

My review of the paper NitroGen A Foundation Model for Generalist Gaming Agents

paperreview deeplearning flowmatching cv

Nov 24, 2025

Paper Review: SAM 3: Segment Anything with Concepts

Meta's unified model for detecting, segmenting, and tracking objects using text or image prompts — trained on 4M conc...

paperreview deeplearning imagesegmentation llm

Nov 17, 2025

Paper Review: HunyuanImage 3.0 Technical Report

My review of the paper HunyuanImage 3.0 Technical Report

paperreview deeplearning llm imagegeneration

Nov 03, 2025

Paper Review: Chronos-2: From Univariate to Universal Forecasting

Chronos-2 extends zero-shot time series forecasting to multivariate and covariate settings with a new group attention...

paperreview deeplearning llm timeseries

Oct 27, 2025

Paper Review: The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

A biologically inspired LLM built as a graph of spiking neurons with Hebbian learning — it matches GPT-2 scaling whil...

paperreview deeplearning nlp llm

Oct 06, 2025

Paper Review: LongLive: Real-time Interactive Long Video Generation

My review of the paper LongLive Real-time Interactive Long Video Generation

paperreview deeplearning imagegeneration videogeneration

Sep 15, 2025

Paper Review: Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

My review of the paper Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing

paperreview deeplearning nlp llm

Sep 01, 2025

Paper Review: Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

My review of the paper Pref-GRPO Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

paperreview deeplearning imagegeneration cv

Aug 25, 2025

Paper Review: DINOv3

Meta's self-supervised vision model trained on 17 billion images, introducing Gram anchoring to prevent feature degra...

paperreview deeplearning cv pytorch

Aug 04, 2025

Paper Review: Group Sequence Policy Optimization

My review of the paper Group Sequence Policy Optimization

paperreview deeplearning llm rl

Jul 28, 2025

Paper Review: Subliminal Learning: Language models transmit behavioral traits via hidden signals in data

My review of the paper Subliminal Learning Language models transmit behavioral traits via hidden signals in data

paperreview deeplearning llm distillation

Jun 30, 2025

Paper Review: ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

My review of the paper ProRL Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

paperreview deeplearning llm rl

Jun 23, 2025

Paper Review: V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

A self-supervised video model trained on 1M+ hours of video that understands motion, anticipates actions, and — with ...

paperreview deeplearning cv selfsupervised

Jun 09, 2025

Paper Review: Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Only ~20% of tokens actually matter when training LLMs to reason with RL. Updating the low-entropy majority actively ...

paperreview deeplearning llm rl

Jun 02, 2025

Paper Review: SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

My review of the paper SWE-rebench An Automated Pipeline for Task Collection and Decontaminated Evaluation of Softwar...

paperreview deeplearning llm evaluation

May 26, 2025

Paper Review: Visual Planning: Lets Think Only with Images

My review of the paper Visual Planning Let's Think Only with Images

paperreview deeplearning llm rl

May 15, 2025

Paper Review: AlphaEvolve: A coding agent for scientific and algorithmic discovery

DeepMind's autonomous coding agent that evolves algorithms through LLM-driven iteration — it discovered the first imp...

paperreview deeplearning agent nlp

Apr 28, 2025

Paper Review: AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents

My review of the paper AgentA/B Automated and Scalable Web A/BTesting with Interactive LLM Agents

paperreview deeplearning agent nlp

Apr 21, 2025

Paper Review: M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

My review of the paper M1 Towards Scalable Test-Time Compute with Mamba Reasoning Models

paperreview deeplearning rnn distillation

Apr 07, 2025

Paper Review: TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes

My review of the paper TextCrafter Accurately Rendering Multiple Texts in Complex Visual Scenes

paperreview deeplearning cv imagegeneration

Mar 24, 2025

Paper Review: Video-T1: Test-Time Scaling for Video Generation

My review of the paper Video-T1 Test-Time Scaling for Video Generation

paperreview deeplearning cv

Mar 24, 2025

Paper Review: RWKV-7 Goose with Expressive Dynamic State Evolution

My review of the paper RWKV-7 Goose with Expressive Dynamic State Evolution

paperreview deeplearning nlp rnn

Mar 17, 2025

Paper Review: Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

My review of the paper Audio Flamingo 2 An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Ab...

paperreview deeplearning transformer nlp

Mar 10, 2025

Paper Review: Large Language Diffusion Models

LLaDA replaces autoregressive token generation with diffusion-based masked prediction, rivaling LLaMA3 8B while natur...

paperreview deeplearning nlp transformer

Mar 03, 2025

Paper Review: NeoBERT: A Next-Generation BERT

A compact 250M-parameter bidirectional encoder that incorporates RoPE, SwiGLU, and modern pretraining to outperform m...

paperreview deeplearning nlp transformer

Feb 24, 2025

Paper Review: SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Google's upgraded vision-language encoders that add self-supervised learning and online data curation to SigLIP, deli...

paperreview deeplearning transformer cv

Feb 17, 2025

Paper Review: Goku: Flow Based Video Generative Foundation Models

My review of the paper Goku Flow Based Video Generative Foundation Models

paperreview deeplearning transformer imagegeneration

Feb 03, 2025

Paper Review: Titans: Learning to Memorize at Test Time

A new architecture that pairs attention with a learnable long-term memory module, scaling to 2M+ tokens and outperfor...

paperreview deeplearning llm nlp

Jan 27, 2025

Paper Review: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

How pure reinforcement learning (without supervised fine-tuning) can teach LLMs to reason, producing open-source mode...

paperreview deeplearning llm rl

Jan 13, 2025

Paper Review: STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

My review of the paper STAR Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolu...

paperreview deeplearning cv video

Jan 06, 2025

Paper Review: Training Large Language Models to Reason in a Continuous Latent Space

Coconut lets LLMs reason in latent space instead of generating text tokens, enabling breadth-first exploration of rea...

paperreview deeplearning nlp llm

Dec 23, 2024

Paper Review: Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

BERT rebuilt with modern tricks — 2 trillion training tokens, 8192 context length, Flash Attention, and rotary embedd...

paperreview deeplearning nlp transformer

Dec 16, 2024

Paper Review: Byte Latent Transformer: Patches Scale Better Than Tokens

My review of the paper Byte Latent Transformer Patches Scale Better Than Tokens

paperreview deeplearning nlp llm

Dec 09, 2024

Paper Review: Reverse Thinking Makes LLMs Stronger Reasoners

My review of the paper Reverse Thinking Makes LLMs Stronger Reasoners

paperreview deeplearning nlp llm

Nov 25, 2024

Paper Review: Project Sid: Many-agent simulations toward AI civilization

What happens when you put 1k AI agents in Minecraft and let them self-organize? They form governments, transmit cultu...

paperreview deeplearning nlp llm

Nov 11, 2024

Paper Review: Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

My review of the paper Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

paperreview deeplearning nlp llm

Oct 29, 2024

Paper Review: Unbounded: A Generative Infinite Game of Character Life Simulation

My review of the paper Unbounded A Generative Infinite Game of Character Life Simulation

paperreview deeplearning nlp llm

Oct 21, 2024

Paper Review: Contextual Document Embeddings

My review of the paper Contextual Document Embeddings

paperreview deeplearning transformer embedding

Oct 14, 2024

Paper Review: Differential Transformer

My review of the paper Differential Transformer

paperreview deeplearning transformer attention

Oct 07, 2024

Paper Review: Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

My review of the paper Depth Pro Sharp Monocular Metric Depth in Less Than a Second

paperreview deeplearning cv depthestimation

Sep 23, 2024

Paper Review: Training Language Models to Self-Correct via Reinforcement Learning

My review of the paper Training Language Models to Self-Correct via Reinforcement Learning

paperreview deeplearning rl llm

Sep 16, 2024

Paper Review: Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

My review of the paper Loopy Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

paperreview deeplearning diffusion video

Sep 04, 2024

Paper Review: Agentic Retrieval-Augmented Generation for Time Series Analysis

My review of the paper Agentic Retrieval-Augmented Generation for Time Series Analysis

paperreview deeplearning llm timeseries

Aug 19, 2024

Paper Review: Winning Amazon KDD Cup24

My review of the paper Winning Amazon KDD Cup24

paperreview deeplearning llm qa

Aug 12, 2024

Paper Review: Wolf: Captioning Everything with a World Summarization Framework

My review of the paper Wolf Captioning Everything with a World Summarization Framework

paperreview deeplearning llm vlm

Aug 05, 2024

Paper Review: Diffusion Feedback Helps CLIP See Better

My review of the paper Diffusion Feedback Helps CLIP See Better

paperreview deeplearning clip diffusion

Jul 29, 2024

Paper Review: Masked Attention is All You Need for Graphs

My review of the paper Masked Attention is All You Need for Graphs

paperreview deeplearning graph transformer

Jul 22, 2024

Paper Review: RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

My review of the paper RankRAG Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

paperreview deeplearning llm rag

Jul 15, 2024

Paper Review: Unveiling Encoder-Free Vision-Language Models

My review of the paper Unveiling Encoder-Free Vision-Language Models

paperreview deeplearning llm vlm

Jul 01, 2024

Paper Review: Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

My review of the paper Husky A Unified, Open-Source Language Agent for Multi-Step Reasoning

paperreview deeplearning llm agent

Jun 17, 2024

Paper Review: Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

My review of the paper Samba Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

paperreview deeplearning swa nlp

Jun 10, 2024

Paper Review: σ-GPTs: A New Approach to Autoregressive Models

My review of the paper σ-GPTs A New Approach to Autoregressive Models

paperreview deeplearning nlp gpt

Jun 03, 2024

Paper Review: LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models

My review of the paper LiteVAE Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models

paperreview deeplearning cv autoencoder

May 27, 2024

Paper Review: YOLOv10: Real-Time End-to-End Object Detection

My review of the paper YOLOv10 Real-Time End-to-End Object Detection

paperreview deeplearning objectdetection yolo

May 20, 2024

Paper Review: Chameleon: Mixed-Modal Early-Fusion Foundation Models

My review of the paper Chameleon Mixed-Modal Early-Fusion Foundation Models

paperreview deeplearning mllm multimodal

May 13, 2024

Paper Review: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

My review of the paper Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

paperreview deeplearning mllm multimodal

May 06, 2024

Paper Review: FlowMind: Automatic Workflow Generation with LLMs

My review of the paper FlowMind Automatic Workflow Generation with LLMs

paperreview deeplearning llm agent

Apr 15, 2024

Paper Review: Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

My review of the paper Ferret-v2 An Improved Baseline for Referring and Grounding with Large Language Models

paperreview deeplearning llm cv

Apr 08, 2024

Paper Review: Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

My review of the paper Visual Autoregressive Modeling Scalable Image Generation via Next-Scale Prediction

paperreview deeplearning generation

Apr 01, 2024

Paper Review: Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

My review of the paper Vision-RWKV Efficient and Scalable Visual Perception with RWKV-Like Architectures

paperreview deeplearning cv rnn

Mar 25, 2024

Paper Review: Chronos: Learning the Language of Time Series

Amazon's framework that tokenizes time series data for pretrained language models, enabling zero-shot forecasting tha...

paperreview deeplearning llm timeseries

Mar 19, 2024

Paper Review: Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

My review of the paper Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

paperreview deeplearning recommender gnn

Mar 11, 2024

Paper Review: NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

My review of the paper NaturalSpeech 3 Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

paperreview deeplearning tts speech

Mar 04, 2024

Paper Review: Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

My review of the paper Griffin Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

paperreview deeplearning recurrent attention

Feb 26, 2024

Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

My review of the paper YOLOv9 Learning What You Want to Learn Using Programmable Gradient Information

paperreview deeplearning cv objectdetection

Feb 19, 2024

Paper Review: LiRank: Industrial Large Scale Ranking Models at LinkedIn

My review of the paper LiRank Industrial Large Scale Ranking Models at LinkedIn

paperreview deeplearning recommender

Feb 12, 2024

Paper Review: Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

My review of the paper Lag-Llama Towards Foundation Models for Probabilistic Time Series Forecasting

paperreview deeplearning llm timeseries

Jan 29, 2024

Paper Review: Lumiere: A Space-Time Diffusion Model for Video Generation

My review of the paper Lumiere A Space-Time Diffusion Model for Video Generation

paperreview deeplearning cv stablediffusion

Jan 22, 2024

Paper Review: Scalable Pre-training of Large Autoregressive Image Models

My review of the paper Scalable Pre-training of Large Autoregressive Image Models

paperreview deeplearning cv

Jan 15, 2024

Paper Review: Ferret: Refer and Ground Anything Anywhere at Any Granularity

My review of the paper Ferret Refer and Ground Anything Anywhere at Any Granularity

paperreview deeplearning llm cv

Jan 08, 2024

Paper Review: DocLLM: A layout-aware generative language model for multimodal document understanding

My review of the paper DocLLM A layout-aware generative language model for multimodal document understanding

paperreview deeplearning llm attention

Dec 25, 2023

Paper Review: StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

My review of the paper StreamDiffusionStreamDiffusion A Pipeline-Level Solution for Real-Time Interactive Generation

paperreview deeplearning stablediffusion cv

Dec 18, 2023

Paper Review: Pixel Aligned Language Models

My review of the paper Pixel Aligned Language Models

paperreview deeplearning llm cv

Dec 12, 2023

Paper Review: EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

My review of the paper EfficientSAM Leveraged Masked Image Pretraining for Efficient Segment Anything

paperreview deeplearning imagesegmentation cv

Dec 07, 2023

Paper Review: Translatotron 3: Speech to Speech Translation with Monolingual Data

My review of the paper Translatotron 3 Speech to Speech Translation with Monolingual Data

paperreview deeplearning languagetranslation speechtranslation

Dec 04, 2023

Paper Review: Adversarial Diffusion Distillation

My review of the paper Adversarial Diffusion Distillation

paperreview deeplearning cv stablediffusion

Nov 30, 2023

Paper Review: Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion

My review of the paper Diffuse, Attend, and Segment Unsupervised Zero-Shot Segmentation using Stable Diffusion

paperreview deeplearning cv stablediffusion

Nov 27, 2023

Paper Review: Diffusion Model Alignment Using Direct Preference Optimization

Adapting DPO from language models to image generation — training Stable Diffusion XL on 851K human preferences to sig...

paperreview deeplearning cv stablediffusion

Nov 23, 2023

Paper Review: Orca 2: Teaching Small Language Models How to Reason

My review of the paper Orca 2 Teaching Small Language Models How to Reason

paperreview deeplearning nlp llm

Nov 20, 2023

Paper Review: Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models

My review of the paper Chain-of-Note Enhancing Robustness in Retrieval-Augmented Language Models

paperreview deeplearning nlp llm

Nov 16, 2023

Paper Review: Deep Learning for Day Forecasts from Sparse Observations

My review of the paper Deep Learning for Day Forecasts from Sparse Observations

paperreview deeplearning forecasting

Nov 13, 2023

Paper Review: Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM

My review of the paper Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM

paperreview deeplearning llm qa

Nov 09, 2023

Paper Review: CogVLM: Visual Expert for Pretrained Language Models

My review of the paper CogVLM Visual Expert for Pretrained Language Models

paperreview deeplearning cv pretraining

Nov 06, 2023

Paper Review: Collaborative Large Language Model for Recommender Systems

My review of the paper Collaborative Large Language Model for Recommender Systems

paperreview deeplearning llm recommender

Nov 02, 2023

Paper Review: SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

My review of the paper SAM-CLIP Merging Vision Foundation Models towards Semantic and Spatial Understanding

paperreview deeplearning cv imagesegmentation

Oct 30, 2023

Paper Review: Zephyr: Direct Distillation of LM Alignment

My review of the paper Zephyr Direct Distillation of LM Alignment

paperreview deeplearning nlp llm

Oct 26, 2023

Paper Review: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture

My review of the paper Monarch Mixer A Simple Sub-Quadratic GEMM-Based Architecture

paperreview deeplearning nlp cv

Oct 23, 2023

Paper Review: Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

My review of the paper Self-RAG Learning to Retrieve, Generate, and Critique through Self-Reflection

paperreview deeplearning llm nlp

Oct 19, 2023

Paper Review: PaLI-3 Vision Language Models: Smaller, Faster, Stronger

My review of the paper PaLI-3 Vision Language Models Smaller, Faster, Stronger

paperreview deeplearning llm vlm

Oct 16, 2023

Paper Review: InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining

My review of the paper InstructRetro Instruction Tuning post Retrieval-Augmented Pretraining

paperreview deeplearning llm nlp

Oct 12, 2023

Paper Review: Mistral 7B

My review of the paper Mistral 7B

paperreview deeplearning llm nlp

Oct 09, 2023

Paper Review: Think before you speak: Training Language Models With Pause Tokens

My review of the paper Think before you speak Training Language Models With Pause Tokens

paperreview deeplearning llm nlp

Oct 05, 2023

Paper Review: QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

My review of the paper QA-LoRA Quantization-Aware Low-Rank Adaptation of Large Language Models

paperreview deeplearning llm nlp

Oct 02, 2023

Paper Review: LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

My review of the paper LAVIE High-Quality Video Generation with Cascaded Latent Diffusion Models

paperreview deeplearning cv diffusion

Sep 28, 2023

Paper Review: DreamLLM: Synergistic Multimodal Comprehension and Creation

My review of the paper DreamLLM Synergistic Multimodal Comprehension and Creation

paperreview deeplearning llm cv

Sep 25, 2023

Paper Review: FreeU: Free Lunch in Diffusion U-Net

My review of the paper FreeU Free Lunch in Diffusion U-Net

paperreview deeplearning stablediffusion unet

Sep 21, 2023

Paper Review: Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

My review of the paper Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

paperreview deeplearning llm promptengineering

Sep 18, 2023

Paper Review: SLiMe: Segment Like Me

My review of the paper SLiMe Segment Like Me

paperreview deeplearning cv imagesegmentation

Sep 14, 2023

Paper Review: TSMixer: An All-MLP Architecture for Time Series Forecasting

My review of the paper TSMixer An All-MLP Architecture for Time Series Forecasting

paperreview deeplearning mlp timeseries

Sep 11, 2023

Paper Review: Explaining grokking through circuit efficiency

My review of the paper Explaining grokking through circuit efficiency

paperreview deeplearning

Sep 07, 2023

Paper Review: Contrastive Feature Masking Open-Vocabulary Vision Transformer

My review of the paper Contrastive Feature Masking Open-Vocabulary Vision Transformer

paperreview deeplearning cv objectdetection

Sep 04, 2023

Paper Review: RecMind: Large Language Model Powered Agent For Recommendation

My review of the paper RecMind Large Language Model Powered Agent For Recommendation

paperreview deeplearning llm

Aug 31, 2023

Paper Review: CoTracker: It is Better to Track Together

My review of the paper CoTracker It is Better to Track Together

paperreview deeplearning cv objecttracking

Aug 28, 2023

Paper Review: Giraffe: Adventures in Expanding Context Lengths in LLMs

My review of the paper Giraffe Adventures in Expanding Context Lengths in LLMs

paperreview deeplearning nlp llm

Aug 24, 2023

Paper Review: OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

My review of the paper OBELISC An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

paperreview deeplearning nlp llm

Aug 21, 2023

Paper Review: LISA: Reasoning Segmentation via Large Language Model

My review of the paper LISA Reasoning Segmentation via Large Language Model

paperreview deeplearning cv imagesegmentation

Aug 17, 2023

Paper Review: FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

My review of the paper FastViT A Fast Hybrid Vision Transformer using Structural Reparameterization

paperreview deeplearning cv sota

Aug 10, 2023

Paper Review: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

A systematic survey of what's broken in RLHF — from reward hacking to evaluation gaps — and what techniques can fix, ...

paperreview deeplearning nlp llm

Aug 10, 2023

Paper Review: UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

My review of the paper UniversalNER Targeted Distillation from Large Language Models for Open Named Entity Recognition

paperreview deeplearning nlp llm

Aug 07, 2023

Paper Review: Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

My review of the paper Skeleton-of-Thought Large Language Models Can Do Parallel Decoding

paperreview deeplearning nlp llm

Aug 03, 2023

Paper Review: Tracking Anything in High Quality

My review of the paper Tracking Anything in High Quality

paperreview deeplearning objectdetection imagesegmentation

Jul 31, 2023

Paper Review: TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning

My review of the paper TabR Unlocking the Power of Retrieval-Augmented Tabular Deep Learning

paperreview deeplearning tabular

Jul 27, 2023

Paper Review: Meta-Transformer: A Unified Framework for Multimodal Learning

My review of the paper Meta-Transformer A Unified Framework for Multimodal Learning

paperreview deeplearning nlp transformer

Jul 24, 2023

Paper Review: Retentive Network: A Successor to Transformer for Large Language Models

My review of the paper Retentive Network A Successor to Transformer for Large Language Models

paperreview deeplearning nlp transformer

Jul 20, 2023

Paper Review: Llama 2: Open Foundation and Fine-Tuned Chat Models

Meta's open-source LLM family (7B–70B parameters) with chat fine-tuning that matched or beat closed-source models on ...

paperreview deeplearning nlp finetuning

Jul 17, 2023

Paper Review: Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

My review of the paper Scaling Autoregressive Multi-Modal Models Pretraining and Instruction Tuning

paperreview deeplearning cv nlp

Jul 13, 2023

Paper Review: UniverSeg: Universal Medical Image Segmentation

My review of the paper UniverSeg Universal Medical Image Segmentation

paperreview deeplearning cv imagesegmentation

Jul 10, 2023

Paper Review: Recognize Anything: A Strong Image Tagging Model

My review of the paper Recognize Anything A Strong Image Tagging Model

paperreview deeplearning cv imagecaptioning

Jul 06, 2023

Paper Review: Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

My review of the paper Hiera A Hierarchical Vision Transformer without the Bells-and-Whistles

paperreview deeplearning cv transformer

Jul 03, 2023

Paper Review: Multilingual End to End Entity Linking

My review of the paper Multilingual End to End Entity Linking

paperreview deeplearning nlp llm

Jun 29, 2023

Paper Review: Fast Segment Anything

My review of the paper Fast Segment Anything

paperreview deeplearning cv imagesegmentation

Jun 26, 2023

Paper Review: Tracking Everything Everywhere All at Once

My review of the paper Tracking Everything Everywhere All at Once

paperreview deeplearning cv motiontracking

Jun 23, 2023

Paper Review: Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

My review of the paper Voicebox Text-Guided Multilingual Universal Speech Generation at Scale

paperreview deeplearning audio tts

Jun 19, 2023

Paper Review: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

My review of the paper Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

paperreview deeplearning nlp llm

Jun 15, 2023

Paper Review: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

Yann LeCun's I-JEPA learns semantic image representations by predicting masked patch features — no data augmentation ...

paperreview deeplearning selfsupervised pretraining

Jun 12, 2023

Paper Review: BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

My review of the paper BiomedGPT A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, L...

paperreview deeplearning nlp gpt

Jun 08, 2023

Paper Review: StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners

My review of the paper StableRep Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners

paperreview deeplearning stablediffusion nlp

Jun 05, 2023

Paper Review: The effectiveness of MAE pre-pretraining for billion-scale pretraining

My review of the paper The effectiveness of MAE pre-pretraining for billion-scale pretraining

paperreview deeplearning cv pretraining

Jun 01, 2023

Paper Review: QLoRA: Efficient Finetuning of Quantized LLMs

My review of the paper QLoRA Efficient Finetuning of Quantized LLMs

paperreview deeplearning finetuning optimization

May 30, 2023

Paper Review: Chain of Hindsight Aligns Language Models with Feedback

My review of the paper Chain of Hindsight Aligns Language Models with Feedback

paperreview deeplearning nlp llm

May 25, 2023

Paper Review: MMS: Scaling Speech Technology to 1000+ languages

My review of the paper MMS Scaling Speech Technology to 1000+ languages

paperreview deeplearning audio pytorch

May 22, 2023

Paper Review: Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

My review of the paper Drag Your GAN Interactive Point-based Manipulation on the Generative Image Manifold

paperreview deeplearning cv gan

May 18, 2023

Paper Review: DarkBERT: A Language Model for the Dark Side of the Internet

My review of the paper DarkBERT A Language Model for the Dark Side of the Internet

paperreview deeplearning nlp pretraining

May 15, 2023

Paper Review: NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers

My review of the paper NaturalSpeech 2 Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers

paperreview deeplearning audio diffusion

May 10, 2023

Paper Review: ImageBind: One Embedding Space To Bind Them All

My review of the paper ImageBind One Embedding Space To Bind Them All

paperreview deeplearning nlp cv

May 08, 2023

Paper Review: Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

My review of the paper Distilling Step-by-Step Outperforming Larger Language Models with Less Training Data and Small...

paperreview deeplearning nlp distillation

May 04, 2023

Paper Review: Phoenix: Democratizing ChatGPT across Languages

My review of the paper Phoenix Democratizing ChatGPT across Languages

paperreview deeplearning nlp

May 01, 2023

Paper Review: Scaling Transformer to 1M tokens and beyond with RMT

My review of the paper Scaling Transformer to 1M tokens and beyond with RMT

paperreview deeplearning transformer

Apr 27, 2023

Paper Review: Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations

My review of the paper Speed Is All You Need On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizat...

paperreview deeplearning diffusion

Apr 24, 2023

Paper Review: Generative Agents: Interactive Simulacra of Human Behavior

My review of the paper Generative Agents Interactive Simulacra of Human Behavior

paperreview deeplearning nlp

Apr 20, 2023

Paper Review: DINOv2: Learning Robust Visual Features without Supervision

How Meta built all-purpose visual features by scaling self-supervised pretraining to a curated 142M-image dataset, pr...

paperreview deeplearning cv pytorch

Apr 17, 2023

Paper Review: InceptionNeXt: When Inception Meets ConvNeXt

My review of the paper InceptionNeXt When Inception Meets ConvNeXt

paperreview deeplearning cv pytorch

Apr 08, 2023

Paper Review: Segment Anything

My review of the paper Segment Anything

paperreview deeplearning cv pytorch

Apr 02, 2023

Paper Review: BloombergGPT: A Large Language Model for Finance

Bloomberg trained a 50B-parameter LLM on 363B tokens of proprietary financial data. It crushes existing models on fin...

paperreview deeplearning nlp

Mar 27, 2023

Paper Review: ReBotNet: Fast Real-time Video Enhancement

My review of the paper ReBotNet Fast Real-time Video Enhancement

paperreview deeplearning cv video

Mar 20, 2023

Paper Review: Hyena Hierarchy: Towards Larger Convolutional Language Models

My review of the paper Hyena Hierarchy Towards Larger Convolutional Language Models

paperreview deeplearning nlp cv

Mar 13, 2023

Paper Review: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

My review of the paper Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models

paperreview deeplearning nlp transformer

Mar 09, 2023

Paper Review: PaLM-E: An Embodied Multimodal Language Model

My review of the paper PaLM-E An Embodied Multimodal Language Model

paperreview deeplearning nlp transformer

Mar 06, 2023

Paper Review: In-Context Instruction Learning

My review of the paper In-Context Instruction Learning

paperreview deeplearning nlp transformer

Feb 26, 2023

Paper Review: LLaMA: Open and Efficient Foundation Language Models

My review of the paper LLaMA Open and Efficient Foundation Language Models

paperreview deeplearning nlp transformer

Feb 20, 2023

Paper Review: Scaling Vision Transformers to 22 Billion Parameters

My review of the paper Scaling Vision Transformers to 22 Billion Parameters

paperreview deeplearning cv transformer

Feb 13, 2023

Paper Review: Dual PatchNorm

My review of the paper Dual PatchNorm

paperreview deeplearning transformer cv

Feb 06, 2023

Paper Review: Cut and Learn for Unsupervised Object Detection and Instance Segmentation

My review of the paper Cut and Learn for Unsupervised Object Detection and Instance Segmentation

paperreview deeplearning cv objectdetection

Jan 29, 2023

Paper Review: StyleGAN-T Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

My review of the paper StyleGAN-T Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

paperreview deeplearning cv gan

Jul 24, 2022

Paper Review: Next-ViT Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios

My review of the paper Next-ViT Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial S...

paperreview deeplearning cv transformer

Dec 10, 2021

Paper Review: NL-Augmenter A Framework for Task-Sensitive Natural Language Augmentation

My review of the paper NL-Augmenter A Framework for Task-Sensitive Natural Language Augmentation and my contribution ...

paperreview deeplearning nlp augmentation

Nov 25, 2021

Paper Review: NÜWA Visual Synthesis Pre-training for Neural visUal World creAtion

My review of the paper NÜWA Visual Synthesis Pre-training for Neural visUal World creAtion

paperreview deeplearning cv transformer

Nov 19, 2021

Paper Review: Swin Transformer V2 Scaling Up Capacity and Resolution

My review of the paper Swin Transformer V2 Scaling Up Capacity and Resolution

paperreview deeplearning cv transformer

Oct 10, 2021

Paper Review: A Recipe For Arbitrary Text Style Transfer with Large Language Models

My review of the paper A Recipe For Arbitrary Text Style Transfer with Large Language Models

paperreview deeplearning nlp styletransfer

Sep 13, 2021

Paper Review: SwinIR Image Restoration Using Swin Transformer

My review of the paper SwinIR Image Restoration Using Swin Transformer

paperreview deeplearning cv transformer

Sep 01, 2021

Paper Review: Efficient Visual Pretraining with Contrastive Detection

My review of the paper Efficient Visual Pretraining with Contrastive Detection

paperreview deeplearning cv pretraining

Aug 15, 2021

Paper Review: Domain-Aware Universal Style Transfer

My review of the paper Domain-Aware Universal Style Transfer

paperreview deeplearning cv styletransfer

Jul 23, 2021

Paper Review: YOLOX Exceeding YOLO Series in 2021

My review of the paper YOLOX Exceeding YOLO Series in 2021

paperreview deeplearning cv objectdetection

Jul 12, 2021

Paper Review: Long-Short Transformer Efficient Transformers for Language and Vision

My review of the paper Long-Short Transformer Efficient Transformers for Language and Vision

paperreview deeplearning cv nlp

Jun 18, 2021

Paper Review: Semi-Autoregressive Transformer for Image Captioning

My review of the paper Semi-Autoregressive Transformer for Image Captioning

paperreview deeplearning imagecaptioning multimodal

Jun 10, 2021

Paper Review: CoAtNet Marrying Convolution and Attention for All Data Sizes

My review of the paper CoAtNet Marrying Convolution and Attention for All Data Sizes

paperreview deeplearning cv pretraining

Jun 02, 2021

Paper Review: ByT5 Towards a token-free future with pre-trained byte-to-byte models

My review of the paper ByT5 Towards a token-free future with pre-trained byte-to-byte models

paperreview deeplearning nlp pretraining

May 21, 2021

Paper Review: Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence

My review of the paper Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence

paperreview deeplearning nlp nlg

May 10, 2021

Paper Review: Are Pre-trained Convolutions Better than Pre-trained Transformers?

My review of the paper Are Pre-trained Convolutions Better than Pre-trained Transformers?

paperreview deeplearning nlp cnn

May 04, 2021

Paper Review: MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

My review of the paper MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding.

paperreview deeplearning objectdetection multimodal

Apr 07, 2021

Paper Review: Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains

My review of the paper Generating Furry Cars Disentangling Object Shape and Appearance across Multiple Domains.

paperreview cv gan deeplearning

Apr 02, 2021

Paper Review: EfficientNetV2: Smaller Models and Faster Training

My review of the paper EfficientNetV2 Smaller Models and Faster Training.

paperreview cv sota nas

Mar 29, 2021

Paper Review: Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

My review of the paper Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning.

paperreview nlp fewshotlearning augmentation

Mar 21, 2021

Paper Review: LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval

My review of the paper LightningDOT Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval.

paperreview pretraining realtime ranking

Feb 22, 2021

Paper Review: Real-World Super-Resolution of Face-Images from Surveillance Cameras

My review of the paper Real-World Super-Resolution of Face-Images from Surveillance Cameras.

paperreview deeplearning superresolution cv

Feb 07, 2021

Paper Review: ObjectAug: Object-level Data Augmentation for Semantic Image Segmentation

My review of the paper ObjectAug Object-level Data Augmentation for Semantic Image Segmentation .

paperreview deeplearning augmentation imageinpainting

Jan 31, 2021

Paper Review: JigsawGAN: Self-supervised Learning for Solving Jigsaw Puzzles with Generative Adversarial Networks

My review of the paper JigsawGAN Self-supervised Learning for Solving Jigsaw Puzzles with Generative Adversarial Netw...

paperreview deeplearning jigsaw selfsupervised

Aug 19, 2020

Paper Review: Language-agnostic BERT Sentence Embedding

My review of the paper Language-agnostic BERT Sentence Embedding.

paperreview deeplearning transformer nlp

Jul 28, 2020

Paper Review: Funnel Activation for Visual Recognition

My review of the paper Funnel Activation for Visual Recognition.

paperreview deeplearning activationfunction cv

Jul 04, 2020

Paper Review: ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network

My review of the paper ReXNet Diminishing Representational Bottleneck on Convolutional Neural Network.

paperreview deeplearning pretraining transferlearning

Jun 14, 2020

Paper Review: VirTex: Learning Visual Representations from Textual Annotations

My review of the paper VirTex Learning Visual Representations from Textual Annotations.

paperreview imagecaptioning cv visual

Jun 10, 2020

Paper Review: Linformer: Self-Attention with Linear Complexity

My review of the paper Linformer Self-Attention with Linear Complexity.

paperreview deeplearning attention transformer

May 28, 2020

Paper Review: End-to-End Object Detection with Transformers

My review of the paper End-to-End Object Detection with Transformers.

paperreview deeplearning objectdetection transformer

May 23, 2020

Paper Review: SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training

My review of the paper SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training.

paperreview nlp deeplearning transformer