Tag: paperreview
194 posts
Collaborative Reinforcement Learning: Why HACRL Trains Models in Teams Instead of Isolation
HACRL proposes a new paradigm for reinforcement learning - instead of training models in isolation, multiple agents c...
Beyond Positional Bias: How DroPE Unlocks Zero-Shot Long Context in LLMs
A review of DroPE, a simple but counterintuitive method that extends LLM context length by dropping positional embedd...
Kimi k2.5 Review: Native Multimodality and Agent Swarms at 1 Trillion Parameters
A deep-dive review of Kimi K2.5, a next-generation open multimodal model that combines native vision-language trainin...
Paper Review: PaperBanana: Automating Academic Illustration for AI Scientists
My review of the paper PaperBanana Automating Academic Illustration for AI Scientists
Paper Review: mHC: Manifold-Constrained Hyper-Connections
My review of the paper mHC Manifold-Constrained Hyper-Connections
Top-10 ML papers I read in 2025
Top-10 ML and AI papers I read in 2025
Paper Review: NitroGen: A Foundation Model for Generalist Gaming Agents
My review of the paper NitroGen A Foundation Model for Generalist Gaming Agents
Paper Review: SAM 3: Segment Anything with Concepts
Meta's unified model for detecting, segmenting, and tracking objects using text or image prompts — trained on 4M conc...
Paper Review: HunyuanImage 3.0 Technical Report
My review of the paper HunyuanImage 3.0 Technical Report
Paper Review: Chronos-2: From Univariate to Universal Forecasting
Chronos-2 extends zero-shot time series forecasting to multivariate and covariate settings with a new group attention...
Paper Review: The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
A biologically inspired LLM built as a graph of spiking neurons with Hebbian learning — it matches GPT-2 scaling whil...
Paper Review: LongLive: Real-time Interactive Long Video Generation
My review of the paper LongLive Real-time Interactive Long Video Generation
Paper Review: Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
My review of the paper Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing
Paper Review: Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
My review of the paper Pref-GRPO Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper Review: DINOv3
Meta's self-supervised vision model trained on 17 billion images, introducing Gram anchoring to prevent feature degra...
Paper Review: Group Sequence Policy Optimization
My review of the paper Group Sequence Policy Optimization
Paper Review: Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
My review of the paper Subliminal Learning Language models transmit behavioral traits via hidden signals in data
Paper Review: ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
My review of the paper ProRL Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper Review: V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
A self-supervised video model trained on 1M+ hours of video that understands motion, anticipates actions, and — with ...
Paper Review: Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Only ~20% of tokens actually matter when training LLMs to reason with RL. Updating the low-entropy majority actively ...
Paper Review: SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
My review of the paper SWE-rebench An Automated Pipeline for Task Collection and Decontaminated Evaluation of Softwar...
Paper Review: Visual Planning: Lets Think Only with Images
My review of the paper Visual Planning Let's Think Only with Images
Paper Review: AlphaEvolve: A coding agent for scientific and algorithmic discovery
DeepMind's autonomous coding agent that evolves algorithms through LLM-driven iteration — it discovered the first imp...
Paper Review: AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents
My review of the paper AgentA/B Automated and Scalable Web A/BTesting with Interactive LLM Agents
Paper Review: M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
My review of the paper M1 Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper Review: TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
My review of the paper TextCrafter Accurately Rendering Multiple Texts in Complex Visual Scenes
Paper Review: Video-T1: Test-Time Scaling for Video Generation
My review of the paper Video-T1 Test-Time Scaling for Video Generation
Paper Review: RWKV-7 Goose with Expressive Dynamic State Evolution
My review of the paper RWKV-7 Goose with Expressive Dynamic State Evolution
Paper Review: Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
My review of the paper Audio Flamingo 2 An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Ab...
Paper Review: Large Language Diffusion Models
LLaDA replaces autoregressive token generation with diffusion-based masked prediction, rivaling LLaMA3 8B while natur...
Paper Review: NeoBERT: A Next-Generation BERT
A compact 250M-parameter bidirectional encoder that incorporates RoPE, SwiGLU, and modern pretraining to outperform m...
Paper Review: SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Google's upgraded vision-language encoders that add self-supervised learning and online data curation to SigLIP, deli...
Paper Review: Goku: Flow Based Video Generative Foundation Models
My review of the paper Goku Flow Based Video Generative Foundation Models
Paper Review: Titans: Learning to Memorize at Test Time
A new architecture that pairs attention with a learnable long-term memory module, scaling to 2M+ tokens and outperfor...
Paper Review: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
How pure reinforcement learning (without supervised fine-tuning) can teach LLMs to reason, producing open-source mode...
Paper Review: STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
My review of the paper STAR Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolu...
Paper Review: Training Large Language Models to Reason in a Continuous Latent Space
Coconut lets LLMs reason in latent space instead of generating text tokens, enabling breadth-first exploration of rea...
Paper Review: Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
BERT rebuilt with modern tricks — 2 trillion training tokens, 8192 context length, Flash Attention, and rotary embedd...
Paper Review: Byte Latent Transformer: Patches Scale Better Than Tokens
My review of the paper Byte Latent Transformer Patches Scale Better Than Tokens
Paper Review: Reverse Thinking Makes LLMs Stronger Reasoners
My review of the paper Reverse Thinking Makes LLMs Stronger Reasoners
Paper Review: Project Sid: Many-agent simulations toward AI civilization
What happens when you put 1k AI agents in Minecraft and let them self-organize? They form governments, transmit cultu...
Paper Review: Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
My review of the paper Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper Review: Unbounded: A Generative Infinite Game of Character Life Simulation
My review of the paper Unbounded A Generative Infinite Game of Character Life Simulation
Paper Review: Contextual Document Embeddings
My review of the paper Contextual Document Embeddings
Paper Review: Differential Transformer
My review of the paper Differential Transformer
Paper Review: Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
My review of the paper Depth Pro Sharp Monocular Metric Depth in Less Than a Second
Paper Review: Training Language Models to Self-Correct via Reinforcement Learning
My review of the paper Training Language Models to Self-Correct via Reinforcement Learning
Paper Review: Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
My review of the paper Loopy Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Paper Review: Agentic Retrieval-Augmented Generation for Time Series Analysis
My review of the paper Agentic Retrieval-Augmented Generation for Time Series Analysis
Paper Review: Winning Amazon KDD Cup24
My review of the paper Winning Amazon KDD Cup24
Paper Review: Wolf: Captioning Everything with a World Summarization Framework
My review of the paper Wolf Captioning Everything with a World Summarization Framework
Paper Review: Diffusion Feedback Helps CLIP See Better
My review of the paper Diffusion Feedback Helps CLIP See Better
Paper Review: Masked Attention is All You Need for Graphs
My review of the paper Masked Attention is All You Need for Graphs
Paper Review: RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
My review of the paper RankRAG Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
Paper Review: Unveiling Encoder-Free Vision-Language Models
My review of the paper Unveiling Encoder-Free Vision-Language Models
Paper Review: Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
My review of the paper Husky A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper Review: Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
My review of the paper Samba Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Paper Review: σ-GPTs: A New Approach to Autoregressive Models
My review of the paper σ-GPTs A New Approach to Autoregressive Models
Paper Review: LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
My review of the paper LiteVAE Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
Paper Review: YOLOv10: Real-Time End-to-End Object Detection
My review of the paper YOLOv10 Real-Time End-to-End Object Detection
Paper Review: Chameleon: Mixed-Modal Early-Fusion Foundation Models
My review of the paper Chameleon Mixed-Modal Early-Fusion Foundation Models
Paper Review: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
My review of the paper Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
Paper Review: FlowMind: Automatic Workflow Generation with LLMs
My review of the paper FlowMind Automatic Workflow Generation with LLMs
Paper Review: Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
My review of the paper Ferret-v2 An Improved Baseline for Referring and Grounding with Large Language Models
Paper Review: Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
My review of the paper Visual Autoregressive Modeling Scalable Image Generation via Next-Scale Prediction
Paper Review: Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
My review of the paper Vision-RWKV Efficient and Scalable Visual Perception with RWKV-Like Architectures
Paper Review: Chronos: Learning the Language of Time Series
Amazon's framework that tokenizes time series data for pretrained language models, enabling zero-shot forecasting tha...
Paper Review: Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks
My review of the paper Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks
Paper Review: NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
My review of the paper NaturalSpeech 3 Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Paper Review: Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
My review of the paper Griffin Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
My review of the paper YOLOv9 Learning What You Want to Learn Using Programmable Gradient Information
Paper Review: LiRank: Industrial Large Scale Ranking Models at LinkedIn
My review of the paper LiRank Industrial Large Scale Ranking Models at LinkedIn
Paper Review: Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
My review of the paper Lag-Llama Towards Foundation Models for Probabilistic Time Series Forecasting
Paper Review: Lumiere: A Space-Time Diffusion Model for Video Generation
My review of the paper Lumiere A Space-Time Diffusion Model for Video Generation
Paper Review: Scalable Pre-training of Large Autoregressive Image Models
My review of the paper Scalable Pre-training of Large Autoregressive Image Models
Paper Review: Ferret: Refer and Ground Anything Anywhere at Any Granularity
My review of the paper Ferret Refer and Ground Anything Anywhere at Any Granularity
Paper Review: DocLLM: A layout-aware generative language model for multimodal document understanding
My review of the paper DocLLM A layout-aware generative language model for multimodal document understanding
Paper Review: StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
My review of the paper StreamDiffusionStreamDiffusion A Pipeline-Level Solution for Real-Time Interactive Generation
Paper Review: Pixel Aligned Language Models
My review of the paper Pixel Aligned Language Models
Paper Review: EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
My review of the paper EfficientSAM Leveraged Masked Image Pretraining for Efficient Segment Anything
Paper Review: Translatotron 3: Speech to Speech Translation with Monolingual Data
My review of the paper Translatotron 3 Speech to Speech Translation with Monolingual Data
Paper Review: Adversarial Diffusion Distillation
My review of the paper Adversarial Diffusion Distillation
Paper Review: Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
My review of the paper Diffuse, Attend, and Segment Unsupervised Zero-Shot Segmentation using Stable Diffusion
Paper Review: Diffusion Model Alignment Using Direct Preference Optimization
Adapting DPO from language models to image generation — training Stable Diffusion XL on 851K human preferences to sig...
Paper Review: Orca 2: Teaching Small Language Models How to Reason
My review of the paper Orca 2 Teaching Small Language Models How to Reason
Paper Review: Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
My review of the paper Chain-of-Note Enhancing Robustness in Retrieval-Augmented Language Models
Paper Review: Deep Learning for Day Forecasts from Sparse Observations
My review of the paper Deep Learning for Day Forecasts from Sparse Observations
Paper Review: Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
My review of the paper Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Paper Review: CogVLM: Visual Expert for Pretrained Language Models
My review of the paper CogVLM Visual Expert for Pretrained Language Models
Paper Review: Collaborative Large Language Model for Recommender Systems
My review of the paper Collaborative Large Language Model for Recommender Systems
Paper Review: SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
My review of the paper SAM-CLIP Merging Vision Foundation Models towards Semantic and Spatial Understanding
Paper Review: Zephyr: Direct Distillation of LM Alignment
My review of the paper Zephyr Direct Distillation of LM Alignment
Paper Review: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture
My review of the paper Monarch Mixer A Simple Sub-Quadratic GEMM-Based Architecture
Paper Review: Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
My review of the paper Self-RAG Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper Review: PaLI-3 Vision Language Models: Smaller, Faster, Stronger
My review of the paper PaLI-3 Vision Language Models Smaller, Faster, Stronger
Paper Review: InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
My review of the paper InstructRetro Instruction Tuning post Retrieval-Augmented Pretraining
Paper Review: Mistral 7B
My review of the paper Mistral 7B
Paper Review: Think before you speak: Training Language Models With Pause Tokens
My review of the paper Think before you speak Training Language Models With Pause Tokens
Paper Review: QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
My review of the paper QA-LoRA Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper Review: LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
My review of the paper LAVIE High-Quality Video Generation with Cascaded Latent Diffusion Models
Paper Review: DreamLLM: Synergistic Multimodal Comprehension and Creation
My review of the paper DreamLLM Synergistic Multimodal Comprehension and Creation
Paper Review: FreeU: Free Lunch in Diffusion U-Net
My review of the paper FreeU Free Lunch in Diffusion U-Net
Paper Review: Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
My review of the paper Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper Review: SLiMe: Segment Like Me
My review of the paper SLiMe Segment Like Me
Paper Review: TSMixer: An All-MLP Architecture for Time Series Forecasting
My review of the paper TSMixer An All-MLP Architecture for Time Series Forecasting
Paper Review: Explaining grokking through circuit efficiency
My review of the paper Explaining grokking through circuit efficiency
Paper Review: Contrastive Feature Masking Open-Vocabulary Vision Transformer
My review of the paper Contrastive Feature Masking Open-Vocabulary Vision Transformer
Paper Review: RecMind: Large Language Model Powered Agent For Recommendation
My review of the paper RecMind Large Language Model Powered Agent For Recommendation
Paper Review: CoTracker: It is Better to Track Together
My review of the paper CoTracker It is Better to Track Together
Paper Review: Giraffe: Adventures in Expanding Context Lengths in LLMs
My review of the paper Giraffe Adventures in Expanding Context Lengths in LLMs
Paper Review: OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
My review of the paper OBELISC An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Paper Review: LISA: Reasoning Segmentation via Large Language Model
My review of the paper LISA Reasoning Segmentation via Large Language Model
Paper Review: FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
My review of the paper FastViT A Fast Hybrid Vision Transformer using Structural Reparameterization
Paper Review: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
A systematic survey of what's broken in RLHF — from reward hacking to evaluation gaps — and what techniques can fix, ...
Paper Review: UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition
My review of the paper UniversalNER Targeted Distillation from Large Language Models for Open Named Entity Recognition
Paper Review: Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
My review of the paper Skeleton-of-Thought Large Language Models Can Do Parallel Decoding
Paper Review: Tracking Anything in High Quality
My review of the paper Tracking Anything in High Quality
Paper Review: TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning
My review of the paper TabR Unlocking the Power of Retrieval-Augmented Tabular Deep Learning
Paper Review: Meta-Transformer: A Unified Framework for Multimodal Learning
My review of the paper Meta-Transformer A Unified Framework for Multimodal Learning
Paper Review: Retentive Network: A Successor to Transformer for Large Language Models
My review of the paper Retentive Network A Successor to Transformer for Large Language Models
Paper Review: Llama 2: Open Foundation and Fine-Tuned Chat Models
Meta's open-source LLM family (7B–70B parameters) with chat fine-tuning that matched or beat closed-source models on ...
Paper Review: Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
My review of the paper Scaling Autoregressive Multi-Modal Models Pretraining and Instruction Tuning
Paper Review: UniverSeg: Universal Medical Image Segmentation
My review of the paper UniverSeg Universal Medical Image Segmentation
Paper Review: Recognize Anything: A Strong Image Tagging Model
My review of the paper Recognize Anything A Strong Image Tagging Model
Paper Review: Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
My review of the paper Hiera A Hierarchical Vision Transformer without the Bells-and-Whistles
Paper Review: Multilingual End to End Entity Linking
My review of the paper Multilingual End to End Entity Linking
Paper Review: Fast Segment Anything
My review of the paper Fast Segment Anything
Paper Review: Tracking Everything Everywhere All at Once
My review of the paper Tracking Everything Everywhere All at Once
Paper Review: Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
My review of the paper Voicebox Text-Guided Multilingual Universal Speech Generation at Scale
Paper Review: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
My review of the paper Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
Paper Review: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Yann LeCun's I-JEPA learns semantic image representations by predicting masked patch features — no data augmentation ...
Paper Review: BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
My review of the paper BiomedGPT A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, L...
Paper Review: StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
My review of the paper StableRep Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
Paper Review: The effectiveness of MAE pre-pretraining for billion-scale pretraining
My review of the paper The effectiveness of MAE pre-pretraining for billion-scale pretraining
Paper Review: QLoRA: Efficient Finetuning of Quantized LLMs
My review of the paper QLoRA Efficient Finetuning of Quantized LLMs
Paper Review: Chain of Hindsight Aligns Language Models with Feedback
My review of the paper Chain of Hindsight Aligns Language Models with Feedback
Paper Review: MMS: Scaling Speech Technology to 1000+ languages
My review of the paper MMS Scaling Speech Technology to 1000+ languages
Paper Review: Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
My review of the paper Drag Your GAN Interactive Point-based Manipulation on the Generative Image Manifold
Paper Review: DarkBERT: A Language Model for the Dark Side of the Internet
My review of the paper DarkBERT A Language Model for the Dark Side of the Internet
Paper Review: NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
My review of the paper NaturalSpeech 2 Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Paper Review: ImageBind: One Embedding Space To Bind Them All
My review of the paper ImageBind One Embedding Space To Bind Them All
Paper Review: Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
My review of the paper Distilling Step-by-Step Outperforming Larger Language Models with Less Training Data and Small...
Paper Review: Phoenix: Democratizing ChatGPT across Languages
My review of the paper Phoenix Democratizing ChatGPT across Languages
Paper Review: Scaling Transformer to 1M tokens and beyond with RMT
My review of the paper Scaling Transformer to 1M tokens and beyond with RMT
Paper Review: Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations
My review of the paper Speed Is All You Need On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizat...
Paper Review: Generative Agents: Interactive Simulacra of Human Behavior
My review of the paper Generative Agents Interactive Simulacra of Human Behavior
Paper Review: DINOv2: Learning Robust Visual Features without Supervision
How Meta built all-purpose visual features by scaling self-supervised pretraining to a curated 142M-image dataset, pr...
Paper Review: InceptionNeXt: When Inception Meets ConvNeXt
My review of the paper InceptionNeXt When Inception Meets ConvNeXt
Paper Review: Segment Anything
My review of the paper Segment Anything
Paper Review: BloombergGPT: A Large Language Model for Finance
Bloomberg trained a 50B-parameter LLM on 363B tokens of proprietary financial data. It crushes existing models on fin...
Paper Review: ReBotNet: Fast Real-time Video Enhancement
My review of the paper ReBotNet Fast Real-time Video Enhancement
Paper Review: Hyena Hierarchy: Towards Larger Convolutional Language Models
My review of the paper Hyena Hierarchy Towards Larger Convolutional Language Models
Paper Review: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
My review of the paper Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models
Paper Review: PaLM-E: An Embodied Multimodal Language Model
My review of the paper PaLM-E An Embodied Multimodal Language Model
Paper Review: In-Context Instruction Learning
My review of the paper In-Context Instruction Learning
Paper Review: LLaMA: Open and Efficient Foundation Language Models
My review of the paper LLaMA Open and Efficient Foundation Language Models
Paper Review: Scaling Vision Transformers to 22 Billion Parameters
My review of the paper Scaling Vision Transformers to 22 Billion Parameters
Paper Review: Dual PatchNorm
My review of the paper Dual PatchNorm
Paper Review: Cut and Learn for Unsupervised Object Detection and Instance Segmentation
My review of the paper Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Paper Review: StyleGAN-T Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
My review of the paper StyleGAN-T Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Paper Review: Next-ViT Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
My review of the paper Next-ViT Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial S...
Paper Review: NL-Augmenter A Framework for Task-Sensitive Natural Language Augmentation
My review of the paper NL-Augmenter A Framework for Task-Sensitive Natural Language Augmentation and my contribution ...
Paper Review: NÜWA Visual Synthesis Pre-training for Neural visUal World creAtion
My review of the paper NÜWA Visual Synthesis Pre-training for Neural visUal World creAtion
Paper Review: Swin Transformer V2 Scaling Up Capacity and Resolution
My review of the paper Swin Transformer V2 Scaling Up Capacity and Resolution
Paper Review: A Recipe For Arbitrary Text Style Transfer with Large Language Models
My review of the paper A Recipe For Arbitrary Text Style Transfer with Large Language Models
Paper Review: SwinIR Image Restoration Using Swin Transformer
My review of the paper SwinIR Image Restoration Using Swin Transformer
Paper Review: Efficient Visual Pretraining with Contrastive Detection
My review of the paper Efficient Visual Pretraining with Contrastive Detection
Paper Review: Domain-Aware Universal Style Transfer
My review of the paper Domain-Aware Universal Style Transfer
Paper Review: YOLOX Exceeding YOLO Series in 2021
My review of the paper YOLOX Exceeding YOLO Series in 2021
Paper Review: Long-Short Transformer Efficient Transformers for Language and Vision
My review of the paper Long-Short Transformer Efficient Transformers for Language and Vision
Paper Review: Semi-Autoregressive Transformer for Image Captioning
My review of the paper Semi-Autoregressive Transformer for Image Captioning
Paper Review: CoAtNet Marrying Convolution and Attention for All Data Sizes
My review of the paper CoAtNet Marrying Convolution and Attention for All Data Sizes
Paper Review: ByT5 Towards a token-free future with pre-trained byte-to-byte models
My review of the paper ByT5 Towards a token-free future with pre-trained byte-to-byte models
Paper Review: Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence
My review of the paper Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence
Paper Review: Are Pre-trained Convolutions Better than Pre-trained Transformers?
My review of the paper Are Pre-trained Convolutions Better than Pre-trained Transformers?
Paper Review: MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
My review of the paper MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding.
Paper Review: Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains
My review of the paper Generating Furry Cars Disentangling Object Shape and Appearance across Multiple Domains.
Paper Review: EfficientNetV2: Smaller Models and Faster Training
My review of the paper EfficientNetV2 Smaller Models and Faster Training.
Paper Review: Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning
My review of the paper Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning.
Paper Review: LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
My review of the paper LightningDOT Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval.
Paper Review: Revisiting ResNets: Improved Training and Scaling Strategies
My review of the paper Revisiting ResNets, Improved Training and Scaling Strategies.
Paper Review: Contrastive Semi-supervised Learning for ASR
My review of the paper Contrastive Semi-supervised Learning for ASR.
Paper Review: Real-World Super-Resolution of Face-Images from Surveillance Cameras
My review of the paper Real-World Super-Resolution of Face-Images from Surveillance Cameras.
Paper Review: ObjectAug: Object-level Data Augmentation for Semantic Image Segmentation
My review of the paper ObjectAug Object-level Data Augmentation for Semantic Image Segmentation .
Paper Review: JigsawGAN: Self-supervised Learning for Solving Jigsaw Puzzles with Generative Adversarial Networks
My review of the paper JigsawGAN Self-supervised Learning for Solving Jigsaw Puzzles with Generative Adversarial Netw...
Paper Review: Language-agnostic BERT Sentence Embedding
My review of the paper Language-agnostic BERT Sentence Embedding.
Paper Review: Funnel Activation for Visual Recognition
My review of the paper Funnel Activation for Visual Recognition.
Paper Review: ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network
My review of the paper ReXNet Diminishing Representational Bottleneck on Convolutional Neural Network.
Paper Review: VirTex: Learning Visual Representations from Textual Annotations
My review of the paper VirTex Learning Visual Representations from Textual Annotations.
Paper Review: Linformer: Self-Attention with Linear Complexity
My review of the paper Linformer Self-Attention with Linear Complexity.
Paper Review: End-to-End Object Detection with Transformers
My review of the paper End-to-End Object Detection with Transformers.
Paper Review: SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training
My review of the paper SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training.
Paper Review: Transformer Reasoning Network for Image-Text Matching and Retrieval
My review of the paper Transformer Reasoning Network for Image-Text Matching and Retrieval.
Paper Review: Named Entity Recognition without Labelled Data A Weak Supervision Approach
My review of the paper Named Entity Recognition without Labelled Data A Weak Supervision Approach.