Tag: transformer

36 posts

Feb 23, 2026
Beyond Positional Bias: How DroPE Unlocks Zero-Shot Long Context in LLMs
A review of DroPE, a simple but counterintuitive method that extends LLM context length by dropping positional embedd...
paperreview deeplearning llm attention
Mar 24, 2025
Paper Review: RWKV-7 Goose with Expressive Dynamic State Evolution
My review of the paper RWKV-7 Goose with Expressive Dynamic State Evolution
paperreview deeplearning nlp rnn
Mar 17, 2025
Paper Review: Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
My review of the paper Audio Flamingo 2 An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Ab...
paperreview deeplearning transformer nlp
Mar 10, 2025
Paper Review: Large Language Diffusion Models
LLaDA replaces autoregressive token generation with diffusion-based masked prediction, rivaling LLaMA3 8B while natur...
paperreview deeplearning nlp transformer
Mar 03, 2025
Paper Review: NeoBERT: A Next-Generation BERT
A compact 250M-parameter bidirectional encoder that incorporates RoPE, SwiGLU, and modern pretraining to outperform m...
paperreview deeplearning nlp transformer
Feb 24, 2025
Paper Review: SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Google's upgraded vision-language encoders that add self-supervised learning and online data curation to SigLIP, deli...
paperreview deeplearning transformer cv
Feb 17, 2025
Paper Review: Goku: Flow Based Video Generative Foundation Models
My review of the paper Goku Flow Based Video Generative Foundation Models
paperreview deeplearning transformer imagegeneration
Feb 03, 2025
Paper Review: Titans: Learning to Memorize at Test Time
A new architecture that pairs attention with a learnable long-term memory module, scaling to 2M+ tokens and outperfor...
paperreview deeplearning llm nlp
Dec 23, 2024
Paper Review: Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
BERT rebuilt with modern tricks — 2 trillion training tokens, 8192 context length, Flash Attention, and rotary embedd...
paperreview deeplearning nlp transformer
Dec 16, 2024
Paper Review: Byte Latent Transformer: Patches Scale Better Than Tokens
My review of the paper Byte Latent Transformer Patches Scale Better Than Tokens
paperreview deeplearning nlp llm
Oct 21, 2024
Paper Review: Contextual Document Embeddings
My review of the paper Contextual Document Embeddings
paperreview deeplearning transformer embedding
Oct 14, 2024
Paper Review: Differential Transformer
My review of the paper Differential Transformer
paperreview deeplearning transformer attention
Jul 29, 2024
Paper Review: Masked Attention is All You Need for Graphs
My review of the paper Masked Attention is All You Need for Graphs
paperreview deeplearning graph transformer
Aug 17, 2023
Paper Review: FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
My review of the paper FastViT A Fast Hybrid Vision Transformer using Structural Reparameterization
paperreview deeplearning cv sota
Jul 27, 2023
Paper Review: Meta-Transformer: A Unified Framework for Multimodal Learning
My review of the paper Meta-Transformer A Unified Framework for Multimodal Learning
paperreview deeplearning nlp transformer
Jul 24, 2023
Paper Review: Retentive Network: A Successor to Transformer for Large Language Models
My review of the paper Retentive Network A Successor to Transformer for Large Language Models
paperreview deeplearning nlp transformer
Jul 06, 2023
Paper Review: Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
My review of the paper Hiera A Hierarchical Vision Transformer without the Bells-and-Whistles
paperreview deeplearning cv transformer
May 01, 2023
Paper Review: Scaling Transformer to 1M tokens and beyond with RMT
My review of the paper Scaling Transformer to 1M tokens and beyond with RMT
paperreview deeplearning transformer
Mar 13, 2023
Paper Review: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
My review of the paper Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models
paperreview deeplearning nlp transformer
Mar 09, 2023
Paper Review: PaLM-E: An Embodied Multimodal Language Model
My review of the paper PaLM-E An Embodied Multimodal Language Model
paperreview deeplearning nlp transformer
Mar 06, 2023
Paper Review: In-Context Instruction Learning
My review of the paper In-Context Instruction Learning
paperreview deeplearning nlp transformer
Feb 26, 2023
Paper Review: LLaMA: Open and Efficient Foundation Language Models
My review of the paper LLaMA Open and Efficient Foundation Language Models
paperreview deeplearning nlp transformer
Feb 20, 2023
Paper Review: Scaling Vision Transformers to 22 Billion Parameters
My review of the paper Scaling Vision Transformers to 22 Billion Parameters
paperreview deeplearning cv transformer
Feb 13, 2023
Paper Review: Dual PatchNorm
My review of the paper Dual PatchNorm
paperreview deeplearning transformer cv
Jul 24, 2022
Paper Review: Next-ViT Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
My review of the paper Next-ViT Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial S...
paperreview deeplearning cv transformer
Nov 25, 2021
Paper Review: NÜWA Visual Synthesis Pre-training for Neural visUal World creAtion
My review of the paper NÜWA Visual Synthesis Pre-training for Neural visUal World creAtion
paperreview deeplearning cv transformer
Nov 19, 2021
Paper Review: Swin Transformer V2 Scaling Up Capacity and Resolution
My review of the paper Swin Transformer V2 Scaling Up Capacity and Resolution
paperreview deeplearning cv transformer
Sep 13, 2021
Paper Review: SwinIR Image Restoration Using Swin Transformer
My review of the paper SwinIR Image Restoration Using Swin Transformer
paperreview deeplearning cv transformer
Jul 12, 2021
Paper Review: Long-Short Transformer Efficient Transformers for Language and Vision
My review of the paper Long-Short Transformer Efficient Transformers for Language and Vision
paperreview deeplearning cv nlp
May 10, 2021
Paper Review: Are Pre-trained Convolutions Better than Pre-trained Transformers?
My review of the paper Are Pre-trained Convolutions Better than Pre-trained Transformers?
paperreview deeplearning nlp cnn
Aug 19, 2020
Paper Review: Language-agnostic BERT Sentence Embedding
My review of the paper Language-agnostic BERT Sentence Embedding.
paperreview deeplearning transformer nlp
Jun 14, 2020
Paper Review: VirTex: Learning Visual Representations from Textual Annotations
My review of the paper VirTex Learning Visual Representations from Textual Annotations.
paperreview imagecaptioning cv visual
Jun 10, 2020
Paper Review: Linformer: Self-Attention with Linear Complexity
My review of the paper Linformer Self-Attention with Linear Complexity.
paperreview deeplearning attention transformer
May 28, 2020
Paper Review: End-to-End Object Detection with Transformers
My review of the paper End-to-End Object Detection with Transformers.
paperreview deeplearning objectdetection transformer
May 23, 2020
Paper Review: SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training
My review of the paper SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training.
paperreview nlp deeplearning transformer
May 17, 2020
Paper Review: Transformer Reasoning Network for Image-Text Matching and Retrieval
My review of the paper Transformer Reasoning Network for Image-Text Matching and Retrieval.
paperreview transformer cv imagetextmatching

← All tags