Tag: attention
- Paper Review: Differential Transformer (14 Oct 2024)
- Paper Review: Masked Attention is All You Need for Graphs (29 Jul 2024)
- Paper Review: Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures (01 Apr 2024)
- Paper Review: Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models (04 Mar 2024)
- Paper Review: DocLLM: A layout-aware generative language model for multimodal document understanding (08 Jan 2024)
- Paper Review: Long-Short Transformer Efficient Transformers for Language and Vision (12 Jul 2021)
- Paper Review: CoAtNet Marrying Convolution and Attention for All Data Sizes (10 Jun 2021)
- Paper Review: Linformer: Self-Attention with Linear Complexity (10 Jun 2020)