Tag: optimization
- Paper Review: Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference (23 Dec 2024)
- Paper Review: Byte Latent Transformer: Patches Scale Better Than Tokens (16 Dec 2024)
- Paper Review: QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models (05 Oct 2023)
- Paper Review: QLoRA: Efficient Finetuning of Quantized LLMs (01 Jun 2023)
- Paper Review: Linformer: Self-Attention with Linear Complexity (10 Jun 2020)