Mar 04, 2024
Paper Review: Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
My review of the paper Griffin Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
paperreview
deeplearning
recurrent
attention