Tag: bert – Andrey Lukyanenko

Mar 03, 2025

Paper Review: NeoBERT: A Next-Generation BERT

A compact 250M-parameter bidirectional encoder that incorporates RoPE, SwiGLU, and modern pretraining to outperform m...

paperreview deeplearning nlp transformer

Dec 23, 2024

Paper Review: Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

BERT rebuilt with modern tricks — 2 trillion training tokens, 8192 context length, Flash Attention, and rotary embedd...

paperreview deeplearning nlp transformer

May 23, 2020

Paper Review: SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training

My review of the paper SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training.

paperreview nlp deeplearning transformer