Tag: nlp
- Paper Review: Training Large Language Models to Reason in a Continuous Latent Space (06 Jan 2025)
- Paper Review: Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference (23 Dec 2024)
- Paper Review: Byte Latent Transformer: Patches Scale Better Than Tokens (16 Dec 2024)
- Paper Review: Reverse Thinking Makes LLMs Stronger Reasoners (09 Dec 2024)
- Paper Review: Project Sid: Many-agent simulations toward AI civilization (25 Nov 2024)
- Paper Review: Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level (11 Nov 2024)
- Paper Review: Unbounded: A Generative Infinite Game of Character Life Simulation (29 Oct 2024)
- Paper Review: Contextual Document Embeddings (21 Oct 2024)
- Paper Review: Differential Transformer (14 Oct 2024)
- Paper Review: Training Language Models to Self-Correct via Reinforcement Learning (23 Sep 2024)
- Paper Review: Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling (17 Jun 2024)
- Paper Review: σ-GPTs: A New Approach to Autoregressive Models (10 Jun 2024)
- Paper Review: Orca 2: Teaching Small Language Models How to Reason (23 Nov 2023)
- Paper Review: Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models (20 Nov 2023)
- Paper Review: Zephyr: Direct Distillation of LM Alignment (30 Oct 2023)
- Paper Review: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture (26 Oct 2023)
- Paper Review: Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection (23 Oct 2023)
- Paper Review: PaLI-3 Vision Language Models: Smaller, Faster, Stronger (19 Oct 2023)
- Paper Review: InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining (16 Oct 2023)
- Paper Review: Mistral 7B (12 Oct 2023)
- Paper Review: Think before you speak: Training Language Models With Pause Tokens (09 Oct 2023)
- Paper Review: QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models (05 Oct 2023)
- Paper Review: DreamLLM: Synergistic Multimodal Comprehension and Creation (28 Sep 2023)
- Paper Review: Giraffe: Adventures in Expanding Context Lengths in LLMs (28 Aug 2023)
- Paper Review: OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents (24 Aug 2023)
- Paper Review: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback (10 Aug 2023)
- Paper Review: UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition (10 Aug 2023)
- Paper Review: Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding (07 Aug 2023)
- Paper Review: Meta-Transformer: A Unified Framework for Multimodal Learning (27 Jul 2023)
- Paper Review: Retentive Network: A Successor to Transformer for Large Language Models (24 Jul 2023)
- Paper Review: Llama 2: Open Foundation and Fine-Tuned Chat Models (20 Jul 2023)
- Paper Review: Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning (17 Jul 2023)
- Paper Review: Multilingual End to End Entity Linking (03 Jul 2023)
- Paper Review: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision (19 Jun 2023)
- Paper Review: BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks (12 Jun 2023)
- Paper Review: StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners (08 Jun 2023)
- Paper Review: Chain of Hindsight Aligns Language Models with Feedback (30 May 2023)
- Paper Review: DarkBERT: A Language Model for the Dark Side of the Internet (18 May 2023)
- Paper Review: ImageBind: One Embedding Space To Bind Them All (10 May 2023)
- Paper Review: Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes (08 May 2023)
- Paper Review: Phoenix: Democratizing ChatGPT across Languages (04 May 2023)
- Paper Review: Generative Agents: Interactive Simulacra of Human Behavior (24 Apr 2023)
- Paper Review: BloombergGPT: A Large Language Model for Finance (02 Apr 2023)
- Paper Review: Hyena Hierarchy: Towards Larger Convolutional Language Models (20 Mar 2023)
- Paper Review: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models (13 Mar 2023)
- Paper Review: PaLM-E: An Embodied Multimodal Language Model (09 Mar 2023)
- Paper Review: In-Context Instruction Learning (06 Mar 2023)
- Paper Review: LLaMA: Open and Efficient Foundation Language Models (26 Feb 2023)
- Medical-chat bot: the history of our attempt to do it (09 Sep 2022)
- Paper Review: NL-Augmenter A Framework for Task-Sensitive Natural Language Augmentation (10 Dec 2021)
- Paper Review: A Recipe For Arbitrary Text Style Transfer with Large Language Models (10 Oct 2021)
- Paper Review: Long-Short Transformer Efficient Transformers for Language and Vision (12 Jul 2021)
- Paper Review: ByT5 Towards a token-free future with pre-trained byte-to-byte models (02 Jun 2021)
- Paper Review: Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence (21 May 2021)
- Paper Review: Are Pre-trained Convolutions Better than Pre-trained Transformers? (10 May 2021)
- Paper Review: Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning (29 Mar 2021)
- Paper Review: Language-agnostic BERT Sentence Embedding (19 Aug 2020)
- Paper Review: SpERT Span-based Joint Entity and Relation Extraction with Transformer Pre-training (23 May 2020)
- Paper Review: Named Entity Recognition without Labelled Data A Weak Supervision Approach (10 May 2020)
- Approaches to sentimental analysis on a small imbalanced dataset without Deep Learning (09 Aug 2019)
All tags
paperreview (159)
deeplearning (155)
cv (65)
nlp (60)
llm (45)
transformer (28)
blogpost (15)
sota (14)
pretraining (14)
imagesegmentation (12)
pytorch (9)
objectdetection (9)
attention (9)
video (7)
career (7)
datascience (6)
stablediffusion (5)
diffusion (5)
timeseries (4)
selfsupervised (4)
ner (4)
mllm (4)
life (4)
gan (4)
agent (4)
yolo (3)
vlm (3)
superresolution (3)
styletransfer (3)
languages (3)
kaggle (3)
imagegeneration (3)
imagecaptioning (3)
augmentation (3)
audio (3)
visual (2)
tts (2)
transferlearning (2)
tokenization (2)
simulation (2)
sd (2)
recommender (2)
reasoning (2)
ranking (2)
qa (2)
languagemodel (2)
gpt (2)
gnn (2)
generation (2)
fewshotlearning (2)
dpo (2)
competition (2)
cnn (2)
classification (2)
bert (2)
weaksupervision (1)
unet (1)
textgeneration (1)
tensorflow (1)
tabular (1)
swa (1)
summarization (1)
speechtranslation (1)
speechtospeech (1)
speechrecognition (1)
speechgeneration (1)
sentenceembeddings (1)
semisupervised (1)
scaling (1)
robustness (1)
robotics (1)
rnn (1)
relationextrction (1)
relationextraction (1)
reinforcementlearning (1)
recurrent (1)
recommendation (1)
realtime (1)
ragn (1)
rag (1)
quantization (1)
promptengineering (1)
objecttracking (1)
objectdetecion (1)
nlg (1)
nas (1)
multimodal (1)
motivation (1)
motiontracking (1)
mlp (1)
mentoring (1)
memoryoptimization (1)
mamba (1)
languagetranslation (1)
jigsaw (1)
instructlearning (1)
inferencespeed (1)
imagetextmatching (1)
imagerestoration (1)
imageinpainting (1)
graphneuralnets (1)
graph (1)
forecasting (1)
fail (1)
entitylinking (1)
endtoend (1)
embedding (1)
efficiency (1)
distillation (1)
diffusionmodels (1)
depthestimation (1)
curriculumlreaning (1)
contrastivelearning (1)
coco (1)
clip (1)
chatbot (1)
captioning (1)
books (1)
autoencoder (1)
asr (1)
annotation (1)
anchorfree (1)
alignment (1)
advice (1)
adversarial (1)
activationfunction (1)
CV (1)