Write an engaging linkedin post with relevant emoji and hashtags promoting my blogpost.
Examples:
1
๐๐๐ฃ๐ข: ๐ง๐ฒ๐ฎ๐ฐ๐ต๐ถ๐ป๐ด ๐๐๐ ๐ ๐ช๐ต๐ถ๐ฐ๐ต ๐ง๐ต๐ผ๐๐ด๐ต๐๐ ๐๐ฐ๐๐๐ฎ๐น๐น๐ ๐ ๐ฎ๐๐๐ฒ๐ฟ
Most current approaches (GRPO, DAPO, RLVR) optimize for the final answer. But they ignore a key question: ๐ช๐ต๐ถ๐ฐ๐ต ๐ฝ๐ฎ๐ฟ๐๐ ๐ผ๐ณ ๐๐ต๐ฒ ๐ฟ๐ฒ๐ฎ๐๐ผ๐ป๐ถ๐ป๐ด ๐ฎ๐ฐ๐๐๐ฎ๐น๐น๐ ๐บ๐ฎ๐๐๐ฒ๐ฟ๐ฒ๐ฑ?
I just published a review of ๐๐๐ฃ๐ข (๐๐๐๐๐ฟ๐ฒ-๐๐ ๐๐ป๐ณ๐น๐๐ฒ๐ป๐ฐ๐ฒ๐ฑ ๐ฃ๐ผ๐น๐ถ๐ฐ๐ ๐ข๐ฝ๐๐ถ๐บ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป) - a method that tackles this directly.
https://lnkd.in/dKQyMSF2
https://lnkd.in/dw84CCPN
Instead of rewarding entire trajectories, FIPO assigns ๐๐ผ๐ธ๐ฒ๐ป-๐น๐ฒ๐๐ฒ๐น ๐ฐ๐ฟ๐ฒ๐ฑ๐ถ๐ ๐ฏ๐ฎ๐๐ฒ๐ฑ ๐ผ๐ป ๐ณ๐๐๐๐ฟ๐ฒ ๐ถ๐บ๐ฝ๐ฎ๐ฐ๐.
* Important reasoning steps โ amplified
* Irrelevant tokens โ suppressed
This turns RL from outcome optimization into process-aware optimization. The results show that reasoning length scales from ~4k to 10k+ tokens, and performance continues to improve rather than plateau.
Most methods improve reasoning by scaling models, better sampling, or better rewards. FIPO changes ๐ต๐ผ๐ ๐ฟ๐ฒ๐๐ฎ๐ฟ๐ฑ ๐ถ๐ ๐ฑ๐ถ๐๐๐ฟ๐ถ๐ฏ๐๐๐ฒ๐ฑ ๐ถ๐ป๐๐ถ๐ฑ๐ฒ ๐ฎ ๐๐ฟ๐ฎ๐ท๐ฒ๐ฐ๐๐ผ๐ฟ๐.
hashtag#MachineLearning hashtag#LLM hashtag#ReinforcementLearning hashtag#AIResearch hashtag#DeepLearning hashtag#GenerativeAI hashtag#AI
2
๐ New paper review: DroPE โ dropping positional embeddings to unlock long context in LLMs
https://lnkd.in/ejbn7u8c
Most long-context tricks (YaRN, NTK-RoPE, etc.) look good on perplexity but quietly fail when important information appears deep in the sequence.
DroPE proposes a solution:
๐ง positional embeddings help LLMs train
๐ซ but they actively hurt long-context generalization
โ๏ธ drop them after pretraining + short recalibration โ strong zero-shot long context
In this review, I break down:
- why NoPE underperforms despite equal expressivity
- why RoPE scaling inevitably distorts semantic attention
- how DroPE gets the best of both worlds
- and why this is surprisingly cheap to apply even to trillion-token models
hashtag#LLM hashtag#Transformers hashtag#DeepLearning hashtag#NLP hashtag#Attention hashtag#Scaling hashtag#Pretraining hashtag#PaperReview hashtag#SOTA
https://lnkd.in/eBbVy2MD
My post on medium:
https://lnkd.in/eP37MZTC
3. Just published a deep-dive review of KIMI K2.5 โ one of the architecturally interesting agentic multimodal models released recently.
https://lnkd.in/dTXEU7CE
Two things stood out:
โข ๐ง Joint textโvision training done properly (not late fusion)
โข โ๏ธ Agent Swarm: learned parallel agent execution with ~4.5x lower latency
What I found especially interesting: outcome-based visual RL improves text-only reasoning โ not something we usually see.
It seems that the parallel agents are the next real scaling axis โ Anthropic already adopted them!
https://lnkd.in/d4pJXw6B
My post on Medium:
https://lnkd.in/dsNAK4BT