Write an engaging linkedin post with relevant emoji and hashtags promoting my blogpost. Examples: 1 ๐—™๐—œ๐—ฃ๐—ข: ๐—ง๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ถ๐—ป๐—ด ๐—Ÿ๐—Ÿ๐— ๐˜€ ๐—ช๐—ต๐—ถ๐—ฐ๐—ต ๐—ง๐—ต๐—ผ๐˜‚๐—ด๐—ต๐˜๐˜€ ๐—”๐—ฐ๐˜๐˜‚๐—ฎ๐—น๐—น๐˜† ๐— ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ Most current approaches (GRPO, DAPO, RLVR) optimize for the final answer. But they ignore a key question: ๐—ช๐—ต๐—ถ๐—ฐ๐—ต ๐—ฝ๐—ฎ๐—ฟ๐˜๐˜€ ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ผ๐—ป๐—ถ๐—ป๐—ด ๐—ฎ๐—ฐ๐˜๐˜‚๐—ฎ๐—น๐—น๐˜† ๐—บ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ๐—ฒ๐—ฑ? I just published a review of ๐—™๐—œ๐—ฃ๐—ข (๐—™๐˜‚๐˜๐˜‚๐—ฟ๐—ฒ-๐—ž๐—Ÿ ๐—œ๐—ป๐—ณ๐—น๐˜‚๐—ฒ๐—ป๐—ฐ๐—ฒ๐—ฑ ๐—ฃ๐—ผ๐—น๐—ถ๐—ฐ๐˜† ๐—ข๐—ฝ๐˜๐—ถ๐—บ๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป) - a method that tackles this directly. https://lnkd.in/dKQyMSF2 https://lnkd.in/dw84CCPN Instead of rewarding entire trajectories, FIPO assigns ๐˜๐—ผ๐—ธ๐—ฒ๐—ป-๐—น๐—ฒ๐˜ƒ๐—ฒ๐—น ๐—ฐ๐—ฟ๐—ฒ๐—ฑ๐—ถ๐˜ ๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ผ๐—ป ๐—ณ๐˜‚๐˜๐˜‚๐—ฟ๐—ฒ ๐—ถ๐—บ๐—ฝ๐—ฎ๐—ฐ๐˜. * Important reasoning steps โ†’ amplified * Irrelevant tokens โ†’ suppressed This turns RL from outcome optimization into process-aware optimization. The results show that reasoning length scales from ~4k to 10k+ tokens, and performance continues to improve rather than plateau. Most methods improve reasoning by scaling models, better sampling, or better rewards. FIPO changes ๐—ต๐—ผ๐˜„ ๐—ฟ๐—ฒ๐˜„๐—ฎ๐—ฟ๐—ฑ ๐—ถ๐˜€ ๐—ฑ๐—ถ๐˜€๐˜๐—ฟ๐—ถ๐—ฏ๐˜‚๐˜๐—ฒ๐—ฑ ๐—ถ๐—ป๐˜€๐—ถ๐—ฑ๐—ฒ ๐—ฎ ๐˜๐—ฟ๐—ฎ๐—ท๐—ฒ๐—ฐ๐˜๐—ผ๐—ฟ๐˜†. hashtag#MachineLearning hashtag#LLM hashtag#ReinforcementLearning hashtag#AIResearch hashtag#DeepLearning hashtag#GenerativeAI hashtag#AI 2 ๐Ÿš€ New paper review: DroPE โ€” dropping positional embeddings to unlock long context in LLMs https://lnkd.in/ejbn7u8c Most long-context tricks (YaRN, NTK-RoPE, etc.) look good on perplexity but quietly fail when important information appears deep in the sequence. DroPE proposes a solution: ๐Ÿง  positional embeddings help LLMs train ๐Ÿšซ but they actively hurt long-context generalization โœ‚๏ธ drop them after pretraining + short recalibration โ†’ strong zero-shot long context In this review, I break down: - why NoPE underperforms despite equal expressivity - why RoPE scaling inevitably distorts semantic attention - how DroPE gets the best of both worlds - and why this is surprisingly cheap to apply even to trillion-token models hashtag#LLM hashtag#Transformers hashtag#DeepLearning hashtag#NLP hashtag#Attention hashtag#Scaling hashtag#Pretraining hashtag#PaperReview hashtag#SOTA https://lnkd.in/eBbVy2MD My post on medium: https://lnkd.in/eP37MZTC 3. Just published a deep-dive review of KIMI K2.5 โ€” one of the architecturally interesting agentic multimodal models released recently. https://lnkd.in/dTXEU7CE Two things stood out: โ€ข ๐Ÿง  Joint textโ€“vision training done properly (not late fusion) โ€ข โš™๏ธ Agent Swarm: learned parallel agent execution with ~4.5x lower latency What I found especially interesting: outcome-based visual RL improves text-only reasoning โ€” not something we usually see. It seems that the parallel agents are the next real scaling axis โ€” Anthropic already adopted them! https://lnkd.in/d4pJXw6B My post on Medium: https://lnkd.in/dsNAK4BT