– Andrey Lukyanenko

Write an engaging linkedin post with relevant emoji and hashtags promoting my blogpost. Examples: 1 𝗙𝗜𝗣𝗢: 𝗧𝗲𝗮𝗰𝗵𝗶𝗻𝗴 𝗟𝗟𝗠𝘀 𝗪𝗵𝗶𝗰𝗵 𝗧𝗵𝗼𝘂𝗴𝗵𝘁𝘀 𝗔𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝗠𝗮𝘁𝘁𝗲𝗿 Most current approaches (GRPO, DAPO, RLVR) optimize for the final answer. But they ignore a key question: 𝗪𝗵𝗶𝗰𝗵 𝗽𝗮𝗿𝘁𝘀 𝗼𝗳 𝘁𝗵𝗲 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝗺𝗮𝘁𝘁𝗲𝗿𝗲𝗱? I just published a review of 𝗙𝗜𝗣𝗢 (𝗙𝘂𝘁𝘂𝗿𝗲-𝗞𝗟 𝗜𝗻𝗳𝗹𝘂𝗲𝗻𝗰𝗲𝗱 𝗣𝗼𝗹𝗶𝗰𝘆 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻) - a method that tackles this directly. https://lnkd.in/dKQyMSF2 https://lnkd.in/dw84CCPN Instead of rewarding entire trajectories, FIPO assigns 𝘁𝗼𝗸𝗲𝗻-𝗹𝗲𝘃𝗲𝗹 𝗰𝗿𝗲𝗱𝗶𝘁 𝗯𝗮𝘀𝗲𝗱 𝗼𝗻 𝗳𝘂𝘁𝘂𝗿𝗲 𝗶𝗺𝗽𝗮𝗰𝘁. * Important reasoning steps → amplified * Irrelevant tokens → suppressed This turns RL from outcome optimization into process-aware optimization. The results show that reasoning length scales from ~4k to 10k+ tokens, and performance continues to improve rather than plateau. Most methods improve reasoning by scaling models, better sampling, or better rewards. FIPO changes 𝗵𝗼𝘄 𝗿𝗲𝘄𝗮𝗿𝗱 𝗶𝘀 𝗱𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗶𝗻𝘀𝗶𝗱𝗲 𝗮 𝘁𝗿𝗮𝗷𝗲𝗰𝘁𝗼𝗿𝘆. hashtag#MachineLearning hashtag#LLM hashtag#ReinforcementLearning hashtag#AIResearch hashtag#DeepLearning hashtag#GenerativeAI hashtag#AI 2 🚀 New paper review: DroPE — dropping positional embeddings to unlock long context in LLMs https://lnkd.in/ejbn7u8c Most long-context tricks (YaRN, NTK-RoPE, etc.) look good on perplexity but quietly fail when important information appears deep in the sequence. DroPE proposes a solution: 🧠 positional embeddings help LLMs train 🚫 but they actively hurt long-context generalization ✂️ drop them after pretraining + short recalibration → strong zero-shot long context In this review, I break down: - why NoPE underperforms despite equal expressivity - why RoPE scaling inevitably distorts semantic attention - how DroPE gets the best of both worlds - and why this is surprisingly cheap to apply even to trillion-token models hashtag#LLM hashtag#Transformers hashtag#DeepLearning hashtag#NLP hashtag#Attention hashtag#Scaling hashtag#Pretraining hashtag#PaperReview hashtag#SOTA https://lnkd.in/eBbVy2MD My post on medium: https://lnkd.in/eP37MZTC 3. Just published a deep-dive review of KIMI K2.5 — one of the architecturally interesting agentic multimodal models released recently. https://lnkd.in/dTXEU7CE Two things stood out: • 🧠 Joint text–vision training done properly (not late fusion) • ⚙️ Agent Swarm: learned parallel agent execution with ~4.5x lower latency What I found especially interesting: outcome-based visual RL improves text-only reasoning — not something we usually see. It seems that the parallel agents are the next real scaling axis — Anthropic already adopted them! https://lnkd.in/d4pJXw6B My post on Medium: https://lnkd.in/dsNAK4BT