<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>RL on CheaSim Blog</title>
    <link>https://www.cheasim.com/tags/rl/</link>
    <description>Recent content in RL on CheaSim Blog</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 24 Jun 2026 21:30:00 +0000</lastBuildDate>
    <atom:link href="https://www.cheasim.com/tags/rl/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Skill0：训练时借技能，推理时把技能撤掉</title>
      <link>https://www.cheasim.com/2026/06/24/skill0-arxiv-2604-02268/</link>
      <pubDate>Wed, 24 Jun 2026 21:30:00 +0000</pubDate>
      <guid>https://www.cheasim.com/2026/06/24/skill0-arxiv-2604-02268/</guid>
      <description>&lt;p&gt;前两篇刚好写了 &lt;a href=&#34;https://www.cheasim.com/2026/06/21/sga-mcts-arxiv-2604-14712/&#34;&gt;SGA-MCTS&lt;/a&gt; 和 &lt;a href=&#34;https://www.cheasim.com/2026/06/22/skillx-arxiv-2604-04804/&#34;&gt;SkillX&lt;/a&gt;，今天这篇 &lt;a href=&#34;https://arxiv.org/abs/2604.02268&#34;&gt;SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization&lt;/a&gt; 正好能把这条线补完整。&lt;/p&gt;
&lt;p&gt;如果说 SGA-MCTS 和 SkillX 都是在讨论“怎么把 Agent 经验放到外部系统里”，那 Skill0 问的是另一个更狠的问题：外部技能库能不能只在训练时用，最后把技能内化到模型参数里，让 Agent 测试时不再依赖 runtime skill retrieval？&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
