转载:2025 我最喜欢的 LLM x AI 论文集

2025 我最喜欢的 LLM x AI 论文集

2025年快过去了。

这一年,我几乎每天都有坚持读论文和分享论文, 获益匪浅。

下面是我整理的个人年度最喜欢论文集:

1、Test Time Scaling

  • “Thinking” 到底是什么?

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

s1: Simple test-time scaling

2. Efficient Reasoning

  • “如果思考一种budget”

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

Scalable Chain of Thoughts via Elastic Reasoning

3. Reasoning Analysis

  • 该如何更好理解 “Reasoning” ?

(How) Do reasoning models reason?

DeepSeek-R1 Thoughtology: Let’s about LLM reasoning

Rethinking Reflection in Pre-Training

Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction

4. CLI Agent

  • 2024年文章,但深度影响了2025。

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

5. LLM X RL Agentic

  • 为此,我创造了一个词,“协议token”

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning

6. Parallel Reasoning

  • 并行 reasoning,超越单线程的想象!

Learning Adaptive Parallel Reasoning with Language Models

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding

7. RL X Reasoning,

  • RLVR 这四篇,给 RL X Reasoning 极速升温+降温!

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Learning to Reason without External Rewards

Spurious Rewards: Rethinking Training Signals in RLVR

8. Agent, interaction Deep research agent

  • 经验,记忆,交互,workflow,围绕

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Alita-G: Self-Evolving Generative Agent for Agent Generation

Sleep-time Compute: Beyond Inference Scaling at Test-time

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

WebDancer: Towards Autonomous Information Seeking Agency

9. Risk modeling X LLM

  • “让贝叶斯再次伟大!”

Model Predictive Task Sampling for Efficient and Robust Adaptation

Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?

10. Multi-Agentic

  • Multi Agent该怎么用?

Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning

CodeContests+: High-Quality Test Case Generation for Competitive Programming

11. Sentient Agent

  • AI + 人文,情怀满满!

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models

12: LLM security and alignment

  • 想要懂你真的不容易, LLM!

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Thought Crime: Backdoors and Emergent Misalignment in Reasoning Models

Natural Emergent Misalignment from Reward Hacking in Production RL

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

13. Model Steering

  • 我要控制我自己,LLM!

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

Persona Vectors: Monitoring and Controlling Character Traits in Language Models

How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models

最后,这个 list 是我的个人选择,因为时间精力有限,很多很好的工作,或许我都没有机会读到,感谢论文的作者们,带我领略智慧的风光!

2025年快过去了,我很怀念它!

2026年,会继续每天把我看到的最优秀的工作介绍给大家,很多深度的分析会专门放在订阅内容中,欢迎你成为我的订阅用户,我们一起成长!

转载原文链接:2025 我最喜欢的 LLM x AI 论文集

目录