About

我是一名有 6 年以上 Python 开发经验的软件工程师，正在从应用开发系统性转向 LLM inference systems 与 AI infrastructure。

Current direction

关注推理服务优化、KV Cache 生命周期、Prefix Cache 命中、KV offloading、cache-aware routing 与可复现实验方法。