<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>ELL3N HSU Technical Notes</title><description>A personal technical site for LLM inference systems, AI infrastructure notes, experiments, and projects.</description><link>http://localhost:4321/</link><item><title>Prefill vs Decode: LLM Inference 的两个阶段</title><link>http://localhost:4321/notes/prefill-vs-decode/</link><guid isPermaLink="true">http://localhost:4321/notes/prefill-vs-decode/</guid><description>理解 LLM 推理中 prefill 和 decode 的区别，以及为什么 prefill 更适合 batching。</description><pubDate>Sun, 21 Jun 2026 00:00:00 GMT</pubDate></item><item><title>KV Cache 为什么会吃显存？</title><link>http://localhost:4321/notes/kv-cache-memory/</link><guid isPermaLink="true">http://localhost:4321/notes/kv-cache-memory/</guid><description>梳理 KV Cache 的数据结构、显存估算方式，以及长上下文为什么会放大问题。</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Prefix Cache 命中率如何影响 TTFT？</title><link>http://localhost:4321/notes/prefix-cache-ttft/</link><guid isPermaLink="true">http://localhost:4321/notes/prefix-cache-ttft/</guid><description>分析 Prefix Cache 命中与未命中对首 token 延迟的影响，并记录后续 benchmark 计划。</description><pubDate>Fri, 19 Jun 2026 00:00:00 GMT</pubDate></item></channel></rss>