归档

共 61 篇文章

Interconnect

2026-05-14 互联网络（二）：伴随通信算子
2026-04-30 互联网络（一）：集合通信原语

备忘录

Model Parallelism

大模型显存和flops分析

2026-05-12 Transformer 模型 GPU 显存分析（三）：反向传播需要保存哪些中间结果？
2026-05-11 Transformer 模型 GPU 显存分析（二）：推理
2026-05-03 Transformer 模型 GPU 显存分析（一）：训练

Gpu逆向工程

训练优化

Triosim模拟器

大模型架构

2026-06-01 DeepSeek-V3（一）：稀疏注意力机制（DSA）
2026-05-30 DeepSeek-V2（二）：MLA
2026-05-30 MoE 负载均衡损失的数学推导：从损失函数下界到 MoE 可微辅助损失
2026-05-30 位置编码（一）：旋转位置编码 RoPE
2026-05-29 DeepSeek-V2（一）：DeepSeekMoE
2026-05-25 Transformer 架构：Scaled Dot-Product Attention 的缩放因子推导

Llm推理框架

2026-06-02 AI Infra 之 LLM 推理优化学习路线

Rlhf

2026-06-02 RLHF 学习路线
2026-06-02 翻译：HybridFlow: A Flexible and Efficient RLHF Framework（未完成）

Api中转

2026-06-22 CPAMC-API中转

思路

2026-06-22 等待实现的思路

Agent Memory

2026-06-26 MemoryBank: Enhancing Large Language Models with Long-Term Memory
2026-06-25 Reflexion: Language Agents with Verbal Reinforcement Learning
2026-06-24 Generative Agents: Interactive Simulacra of Human Behavior
2026-06-24 Improving language models by retrieving from trillions of tokens
2026-06-23 Lost in the Middle: How Language Models Use Long Contexts
2026-06-23 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
2026-06-23 目录

数学原理