标签 - 论文阅读 - 滑滑蛋的个人博客

03-02

【论文阅读】GLM-5:from Vibe Coding to Agentic Engineering

01-09

【论文阅读】Efficient Memory Management for Large Language Model Serving with PagedAttention（vLLM论文）

12-28

【论文阅读】ByteScale:Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000

12-22

【论文阅读】ScheMoE:An Extensible Mixture-of-Experts Distributed Training System with Tasks Scheduling

12-13

【论文阅读】The Llama 3 Herd of Models（Section 3 Pre-Training）

12-08

【论文阅读】Reducing Activation Recomputation in Large Transformer Models

12-07

【论文阅读】Megatron-LM论文阅读

08-18

【论文阅读】{MegaScale}:Scaling Large Language Model Training to More Than 10,000 {GPUs}

08-17

【论文阅读】Fluid:Dataset Abstraction and Elastic Acceleration for Cloud-native Deep Learning Training Jobs

05-01

【论文阅读】Gödel:Unified Large-Scale Resource Management and Scheduling at ByteDance