Skip to main content
COMP 620
GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Materials

Subtopic 1: I/O Aware & Exact Attention

PaperLink
FlashAttention 1, 2, 3PDF, PDF, PDF
PagedAttention (vLLM)PDF
SGLangPDF
FlexAttentionPDF
FlashInferPDF
SpargeAttentionPDF
SageAttention 1,2PDF, PDF

Subtopic 2: Sparse Attention

PaperLink
DejaVuPDF
H2OPDF
SpAttnPDF
MoEPDF
Deepseek-MoEPDF

Subtopic 3: Kernel Generation & Compiler

PaperLink
TVMPDF
AnsorPDF
MLIRPDF

Subtopic 4: Execution Optimization/Serving

PaperLink
AlpaPDF
OrcaPDF
FlexGenPDF
ZeRO-OffloadingPDF
Megatron-LMPDF
FlashDecoding++PDF
SarathiServePDF

Chapter II: Efficient LLM

Subtopic 1: LLM 101

PaperLink
Attention is All You NeedPDF
BERTPDF
GPT-3PDF
Scaling LawsPDF
RLHFPDF
PPO/DPOPDF , PDF

Subtopic 2: Efficient Inference & Long-context

PaperLink
Streaming LLM & DuoAttentionPDF, PDF
MInferencePDF
H2OPDF
TOVA/KIVIPDF, PDF
Speculative DecodingPDF, PDF
Multi-token prediction: Deepseek-v3PDF

Subtopic 3: Model Compression (Quant & Pruning)

PaperLink
LLM.int8()/GPTQPDF, PDF
AWQPDF
LLM PrunerPDF
ShearedLlamaPDF

Subtopic 4: Efficient Training

PaperLink
ZeROPDF
Megatron-LMPDF
LoRA & QLoRAPDF, PDF

Subtopic 5: Efficient Model Designs

PaperLink
Swtich Transformers/Outrageously Large Neural NetworksPDF, PDF
MLA AttentionPDF
MambaPDF

Chapter III: Video Generation

Subtopic 1: SOTA/Baseline Model

PaperLink
CogVideoXPDF
HunyuanVideoPDF
WANPDF
Seaweed-7BPDF

Subtopic 2: Optimization Techniques

PaperLink
PruningUniCP
CachePDF, need to add more..
CompressionPDF
SparsityPDF

Subtopic 3: Long Video Generation

PaperLink
Tuning-Free Multi-Event Long Video GenerationPDF
Long Context Tuning for Video GenerationPDF
One-Minute Video Generation with Test-Time TrainingPDF
SKYREELS-V2: INFINITE-LENGTH FILM GENERATIVE MODELPDF

Subtopic 4: Video Super Resolution

PaperLink
SeedVRPDF
MGLD-VSRPDF
DynamicScalerPDF

Chapter IV: Secure LLM

Subtopic 1: Diffusion Model/Flow Matching

Subtopic 2: Watermarking

Subtopic 3: Efficient CNN

Subtopic 4: Encryption


Chapter V: MLLM Video Understanding

Subtopic 1: SOTA/Baseline

Subtopic 2: Optimization Techniques

Subtopic 3: Algorithm Design