Papers and books I am reading, with short notes and a relevance rating
(out of 5). Click a title for my notes; "source" links to the original.
S. L. Brunton, J. N. Kutz · Textbook, 2nd ed., Cambridge University Press · β
β
β
β
Β½4.5/5
The canonical Brunton-Kutz textbook tying SVD, sparsity, ML, dynamical systems, control, ROMs, and physics-informed ML into one toolkit.
notes · source
Jacob Austin, Sholto Douglas, Roy Frostig, Anselm Levskaya, Charlie Chen, Reiner Pope, et al. (Google DeepMind) · Online book (jax-ml.github.io/scaling-book), Google DeepMind · β
β
β
β
Β½4.5/5
A first-principles, roofline-driven playbook for scaling Transformer training and inference across thousands of TPUs (and GPUs).
notes · source
Kevin P. Murphy · arXiv 2024 (v5, Dec 2025) · β
β
β
β
Β½4.5/5
Kevin Murphy's 250-page modern RL monograph: value/policy/model-based, multi-agent, and a deep LLMs-and-RL chapter (RLHF, RLVR, PPO/GRPO/DPO).
notes · source
Zhili Li, Kangyang Chai, Zhihao Wang, Xiaowei Jia, Yanhua Li, Gengchen Mai, Sergii Skakun, Dinesh Manocha, Yiqun Xie · arXiv 2026 (v2); under review at IEEE TPAMI · β
β
β
β
β4/5
A remote-sensing super-resolution benchmark that scores SR models by downstream task utility, not just PSNR/SSIM.
notes · notes (v1) · source
R. Taft, I. Sharif, A. Matei, N. VanBenschoten, J. Lewis, et al. (Cockroach Labs) · SIGMOD 2020 (Industry Track) · β
β
β
β
β4/5
How CockroachDB delivers serializable, geo-distributed SQL transactions on commodity clouds without atomic clocks.
notes · source
G. Dexter, S. Tang, A. Fatahi Baarzi, Q. Song, T. Dharamsi, A. Gupta (LinkedIn; Nubank) · NeurIPS 2025 (Poster); also arXiv 2502.04677 · β
β
β
β
β4/5
Theory + algorithm for scheduling LLM queries under prefix-cache reuse and TTFT limits; k-LPM cuts P99 latency vs FCFS/LPM.
notes · source
K. Behdin, A. Fatahibaarzi, Q. Song, Y. Dai, A. Gupta, Z. Wang, et al. (LinkedIn / MIT) · EMNLP 2025 (Industry Track) · β
β
β
β
β4/5
LinkedIn's playbook for shrinking a 100B+ RecSys LLM 20x via distillation, structured pruning, and FP8 quant, then serving it fast.
notes · source
X. Ning, K. Tieu, D. Fu, T. Wei et al. (UIUC, Meta, Stanford); senior authors H. Tong, J. He, T. Zhang · arXiv 2026 (v1), ~102pp survey · β
β
β
Β½β3.5/5
Survey reframing code as the operational "harness" for LLM agents: reasoning, acting, state, verification, and multi-agent coordination.
notes · source
Yun Dai, Tejas Dharamsi, Byron Hsu, Tao Song, Hamed Firooz (LinkedIn) · ICML 2024 ES-FoMo workshop (short paper), PMLR 235 · β
β
β
Β½β3.5/5
Finds and fixes a GPU race condition in ZeRO++ hpZ that silently breaks 40B-70B LLM training on low-bandwidth clusters.
notes · source
Gemini Embedding Team, Google DeepMind (M. Shanbhogue, Z. Li, S. Zhang, G. HernΓ‘ndez Γbrego, et al.) · arXiv 2026 (v1), tech report · β
β
β
Β½β3.5/5
Google's native multimodal embedding model: one Gemini-initialized encoder maps text, image, video, and audio into a shared vector space.
notes · source
G. Winata, F. Hudi, P. A. Irawan, D. Anugraha, R. A. Putri et al. (60+ authors; senior authors incl. D. I. Adelani, A. Oh, A. F. Aji, T. Watanabe, C.-W. Ngo) · arXiv 2024 (v5, May 2025); NAACL 2025 · β
β
Β½ββ2.5/5
1M-sample VQA benchmark probing whether vision-language models recognize dishes and origins across 30 languages and 189 countries.
notes · source