The ML Engineer logo

The ML Engineer

Archives
Subscribe
January 7, 2026

The ML Engineer 07-01-2026

📝 Engineering Notes

DGX Spark: Hello World (svnscha​.de). NGDGX Spark brings a personal AI supercomputer with 128GB memory; exploring Ollama, LibreChat, ComfyUI, and Stable Diffusion courses on Blackwell hardware

2025-12-31: From Tables to Triumph: A PhD Journey in Uncertainty-Aware Scientific Data Extraction (ws-dl​.blogspot​.com). PhD journey in uncertainty-aware scientific table data extraction using TTA-m, TSR-OCR-UQ, SciTableQA, and SCITEUQ at ODU with Dr. Jian Wu

2025: Career in Review (sajalsharma​.com). Year in AI product building, teaching courses, and viral blogging with LangGraph, Liminal, Claude Code, and multi-agent systems

Looking back at 2025 (blog​.lawrencejones​.dev). Lawrence Jones chronicles incident.io's AI SRE rise: Series B, telemetry push, 18 engineers, Sev0 demo, and hands-on tooling and backtests

🚦 ML/LLMOps

Drift Detection in Robust Machine Learning Systems (towardsdatascience​.com). Drift detection in ML systems using data and concept drift, KS test, PSI, chi-square, autoencoder-based multivariate checks, with practical guidance

Why 90% of AI Agents Never Leave the Demo (mikulskibartosz​.name). Practical AI engineering guide exploring three production pitfalls and how to build reliable AI pilots with abstractions, metrics, and real-data testing

Actor Mesh: Enterprise Architecture for Scalable AI Engineering (blog​.kodigy​.com). A practical guide to scalable AI engineering using Actor Model, distributed sagas, Kubernetes, and Monotonic content enrichment

The Control Layers of AI (dri​.es). Deterministic workflows, AI decisioning, and open-source orchestration tools like n8n and Activepieces for reliable enterprise AI

Kubernetes v1.35: New level of efficiency with in-place Pod restart (kubernetes​.io). Kubernetes 1.35 introduces in-place Pod restart (alpha), enabling full Pod restarts to reset state for AI/ML workloads

Getting metrics by logging (natemeyvis​.com). Using logs to emit metrics in AWS CloudWatch with Python, refactoring for testability and performance benefits

⚙️ PyTorch, LLMs & GPUs

Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models (rocm​.blogs​.amd​.com). Batch-level data parallelism for vision encoders in vLLM boosts throughput on AMD MI300X using a one-line --mm-encoder-tp-mode data switch

Optimizing Data Transfer in AI/ML Workloads (towardsdatascience​.com). explores data-transfer bottlenecks in AI/ML workloads using NVIDIA Nsight Systems and PyTorch, with CUDA streams and prefetching

Inspecting and Visualizing Torch FX Graph (leimao​.github​.io). Inspect and visualize Torch FX graphs and ATen IR with PyTorch 2.x, using FxGraphDrawer, TorchFunctionMode, and TorchDispatchMode

Train Your Large Model on Multiple GPUs with Tensor Parallelism (machinelearningmastery​.com). Tensor parallelism for large transformers on multi-GPU systems using PyTorch, with TP plans, DTensor, and 2D parallelism concepts

Parameter-efficient fine-tuning in tinygrad (dxuuu​.xyz). Parameter-efficient fine-tuning with LoRA in tinygrad on Llama 3.2 1B, exploring implementation and inference

SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning (rocm​.blogs​.amd​.com). SparK enables query-aware unstructured KV cache pruning and recoverable channels to compress LLM KV caches on AMD Instinct GPUs

Finding Hotspots in Your Code with the Intel VTune Command-Line Interface (nas​.nasa​.gov). Profiling NVIDIA Intel VTune hotspots via command-line for serial, Python, OpenMP, and MPI with MPT on NASA HECC clusters

Zhang et al (2024) TinyLlama (adrian​.idv​.hk). TinyLlama 1.1B trains on SlimPajama-derived data to outperform larger models; uses Llama 2 architecture, FSDP, FlashAttention, xFormer; reports 24K tokens/s per A100-40G

🎯 Applications

Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc. (exopriors​.com). Semantic search over alignment documents with Claude Code, vector mixing, debiasing, and public/private embeddings for arXiv, Hacker News, and More

Not just numbers: Understanding cities through their words (gisagents​.org). Explores how housing, tourism, digital economy, and government reports reveal city dynamics using online reviews and social data

Machine Learning for Optimization: Toy Example (juanitorduz​.github​.io). Brute-force ML optimization in Python using HistGradientBoostingRegressor to tune bids, with Nelder-Mead/Powell, and visualization of x1/x2 synthetic data

When Kevin Malone meets Claude (koaning​.io). Concise guide to HuberRegressor in scikit-learn, its robust loss, epsilon tuning, and practical usage

How Netflix Uses Matrix Factorization to Predict Your Next Favorite Movie (journal​.hexmos​.com). Explains Matrix Factorization with latent factors and SGD for sparse Netflix-like data using Python vectors

Generating Human Faces with Variational Autoencoders (mayberay​.bearblog​.dev). Variational autoencoders (VAEs) explored: KL-divergence, reparameterization, MNIST digits, and convolutional faces using PyTorch

⏱️ Forecasting & State Space

Forecasting Hierarchical Models - Part III (juanitorduz​.github​.io). Hybrid deep state-space forecasting in NumPyro with Flax NNX, station embeddings, and SVI in Python

Python examples for ‘Beyond Nelson-Siegel and splines: A model- agnostic Machine Learning framework for discount curve calibration, interpolation and extrapolation’ (thierrymoudiki​.github​.io). Python examples using yieldcurveml to interpolate and extrapolate discount curves with Laguerre and Cubic bases in ML models

Forecasting benchmark: Dynrmf (a new serious competitor in town) vs Theta Method on M-Competitions and Tourism competitition (thierrymoudiki​.github​.io). A benchmarking study comparing Dynrmf and Theta Method on M3, M1, and Tourism datasets using R, parallel processing, and standard accuracy metrics

Modelling time to next reported fault (shape-of-code​.com). Explores modelling time to next fault reports using Poisson processes, exponential interarrival times, and user activity effects in software fault prediction

🧪 Causal & Statistical Evidence

Testing Super Learner's Coverage - A Note To Myself (kenkoonwong​.com). Explores SuperLearner with TMLE in R, comparing XGBoost, Random Forest, GLM, NNLS, and parallel computation

Effects of Algorithmic Flagging on Fairness: Quasi-experimental Evidence from Wikipedia (mako​.cc). Quasi-experimental analysis using regression discontinuity to study algorithmic flagging effects on Wikipedia fairness across editor groups

Willful Incompetence: Repeating False Claims Does Not Make them True (replicationindex​.com). Schimmack defends z-curve, critiques Pek et al., discusses EDR, LLN, bias, and meta-analysis in emotion psychology with R-like statistical nuance

🧠 Representation & Vision

Neural Networks: Zero to Hero (karpathy​.ai). A practical guide to neural networks with PyTorch and Python, detailing concepts from linear models to deep learning, including backpropagation and optimization

Introducing Brand New Face Recognition in DeepFace (sefiks​.com). DeepFace adds register, build_index, and search to enable scalable, stateless face recognition with databases and ANN

Human embeddings (trfetzer​.com). Explores human embeddings for team design using 1024D vector space, open-source embeddings, GPT-OSS, and clustering to form diverse groups

Beating BERT? Small LLMs vs Fine-Tuned Encoders for Classification (alex-jacobs​.com). 32 experiments compare small LLMs to fine-tuned encoders like BERT/DeBERTa for classification, revealing nuanced performance and throughput insights

The 1,000 neuron challenge (thetransmitter​.org). Small neural models in a 1,000-neuron Braincraft competition explore energy-efficient AI with Python/GitHub, featuring Rougier and Churchland

🧲 Physics & Spectra

Robustness, interpretability, and scaling of eigenvalue models (alexshtf​.github​.io). Robustness, interpretability, and scaling of eigenvalue models using PyTorch on real data (California Housing), examining spectral bounds and feature importance

DeepSeek’s mHC Explained: Manifold-Constrained Hyper-Connections (aipapersacademy​.com). Explains DeepSeek’s mHC: manifold-constrained hyper-connections that stabilize and enhance residual streams in LLMs using doubly stochastic mixing via Sinkhorn–Knopp

The Physics of mHC: Why Deep Learning Needs Energy Conservation (toooold​.com). Conservation-inspired mHC uses doubly stochastic matrices and Sinkhorn iterations to build energy-conservative neural layers in Python-like reasoning

The Transformer as Renormalization Group Flow (symmetrybroken​.com). Transformer attention as a Bayesian RG flow: coarse-graining tokens to semantic attractors using neural networks, physics-inspired interpretation

Information entropy for spectra (nirpyresearch​.com). Shannon entropy applied to NIR spectra with Python (SciPy, NumPy); derivative entropy for smoothing optimization, inspired by K. G. Larkin

Limits of the Transformer Architecture and a QCD-like Alternative (symmetrybroken​.com). A physics-inspired critique of transformers, exploring UV/IR limits, QCD analogies, and multi-scale architectures for cognition using Python-like pseudocode

📚 Research

Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice (arxiv:cs). Shows small proxy runs can mis-rank data recipes because optimal hyperparameters depend on data. Lower learning rates make rankings correlate with tuned large-scale pretraining better

Analyzing Communication Predictability in LLM Training (arxiv:cs). Characterizes predictable communication patterns in distributed LLM training and models overhead analytically. ConfigTuner uses the model to pick parallelism settings, boosting throughput up to 1.36×

Disordered Dynamics in High Dimensions: Connections to Random Matrices and Machine Learning (arxiv:stat). Pedagogical dynamical mean-field theory for high-dimensional random-matrix learning dynamics: gradient flow, SGD, random features, deep linear nets. Explains loss nonmonotonicity and bias–variance scaling for practitioners

Conformal Prediction Under Distribution Shift: A COVID-19 Natural Experiment (arxiv:stat). Uses COVID-19 supply-chain shifts to stress-test conformal prediction; coverage collapses unpredictably. Finds SHAP importance concentration predicts failures and suggests monitoring plus quarterly retraining triggers operationally

Don't miss what's next. Subscribe to The ML Engineer:

Add a comment:

Share this email:
Share on LinkedIn Share on Hacker News Share on Mastodon Share on Bluesky
Bluesky
Mastodon
LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.