The ML Engineer logo

The ML Engineer

Archives
Subscribe
December 9, 2025

The ML Engineer 09-12-2025

ML practices, fairness debates, future AI architectures, perception science

🔧 Company Engineering Blogs

Getting from tested to battle-tested (blog​.janestreet​.com). Jane Street details Aria testing, Antithesis end-to-end chaos testing, and lessons from battle-testing distributed systems using OCaml-centric tooling

The Hidden Cost of Convenience: Rethinking Old ORM Patterns for Scale (eng​.wealthfront​.com). Wealthfront rewrites an old ORM-heavy balance system using modern architecture to cut runtime and scale efficiently

How Agentforce Achieved 3–5x Faster Response Times While Solving Enterprise-Scale Architectural Complexity (engineering​.salesforce​.com). How Salesforce refactors deterministic and LLM tasks in Apex, reduces latency 75%, and deploys multi-brand Agentforce agents for tailored brand voice

We Got Claude to Fine-Tune an Open Source LLM (huggingface​.co). Demonstrates fine-tuning open-source LLMs with Hugging Face Skills to train Claude-like agents on Qwen3-0.6B using SFT, DPO, GRPO

Mob Programming: Smells Like Team Spirit (tech​.trivago​.com). Mob programming at trivago Intelligence: structured collaboration, quick onboarding, and improved PR flow using diverse team roles


📚 Applied ML Practice

Improving Water Quality Predictions with Machine Learning (eesa​.lbl​.gov). Ensemble machine learning improves river water temperature predictions in data-sparse regions using multiple model types and high-performance computing

Yes, You Need To Work Through Concrete Examples (justinmath​.com). Concrete calculations and bottom-up intuition build mastery in ML; avoid cargo-cult abstractions and push beyond gradient descent

How Thousands of Citizen Readers Helped Build the Largest Open-Vocabulary Dataset of Narrative Emotions (txtlab​.org). Open-vocabulary narrative emotion dataset built by 3,738 citizen readers on Zooniverse for 200k annotations across 43k passages, modeled with VAD and NRC emotions

Building a Bayesian Spam Classifier from First Principles (journal​.hexmos​.com). Bayesian spam filtering with Naive Bayes, Enron data, Python code, and CPT/Laplace smoothing explained


⚖️ Fairness & Causality

Willful Incompetence: Questionable Modeling Practices in Implicit Bias Research (replicationindex​.com). Schimmack critiques IAT practices, argues shared method variance inflates validity, promotes multimethod models and careful interpretation

Talk at the JFLI, at the NII (国立情報学研究所) in Tokyo (東京) (freakonometrics​.hypotheses​.org). Talk on counterfactual and transport-based methods for understanding indirect discrimination in algorithmic systems using causal reasoning and optimal transport

Is the Implicit Association Test Too Big To Fail? (replicationindex​.com). IAT validity challenged; latent models, method variance, and adversarial collaboration analyzed using psychometric critiques

Actuarial Pricing Discrimination and Fairness (freakonometrics​.hypotheses​.org). Actuarial pricing, fairness concepts, proxies, and regulatory tensions in discrimination-aware insurance modeling


🤖 Agent Architectures

Titans + MIRAS: Helping AI have long-term memory (research​.google). Titans and MIRAS enable long-term memory in AI with on-the-fly learning, surprise metrics, and deep memory modules

Glia: A Human-Inspired AI for Systems Design and Optimization (sigops​.org). Glia uses a human-inspired multi-agent AI to autonomously design and optimize AI infrastructure, including LLM serving routers, batch schedulers, and autoscalers

Scaling Agentic Inference Across Heterogeneous Compute with Zain Asgar - #757 (twimlai​.com). Gimlet Labs’ Zain Asgar discusses heterogeneous AI inference across diverse hardware including H100s, CPUs, and older GPUs, with a three-layer architecture

The Inverted Agent (jlowin​.dev). SEP-1577 enables MCP sampling with tools, flipping agent architecture to server-driven control using FastMCP and structured outputs


🎯 Perception & Sensing

Traffic Modeling Using Machine Learning (calendar​.perfplanet​.com). Predicts lab-to-field LCP relationships using Python, pandas, XGBoost with synthetic CrUX data and engineered features

Measuring What Actually Matters in Real-Time ADAS Perception (blog​.us​.fixstars​.com). Memory bandwidth and latency dominate real-time ADAS perception; highlights include edge compute constraints, dataflow optimization, and federated learning approaches

Grounding DINO: Open Vocabulary Object Detection on Videos (pyimagesearch​.com). Open vocabulary object detection on videos using Grounding DINO with Hugging Face and Gradio

Augmented reality meets neuroendoscopy (cs​.jhu​.edu). Hopkins researchers enable real-time 3D neuroendoscopy navigation with R2D2-E, AI-based feature tracking, and augmented visualization

From Waveforms to Wisdom: The New Benchmark for Auditory Intelligence (research​.google). Google Research introduces Massive Sound Embedding Benchmark (MSEB) to unify eight sound tasks, datasets like SVQ, and a robust evaluation framework


📡 Observability & Benchmarks

Five Things to Check Before You Trust AI-Generated Machine Learning Code (statisticalhorizons​.com). Practical five-check framework for evaluating AI-generated ML code across parameters, tuning, splits, metrics, and interpretation

How I Simplified LLM Telemetry Using Dual-Destination Observability Without Performance Degradation (blog​.mphomphego​.co​.za). Python-based dual-destination telemetry bridge using OpenTelemetry and Traceloop SDK for Instana and Langfuse with circuit breakers

How to Benchmark C++ Code? (codspeed​.io). Guided Google Benchmark-based C++ performance benchmarking with fixtures, parameters, and CI integration tips

A Guide to Web Application Monitoring (blog​.saeloun​.com). A practical guide to monitoring Rails API, React frontend, and PostgreSQL using metrics, logs, traces, and RUM


🛠️ Data Science Tools

Rogue Scholar is improving subject classification (Version 2) (blog​.front-matter​.de). OpenAlex/CWTS subject classification of Rogue Scholar posts using OpenAlex subfields and a machine learning classifier

Haskell IS a Great Language for Data Science (jcarroll​.com​.au). Haskell, dataHaskell, dataframe, and knitr integration showcase strong typing for data science workflows

QGIS to (Geo)Pandas – part 3 (anitagraser​.com). QGIS to GeoPandas uses QgsArrowIterator to stream features as Arrow batches with Python GeoPandas integration


🗃️ Data Systems & Vectors

Product Quantization (arpitbhayani​.me). Explains Product Quantization for compressing high-dimensional vectors, subspace coding, PQ codebooks, and distance computations with Python snippets

A complete guide to vector search (redis​.io). Vector search explained with encoding models, KNN to ANN, hybrid filtering, and hybrid search in a unified data platform context

Adaptive Query Optimizer for MariaDB Vector – Innovation Winner of MariaDB Python Hackathon 2025 (mariadb​.org). Innovation winner: adaptive query optimizer for MariaDB Vector using Python in Hackathon 2025 with Aakanksha Singh and Mihir Phalke

Polars in Aggregate: Polars Cloud, Streaming engine, and New Data Types (pola​.rs). Polars Cloud, streaming engine, and new Decimal and Int128 types empower scalable dataframes in Python and Rust

Apache Flink 2.2.0: Advancing Real-Time Data + AI and Empowering Stream Processing for the AI Era (flink​.apache​.org). Flink 2.2.0 advances real-time data processing with AI features, materialized tables, Delta Joins, improved connectors, and PyFlink support, enabling LLM inference and vector search

Branch, Test, Deploy: A Git-Inspired Approach for Data (ssp​.sh). Git-like workflows for data: branching, zero-copy cloning, Prolly Trees, LakeFS, and Nessie to enable testing and deploying data pipelines


🖥️ Serving & Queues

Optimizing PyTorch Model Inference on CPU (towardsdatascience​.com). CPU inference optimization on Intel Xeon with PyTorch 2.x, AMP, channels-last, IPEX, OpenVINO, and ONNX in a toy ResNet50 workflow

How to run Ollama with docker compose and GPU support (sleeplessbeastie​.eu). GPU-enabled Ollama setup using Docker Compose for accelerated model inference with Nvidia devices

Trying out the Absurd queue for AI Workloads (leblancfg​.com). Explores Absurd, a Postgres-based durable queue for AI workloads, with Python/TypeScript SDKs and a FastAPI demo by François Leblanc


🚀 GPU Training Pipelines

Accelerating Autonomous Driving Model Training on AMD ROCm™ Software (rocm​.blogs​.amd​.com). ROCm-accelerated autonomous driving model training with awesome-rocm-autodrive, MMCV optimizations, NHWC, bmm reshaping, and MIOpen tuning on AMD GPUs

Decoding high-bandwidth memory: A practical guide to GPU memory for fine-tuning AI models (cloud​.google​.com). Guides memory‑efficient fine‑tuning on GPUs with PEFT, LoRA, quantization, FlashAttention, and multi‑GPU strategies

DGL in Depth: SE(3)-Transformer on ROCm 7 (rocm​.blogs​.amd​.com). SE(3)-Transformer runs efficiently with DGL on AMD ROCm 7/MI300X, exploring 3D graphs, equivariant attention, and cross‑platform benchmarks

Introducing checkpointless and elastic training on Amazon SageMaker HyperPod (aws​.amazon​.com). SageMaker HyperPod introduces checkpointless and elastic training to accelerate AI model training

New serverless customization in Amazon SageMaker AI accelerates model fine-tuning (aws​.amazon​.com). Serverless customization in SageMaker AI enables fine-tuning of models like Llama, Qwen, and GPT-OSS with UI or code, automating deployment


🧵 Distributed Training Internals

Machine Learning (danieldk​.eu). Explores Dish Activation, attention mechanisms, logits, quantization, and multi-GPU model parallelism with Tensor Parallelism on machine learning models

Support FSDP2 as A Training Backend for Miles (lmsys​.org). Miles adds FSDP2 as a flexible training backend, enabling DTensor-based sharding, true on-policy training, data packing, and CP/DP optimization for Qwen3-Next and VLM RL

Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 (gilesthomas​.com). Using Hugging Face datasets, GPT-2 style base model training on FineWeb 10B samples with RTX 3090; tokenization, token counts, and data scaling

Learning to love mesh-oriented sharding (blog​.ezyang​.com). DTensor and mesh-oriented sharding clash: open vs closed designs, extensibility with Placement, and implications for PyTorch and JAX in distributed ML


🧠 LLM Representations

Cross Layer Transcoders for the Qwen3 LLM Family (lesswrong​.com). Explores sparse autoencoders and cross layer transcoders (CLTs) for Qwen3 LLMs with BLUELightAI’s CLT features and TDA tools

Concept Subspaces for Targeted Model Editing (gojiberries​.io). Counterfactual concept subspaces for targeted model editing using activation directions, PCA, orthogonalization, and subspace-constrained adapters

Picking Optimal Token IDs (notes​.hella​.cheap). How to arrange token IDs with PCA-like ordering to maximize zero runs in sparse bit vectors

Interactively Visualizing the Qwen3 MoE Architecture (vkethana​.com). Interactive Qwen3 MoE architecture visualizations explore grouped query attention, RMS normalization, and RoPE rotations

DeepSeek V3.2 (aarnphm​.xyz). DeepSeek V3.2 introduces Sparse Attention (DSA) and FP8 indexing for efficient memory and FLOPs, detailing MHA/MQA training and inference, compressed caches, and Hadamard transforms


📐 Statistical Diagnostics

New Preprint: Model Checking for Vector Autoregressive Models (jmbh​.github​.io). Tutorial on VAR model checking with diagnostics, plots, simulations, and R-code for multilevel VAR in psychological time series

Notes - NLA MT25, Marchenko-Pastur theorem (ollybritton​.com). Notes on Marchenko-Pastur theorem for random matrices X, singular values, distribution, and conditioning estimates

Data Science Notes: 1. Bland-Altman plots (hoyleanalytics​.org). Rotating data to create Bland-Altman plots reveals reproducibility and bias patterns using Python (statsmodels) in a data science context

Notes - NLA MT25, Gaussian random matrices (ollybritton​.com). Gaussian random matrices, orthogonal invariance, and basics of G ~ N(0,1) entries for m x n matrices


📚 Academic Research

PystachIO: Efficient Distributed GPU Query Processing with PyTorch over Fast Networks & Fast Storage (arxiv:cs). Introduces PystachIO, a PyTorch-based distributed OLAP engine optimizing GPU, network, and NVMe utilization, offering up to 3x faster analytical queries on modern GPU clusters

Interaction Tensor Shap (arxiv:cs). IT-SHAP reformulates high-order Shapley interaction indices as tensor-network contractions, enabling polynomial-time computation of exact interaction attributions, scaling explainability to deep, high-dimensional models

Robust Tabular Foundation Models (arxiv:cs). Proposes Robust Tabular Foundation Models, adversarially adapting synthetic-data generators using an optimality-gap objective, improving TabPFN performance and robustness on diverse tabular benchmarks with limited pretraining

Hyperparameter Transfer Enables Consistent Gains of Matrix-Preconditioned Optimizers Across Scales (arxiv:cs). Analyzes how learning rate and weight decay should scale for matrix-preconditioned optimizers like Shampoo and Muon, enabling consistent speedups over AdamW across language-model sizes

Gradient Descent with Provably Tuned Learning-rate Schedules (arxiv:cs). Develops theory and algorithms for tuning learning-rate schedules and momentum in gradient descent on nonconvex, nonsmooth objectives, giving complexity guarantees applicable to neural network training

👋 Before you go...

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can, by joining the Patreon page. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month.

If you are getting value from blaze, checking this out would mean the absolute world. But if you can't contribute, no worries - the newsletters keep coming either way. Thanks for reading and being part of this nerdy corner of the internet. All the best for the coming week - Alastair.

Don't miss what's next. Subscribe to The ML Engineer:

Add a comment:

Share this email:
Share on LinkedIn Share on Hacker News Share on Mastodon Share on Bluesky
Bluesky
https://mastodo...
LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.