The ML Engineer logo

The ML Engineer

Archives
Subscribe
December 2, 2025

The ML Engineer 02-12-2025

transformers v5, AI culture and strategy, machine learning for physical systems

🔧 Company Engineering Blogs

Why developers still flock to Python: Guido van Rossum on readability, AI, and the future of programming (github​.blog). Guido van Rossum discusses Python’s readability, its pivotal role in AI, and the language’s evolution and ecosystem

Transformers v5: Simple model definitions powering the AI ecosystem (huggingface​.co). Transformers v5 introduces simplicity, modular model definitions, training at scale, improved inference, and quantization with PyTorch focus and ecosystem interoperability

Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures (machinelearning​.apple​.com). Conjugate moment measures and convex potentials to sample and map from a log-concave distribution using input-convex neural networks

Reducing experiment duration with predicted control variates (etsy​.com). Reducing experiment duration using predicted control variates with Python, R, and machine learning on A/B tests


💼 Career & Community

Joining Eventual (five-eights​.com). Joining Eventual: a distributed systems engineer’s path from Honeycomb to Eventual and building a scalable data-processing product

27/Nov/2025 - Platform Engineering is ON (chris​.funderburg​.me). Platform engineering lead builds a Video AI Engine on AWS with Terraform; weather AI framework for energy forecasting; London work-life balance

Feeling Thankful Today and Reflecting on Two Incredible Years at Red Hat (terrytangyuan​.github​.io). Gratitude for two years at Red Hat, mentorship, leadership growth, and open source collaboration with Kubeflow and teammates


🌍 AI Culture & Strategy

Is there a Multicultural Advantage in AI? (blog​.alexsanjoseph​.com). Explores multilingual diversity, Sapir-Whorf ideas, and cultural cognition to boost AI reasoning beyond English-centric data

Digital Inbreeding: What Happens When AI Runs Out of “Surprise”? (shawnharris​.com). Explores model collapse, Surplexity, balancing loops, and the need for Gold Standard data to curb synthetic content flooding the web

AI Capex Risk as Predictable Engineering (paulkedrosky​.com). AI capex risk, pre-training scaling, deterministic engineering, Ilya Sutskever, OpenAI, compute + data + parameters + training = capability

Quoting Qwen3-VL Technical Report (simonwillison​.net). Quoting Qwen3-VL: long-context video QA evaluation and YaRN-based extension achieving near-perfect accuracy

Circuit discovery through chain of thought using policy gradients (lesswrong​.com). Policy gradients and integrated gradients enable tracing chain-of-thought in RL-augmented circuit discovery for LLMs, using subgraph z variables to patch edges


🛠️ Data Platforms & Tooling

An Example ETL Pipeline with dlt + SQLMesh + DuckDB (aetperf​.github​.io). ETL with dlt, SQLMesh, and DuckDB in Python: Yahoo Finance data, transformations, and dashboards by François Pacull

Alex Bradbury: Minipost: Benchmarking the Hetzner AX102 vs CCX53 (muxup​.com). Benchmarking Hetzner AX102 and CCX53 using Arch Linux, Clang/LLVM, LLD, Ninja, and LLVM tests

Plan for Clojure AI, ML, and high-performance Uncomplicate ecosystem in 2026 (dragan​.rocks). Clojure AI/ML ecosystem plan for 2026: Neanderthal, Deep Diamond, Diamond ONNX Runtime, ClojureCUDA, ClojureCPP, Apple Presets, METAL bindings, and more, powered by Clojurists Together funding

tisthemachinelearner: New Workflow with uv for R Integration of scikit-learn (thierrymoudiki​.github​.io). tisthemachinelearner integrates uv for seamless R and Python scikit-learn workflow in a lightweight interface

Hachi: An Image Search Engine (eagledot​.xyz). Self-hosted image search engine with Nim and Python backend, ML embeddings, vector-indexing, and a modular, minimalistic design


🤖 Agents, Services & MLOps

A Whole New World (annievella​.com). Explores AI engineering shifts, non-deterministic systems, and the AI application layer with Gen AI, LLMs, and modern tooling

On "AI Brendans" or "Virtual Brendans" (brendangregg​.com). AI performance tuning agents named 'AI Brendans' and 'Virtual Brendans' emerge, tracing history from virtual Adrian to flame-graph-based optimization and in-house vs. commercial tooling

Demo to Production: An Open Source Architecture for Reliable AI Agents (lfaidata​.foundation). Open source, vendor-neutral architecture using ONNX, KServe, Temporal, OPA, OpenTelemetry, FAISS for robust AI agents

Python Services in 2025: Why They Matter & How They Power Modern Tech (howtolearnmachinelearning​.com). Modern Python services drive AI, data, and backend work with FastAPI, AI code assistants, and MLOps in 2025

Evaluate models with the Amazon Nova evaluation container using Amazon SageMaker AI (aws​.amazon​.com). Nova evaluation container adds BYOM metrics, log probabilities, metadata passthrough, LLM-as-a-judge, and multi-node scaling for SageMaker AI model evaluation


🧪 ML for Physical Systems

NeurIPS Paper #1: INC, An Indirect Neural Corrector for Auto-Regressive Hybrid PDE Solvers (ge​.in​.tum​.de). INC uses indirect neural corrections embedded in governing equations to stabilize auto-regressive hybrid PDE solvers and boost performance

NeurIPS Paper #2: Neural Emulator Superiority, When Machine Learning for PDEs Surpasses its Training Data (ge​.in​.tum​.de). Neural emulators for PDEs can outperform high-fidelity solvers under certain conditions using data-driven models and multi-step rollout analysis

VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite (rocm​.blogs​.amd​.com). LoRA fine-tuning of OpenCLIP for robotics on AMD ROCm Enterprise AI Suite using BridgeData V2 with Kubernetes and Hugging Face PEFT

How to Scale Data Generation for Physical AI with the NVIDIA Cosmos Cookbook (developer​.nvidia​.com). Scalable synthetic data generation for physical AI with NVIDIA Cosmos WFMs and Cosmos Transfer recipes in the Cookbook

Don’t go hacking my chart (sciencespot​.co​.uk). An anomaly-detection system for smart hospitals combines time-series analysis with image classification to detect irregular data and security breaches


🧮 Classical ML Algorithms

Unexpected Side Effects of Let’s Add More jobs (tiago​.rio​.br). Debugging a model deployment: Python, XGBoost, joblib Loky, n_jobs misconfiguration in a Pipeline

The Q, K, V Matrices (arpitbhayani​.me). How Q, K, V projections drive self-attention with Python numpy examples and dimension choices

Scalable Exploration via Ensemble++ (richardli​.xyz). Ensemble++ enables scalable Thompson Sampling-like exploration in neural bandits using a shared ensemble matrix factor with incremental updates

Data leakage (markusloecher​.github​.io). Explores data leakage concepts, Preprocessing before train/test splits, StandardScaler, pipelines, cross-validation, and one-hot encoding in Python ML workflows

The Greedy Boruta Algorithm: Faster Feature Selection Without Sacrificing Recall (towardsdatascience​.com). Greedy Boruta speeds up all-relevant feature selection using a relaxed confirmation criterion in Python with boruta_py and GreedyBorutaPy

Why False Positives Are Costing Banks More Than Fraud - with Suvaleena Paul of Bank of America (podcast​.emerj​.com). Bank of America fraud AI lead discusses integrating AI-driven fraud prevention, speeding threat detection, cross-team collaboration, and ROI-focused model refinement

“Decision Tree Regression from Scratch with Pointers Using C#” in Visual Studio Magazine (jamesmccaffreyblog​.com). Decision tree regression with C# in Visual Studio Magazine demonstrates pointer-based nodes and MSE splits


📐 Mathematical Foundations

Fisher Information Explained: Python and Visual Illustrations (data-processing​.club). Explains Fisher information and KL divergence with Python examples (NumPy) and visualizations, including Monte Carlo methods and a multivariate generalization

Generalized Worley Noise (ianthehenry​.com). Explores generalized Worley noise, signed distance functions, and Bauble for 3D procedural textures and distortion

Diffeomorphic deformations of Markov diffusions (djalil​.chafai​.net). Diffeomorphic deformation of Markov diffusions: Itô calculus, semigroups, generators, and flattening via diffusion-diffeomorphism tricks

Scaling Time Series Using the Inverse Hyperbolic Sine (minimizeregret​.com). Scaling time series with inverse hyperbolic sine transformation used after standardization for variance stabilization

The Manifold Hypothesis in SAT Prep (justinmath​.com). Manifold hypothesis in SAT prep: reconciling countless subskills with a smaller, enumerable core of recurring concepts

Matrix Moore-Penrose Pseudo-Inverse Via SVD Using C# (jamesmccaffreyblog​.com). Moore-Penrose pseudo-inverse via SVD (Jacobi) implemented in C# with MatDecompSVD and helper matrix operations


📚 Academic Research

Probabilistic Hash Embeddings for Online Learning of Categorical Features (arxiv:stat). Introduces probabilistic hash embeddings for streaming categorical features, using Bayesian online learning to avoid forgetting. Great for engineers building memory-efficient, order-invariant recommenders or classification systems

On the Effect of Regularization on Nonparametric Mean-Variance Regression (arxiv:stat). Analyzes overparameterized mean-variance regression using statistical field theory, revealing sharp regularization-driven phase transitions. Helps practitioners tune uncertainty-aware models more systematically on tabular and scientific datasets

Privacy-Utility-Bias Trade-offs for Privacy-Preserving Recommender Systems (arxiv:cs). Evaluates differential privacy mechanisms across recommender architectures, measuring accuracy and fairness. Offers guidance on privacy–utility–bias tradeoffs when deploying DP-SGD or local DP in recommendation systems

A Probabilistic Framework for Temporal Distribution Generalization in Industry-Scale Recommender Systems (arxiv:cs). Proposes ELBO_TDS, a causal-variational framework tackling temporal distribution shift in industrial recommenders. Combines time-aware data augmentation and self-supervised objectives, delivering GMV gains at Shopee scale

On Information Theoretic Fairness With A Bounded Point-Wise Statistical Parity Constraint: An Information Geometric Approach (arxiv:cs). Develops information-theoretic methods for fair representations under statistical parity constraints. Uses information geometry to derive quadratic optimization solutions balancing compression, task relevance, and fairness guarantees

👋 Before you go...

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can, by joining the Patreon page. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month.

If you are getting value from blaze, checking this out would mean the absolute world. But if you can't contribute, no worries - the newsletters keep coming either way. Thanks for reading and being part of this nerdy corner of the internet. All the best for the coming week - Alastair.

Don't miss what's next. Subscribe to The ML Engineer:

Add a comment:

Share this email:
Share on LinkedIn Share on Hacker News Share on Mastodon Share on Bluesky
Bluesky
https://mastodo...
LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.