The ML Engineer 28-10-2025
Scaling privacy, high-impact engineering, quantum computing in ML, increasing data efficiency
🔧 Company Engineering Blogs
Scaling Privacy Infrastructure for GenAI Product Innovation (engineering.fb.com). Meta scales Privacy Aware Infrastructure (PAI) for GenAI with data lineage, policy enforcement, and on-device/off-cloud processing using PrivacyLib in AI glasses
The Engineer’s Guide to Impact: Finding and Focusing on High-Leverage Work (engineering.gusto.com). Identify high-leverage work using a Venn diagram of low effort, high impact, and unique knowledge to multiply impact
Multi-Table Predictions in Data Cloud: Enabling Machine Learning Across Related Data Objects (engineering.salesforce.com). Multi-DMO in Data Cloud enables cross-object predictions with UI/UX refinements, SQL performance tuning, and 99.999% reliability
Streaming datasets: 100x More Efficient (huggingface.co). Streaming datasets enable 100x faster data loading with a simple streaming API, dedupe storage (Xet), Parquet CDC, and scalable pipelines
Identify User Journeys at Pinterest (medium.com/pinterest-engineering). Explore dynamic extraction, hierarchical clustering, journey naming, ranking with diversification, stage prediction, and LLM-based evaluation for Pinterest's user journeys
🔭 Emerging Methods & Evaluation
AI Agents: The case for Eval Driven Development (sdarchitect.blog). Eval Driven Development for AI agents using EDD, evals, MCP protocol, A2A, drift, bias, compliance, observability, and continuous testing with Andrew Ng and Hamel Husain quotes
Why Should We Bother with Quantum Computing in ML? (towardsdatascience.com). Quantum computing in ML: gate-model and annealing approaches, quantum kernels, QUBOs, hybrid quantum-classical workflows, and practical cybersecurity implications
From fields to surface-specific points, geomorphons, and networks (spatialists.ch). Transformations from fields to surface-specific points, geomorphons, and surface networks using R for DEM analysis
🔬 Applied Models & Representations
Draw high dimensional tensors as a matrix of matrices (blog.ezyang.com). Show how to visualize high-dimensional tensors as a matrix of matrices across 0D–5D, using PyTorch's view and split to reveal axes
From Structure to Function: Leveraging AlphaFold's Evoformer Embeddings for Downstream AI (rewire.it). AlphaFold's Evoformer embeddings enable downstream AI tasks: stability, binding site prediction, variant effect, and drug design using embeddings beyond structure
Multimodal Conditional 3D Face Geometry Generation (studios.disneyresearch.com). Diffusion-based multimodal 3D face geometry generation with cross-attention adapters for sketches, photos, edges, FLAME parameters, landmarks, and text prompts
Meta’s new free transformer (kiledjian.com). Meta's Free Transformer introduces a latent variable layer enabling working memory for improved planning in generation
Visual Features Across Modalities: SVG and ASCII Art Reveal Cross-Modal Understanding (simonwillison.net). SVG and ASCII art reveal cross-modal understanding of features across eyes, faces, and concepts like dogs, cats, unicorns, and bikes
Technical Demonstration (lyndonhill.com). Demonstrations of motion estimation, optical flow, SfM/SLAM, and advanced 3D representations for Spatial AI
the bug that taught me more about PyTorch than years of using it (elanapearl.github.io). Plateauing loss reveals a PyTorch MPS non-contiguous kernel bug in addcmul_ and addcdiv_ with encoder weights on Apple Silicon
Building Gemma3 from scratch in Rust (lucas-montes.com). Rust-based recreation of Gemma3 components: Linear, TransformerBlock, GQA, RMSNorm, RoPE, masking, and Safetensors loading
lecture nine (aarnphm.xyz). Lecture nine covers PyTorch and JAX, including NumPy+Autograd, JAX jit, memory handling, static_argnums, and normalization techniques
🏟️ Domain ML & Validation
Barrels Are All You Need (runningonnumbers.com). Explores barrels, xwOBA, and ML models (HistGradientBoostingClassifier) to predict barrels from batting and pitching features on 2023–2025 MLB data
Denosing Images of Cats and Dogs with Autoencoders (mayberay.bearblog.dev). Denoising Cat and Dog images with autoencoders, RED-NET with symmetric skip connections, MNIST and Caltech-256 notes, latent space concepts
Multi-Table Predictions in Data Cloud: Enabling Machine Learning Across Related Data Objects (engineering.salesforce.com). Multi-DMO in Data Cloud enables cross-object predictions with UI/UX refinements, SQL performance tuning, and 99.999% reliability
Validating Machine Learning Models in Fintech (barnesanalytics.com). Practical ML validation for fintech: data integrity, reproducibility, performance, explainability, fairness, security, and independent validation with bank governance
Making medical images make sense (news.wm.edu). GuanNan Wang and collaborators assess AI-generated medical images using FDA, DDPMs, and spherical projection to ensure clinical fidelity
🛠️ ML Systems & Quantization
Half-Quadratic Quantization of large machine learning models (dropbox.tech). HQQ quantization speeds up large model quantization without calibration data, outperforming GPTQ and AWQ on Llama-2 models
Accelerating Hybrid Inference in SGLang with KTransformers CPU Kernels (lmsys.org). KTransformers CPU kernels, AMX/AVX-512, NUMA-aware tensor parallelism, CUDA Graphs, Expert Deferral, SGLang integration for MoE hybrid inference
Streaming datasets: 100x More Efficient (huggingface.co). Streaming datasets enable 100x faster data loading with a simple streaming API, dedupe storage (Xet), Parquet CDC, and scalable pipelines
duckdb-mlpack 0.0.2: mlpack is now a duckdb community extension (dirk.eddelbuettel.com). duckdb-mlpack 0.0.2 extends mlpack as a duckdb community extension, enabling per-user per-version loading and binary distribution across Linux arm64/amd64 (with macOS, Windows, WASM planned) and showcasing adaBoost and regularized linear regression models
MLflow System Tables: Analyze Data Across All Your Experiments (databricks.com). Explore MLflow metadata across workspaces with system tables in Unity Catalog for scalable querying and dashboards
Accelerate large-scale AI training with Amazon SageMaker HyperPod training operator (aws.amazon.com). Deploy and manage large-scale AI training with SageMaker HyperPod, EKS add-on, fault resilience, log monitoring, and FSDP-based multi-node PyTorch
⚡ Performance & Infrastructure
GMKTec Evo-X2 Ryzen AI Max 395+ Benchmarks (nishtahir.com). Benchmarks of GMKTec Evo-X2 Ryzen AI Max 395+ on Ubuntu 24.04, ROCm setup, AMD driver tweaks, HIP builds, and ggml/gguf models
Train an LLM on an NVIDIA Blackwell Desktop with Unsloth—and Scale It (developer.nvidia.com). Democratizing LLM fine-tuning on NVIDIA Blackwell with Unsloth, from local 20B/40B experiments to DGX Cloud scalability
NVIDIA DGX Spark vs Mac Studio vs RTX-4080: Ollama Performance Comparison (glukhov.org). Ollama performance across NVIDIA DGX Spark, Mac Studio, and RTX 4080 for GPT-OSS 120b (65GB MoE model) with CPU/GPU offloading
Talking about HP’s ZGX Nano on the “Intelligent Machines” podcast (globalnerdy.com). HP’s ZGX Nano, built around NVIDIA GB10 Grace Blackwell, scales to 400B parameters with ConnectX-7, plus MCP servers like Too Many Cats demonstrated
📐 Mathematical Foundations
Spherical Ensemble (djalil.chafai.net). Spherical Ensemble: Coulomb gas on S^2, determinantal structure, Möbius invariance, Kostlan observation, and spectral radius analysis
Paper Review: The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain (andlukyane.com). Biologically inspired BDH: graph-based neuron units, Hebbian learning, and Transformer-level performance with GPU efficiency
The Geometry of Grammar: An Investigation into Interpretable Dimensions in Word Embeddings (rewire.it). Explores tense vectors in Word2Vec with PCA/t-SNE visualizations; compares static embeddings to BERT’s contextual representations
Notes - Machine Learning MT23, Multivariate Gaussians (ollybritton.com). Multivariate Gaussian density, covariance eigenstructure, and variance directions with mean mu
The Future of Interpretability is Geometric (lesswrong.com). Geometric activation-space insights from Anthropic’s 'When Models Manipulate Manifolds' support unsupervised detection of complex LLM structures
Exploring the multi-dimensional refusal subspace in reasoning models (lesswrong.com). Explores multi-dimensional refusal subspace in LLMs using DIM, probes, BigBench-style dataset, SSR-inspired harm data, MINCOS, SVD, and weight ablation on Qwen3/12B-14B models
Learning from Failure to Tackle Extremely Hard Problems (blog.ml.cmu.edu). BaNEL: Bayesian Negative Evidence Learning uses a failure-only generative model to guide post-training with negative rewards and minimize reward evaluations
📚 Academic Research
Ridge Boosting is Both Robust and Efficient (arxiv:stat). Shows that a single ridge-boosting step yields semiparametric efficiency and robustness to RKHS-bounded distribution shifts. Practical takeaway: one model can be both low-variance and distributionally robust
Direct Debiased Machine Learning via Bregman Divergence Minimization (arxiv:cs). Presents a unified, end-to-end framework for Neyman-targeted debiased ML using generalized Riesz regression and Bregman divergences. Engineers gain principled algorithms for causal/semiparametric inference with ML nuisances
Centrum: Model-based Database Auto-tuning with Minimal Distributional Assumptions (arxiv:stat). Introduces Centrum: stochastic gradient boosting ensembles plus conformal inference for distribution-free point and interval estimates in DBMS auto-tuning. High engineering impact—better tuning accuracy, intervals, and faster convergence in practice
Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action Spaces (arxiv:stat). Develops a dynamical low-rank parametrization and stable rank-1/batched updates for the inverse design matrix, reducing LinUCB costs to O(dr). Enables scalable, memory-efficient contextual bandits for large-scale recommenders
A Theory of the Mechanics of Information: Generalization Through Measurement of Uncertainty (Learning is Measuring) (arxiv:stat). Formulates construct-validity conditions for predictive benchmarking, making explicit the assumptions behind benchmark-driven claims. Essential reading for engineers designing evaluations and interpreting benchmark-based scientific inferences