ML Engineering newsletter


            
        September 2, 2025
    
    
ML Engineering newsletter


   Blaze Email
  

               ML Engineering
              

               2025-09-02
               
                •  read online
               

                •  patreon
               

            🔧 Company Engineering Blogs
           

             What the interns have wrought, 2025 edition
            

             (blog.janestreet.com)
            
            . Intern projects include Faster (J)SQL evaluation with JSQL, better Torch bindings via OxCaml, Memtrace memory leaks, and ref-counted shared memory with OxCaml modes
           

             Moving ahead faster with fallbacks
            

             (booking.ai)
            
            . Fallbacks in ranking service enable fast experimentation, reliability, and ML-induced innovation without outages
           

             Engineering stories behind the Medium Daily Digest Algorithm: Part 1
            

             (medium.engineering)
            
            . How Apple Mail Privacy Protection and filtering adjustments boosted digest quality and engagement through adjusted filtering rules and A/B testing
           

             Simplifying Large-Scale LLM Processing across Instacart with Maple
            

             (tech.instacart.com)
            
            . Maple: Instacart’s batch LLM processing service for scalable, cost-efficient, auditable prompts across catalogs, fulfillment, and search
           

             Revolutionizing warehouse automation with scientific simulation
            

             (amazon.science)
            
            . Sensor Workbench (SWB) on NVIDIA Isaac Sim enables parallel GPU-based sensor simulations, CAD-to-OpenUSD pipeline, and OpenUSD ground-truth for barcode detection in warehouses
           

            🔬 Applied ML Research & Domain Applications
           

             Building a YOLOX Plate Detector – Setup, Fine-Tuning, Metrics, Dashcam inference
            

             (poeticoding.com)
            
            . YOLOX plate detector setup, fine-tuning, COCO annotations, ONNX export, dashcam inference, and evaluation
           

             Two Papers Accepted at APSIPA 2025 in Singapore
            

             (bagustris.blogspot.com)
            
            . Two papers on dementia prediction from optimized prosodic features and longitudinal cough TB detection for APSIPA 2025 Singapore
           

             SURP Student Spotlight: Nava Wolfish
            

             (dunlap.utoronto.ca)
            
            . SURP spotlight on Nava Wolfish: ML-based stellar abundances from JWST NIRSpec and APOGEE data with contrastive learning
           

             Team Brings Lung Cancer Into Focus with 3D Imaging Innovation
            

             (cmu.edu)
            
            . NIH-funded CMU collaboration uses Magnify expansion microscopy and omni-mesoscopes to map 3D tumor microenvironments at nanoscale for lung cancer research
           

             Empowering air quality research with secure, ML-driven predictive analytics
            

             (aws.amazon.com)
            
            . Low-code SageMaker Canvas-based PM2.5 imputation using AWS AI services, Lambda, Step Functions, and RDS in Africa’s sensor networks
           

            🎯 Classification, NLP & Feature Engineering
           

             Interestingness First Classifiers
            

             (data-processing.club)
            
            . EUREKA: selecting interesting features with pairwise LLM comparisons to build rule-based classifiers on tabular data
           

             Engineering a Scalable Topic Pipeline: A BERTopic and GenAI Case Study
            

             (medium.com/gumgum-tech)
            
            . Hybrid BERTopic + Agglomerative Clustering pipeline with cuML GPU acceleration, post-processing, and GenAI for topic merging and naming
           

             Symmetry in subword segmentation
            

             (languagelog.ldc.upenn.edu)
            
            . Symmetry in subword segmentation across 32 languages; comparison of manual, probabilistic, and BPE-MR methods for minimizing text redundancy
           

             Algebraic approach reveals how to restore complex altered gene networks
            

             (phys.org)
            
            . KAIST uses Boolean networks, semi-tensor product, and Taylor approximation to identify gene control targets restoring altered stimulus–response patterns
           

            📈 Statistical Learning & Forecasting Methods
           

             external regressors in ahead::dynrmf’s interface for Machine learning forecasting
            

             (thierrymoudiki.github.io)
            
            . External regressors in ahead::dynrmf interface demonstrated with USAccDeaths, AirPassengers, fpp2 a10, fdeaths; xreg creation; runs with ridge and glmnet cv.glmnet
           

             Historical notes on semi-parametric theory and estimation
            

             (herbsusmann.com)
            
            . Historical notes on von Mises functionals, sample splitting, one-step estimation, and key references in semiparametric theory
           

             PyData Berlin 2025: Introduction to Stochastic Variational Inference with NumPyro
            

             (juanitorduz.github.io)
            
            . Intro to Stochastic Variational Inference with NumPyro: SVI concepts, Gamma toy, AutoNormal, Predictive, BNNs, Flax NNX integration, SVI with AutoGuides
           

             A One-Slide Summary of Tree-Based Classification and Regression
            

             (jamesmccaffrey.wpcomstaging.com)
            
            . One-slide refresher on tree-based classification and regression: bagging, bootstrap aggregation, boosting, random forests, weak learners, and memory aids
           

             Marginal Effect of Hyperparameter Tuning with XGBoost
            

             (towardsdatascience.com)
            
            . Bayesian hyperparameter optimization with hyperopt (TPE), SMBO, EI, and broader vs narrower XGBoost search spaces
           

            ⚡ ML Systems & GPU Optimization
           

             Draft - Efficient RL Training - Optimizing Weight Sync in slime
            

             (hebiao064.github.io)
            
            . Weight synchronization in slime: CUDA IPC, asynchronous tensor gathering, tensor bucketing, SGLang server calls, and 120s→7s optimizations for RL training with Megatron, PPO/GRPO
           

             The Parallelism Mesh Zoo
            

             (blog.ezyang.com)
            
            . Overview of device mesh concepts and parallelism strategies: DP, FSDP, HSDP, TP, SP, Ulysses, CP, PP, EP across multi-dimensional device meshes
           

             Why are CUDA kernels hard to optimize?
            

             (johndcook.com)
            
            . Investigates GPU kernel optimization challenges, memory hierarchies, tiling, block size, prefetching, caching, and autotuning across eight GPUs with PTX/SASS exploration
           

             Deploying DeepSeek on 96 H100 GPUs
            

             (lmsys.org)
            
            . Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on 96 H100 GPUs
           

            📊 Embeddings, Quantization & Information Theory
           

             GeoTessera Python library released for geospatial embeddings
            

             (anil.recoil.org)
            
            . GeoTessera Python library for accessing 128-band 10 m2 geospatial embeddings from Sentinel data and GIS workflows
           

             quantisation basics
            

             (aarnphm.xyz)
            
            . Quantization, uniform/non-uniform, MSQE; kv cache pruning; KV quantization (KVQuant, SKVQ, KIVI, AdaKV, PyramidKV); multi-head attention, per-token KV; RoPE conflicts; DeepSeek KV compression; two-batch overlap (TBO); RMDA/NIXL; KV-aware routing; prefill/decode timing;
           

             How big are our embeddings now and why?
            

             (veekaybee.github.io)
            
            . Embedding sizes grow from 768 to 4096+, OpenAI’s 1536 norm, HuggingFace standardization, MTEB benchmarks, matryoshka representations, vector databases commoditization
           

             The Theoretical Limitations of Embedding-Based Retrieval
            

             (arxiv.org)
            
            . Theoretical limitations of embedding-based retrieval, kernelized similarity, retrieval error bounds, and implications for practical IR systems
           

            🧠 Deep Learning Theory & Mathematical Foundations
           

             Cracking the Black Box: Six Lenses for Understanding Deep Learning
            

             (kalhansblog.blogspot.com)
            
            . Six lenses—NTK, Information Bottleneck, Mean-Field theory, Loss landscape, Geometric Deep Learning, PAC-Bayes—explain deep learning generalisation
           

             Information bottleneck method
            

             (aarnphm.xyz)
            
            . Information bottleneck principle for representing X with compressed T to maximize I(T;Y) while minimizing I(X;T), via Lagrangian β, information plane, mutual information, conditional entropy, and TBP visualization
           

             Notes: A Brief History of Intelligence (Bennett)
            

             (scyy.fi)
            
            . Notes: A Brief History of Intelligence (Bennett) traces steering, affect, TD learning, pattern recognition, neocortex function, vicarious trial and error, episodic memory, theory of mind, language, and morality across nematodes to humans
           

             The Price of Unearned Knowledge: Jung’s Warning and the Crisis of Modern Machine Learning
            

             (medium.com/intuitionmachine)
            
            . Jungian warning on unearned knowledge applied to ML: FER, UFR, compression-decompression, temporal ordering, curriculums, meta-learning, and evolvability
           

            📐 Mathematical Structures & Computational Geometry
           

             Million Point Sculptures: an exploration tool written in Metal
            

             (hunsley.io)
            
            . Ynfold: a Metal-based MacOS/iPadOS explorer for Million Point Sculptures, real-time MPS rendering, hashing, RNG options, and focus-based geometric invariants
           

             Linkage
            

             (11011110.github.io)
            
            . 3d and layered QR codes, developable surfaces from flat strips, AI slop in knowledge, matroid parity, cubical spheres, topological book embeddings, LATIN call, Wikipedia search critique
           

             Multilinear polynomials: survival kit
            

             (blog.lambdaclass.com)
            
            . Multilinear polynomials, hypercube interpolation, Lagrange basis, coordinates via evaluations, tensor product structure, and variable-dependence tests for products p_k(X)
           

             Five-arc fractal
            

             (11011110.github.io)
            
            . Five-arc fractal replaces arcs with five congruent sub-arcs; C1 smooth curve, no convex arcs, convex-arc-free, connects to preprint on Stabbing faces by a convex curve
           

             Deliberate play
            

             (koaning.io)
            
            . Deliberate play concept, interactive Matrix widget (wigglystuff), PCA demo, reactive notebooks, curiosity-driven exploration in linear algebra
           

             The biggest math symbol
            

             (johndcook.com)
            
            . Riemann P-symbol (Papperitz) for solutions to Riemann’s differential equation with three regular singular points a, b, c and Möbius-transformation behavior
           

            📚 Academic Research
           

             Interestingness First Classifiers
            

             (arxiv:stat)
            
            . Exposes critical security vulnerabilities in Python's pickle serialization used by ML frameworks, demonstrating bypass techniques against existing scanners. Essential reading for any Python ML engineer dealing with model serialization and supply chain security
           

             FORGE: Foundational Optimization Representations from Graph Embeddings
            

             (arxiv:cs)
            
            . Introduces a pre-trained graph autoencoder for mixed-integer programming instances, enabling transfer learning across optimization problems. Significant for ML engineers working on optimization problems and combinatorial challenges in production systems
           

             Fast and Scalable Mixed Precision Euclidean Distance Calculations Using   GPU Tensor Cores
            

             (arxiv:cs)
            
            . Achieves 2.5-51× speedup in Euclidean distance calculations using GPU tensor cores with mixed precision arithmetic. Critical for ML engineers optimizing similarity searches, clustering, and nearest neighbor algorithms at scale
           

             A Mixture of Experts Gating Network for Enhanced Surrogate Modeling in   External Aerodynamics
            

             (arxiv:cs)
            
            . NVIDIA researchers combine three specialized neural architectures using mixture-of-experts with entropy regularization for CFD surrogate modeling. Demonstrates practical MoE implementation patterns valuable for complex multi-domain prediction tasks
           

             MERIT: Maximum-normalized Element-wise Ratio for Language Model   Large-batch Training
            

             (arxiv:cs)
            
            . Proposes new optimizer using max-norm trust ratios and element-wise scaling for stable large-batch training of neural networks. Addresses fundamental optimization challenges in distributed training with rigorous mathematical foundations
           

             Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial   Semi-Bandits
            

             (arxiv:stat)
            
            . Develops first algorithm guaranteeing optimal regret bounds in both adversarial and stochastic settings with efficient KKT-based projections. Advances mathematical foundations of online learning algorithms with practical computational improvements
           

            👋 Before you go
           

            I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching
            
             a Patreon page!
            
            .  Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
           

             Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
            

             First dibs on merch (details still cooking)
            

             That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
            

            If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
           

              Have an idea for how blaze could be better? Please visit the
              
               feedback form
              
              to let us know. To update your preferences, or to unsubscribe, please go to
              
               blaze.email/unsubscribe
              
              .
             

                            Don't miss what's next. Subscribe to The ML Engineer:
                        
                    
            Email address (required)
            
            
          Add a comment:
          
            
                Share this email:
                
                    
                                Share on LinkedIn
                            
                        
                                Share on Hacker News
                            
                        
                                Share on Mastodon
                            
                        
                                Share on Bluesky