The ML Engineer 21-01-2026
🔧 Company Engineering Blogs
Adapting the Facebook Reels RecSys AI Model Based on User Feedback (engineering.fb.com). Facebook Reels uses UTIS to align recommendations with true user interests via surveys, boosting niche content and engagement
Introducing OptiMind, a research model designed for optimization (huggingface.co). OptiMind: Microsoft Research transforms natural language optimization problems into solver-ready mathematical formulations on Hugging Face for open-source exploration
Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR (research.google). Google Research releases MedGemma 1.5 4B for high-dimensional medical imaging and MedASR for medical speech-to-text, plus a MedGemma Impact Challenge
Customizing multiturn AI agents with reinforcement learning (amazon.science). Reinforcement learning improves multiturn agent customization using environment simulators, ground-truth rewards, and small models with AppWorld and DeepSearch Agent
Paper Announcement: A Practical Approach to Replenishment Optimization with Extended (R, s, Q) Policy and Probabilistic Models (engineering.zalando.com). Zalando's ZEOS replenishment engine uses probabilistic forecasting, extended (R, s, Q) policy, and discrete event simulation to boost GMV
🧭 Careers & Open Source
Projections for Data-Related Roles in The AI Era (cesarsotovalero.net). Data systems ownership, production monitoring, and governance reshape data roles; from models to end-to-end systems using Python and PSI drift monitoring
Stepping up as probabl’s CSO to supercharge scikit-learn and its ecosystem (gael-varoquaux.info). Gaël Varoquaux steps up as Probabl’s CSO to accelerate scikit-learn and its ecosystem using Python, open source, and enterprise tooling
AI model choices 2026-01 (kau.sh). Stable AI model choices for 2026, detailing tools, languages, and strategies Kaushik Gopal uses to achieve reliable results
🧪 Production ML Reality
Why Your ML Model Works in Training But Fails in Production (towardsdatascience.com). Explores production ML failures from time leaks, default signals, and population shifts with real-world fraud and payments examples
SE Radio 703: Sahaj Garg on Low Latency AI (se-radio.net). Low latency AI approaches, systems, and tools discussed by Sahaj Garg with insights on optimization and deployment
GPUs: Enterprise AI’s New Architectural Control Point (oreilly.com). GPU-bound enterprise AI shifts from elastic compute to constrained architecture, driving cost, latency, and governance considerations
From deployment slop to production reality: How BriX bridges the gap with enterprise-grade AI infrastructure (engineering.grab.com). BriX turns prototypes into production-grade AI tools with model switches, MCP data access, and enterprise-grade security
The race toward Confidential AI inference (gjolly.fr). Confidential AI inference, AMD SEV-SNP/Intel TDX, Apple, Google Gemini, startups Confer and Tinfoil, Canonical, Kubernetes, TLS, attestation
🏗️ Data Engineering Practice
Agenci AI, skalowanie pipeline’ów i ewolucja Pythona (blog.prokulski.science). AI agents, pipeline scaling, and Python evolution in 2026, covering generative AI, agentic AI, API security, MCP Model Context Protocol, SVM, anomaly detection, DuckDB, dbt-checkpoint, and Python/Geospatial tooling
Setting Up A Cluster of Tiny PCs For Parallel Computing - A Note To Myself (kenkoonwong.com). Setting up a Ubuntu cluster of tiny PCs with passwordless SSH, automated R package installs, and parallel simulations using multicore futures and TMLE in R
Apache Hudi™ at Uber: Engineering for Trillion-Record-Scale Data Lake Operations (hudi.apache.org). Uber shares engineering insights on building a trillion-record data lake using Apache Hudi, Spark, and Java, focusing on scalability and reliability
A Diary of a Data Engineer (ssp.sh). A veteran data engineer traces 50 years of evolution from BI to modern pipelines, emphasizing fundamentals, data modeling, and human impact
🧬 Semantics & Data Quality
Explainable unsupervised query tagging (emiruz.com). Explainable unsupervised query tagging using Python, pyEvidence, and OpenStreetMap gazetteers for England
Semantic Mappings Enable Automated Assembly (cthoyt.com). Semantic mappings unify heterogeneous vocabularies for knowledge graphs and lexical resources using SSSOM, JSKOS, SeMRA, and Biomappings
Spot check random samples of your data (anderspoirel.net). Spot check random samples of data using tablesample reservoir and DuckDB syntax to uncover inconsistencies across sources
🔎 Retrieval & Vector Search
VideoSummarizer: Reduced RAG for Video (Shots → Scenes → Evidence) (mostlylucid.net). VideoSummarizer uses reduced RAG for video with CLIP-based embeddings, deduplication, batching, and multi-signal scene assembly (shots → scenes → evidence) in lucidRAG
An introduction to XET, Hugging Face's storage system (part 2) (00f.net). Content-defined chunking with GearHash in XET, detailing rolling hashes, chunk boundaries, LZ4F compression, and bit/byte grouping for AI model weights
ClickHouse as a Vector Database (lorbic.com). ClickHouse can store embeddings and run vector search with HNSW indexes using SQL for semantic retrieval
📏 Evals & Benchmarks
What LLM benchmarks get wrong about measuring model performance (t-redactyl.io). Critical examination of LLM benchmarks, validity, and measurement issues in psychometrics for evaluating model performance
Evaluation (inkdroid.org). A practical guide to evaluating genAI against non-genAI systems, benchmarks, and different models, with benchmarking insights and recommendations
From AI Agent Prototype to Product: Lessons from Building AWS DevOps Agent (efekarakus.com). Four mechanisms for transforming AI agent prototypes into reliable products: evals, trajectory visualization, fast feedback loops, and production sampling using AWS DevOps Agent and OpenTelemetry tooling
🚀 GPU Training Systems
Technical Deep Dive: How DigitalOcean and AMD Delivered a 2x Production Inference Performance Increase for Character.ai (digitalocean.com). How DigitalOcean and AMD boosted Character.ai’s production inference throughput using MI300X/MI325X GPUs, DP1/TP8/EP8 and DP2/TP4/EP4 optimizations
Deep Dive into Primus: High-Performance Training for Large Language Models (rocm.blogs.amd.com). Unified Primus training framework with optimized backends (Megatron-LM, TorchTitan) for dense LLMs on AMD Instinct GPUs, featuring GEMM/FlashAttention tuning and AITER-based kernels
Pipeline Parallelism in SGLang: Scaling to Million-Token Contexts and Beyond (lmsys.org). SGLang's Chunked Pipeline Parallelism, Async P2P, and Dynamic Prefill boost ultra-long context inference across multi-node GPU clusters
Applying Compute Partitioning for Workloads on MI300X GPUs (rocm.blogs.amd.com). GPU compute partitioning on MI300X boosts throughput for GROMACS MD ensembles and REINVENT4 AI workflows using CPX mode with multiple partitions
🔧 Fine-Tuning Workflows
How to Fine-Tune Vision Models for Expert AI Results (cognitivetoday.com). Fine-tuning vision models with a 7-step blueprint, addressing domain shift, data prep, hyperparameters, and advanced strategies
Transform AI development with new Amazon SageMaker AI model customization and large-scale training capabilities (aws.amazon.com). SageMaker AI enables serverless model customization, elastic and checkpointless training, and MLflow observability for frontier models
Advanced fine-tuning techniques for multi-agent orchestration: Patterns from Amazon at scale (aws.amazon.com). Advanced fine-tuning techniques for agentic AI using SFT, PPO, DPO, GRPO, DAPO, GSPO with Amazon Bedrock and SageMaker across real-world Amazon cases
🧠 Model Ideas & Recsys
Parameters Are Like Pixels (lesswrong.com). Explores non-linear scaling, data quality, and mixture concepts in ML, using metaphors like pixels and parameters to discuss model performance
Adapting the Facebook Reels RecSys AI Model Based on User Feedback (engineering.fb.com). Facebook Reels uses UTIS to align recommendations with true user interests via surveys, boosting niche content and engagement
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models – Deepseek strikes again (blog.quintarelli.it). Conditional memory via scalable lookup enables near‑OEIS sparsity in LLMs, using Engram and MoE tradeoffs for improved reasoning and long-context retrieval
Starting from scratch: Training a 30M Topological Transformer (tuned.org.uk). Training a 30M Tauformer: topological attention, Laplacian energy, domain memory, and 60k tokens/s on H100 with 5k steps
[笔记] 生成式推荐:OpenOneRec 技术报告(快手,2026) (arthurchiao.art). OpenOneRec 技术报告解读:两阶段对齐、Itemic Tokens 与 RecIF-Bench 的实现要点
🔮 Forecasting Practice
Forecasting the Future: Time Series, Prophets, and Cross-Validation (jrogel.com). Practical forecasting with Prophet (Meta), time series cross-validation, and defensible evaluation for real-world deployments
ACX 2025 Prediction Contest Retrospective (entropicthoughts.com). Forecasting insights from ACX 2025 contest; Brier scores, questions, Vox comparison, and predictions across economics and tech events
Predicting Best Picture at the 2026 Academy Awards (markhw.com). Mark H. White II uses a probabilistic Best Picture model to forecast Oscars 2026 favorites and nomination risks
🛰️ Vision & Segmentation
Earth Observation on a Budget: Finding Solar Farms with a 42k-Parameter Model (toao.com). Finding solar farms with a 42k-parameter UNet using Tessera embeddings and OpenStreetMap/REPD data
Rolling your own serverless OCR in 40 lines of code (ckrapu.github.io). Serverless OCR with Modal in 40 lines using DeepSeek OCR, PyTorch, and FastAPI for batch PDF processing
Grounded SAM 2: From Open-Set Detection to Segmentation and Tracking (pyimagesearch.com). Grounded SAM 2 enables open-set detection, segmentation, and video tracking using Grounding DINO and SAM 2 with Python, Gradio, and OpenCV
Watershed Segmentation Using OpenCV (opencv.org). Learn watershed segmentation in OpenCV to separate touching objects using marker-based methods with preprocessing and Python code
🌲 Classical ML Math
Learning better decision tree splits - LLMs as Heuristics for Program Synthesis (mchav.github.io). Explores decision tree splits, program synthesis, and Haskell tooling for data science and feature generation
Science Discovery: The Advanced Matrix Factorization and Decomposition Jungle Page (nuit-blanche.blogspot.com). Explores advanced matrix factorization and decomposition topics with notes on CS, ML, MF, and references
Randomized SVD (leimao.github.io). Efficient approximation of SVD for large matrices using random projections, QR, and compact SVD, with Python/Matlab-like math exposition
Spectra smoothing based on information entropy (nirpyresearch.com). Smoothing spectra with information entropy using Savitzky-Golay parameters and delentropy criteria in Python
📚 Academic Research
Improved Algorithms for Fair Matroid Submodular Maximization (arxiv:cs). Microsoft Research and CMU propose better algorithms for fair submodular maximization under matroid constraints. Near-full fairness (1−ε) with constant approximation helps clustering, recommendation, coverage tasks
Optimising for Energy Efficiency and Performance in Machine Learning (arxiv:cs). Cambridge introduces ECOpt, a hyperparameter tuner optimizing accuracy and energy, including inference cost. It exposes Pareto frontiers and finds greener CIFAR‑10 models across common hardware
A pipeline for enabling path-specific causal fairness in observational health data (arxiv:cs). Columbia and Oxford present a pipeline for path-specific causal fairness on EHR data. It robustly estimates direct/indirect effects and tests mitigation tradeoffs across clinical tasks
Evaluating the Ability of Explanations to Disambiguate Models in a Rashomon Set (arxiv:cs). Oxford researchers propose AXE to evaluate local feature-importance explanations without ground-truth labels. On-manifold kNN surrogates reveal misleading metrics and detect fairwashing attacks in Rashomon sets
Combinatorial Optimization Augmented Machine Learning (arxiv:cs). TUM, Inria and Institut Polytechnique Paris survey combinatorial-optimization-augmented machine learning. A unifying framework links prediction to decisions, highlighting algorithms, applications, and open research frontiers today
Add a comment: