AI news, receipts-first

Provenance Brief

Latest AI news: plain-English for humans, deep links for builders.

Plain-English gists No jargon Receipts + links Benchmarks & diffs 20s · 2m · deep
Updated 2025-12-28T19:45:48+00:00 0 new / 24h Sources 13/13 How to use

Latest

Tap a headline for the gist. Open receipts when you care.

Tap a headline for the gist. Jump to receipts when you need to ship.

Search & filter
  1. AWS Machine Learning BlogDec 24, 2025 17:22 UTC AI agent-driven browser automation for enterprise workflow management Open

    Enterprise organizations increasingly rely on web-based applications for critical business processes, yet many workflows remain manually intensive, creating operational inefficiencies and compliance risks.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  2. AWS Machine Learning BlogDec 24, 2025 17:20 UTC Agentic QA automation using Amazon Bedrock AgentCore Browser and Amazon Nova Act Open

    Quality assurance (QA) testing has long been the backbone of software development, but traditional QA approaches haven’t kept pace with modern development cycles and complex UIs.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  3. AWS Machine Learning BlogDec 24, 2025 17:17 UTC Optimizing LLM inference on Amazon SageMaker AI with BentoML’s LLM- Optimizer Open

    The rise of powerful large language models (LLMs) that can be consumed via API calls has made it remarkably straightforward to integrate artificial intelligence (AI) capabilities into applications.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  4. AWS Machine Learning BlogDec 23, 2025 22:18 UTC Exploring the zero operator access design of Mantle Open

    At Amazon, our culture, built on honest and transparent discussion of our growth opportunities, enables us to focus on investing and innovating to continually raise the standard on our ability to deliver value for our…

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  5. AWS Machine Learning BlogDec 23, 2025 17:36 UTC AWS AI League: Model customization and agentic showdown Open

    Building intelligent agents to handle complex, real-world tasks can be daunting.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  6. AWS Machine Learning BlogDec 23, 2025 17:32 UTC Accelerate Enterprise AI Development using Weights & Biases and Amazon Bedrock AgentCore Open

    This post is co-written by Thomas Capelle and Ray Strickland from Weights & Biases (W&B).

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  7. AWS Machine Learning BlogDec 23, 2025 17:24 UTC How dLocal automated compliance reviews using Amazon Quick Automate Open

    dLocal , Uruguay’s first unicorn, has established itself as a pioneer in cross-border payments since its founding in 2016.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  8. AWS Machine Learning BlogDec 23, 2025 17:11 UTC Advancing ADHD diagnosis: How Qbtech built a mobile AI assessment Model Using Amazon SageMaker AI Open

    This post is cowritten with Dr.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  9. AWS Machine Learning BlogDec 23, 2025 17:06 UTC Accelerating your marketing ideation with generative AI – Part 1: From idea to generation with the Amazon Nova foundation models Open

    Marketing teams face increasing pressure to create engaging campaigns quickly while maintaining brand consistency and creative quality.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  10. Google DeepMind BlogDec 23, 2025 17:01 UTC Google's year in review: 8 areas with research breakthroughs in 2025 Open

    Google 2025 recap: Research breakthroughs of the year

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Labs Research
  11. Google AIDec 23, 2025 17:00 UTC Google's year in review: 8 areas with research breakthroughs in 2025 Open

    This year saw new AI models, transformative products and new breakthroughs in science and robotics.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Labs Research
  12. AWS Machine Learning BlogDec 23, 2025 16:45 UTC Introducing Visa Intelligent Commerce on AWS: Enabling agentic commerce with Amazon Bedrock AgentCore Open

    This post is cowritten with Sangeetha Bharath and Seemal Zaman from Visa.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  13. Hugging Face BlogDec 23, 2025 14:07 UTC AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems Open

    Hugging Face Blog: AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Open Source Tools
  14. AWS Machine Learning BlogDec 22, 2025 18:37 UTC Move Beyond Chain-of-Thought with Chain-of-Draft on Amazon Bedrock Open

    As organizations scale their generative AI implementations, the critical challenge of balancing quality, cost, and latency becomes increasingly complex.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  15. AWS Machine Learning BlogDec 22, 2025 18:32 UTC Deploy Mistral AI’s Voxtral on Amazon SageMaker AI Open

    Mistral AI’s Voxtral models combine text and audio processing capabilities in a single framework.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  16. AWS Machine Learning BlogDec 22, 2025 18:26 UTC Enhance document analytics with Strands AI Agents for the GenAI IDP Accelerator Open

    Extracting structured information from unstructured data is a critical first step to unlocking business value.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  17. AWS Machine Learning BlogDec 22, 2025 18:21 UTC Build a multimodal generative AI assistant for root cause diagnosis in predictive maintenance using Amazon Bedrock Open

    Predictive maintenance is a strategy that uses data from equipment sensors and advanced analytics to predict when a machine is likely to fail, ensuring maintenance can be performed proactively to prevent breakdowns.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  18. Google AIDec 22, 2025 17:00 UTC 60 of our biggest AI announcements in 2025 Open

    Look back on Google AI news in 2025 across Gemini, Search, Pixel and more products.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Labs Research
  19. OpenAI NewsDec 22, 2025 00:00 UTC One in a million: celebrating the customers shaping AI’s future Open

    More than one million customers around the world now use OpenAI to empower their teams and unlock new opportunities.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Labs Product
  20. OpenAI NewsDec 22, 2025 00:00 UTC Continuously hardening ChatGPT Atlas against prompt injection Open

    OpenAI is strengthening ChatGPT Atlas against prompt injection attacks using automated red teaming trained with reinforcement learning.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Labs Product
  21. AWS Machine Learning BlogDec 19, 2025 18:23 UTC Introducing SOCI indexing for Amazon SageMaker Studio: Faster container startup times for AI/ML workloads Open

    Today, we are excited to introduce a new feature for SageMaker Studio : SOCI (Seekable Open Container Initiative) indexing.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
  22. Google AIDec 19, 2025 16:00 UTC 40 of our most helpful AI tips from 2025 Open

    Learn more about the AI tips and tools Google shared in 2025.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Labs Research
  23. Google AIDec 19, 2025 14:00 UTC 5 ways AI agents will transform the way we work in 2026 Open

    Today, Google Cloud dropped its 2026 AI Agent Trends Report.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Labs Research
  24. AWS Machine Learning BlogDec 18, 2025 17:26 UTC Build and deploy scalable AI agents with NVIDIA NeMo, Amazon Bedrock AgentCore, and Strands Agents Open

    This post is co-written with Ranjit Rajan, Abdullahi Olaoye, and Abhishek Sawarkar from NVIDIA.

    Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

    Cloud Product
Research papers (35)
  1. arXiv cs.AIDec 24, 2025 18:59 UTC Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty Open

    Masked Diffusion Models (MDMs) offer flexible, non-autoregressive generation, but this freedom introduces a challenge: final output quality is highly sensitive to the decoding order.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  2. arXiv cs.LGDec 24, 2025 18:59 UTC Autonomous Uncertainty Quantification for Computational Point-of-care Sensors Open

    Computational point-of-care (POC) sensors enable rapid, low-cost, and accessible diagnostics in emergency, remote and resource-limited areas that lack access to centralized medical facilities.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  3. arXiv cs.AIDec 24, 2025 18:59 UTC C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling Open

    We present C2LLM - Contrastive Code Large Language Models, a family of code embedding models in both 0.5B and 7B sizes.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  4. arXiv stat.MLDec 24, 2025 18:54 UTC Measuring all the noises of LLM Evals Open

    Separating signal from noise is central to experimental science.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  5. arXiv cs.LGDec 24, 2025 18:46 UTC Parallel Token Prediction for Language Models Open

    We propose Parallel Token Prediction (PTP), a universal framework for parallel sequence generation in language models.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  6. arXiv cs.LGDec 24, 2025 18:37 UTC Variationally correct operator learning: Reduced basis neural operator with a posteriori error estimation Open

    Minimizing PDE-residual losses is a common strategy to promote physical consistency in neural operators.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  7. arXiv cs.AIDec 24, 2025 18:24 UTC Scaling Laws for Economic Productivity: Experimental Evidence in LLM-Assisted Consulting, Data Analyst, and Management Tasks Open

    This paper derives `Scaling Laws for Economic Impacts' -- empirical relationships between the training compute of Large Language Models (LLMs) and professional productivity.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  8. arXiv stat.MLDec 24, 2025 18:21 UTC Does the Data Processing Inequality Reflect Practice? On the Utility of Low-Level Tasks Open

    The data processing inequality is an information-theoretic principle stating that the information content of a signal cannot be increased by processing the observations.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  9. arXiv cs.LGDec 24, 2025 18:14 UTC Learning to Solve PDEs on Neural Shape Representations Open

    Solving partial differential equations (PDEs) on shapes underpins many shape analysis and engineering tasks; yet, prevailing PDE solvers operate on polygonal/triangle meshes while modern 3D assets increasingly live as…

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  10. arXiv cs.LGDec 24, 2025 17:39 UTC Transcriptome-Conditioned Personalized De Novo Drug Generation for AML Using Metaheuristic Assembly and Target-Driven Filtering Open

    Acute Myeloid Leukemia (AML) remains a clinical challenge due to its extreme molecular heterogeneity and high relapse rates.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  11. arXiv cs.AIDec 24, 2025 17:10 UTC Model Merging via Multi-Teacher Knowledge Distillation Open

    Model merging has emerged as a lightweight alternative to joint multi-task learning (MTL), yet the generalization properties of merged models remain largely unexplored.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  12. arXiv cs.AIDec 24, 2025 16:59 UTC SMART SLM: Structured Memory and Reasoning Transformer, A Small Language Model for Accurate Document Assistance Open

    The user of Engineering Manuals (EM) finds it difficult to read EM s because they are long, have a dense format which includes written documents, step by step procedures, and standard parameter lists for engineering…

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  13. arXiv cs.AIDec 24, 2025 15:43 UTC Learning Factors in AI-Augmented Education: A Comparative Study of Middle and High School Students Open

    The increasing integration of AI tools in education has led prior research to explore their impact on learning processes.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  14. arXiv cs.AIDec 24, 2025 15:36 UTC LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation Open

    Methods that use Large Language Models (LLM) as planners for embodied instruction following tasks have become widespread.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  15. arXiv cs.AIDec 24, 2025 15:35 UTC Improving the Convergence Rate of Ray Search Optimization for Query-Efficient Hard-Label Attacks Open

    In hard-label black-box adversarial attacks, where only the top-1 predicted label is accessible, the prohibitive query complexity poses a major obstacle to practical deployment.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  16. arXiv cs.LGDec 24, 2025 15:29 UTC Assessing the Software Security Comprehension of Large Language Models Open

    Large language models (LLMs) are increasingly used in software development, but their level of software security expertise remains unclear.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  17. arXiv cs.AIDec 24, 2025 15:25 UTC Casting a SPELL: Sentence Pairing Exploration for LLM Limitation-breaking Open

    Large language models (LLMs) have revolutionized software development through AI-assisted coding tools, enabling developers with limited programming expertise to create sophisticated applications.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  18. arXiv cs.LGDec 24, 2025 15:15 UTC MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models Open

    Large Language Models can develop reasoning capabilities through online fine-tuning with rule-based rewards.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  19. arXiv cs.AIDec 24, 2025 15:07 UTC PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation Open

    In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  20. arXiv cs.AIDec 24, 2025 15:02 UTC Leveraging Lightweight Entity Extraction for Scalable Event-Based Image Retrieval Open

    Retrieving images from natural language descriptions is a core task at the intersection of computer vision and natural language processing, with wide-ranging applications in search engines, media archiving, and…

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  21. arXiv cs.AIDec 24, 2025 15:01 UTC RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic Open

    Embodied agents powered by vision-language models (VLMs) are increasingly capable of executing complex real-world tasks, yet they remain vulnerable to hazardous instructions that may trigger unsafe behaviors.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  22. arXiv stat.MLDec 24, 2025 14:51 UTC Causal-driven attribution (CDA): Estimating channel influence without user-level data Open

    Attribution modelling lies at the heart of marketing effectiveness, yet most existing approaches depend on user-level path data, which are increasingly inaccessible due to privacy regulations and platform restrictions.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  23. arXiv cs.AIDec 24, 2025 14:33 UTC SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation Open

    Human infants, with only a few hundred hours of speech exposure, acquire basic units of new languages, highlighting a striking efficiency gap compared to the data-hungry self-supervised speech models.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  24. arXiv cs.AIDec 24, 2025 14:28 UTC Schrödinger's Navigator: Imagining an Ensemble of Futures for Zero-Shot Object Navigation Open

    Zero-shot object navigation (ZSON) requires a robot to locate a target object in a previously unseen environment without relying on pre-built maps or task-specific training.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  25. arXiv stat.MLDec 24, 2025 11:59 UTC Active inference and artificial reasoning Open

    This technical note considers the sampling of outcomes that provide the greatest amount of information about the structure of underlying world models.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  26. arXiv stat.MLDec 24, 2025 11:18 UTC Statistical and computational challenges in ranking Open

    We consider the problem of ranking $n$ experts according to their abilities, based on the correctness of their answers to $d$ questions.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  27. arXiv stat.MLDec 24, 2025 09:39 UTC Understanding Scaling Laws in Deep Neural Networks via Feature Learning Dynamics Open

    The empirical success of deep learning is often attributed to scaling laws that predict consistent gains as model, data, and compute grow; however, large models can exhibit training instability and diminishing…

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  28. arXiv stat.MLDec 24, 2025 07:34 UTC Enhancing diffusion models with Gaussianization preprocessing Open

    Diffusion models are a class of generative models that have demonstrated remarkable success in tasks such as image generation.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  29. arXiv stat.MLDec 24, 2025 07:10 UTC Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments Open

    Modeling sparse count data, which arise across numerous scientific fields, presents significant statistical challenges.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  30. arXiv stat.MLDec 24, 2025 03:39 UTC Invariant Feature Extraction Through Conditional Independence and the Optimal Transport Barycenter Problem: the Gaussian case Open

    A methodology is developed to extract $d$ invariant features $W=f(X)$ that predict a response variable $Y$ without being confounded by variables $Z$ that may influence both $X$ and $Y$.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  31. arXiv stat.MLDec 23, 2025 22:20 UTC Weighted MCC: A Robust Measure of Multiclass Classifier Performance for Observations with Individual Weights Open

    Several performance measures are used to evaluate binary and multiclass classification tasks.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  32. arXiv stat.MLDec 23, 2025 20:49 UTC Subgroup Discovery with the Cox Model Open

    We study the problem of subgroup discovery for survival analysis, where the goal is to find an interpretable subset of the data on which a Cox model is highly accurate.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  33. arXiv cs.LGDec 23, 2025 14:40 UTC GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer Open

    We present GeoTransolver, a Multiscale Geometry-Aware Physics Attention Transformer for CAE that replaces standard attention with GALE, coupling physics-aware self-attention on learned state slices with…

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  34. arXiv cs.AIDec 19, 2025 18:11 UTC Interpretable Plant Leaf Disease Detection Using Attention-Enhanced CNN Open

    Plant diseases pose a significant threat to global food security, necessitating accurate and interpretable disease detection methods.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research
  35. arXiv cs.AIDec 18, 2025 21:29 UTC When F1 Fails: Granularity-Aware Evaluation for Dialogue Topic Segmentation Open

    Dialogue topic segmentation supports summarization, retrieval, memory management, and conversational continuity.

    Builder: scan the abstract + experiments; look for code, datasets, and evals.

    Research