AI news, receipts-first

Provenance Brief

Latest AI news: plain-English for humans, deep links for builders.

Latest RSS

Plain-English gists No jargon Receipts + links Benchmarks & diffs 20s · 2m · deep

Updated 2025-12-28T19:45:48+00:00 0 new / 24h Sources 13/13 How to use

Latest

Tap a headline for the gist. Open receipts when you care.

Tap a headline for the gist. Jump to receipts when you need to ship.

Search & filter

AWS Machine Learning BlogDec 24, 2025 17:26 UTC Programmatically creating an IDP solution with Amazon Bedrock Data Automation Open

Intelligent Document Processing (IDP) transforms how organizations handle unstructured document data, enabling automatic extraction of valuable information from invoices, contracts, and reports.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 24, 2025 17:22 UTC AI agent-driven browser automation for enterprise workflow management Open

Enterprise organizations increasingly rely on web-based applications for critical business processes, yet many workflows remain manually intensive, creating operational inefficiencies and compliance risks.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 24, 2025 17:20 UTC Agentic QA automation using Amazon Bedrock AgentCore Browser and Amazon Nova Act Open

Quality assurance (QA) testing has long been the backbone of software development, but traditional QA approaches haven’t kept pace with modern development cycles and complex UIs.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 24, 2025 17:17 UTC Optimizing LLM inference on Amazon SageMaker AI with BentoML’s LLM- Optimizer Open

The rise of powerful large language models (LLMs) that can be consumed via API calls has made it remarkably straightforward to integrate artificial intelligence (AI) capabilities into applications.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 22:18 UTC Exploring the zero operator access design of Mantle Open

At Amazon, our culture, built on honest and transparent discussion of our growth opportunities, enables us to focus on investing and innovating to continually raise the standard on our ability to deliver value for our…

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 17:36 UTC AWS AI League: Model customization and agentic showdown Open

Building intelligent agents to handle complex, real-world tasks can be daunting.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 17:32 UTC Accelerate Enterprise AI Development using Weights & Biases and Amazon Bedrock AgentCore Open

This post is co-written by Thomas Capelle and Ray Strickland from Weights & Biases (W&B).

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 17:24 UTC How dLocal automated compliance reviews using Amazon Quick Automate Open

dLocal , Uruguay’s first unicorn, has established itself as a pioneer in cross-border payments since its founding in 2016.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 17:11 UTC Advancing ADHD diagnosis: How Qbtech built a mobile AI assessment Model Using Amazon SageMaker AI Open

This post is cowritten with Dr.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 17:06 UTC Accelerating your marketing ideation with generative AI – Part 1: From idea to generation with the Amazon Nova foundation models Open

Marketing teams face increasing pressure to create engaging campaigns quickly while maintaining brand consistency and creative quality.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
Google DeepMind BlogDec 23, 2025 17:01 UTC Google's year in review: 8 areas with research breakthroughs in 2025 Open

Google 2025 recap: Research breakthroughs of the year

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Labs Research

Read Receipts Original
Google AIDec 23, 2025 17:00 UTC Google's year in review: 8 areas with research breakthroughs in 2025 Open

This year saw new AI models, transformative products and new breakthroughs in science and robotics.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Labs Research

Read Receipts Original
AWS Machine Learning BlogDec 23, 2025 16:45 UTC Introducing Visa Intelligent Commerce on AWS: Enabling agentic commerce with Amazon Bedrock AgentCore Open

This post is cowritten with Sangeetha Bharath and Seemal Zaman from Visa.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
Hugging Face BlogDec 23, 2025 14:07 UTC AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems Open

Hugging Face Blog: AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Open Source Tools

Read Receipts Original
AWS Machine Learning BlogDec 22, 2025 18:37 UTC Move Beyond Chain-of-Thought with Chain-of-Draft on Amazon Bedrock Open

As organizations scale their generative AI implementations, the critical challenge of balancing quality, cost, and latency becomes increasingly complex.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 22, 2025 18:32 UTC Deploy Mistral AI’s Voxtral on Amazon SageMaker AI Open

Mistral AI’s Voxtral models combine text and audio processing capabilities in a single framework.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 22, 2025 18:26 UTC Enhance document analytics with Strands AI Agents for the GenAI IDP Accelerator Open

Extracting structured information from unstructured data is a critical first step to unlocking business value.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
AWS Machine Learning BlogDec 22, 2025 18:21 UTC Build a multimodal generative AI assistant for root cause diagnosis in predictive maintenance using Amazon Bedrock Open

Predictive maintenance is a strategy that uses data from equipment sensors and advanced analytics to predict when a machine is likely to fail, ensuring maintenance can be performed proactively to prevent breakdowns.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
Google AIDec 22, 2025 17:00 UTC 60 of our biggest AI announcements in 2025 Open

Look back on Google AI news in 2025 across Gemini, Search, Pixel and more products.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Labs Research

Read Receipts Original
OpenAI NewsDec 22, 2025 00:00 UTC One in a million: celebrating the customers shaping AI’s future Open

More than one million customers around the world now use OpenAI to empower their teams and unlock new opportunities.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Labs Product

Read Receipts Original
OpenAI NewsDec 22, 2025 00:00 UTC Continuously hardening ChatGPT Atlas against prompt injection Open

OpenAI is strengthening ChatGPT Atlas against prompt injection attacks using automated red teaming trained with reinforcement learning.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Labs Product

Read Receipts Original
AWS Machine Learning BlogDec 19, 2025 18:23 UTC Introducing SOCI indexing for Amazon SageMaker Studio: Faster container startup times for AI/ML workloads Open

Today, we are excited to introduce a new feature for SageMaker Studio : SOCI (Seekable Open Container Initiative) indexing.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original
Google AIDec 19, 2025 16:00 UTC 40 of our most helpful AI tips from 2025 Open

Learn more about the AI tips and tools Google shared in 2025.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Labs Research

Read Receipts Original
Google AIDec 19, 2025 14:00 UTC 5 ways AI agents will transform the way we work in 2026 Open

Today, Google Cloud dropped its 2026 AI Agent Trends Report.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Labs Research

Read Receipts Original
AWS Machine Learning BlogDec 18, 2025 17:26 UTC Build and deploy scalable AI agents with NVIDIA NeMo, Amazon Bedrock AgentCore, and Strands Agents Open

This post is co-written with Ranjit Rajan, Abdullahi Olaoye, and Abhishek Sawarkar from NVIDIA.

Builder: read docs/changelog; watch for breaking changes, quotas, and pricing.

Cloud Product

Read Receipts Original

Research papers (35)

arXiv cs.AIDec 24, 2025 18:59 UTC Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty Open

Masked Diffusion Models (MDMs) offer flexible, non-autoregressive generation, but this freedom introduces a challenge: final output quality is highly sensitive to the decoding order.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 18:59 UTC Autonomous Uncertainty Quantification for Computational Point-of-care Sensors Open

Computational point-of-care (POC) sensors enable rapid, low-cost, and accessible diagnostics in emergency, remote and resource-limited areas that lack access to centralized medical facilities.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 18:59 UTC C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling Open

We present C2LLM - Contrastive Code Large Language Models, a family of code embedding models in both 0.5B and 7B sizes.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 18:54 UTC Measuring all the noises of LLM Evals Open

Separating signal from noise is central to experimental science.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 18:46 UTC Parallel Token Prediction for Language Models Open

We propose Parallel Token Prediction (PTP), a universal framework for parallel sequence generation in language models.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 18:37 UTC Variationally correct operator learning: Reduced basis neural operator with a posteriori error estimation Open

Minimizing PDE-residual losses is a common strategy to promote physical consistency in neural operators.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 18:24 UTC Scaling Laws for Economic Productivity: Experimental Evidence in LLM-Assisted Consulting, Data Analyst, and Management Tasks Open

This paper derives `Scaling Laws for Economic Impacts' -- empirical relationships between the training compute of Large Language Models (LLMs) and professional productivity.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 18:21 UTC Does the Data Processing Inequality Reflect Practice? On the Utility of Low-Level Tasks Open

The data processing inequality is an information-theoretic principle stating that the information content of a signal cannot be increased by processing the observations.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 18:14 UTC Learning to Solve PDEs on Neural Shape Representations Open

Solving partial differential equations (PDEs) on shapes underpins many shape analysis and engineering tasks; yet, prevailing PDE solvers operate on polygonal/triangle meshes while modern 3D assets increasingly live as…

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 17:39 UTC Transcriptome-Conditioned Personalized De Novo Drug Generation for AML Using Metaheuristic Assembly and Target-Driven Filtering Open

Acute Myeloid Leukemia (AML) remains a clinical challenge due to its extreme molecular heterogeneity and high relapse rates.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 17:10 UTC Model Merging via Multi-Teacher Knowledge Distillation Open

Model merging has emerged as a lightweight alternative to joint multi-task learning (MTL), yet the generalization properties of merged models remain largely unexplored.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 16:59 UTC SMART SLM: Structured Memory and Reasoning Transformer, A Small Language Model for Accurate Document Assistance Open

The user of Engineering Manuals (EM) finds it difficult to read EM s because they are long, have a dense format which includes written documents, step by step procedures, and standard parameter lists for engineering…

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:43 UTC Learning Factors in AI-Augmented Education: A Comparative Study of Middle and High School Students Open

The increasing integration of AI tools in education has led prior research to explore their impact on learning processes.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:36 UTC LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation Open

Methods that use Large Language Models (LLM) as planners for embodied instruction following tasks have become widespread.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:35 UTC Improving the Convergence Rate of Ray Search Optimization for Query-Efficient Hard-Label Attacks Open

In hard-label black-box adversarial attacks, where only the top-1 predicted label is accessible, the prohibitive query complexity poses a major obstacle to practical deployment.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 15:29 UTC Assessing the Software Security Comprehension of Large Language Models Open

Large language models (LLMs) are increasingly used in software development, but their level of software security expertise remains unclear.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:25 UTC Casting a SPELL: Sentence Pairing Exploration for LLM Limitation-breaking Open

Large language models (LLMs) have revolutionized software development through AI-assisted coding tools, enabling developers with limited programming expertise to create sophisticated applications.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 24, 2025 15:15 UTC MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models Open

Large Language Models can develop reasoning capabilities through online fine-tuning with rule-based rewards.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:07 UTC PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation Open

In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:02 UTC Leveraging Lightweight Entity Extraction for Scalable Event-Based Image Retrieval Open

Retrieving images from natural language descriptions is a core task at the intersection of computer vision and natural language processing, with wide-ranging applications in search engines, media archiving, and…

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 15:01 UTC RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic Open

Embodied agents powered by vision-language models (VLMs) are increasingly capable of executing complex real-world tasks, yet they remain vulnerable to hazardous instructions that may trigger unsafe behaviors.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 14:51 UTC Causal-driven attribution (CDA): Estimating channel influence without user-level data Open

Attribution modelling lies at the heart of marketing effectiveness, yet most existing approaches depend on user-level path data, which are increasingly inaccessible due to privacy regulations and platform restrictions.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 14:33 UTC SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation Open

Human infants, with only a few hundred hours of speech exposure, acquire basic units of new languages, highlighting a striking efficiency gap compared to the data-hungry self-supervised speech models.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 24, 2025 14:28 UTC Schrödinger's Navigator: Imagining an Ensemble of Futures for Zero-Shot Object Navigation Open

Zero-shot object navigation (ZSON) requires a robot to locate a target object in a previously unseen environment without relying on pre-built maps or task-specific training.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 11:59 UTC Active inference and artificial reasoning Open

This technical note considers the sampling of outcomes that provide the greatest amount of information about the structure of underlying world models.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 11:18 UTC Statistical and computational challenges in ranking Open

We consider the problem of ranking $n$ experts according to their abilities, based on the correctness of their answers to $d$ questions.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 09:39 UTC Understanding Scaling Laws in Deep Neural Networks via Feature Learning Dynamics Open

The empirical success of deep learning is often attributed to scaling laws that predict consistent gains as model, data, and compute grow; however, large models can exhibit training instability and diminishing…

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 07:34 UTC Enhancing diffusion models with Gaussianization preprocessing Open

Diffusion models are a class of generative models that have demonstrated remarkable success in tasks such as image generation.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 07:10 UTC Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments Open

Modeling sparse count data, which arise across numerous scientific fields, presents significant statistical challenges.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 24, 2025 03:39 UTC Invariant Feature Extraction Through Conditional Independence and the Optimal Transport Barycenter Problem: the Gaussian case Open

A methodology is developed to extract $d$ invariant features $W=f(X)$ that predict a response variable $Y$ without being confounded by variables $Z$ that may influence both $X$ and $Y$.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 23, 2025 22:20 UTC Weighted MCC: A Robust Measure of Multiclass Classifier Performance for Observations with Individual Weights Open

Several performance measures are used to evaluate binary and multiclass classification tasks.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv stat.MLDec 23, 2025 20:49 UTC Subgroup Discovery with the Cox Model Open

We study the problem of subgroup discovery for survival analysis, where the goal is to find an interpretable subset of the data on which a Cox model is highly accurate.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.LGDec 23, 2025 14:40 UTC GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer Open

We present GeoTransolver, a Multiscale Geometry-Aware Physics Attention Transformer for CAE that replaces standard attention with GALE, coupling physics-aware self-attention on learned state slices with…

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 19, 2025 18:11 UTC Interpretable Plant Leaf Disease Detection Using Attention-Enhanced CNN Open

Plant diseases pose a significant threat to global food security, necessitating accurate and interpretable disease detection methods.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original
arXiv cs.AIDec 18, 2025 21:29 UTC When F1 Fails: Granularity-Aware Evaluation for Dialogue Topic Segmentation Open

Dialogue topic segmentation supports summarization, retrieval, memory management, and conversational continuity.

Builder: scan the abstract + experiments; look for code, datasets, and evals.

Research

Read Receipts Original