Skip to content
Mobrief
Mobrief

The Brief

Labs just now

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in FinanceNVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance

Large language models (LLMs) are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to...

NVIDIA Developer
Product just now

Fast and FlexibleFlexAttention + FlashAttention-4: Fast and Flexible

TL;DR: On Hopper and Blackwell GPUs, FlexAttention now has a FlashAttention-4 backend.

PyTorch Blog added support in PyTorch to automatically generate CuTeDSL score/mask modification functions, and to JIT-instantiate FlashAttention-4 for custom attention variants. This leads to performance gains of 1.2×…

Why it matters

This leads to performance gains of 1.2× to 3.2× over the existing Triton implementation on compute-bound workloads.

PyTorch Blog
Labs 1h ago

Controlling Floating-Point Determinism in NVIDIA CCCLControlling Floating-Point Determinism in NVIDIA CCCL

A computation is considered deterministic if multiple runs with the same input data produce the same bitwise result.

While this may seem like a simple property...

NVIDIA Developer
Labs 8h ago

Reasoning models struggle to control their chains of thought, and that’s goodReasoning models struggle to control their chains of thought, and that’s good

OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.

OpenAI News
Labs 8h ago

GPT-5.4 Thinking System CardGPT-5.4 Thinking System Card

OpenAI News: GPT-5.4 Thinking System Card.

OpenAI News
ALL STORIES

60 stories from 88 sources

Research just now

Chatbot Arena Elo Rankings — Top 20 ModelsChatbot Arena Elo Rankings — Top 20 Models

LMArena Elo Rankings — Chatbot Arena Elo Rankings — Top 20 Models. Compare and track AI model performance.

LMArena Elo Rankings
Research just now

How to Power Data 360 with Code ExtensionHow to Power Data 360 with Code Extension

Code Extension resolves a dilemma an architect faces: the need to operationalize complex data manipulations without leaving the Salesforce trust boundary.

Traditionally, handling advanced requirements like parsing complex XML, managing encrypted data, or designing custom AI chunking algorithms, led architects to export data to external systems or rely on unmanaged local…

Why it matters

Moving the data to perform advanced processing on it introduces risks, including rogue code security vulnerabilities, and compliance…

Salesforce AI Research
Labs just now

How does AI understand my visual searchesAsk a Techspert: How does AI understand my visual searches?

Learn more about AI Mode in Search’s query fan-out method for visual search.

Google AI Blog
Labs just now

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in FinanceNVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance

Large language models (LLMs) are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to...

NVIDIA Developer
Product just now

Fast and FlexibleFlexAttention + FlashAttention-4: Fast and Flexible

TL;DR: On Hopper and Blackwell GPUs, FlexAttention now has a FlashAttention-4 backend.

PyTorch Blog added support in PyTorch to automatically generate CuTeDSL score/mask modification functions, and to JIT-instantiate FlashAttention-4 for custom attention variants. This leads to performance gains of 1.2×…

Why it matters

This leads to performance gains of 1.2× to 3.2× over the existing Triton implementation on compute-bound workloads.

PyTorch Blog
Research 1h ago

ChatGPT users research products but won't buy there, forcing OpenAI to rethink its…ChatGPT users research products but won't buy there, forcing OpenAI to rethink its commerce strategy

OpenAI wanted to turn ChatGPT into a shopping destination, but only about a dozen retailers signed up and users weren't buying.

Now the company is handing off purchases to app partners like Instacart and Target. The article ChatGPT users research products but won't buy there, forcing OpenAI to rethink its commerce strategy appeared first on…

The Decoder
Product 1h ago

Make kulinseth and albanD emeritus for MPS/Metal backendMake kulinseth and albanD emeritus for MPS/Metal backend

PyTorch Releases: Make kulinseth and albanD emeritus for MPS/Metal backend.

PyTorch Releases
Labs 1h ago

Controlling Floating-Point Determinism in NVIDIA CCCLControlling Floating-Point Determinism in NVIDIA CCCL

A computation is considered deterministic if multiple runs with the same input data produce the same bitwise result.

While this may seem like a simple property...

NVIDIA Developer
Product 1h ago

Learnings from GitHub and AndelaScaling AI opportunity across the globe: Learnings from GitHub and Andela

Across the globe, developer talent is abundant.

But what has been historically inequitable is the access to emerging technologies, mentorship, and enablement when those technologies are reshaping the industry. Developers in regions like Africa, South America, and…

Why it matters

Developers in regions like Africa, South America, and Southeast Asia can build products at scale, yet access to emerging tools and…

GitHub Blog
Product 1h ago

The ultimate Nano Banana prompting guideThe ultimate Nano Banana prompting guide

Creating precise, high-quality images often involves endless trial and error.

You need a model that actually understands what you’re asking for. Built on the Gemini 3 family of models, Nano Banana models apply deep reasoning capabilities to fully understand your prompt before generating an…

Google Cloud AI Blog
Tech 1h ago

Roblox is censoring chats with AIRoblox is censoring chats with AI

Roblox is using AI to alter the content of chat messages on its platform in real time using a new feature rolling out today.

Real-time chat rephrasing goes beyond the current filtering for banned language, which replaces certain words and phrases with "#" symbols. Now, Roblox says those words and phrases can be "translated into […]

The Verge Tech
Business 1h ago

Gemini’s Canvas in AI Mode Available in Google Search in USGemini’s Canvas in AI Mode Available in Google Search in US

The expansion of the workspace platform follows a gradual rollout.

AI Business
Tech 1h ago

Meta’s AI glasses reportedly send sensitive footage to human reviewers in KenyaMeta’s AI glasses reportedly send sensitive footage to human reviewers in Kenya

Meta's AI-powered smart glasses could be sending sensitive footage to human reviewers in Nairobi, Kenya, according to an investigation by the Swedish outlets Svenska Dagbladet and Göteborgs-Posten.

The report, which was published last week, claims Meta contractors in Kenya have seen videos captured with the smart glasses that show "bathroom visits, sex and other intimate […]

The Verge Tech
Research 1h ago

Google Search quietly becomes an AI assistant as Canvas feature launches for US usersGoogle Search quietly becomes an AI assistant as Canvas feature launches for US users

Google is turning AI search into a workspace.

Canvas lets users build interactive dashboards, documents, and code prototypes directly in AI mode. The article Google Search quietly becomes an AI assistant as Canvas feature launches for US users appeared first on…

The Decoder
Labs 1h ago

The latest AI news we announced in FebruaryThe latest AI news we announced in February

Here are Google’s latest AI updates from February 2026

Google AI Blog
Product 2h ago

Drive organizational growth with Amazon Lex multi-developer CI/CD pipelineDrive organizational growth with Amazon Lex multi-developer CI/CD pipeline

As your conversational AI initiatives evolve, developing Amazon Lex assistants becomes increasingly complex.

Multiple developers working on the same shared Lex instance leads to configuration conflicts, overwritten changes, and slower iteration cycles. Scaling Amazon Lex development requires isolated environments, version…

AWS Machine Learning
Research 2h ago

Olmo Hybrid and future LLM architecturesOlmo Hybrid and future LLM architectures

So-called hybrid architectures are far from new in open-weight models these days.

Interconnects (Nathan Lambert) now have the recent Qwen 3.5 (previewed by Qwen3-Next ), Kimi Linear last fall (a smaller release than their flagship Kimi K2 models ), Nvidia’s Nemotron 3 Nano (with the bigger models…

Why it matters

This is one of those times when a research trend looks like it’s getting adopted everywhere at once (maybe the Muon optimizer too, soon?).

Interconnects (Nathan Lambert)
Product 2h ago

Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI…Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints

AWS Machine Learning: Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints.

AWS Machine Learning
Product 2h ago

Deploying PyTorch Models to the Micro-Edge with ExecuTorch and ArmDeploying PyTorch Models to the Micro-Edge with ExecuTorch and Arm

PyTorch Blog: Deploying PyTorch Models to the Micro-Edge with ExecuTorch and Arm.

PyTorch Blog
Business 3h ago

Databricks built a RAG agent it says can handle every kind of enterprise searchDatabricks built a RAG agent it says can handle every kind of enterprise search

Most enterprise RAG pipelines are optimized for one search behavior.

They fail silently on the others. A model trained to synthesize cross-document reports handles constraint-driven entity search poorly. A model tuned for simple lookup tasks falls apart on multi-step reasoning over…

VentureBeat
Research 3h ago

Social Relationship Management Software For Small TeamsSocial Relationship Management Software For Small Teams

Key Takeaways Social relationship management software lets small teams respond to customer inquiries faster and with more personalized context, focusing on the relationship aspect of marketing.

Integrating social with a unified data system, like a CRM, ensures that every interaction is recorded and accessible to your teams. Salesforce offers automated tools for connecting to your social network, helping to…

Salesforce AI Research
Research 3h ago

Tech giants make non-binding White House pledge to cover AI data center energy costsTech giants make non-binding White House pledge to cover AI data center energy costs

Google, Microsoft, Meta, Amazon, Oracle, xAI, and OpenAI signed a voluntary pledge at the White House to cover the electricity costs of their data centers themselves.

The article Tech giants make non-binding White House pledge to cover AI data center energy costs appeared first on The Decoder .

The Decoder
Press 3h ago

an AI agent’s hit piece, and preventing lightningThe Download: an AI agent’s hit piece, and preventing lightning

This is today’s edition of The Download , MIT Technology Review's weekday newsletter that provides a daily dose of what’s going on in the world of technology.

Online harassment is entering its AI era Scott Shambaugh didn’t think twice when he denied an AI agent’s request to contribute to matplotlib, a software library he helps manage. Then things got weird. In the middle of…

MIT Technology Review
Research 4h ago

Apple puts AI disclosure responsibility on labels and distributorsApple puts AI disclosure responsibility on labels and distributors

Apple Music is rolling out Transparency Tags that let labels and distributors flag AI-generated content across four categories: Artwork, Tracks, Compositions, and Music Videos.

The article Apple puts AI disclosure responsibility on labels and distributors appeared first on The Decoder .

The Decoder
Research 4h ago

Anthropic CEO attacks OpenAI's Pentagon deal as "safety theater" while investors…Anthropic CEO attacks OpenAI's Pentagon deal as "safety theater" while investors scramble for de-escalation

Anthropic CEO Dario Amodei attacks OpenAI's Pentagon contract as "80% safety theater" in a leaked memo and accuses the Trump administration of punishing his company for a lack of political loyalty.

OpenAI hastily updates its contract, investors push for de-escalation, and a major tech industry group backs Anthropic. Meanwhile, Amodei is making a last-ditch attempt to negotiate directly with the Under Secretary…

The Decoder
Tech 4h ago

Apple Music adds optional labels for AI songs and visualsApple Music adds optional labels for AI songs and visuals

Apple is asking artists and record labels on its music streaming platform to voluntarily label songs that were made using AI.

The new "Transparency Tags" metadata system for Apple Music was announced in a newsletter to industry partners yesterday, according to Music Business Worldwide, and covers four categories, including track,…

The Verge Tech
Business 4h ago

Euro Regulators Question Meta Over AI Glasses Privacy FearsEuro Regulators Question Meta Over AI Glasses Privacy Fears

The allegations involve a U.S.-based data annotation and labeling vendor.

AI Business
Tech 4h ago

AI tools can unmask anonymous accountsAI tools can unmask anonymous accounts

Do you have a Reddit alt, secret X, finsta, or Glassdoor account you trash your boss with?

AI might have just made it a lot easier to unmask you. That's the conclusion of a recently published study, which hints at some uncomfortable consequences for staying private online - even if it's not quite time to […]

The Verge Tech
Product 5h ago

GPT 5.4 is now on AI GatewayGPT 5.4 is now on AI Gateway

GPT-5.4 is now available on AI Gateway.This model brings the agentic and reasoning leaps from GPT-5.3-Codex to all domains.

This includes knowledge work like reports, spreadsheets, presentations, and analysis in addition to coding. It handles complex multi-step workflows more reliably, including tasks that involve tools, research, and…

Why it matters

GPT-5.4 is faster and also more token-efficient than previous iterations (GPT-5.2).To use this model, set model to openai/gpt-5.4 in the…

Vercel Blog
Research 5h ago

Alibaba's chief AI developer quits, taking key team members with himAlibaba's chief AI developer quits, taking key team members with him

Alibaba's chief AI developer Junyang Lin has unexpectedly resigned, and several core members of the Qwen team followed him out the door.

The departures were reportedly triggered by an internal reorganization. The article Alibaba's chief AI developer quits, taking key team members with him appeared first on The Decoder .

The Decoder
News 6h ago

How Human Work Will Remain Valuable in an AI WorldHow Human Work Will Remain Valuable in an AI World

The Road to Reality — Episode 1

Towards Data Science
Tech 6h ago

Anthropic makes last-ditch effort to salvage deal with Pentagon after blowupAnthropic makes last-ditch effort to salvage deal with Pentagon after blowup

Anthropic CEO Dario Amodei is reportedly back at the negotiating table with the Department of Defense in an attempt to salvage the company's relationship with the US military and prevent it from being iced out of…

The Verge Tech
News 7h ago

Vector Databases vs. Graph RAG for Agent Memory: When to Use WhichVector Databases vs. Graph RAG for Agent Memory: When to Use Which

Machine Learning Mastery: Vector Databases vs. Graph RAG for Agent Memory: When to Use Which.

Machine Learning Mastery
Press 7h ago

How much wildfire prevention is too muchHow much wildfire prevention is too much?

The race to prevent the worst wildfires has been an increasingly high-tech one.

Companies are proposing AI fire detection systems and drones that can stamp out early blazes . And now, one Canadian startup says it’s going after lightning. Lightning-sparked fires can be a big deal: The Canadian…

MIT Technology Review
Business 7h ago

An Interview with Gregory Allen About Anthropic and the U.S. GovernmentAn Interview with Gregory Allen About Anthropic and the U.S. Government

Stratechery: An Interview with Gregory Allen About Anthropic and the U.S. Government.

Stratechery
Research 8h ago

Entomologists Use a Particle Accelerator to Image Ants at ScaleEntomologists Use a Particle Accelerator to Image Ants at Scale

The ants that animators once morphed into googly-eyed caricatures in films such as A Bug’s Life and Antz just received a meticulously precise anatomical reboot.

The ants that animators once morphed into googly-eyed caricatures in films such as A Bug’s Life and Antz just received a meticulously precise anatomical reboot. Writing today in Nature Methods , an international team…

Why it matters

Writing today in Nature Methods , an international team of entomologists, accelerator physicists, computer scientists, and biological…

IEEE Spectrum AI
Labs 8h ago

Reasoning models struggle to control their chains of thought, and that’s goodReasoning models struggle to control their chains of thought, and that’s good

OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.

OpenAI News
Labs 8h ago

GPT-5.4 Thinking System CardGPT-5.4 Thinking System Card

OpenAI News: GPT-5.4 Thinking System Card.

OpenAI News
Press 8h ago

Online harassment is entering its AI eraOnline harassment is entering its AI era

Scott Shambaugh didn’t think twice when he denied an AI agent’s request to contribute to matplotlib, a software library that he helps manage.

Like many open-source projects, matplotlib has been overwhelmed by a glut of AI code contributions, and so Shambaugh and his fellow maintainers have instituted a policy that all AI-written code must be reviewed and…

Why it matters

He rejected the request and went to bed.

MIT Technology Review
Labs 9h ago

Ensuring AI use in education leads to opportunityEnsuring AI use in education leads to opportunity

OpenAI shares new tools, certifications, and measurement resources to help schools and universities close AI capability gaps and expand opportunity.

OpenAI News
News 9h ago

LWiAI Podcast #235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs PentagonLWiAI Podcast #235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon

Last Week in AI's 235th episode with a summary and discussion of last week’s big AI news!

Recorded on 02/27/2026. Hosted by Andrey Kurenkov and Jeremie Harris Note from Andrey: the startup Astrocade is hiring for engineers, marketing, product, growth, and more! If you’re in the bay area, would like to join…

Last Week in AI
Labs 11h ago

High Error Rate in Realtime API (EU)High Error Rate in Realtime API (EU)

Status: Resolved All impacted services have now fully recovered.

Affected components Search (Operational) Images (Operational) Audio (Operational) Login (Operational) Connectors/Apps (Operational) Realtime (Operational) Files (Operational) Video generation (Operational) Login…

OpenAI Status
Labs 11h ago

User may experience errors in ChatGPTUser may experience errors in ChatGPT

Status: Resolved All impacted services have now fully recovered.

Affected components Conversations (Operational)

OpenAI Status
Product 13h ago

Revert "[BE] Apply up007 and up045 to .ci through tools "Revert "[BE] Apply up007 and up045 to .ci through tools "

This reverts commit f1da356 .

Reverted #176458 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ( comment )

PyTorch Releases
Research 23h ago

Simplifying Human Motion PredictionSimpliHuMoN: Simplifying Human Motion Prediction

Human motion prediction combines the tasks of trajectory forecasting and human pose prediction.

For each of the two tasks, specialized models have been developed. Combining these models for holistic human motion prediction is non-trivial, and recent methods have struggled to compete on established benchmarks for…

arXiv · Computer Vision
Research 23h ago

Accurate and Efficient Hybrid-Ensemble Atmospheric Data Assimilation in Latent Space…Accurate and Efficient Hybrid-Ensemble Atmospheric Data Assimilation in Latent Space with Uncertainty Quantification

Data assimilation (DA) combines model forecasts and observations to estimate the optimal state of the atmosphere with its uncertainty, providing initial conditions for weather prediction and reanalyses for climate…

Yet, existing traditional and machine-learning DA methods struggle to achieve accuracy, efficiency and uncertainty quantification simultaneously. Here, arXiv cs.LG proposes HLOBA (Hybrid-Ensemble Latent…

Why it matters

Here, arXiv cs.LG proposes HLOBA (Hybrid-Ensemble Latent Observation-Background Assimilation), a three-dimensional hybrid-ensemble DA…

arXiv · Machine Learning
Research 23h ago

Supernova Explosions Learned by Deep ODE NetworksSELDON: Supernova Explosions Learned by Deep ODE Networks

The discovery rate of optical transients will explode to 10 million public alerts per night once the Vera C.

Rubin Observatory's Legacy Survey of Space and Time comes online, overwhelming the traditional physics-based inference pipelines. A continuous-time forecasting AI model is of interest because it can deliver…

Why it matters

A continuous-time forecasting AI model is of interest because it can deliver millisecond-scale inference for thousands of objects per…

arXiv · Machine Learning
Research 23h ago

A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS DevelopmentA Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

WebGIS development requires rigor, yet agentic AI frequently fails due to five large language model (LLM) limitations: context constraints, cross-session forgetting, stochasticity, instruction failure, and adaptation…

arXiv cs.AI proposes a dual-helix governance framework reframing these challenges as structural governance problems that model capacity alone cannot resolve. arXiv cs.AI implements the framework as a 3-track…

arXiv · Artificial Intelligence
Research 23h ago

Linear-Time Stateful 3D Reconstruction with Test-Time TrainingZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training

Feed-forward transformer models have driven rapid progress in 3D vision, but state-of-the-art methods such as VGGT and $π^3$ have a computational cost that scales quadratically with the number of input images, making…

Sequential-reconstruction approaches reduce this cost but sacrifice reconstruction quality. arXiv cs.CV introduces ZipMap, a stateful feed-forward model that achieves linear-time, bidirectional 3D reconstruction while…

arXiv · Computer Vision
Research 23h ago

Reasoning-Aware Retrival for Deep Research AgentsAgentIR: Reasoning-Aware Retrival for Deep Research Agents

Deep Research agents are rapidly emerging as primary consumers of modern retrieval systems.

Unlike human users who issue and refine queries without documenting their intermediate thought processes, Deep Research agents generate explicit natural language reasoning before each search call, revealing rich…

Why it matters

To exploit this overlooked signal, arXiv cs.CL introduces: (1) Reasoning-Aware Retrieval, a retrieval paradigm that jointly embeds the…

arXiv · NLP & Language
Research 23h ago

Tracking Affiliate Marketing and FTC Compliance in YouTube's Influencer EconomyTurning Trust to Transactions: Tracking Affiliate Marketing and FTC Compliance in YouTube's Influencer Economy

YouTube has evolved into a powerful platform that where creators monetize their influence through affiliate marketing, raising concerns about transparency and ethics, especially when creators fail to disclose their…

Although regulatory agencies like the US Federal Trade Commission (FTC) have issued guidelines to address these issues, non-compliance and consumer harm persist, and the extent of these problems remains unclear. In…

arXiv · Machine Learning
Research 23h ago

Reinforcement Learning with Intermediate Rewards for Interpretable Fine-Grained Visual…TaxonRL: Reinforcement Learning with Intermediate Rewards for Interpretable Fine-Grained Visual Reasoning

Traditional vision-language models struggle with contrastive fine-grained taxonomic reasoning, particularly when distinguishing between visually similar species within the same genus or family.

arXiv cs.CV introduces TaxonRL, a reinforcement learning approach using Group Relative Policy Optimization with intermediate rewards that decomposes the reasoning process into hierarchical taxonomic predictions. arXiv…

Why it matters

arXiv cs.CV's method incentivizes models to explicitly reason about species-level, genus-level, and family-level features before making…

arXiv · Computer Vision
Research 23h ago

Real Real-Time Long Video Generation ModelHelios: Real Real-Time Long Video Generation Model

arXiv cs.CV introduces Helios, the first 14B video generation model that runs at 19.5 FPS on a single NVIDIA H100 GPU and supports minute-scale generation while matching the quality of a strong baseline.

arXiv cs.CV make breakthroughs along three key dimensions: (1) robustness to long-video drifting without commonly used anti-drifting heuristics such as self-forcing, error-banks, or keyframe sampling; (2) real-time…

Why it matters

Specifically, Helios is a 14B autoregressive diffusion model with a unified input representation that natively supports T2V, I2V, and V2V…

arXiv · Computer Vision
Research 23h ago

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian RegularizationRobustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

As Large Language Models (LLMs) transition into autonomous multi-agent ecosystems, robust minimax training becomes essential yet remains prone to instability when highly non-linear policies induce extreme local…

Standard remedies that enforce global Jacobian bounds are overly conservative, suppressing sensitivity in all directions and inducing a large Price of Robustness. arXiv cs.AI introduces Adversarially-Aligned Jacobian…

Why it matters

arXiv cs.AI introduces Adversarially-Aligned Jacobian Regularization (AAJR), a trajectory-aligned approach that controls sensitivity…

arXiv · Artificial Intelligence
Research 23h ago

Evaluating Conversational Agents over Unstructured KnowledgeKnowledge: Evaluating Conversational Agents over Unstructured Knowledge

Conversational agents are increasingly deployed in knowledge-intensive settings, where correct behavior depends on retrieving and applying domain-specific knowledge from large, proprietary, and unstructured corpora…

Yet most existing benchmarks evaluate retrieval or tool use independently of each other, creating a gap in realistic, fully agentic evaluation over unstructured data in long-horizon interactions. arXiv cs.AI…

arXiv · Artificial Intelligence
Research 23h ago

Low-Resource Guidance for Controllable Latent Audio DiffusionLow-Resource Guidance for Controllable Latent Audio Diffusion

Generative audio requires fine-grained controllable outputs, yet most existing methods require model retraining on specific controls or inference-time controls (\textit{e.g.}, guidance) that can also be…

By examining the bottlenecks of existing guidance-based controls, in particular their high cost-per-step due to decoder backpropagation, arXiv cs.AI introduces a guidance-based approach through selective TFG and…

arXiv · Artificial Intelligence
Research 23h ago

Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web…Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal Attacks

Multimodal web agents that process both screenshots and accessibility trees are increasingly deployed to interact with web interfaces, yet their dual-stream architecture opens an underexplored attack surface: an…

arXiv cs.AI's vulnerability analysis on MiniWob++ reveals that attacks including a visual component far outperform text-only injections, exposing critical gaps in text-centric VLM safety training. Motivated by this…

arXiv · Artificial Intelligence
Research 23h ago

Robust Unscented Kalman Filtering via Recurrent Meta-Adaptation of Sigma-Point WeightsRobust Unscented Kalman Filtering via Recurrent Meta-Adaptation of Sigma-Point Weights

The Unscented Kalman Filter (UKF) is a ubiquitous tool for nonlinear state estimation; however, its performance is limited by the static parameterization of the Unscented Transform (UT).

Conventional weighting schemes, governed by fixed scaling parameters, assume implicit Gaussianity and fail to adapt to time-varying dynamics or heavy-tailed measurement noise. This work introduces the Meta-Adaptive…

Why it matters

This work introduces the Meta-Adaptive UKF (MA-UKF), a framework that reformulates sigma-point weight synthesis as a hyperparameter…

arXiv · Machine Learning
Research 23h ago

A Concentration-Alignment PerspectiveDissecting Quantization Error: A Concentration-Alignment Perspective

Quantization can drastically increase the efficiency of large language and vision models, but typically incurs an accuracy drop.

Recently, function-preserving transforms (e.g. rotations, Hadamard transform, channel-wise scaling) have been successfully applied to reduce post-training quantization error, yet a principled explanation remains…

Why it matters

arXiv cs.AI analyze linear-layer quantization via the signal-to-quantization-noise ratio (SQNR), showing that for uniform integer…

arXiv · Artificial Intelligence
/ Search M Mode T Theme