Mobrief

Thursday, March 5, 2026 45 new · 90 sources · 15 papers

Product 7h ago

Quantization-Aware Training in TorchAO (II)Quantization-Aware Training in TorchAO (II)

In our previous Quantization-Aware Training (QAT) blog , we introduced the initial QAT flow in TorchAO for large language models targeting edge devices with ExecuTorch .

Since then, we extended this flow to also target fast CUDA kernels like the ones in MSLK for fast inference in vLLM , and incorporated this flow into popular fine-tuning frameworks like Unsloth and…

Why it matters

Since then, we extended this flow to also target fast CUDA kernels like the ones in MSLK for fast inference in vLLM , and incorporated…

PyTorch Blog

Business 3h ago

Nvidia Takes on Telco Industry With Open Source ModelNvidia Takes on Telco Industry With Open Source Model

While Nvidia's approach focuses on enabling more autonomous workflows for telco companies, it faces competition from traditional network vendors such as Ericsson and Nokia.

AI Business

Tech 8h ago

Google faces wrongful death lawsuit after Gemini allegedly ‘coached’ man to die by suicideGoogle faces wrongful death lawsuit after Gemini allegedly ‘coached’ man to die by suicide

A lawsuit filed on Wednesday accuses Google's Gemini AI chatbot of trapping 36-year-old Jonathan Gavalas in a "collapsing reality" that involved a series of violent missions, ultimately ending with…

In the days leading up to his death, Gemini allegedly convinced Gavalas that he was "executing a covert plan to liberate his […]

Why it matters

In the days leading up to his death, Gemini allegedly convinced Gavalas that he was "executing a covert plan to liberate his […]

The Verge Tech

Community 1h ago

32% Expert-Pruned for Agentic Coding (GGUF)Qwen3.5-24B-A3B-REAP-0.32: 32% Expert-Pruned for Agentic Coding (GGUF)

I forked CerebrasResearch/reap and added some custom patches for Qwen3.5 support, I have just released a REAPed…

I wanted to run the MoE model on my 16GB nvidia card and no one had pruned the model yet so I started this.

Reddit LocalLLaMA

Community 2h ago

I'm running a Truman Show for an AI agent. It writes its own code…I'm running a Truman Show for an AI agent. It writes its own code, files its own bugs, and doesn't know you're watching.

Four days ago I wrote a 200-line coding agent in Rust.

Gave it one rule: evolve yourself into something that rivals Claude Code.

Reddit LocalLLaMA

Product 6h ago

LangSmith CLI & SkillsLangSmith CLI & Skills

We’re releasing a CLI along with our first set of skills to give AI coding agents expertise in the LangSmith ecosystem.

This includes adding tracing to agents, understanding their execution, building test sets, and evaluating performance.

LangChain Blog

ALL STORIES

60 stories from 90 sources

Research just now

Chatbot Arena Elo Rankings — Top 20 ModelsChatbot Arena Elo Rankings — Top 20 Models

LMArena Elo Rankings — Chatbot Arena Elo Rankings — Top 20 Models. Compare and track AI…

Compare and track AI model performance.

LMArena Elo Rankings

Community 1h ago

AMD engineer leverages AI to help make a pure-Python AMD GPU user-space driverAMD engineer leverages AI to help make a pure-Python AMD GPU user-space driver

Reddit Artificial: AMD engineer leverages AI to help make a pure-Python AMD GPU…

Reddit Artificial

Community 1h ago

Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up…Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch

Anthropic gave up its contract with the Pentagon over AI safety disagreements -- then,…

Reddit singularity

Press 1h ago

Grammarly Is Offering ‘Expert’ AI Reviews From Your Favorite Authors—Dead or AliveGrammarly Is Offering ‘Expert’ AI Reviews From Your Favorite Authors—Dead or Alive

The tool, offered by the recently-rebranded company Superhuman, gives feedback based on…

Wired AI

Community 1h ago

YuanLabAI/Yuan3.0-Ultra • HuggingfaceYuanLabAI/Yuan3.0-Ultra • Huggingface

Yuan 3.0 is a multimodal large model based on MoE architecture.

It supports multimodal inputs including text, images, tables and documents, and…

Reddit LocalLLaMA

Community 1h ago

32% Expert-Pruned for Agentic Coding (GGUF)Qwen3.5-24B-A3B-REAP-0.32: 32% Expert-Pruned for Agentic Coding (GGUF)

I forked CerebrasResearch/reap and added some custom patches for Qwen3.5 support, I…

I wanted to run the MoE model on my 16GB nvidia card and no one had pruned the model…

Reddit LocalLLaMA

Community 2h ago

I'm running a Truman Show for an AI agent. It writes its own code…I'm running a Truman Show for an AI agent. It writes its own code, files its own bugs, and doesn't know you're watching.

Four days ago I wrote a 200-line coding agent in Rust.

Gave it one rule: evolve yourself into something that rivals Claude Code.

Reddit LocalLLaMA

Community 2h ago

GPT-5.4 on lmarenaGPT-5.4 on lmarena

Go try for yourself, both text and image input.

Reddit singularity

Community 2h ago

Bernie Sanders meets with Eliezer Yudkowsky and Nate Soares(MIRI) to discuss AI RiskBernie Sanders meets with Eliezer Yudkowsky and Nate Soares(MIRI) to discuss AI Risk

Reddit singularity: Bernie Sanders meets with Eliezer Yudkowsky and Nate Soares(MIRI)…

Reddit singularity

Press 2h ago

What AI Models for War Actually Look LikeWhat AI Models for War Actually Look Like

While companies like Anthropic debate limits on military uses of AI, Smack Technologies…

Wired AI

Product 3h ago

Embed Amazon Quick Suite chat agents in enterprise applicationsEmbed Amazon Quick Suite chat agents in enterprise applications

AWS Machine Learning: Embed Amazon Quick Suite chat agents in enterprise applications.

First, users need answers where they work—in their CRM, support console, or analytics…

AWS Machine Learning

Product 3h ago

Unlock powerful call center analytics with Amazon Nova foundation modelsUnlock powerful call center analytics with Amazon Nova foundation models

Call center analytics play a crucial role in improving customer experience and…

With foundation models (FMs), you can improve the quality and efficiency of call center…

AWS Machine Learning

Business 3h ago

Nvidia Takes on Telco Industry With Open Source ModelNvidia Takes on Telco Industry With Open Source Model

While Nvidia's approach focuses on enabling more autonomous workflows for telco…

AI Business

Product 3h ago

How Ricoh built a scalable intelligent document processing solution on AWSHow Ricoh built a scalable intelligent document processing solution on AWS

This post is cowritten by Jeremy Jacobson and Rado Fulek from Ricoh.

This post demonstrates how enterprises can overcome document processing scaling limits…

AWS Machine Learning

Business 4h ago

Black Forest Labs' new Self-Flow technique makes training multimodal AI models 2.8x…Black Forest Labs' new Self-Flow technique makes training multimodal AI models 2.8x more efficient

To create coherent images or videos, generative AI diffusion models like Stable…

But this reliance has come at a cost: a "bottleneck" where scaling up the model no…

VentureBeat

Research 5h ago

OpenAI's Codex app lands on Windows after topping a million Mac downloads in its…OpenAI's Codex app lands on Windows after topping a million Mac downloads in its first week

OpenAI brings its AI coding tool Codex to Windows, with native support for Windows…

The article OpenAI's Codex app lands on Windows after topping a million Mac downloads…

The Decoder

Tech 5h ago

Google’s AI-powered workspace is now available to more users in SearchGoogle’s AI-powered workspace is now available to more users in Search

Google is bringing Canvas to everyone in the US using AI Mode in Search.

The feature opens up a dedicated workspace within its AI-powered search tool, allowing…

The Verge Tech

Research 6h ago

Google faces wrongful death suit after Gemini allegedly convinced a man to die and…Google faces wrongful death suit after Gemini allegedly convinced a man to die and become digital

According to a lawsuit filed in a US federal court in Northern California on Wednesday,…

The article Google faces wrongful death suit after Gemini allegedly convinced a man to…

The Decoder

Labs 6h ago

Phi-4-reasoning-vision and the lessons of training a multimodal reasoning modelPhi-4-reasoning-vision and the lessons of training a multimodal reasoning model

Microsoft Research: Phi-4-reasoning-vision and the lessons of training a multimodal…

It is a broadly capable model that allows for natural interaction for a wide array of…

Microsoft Research

Product 6h ago

LangSmith CLI & SkillsLangSmith CLI & Skills

We’re releasing a CLI along with our first set of skills to give AI coding agents…

This includes adding tracing to agents, understanding their execution, building test…

LangChain Blog

Product 6h ago

EuroBERT, VibeVoice ASR, TimesFM2.5, PP-DocLayoutV2, OlmoHybrid, ModernVBert, Higgs…v5.3.0: EuroBERT, VibeVoice ASR, TimesFM2.5, PP-DocLayoutV2, OlmoHybrid, ModernVBert, Higgs Audio V2

New Model additions EuroBERT EuroBERT is a multilingual encoder model based on a…

It supports a mixture of European and widely spoken languages, with sequences of up to…

HF Transformers Releases

Research 7h ago

Meta signs multi-year AI deal with News Corp worth up to $50 million a…Meta signs multi-year AI deal with News Corp worth up to $50 million a year

Meta is paying News Corp up to $50 million a year for AI training data.

Good for individual publishers, bad for the industry as a whole.

The Decoder

News 7h ago

5 Essential Security Patterns for Robust Agentic AI5 Essential Security Patterns for Robust Agentic AI

Machine Learning Mastery: 5 Essential Security Patterns for Robust Agentic AI.

Machine Learning Mastery

Research 7h ago

GPT-5.4 reportedly brings a million-token context window and an extreme reasoning modeGPT-5.4 reportedly brings a million-token context window and an extreme reasoning mode

GPT-5.4 is coming soon: double the context window of GPT-5.2, more reliable performance…

The article GPT-5.4 reportedly brings a million-token context window and an extreme…

The Decoder

Product 7h ago

Quantization-Aware Training in TorchAO (II)Quantization-Aware Training in TorchAO (II)

In our previous Quantization-Aware Training (QAT) blog , we introduced the initial QAT…

Since then, we extended this flow to also target fast CUDA kernels like the ones in…

PyTorch Blog

Labs 7h ago

Elevated errors on Claude Haiku 4.5Elevated errors on Claude Haiku 4.5

Mar 4 , 17:01 UTC Resolved - Errors have returned to the baseline as of 8:08 PT / 16:08…

Mar 4 , 16:13 UTC Monitoring - A fix has been implemented and we are monitoring the…

Anthropic Status

Labs 7h ago

Use Canvas in AI Mode to get things done and bring your ideas to…Use Canvas in AI Mode to get things done and bring your ideas to life, right in Search.

Canvas in AI Mode is now available for everyone in the U.S.

Plus, it can now help you draft documents or build interactive tools.

Google AI Blog

Labs 7h ago

Tuning Flash Attention for Peak Performance in NVIDIA CUDA TileTuning Flash Attention for Peak Performance in NVIDIA CUDA Tile

In this post, we dive into one of the most critical workloads in modern AI: Flash…

NVIDIA Developer

Product 7h ago

Explore new resources for building a stronger, more efficient infrastructureAzure IaaS series: Explore new resources for building a stronger, more efficient infrastructure

Why a modern cloud infrastructure foundation is critical to your business…

As organizations accelerate digital transformation, infrastructure decisions…

Azure Blog

Product 7h ago

Inside BMW Group’s experiments evaluating domain-specific language modelsSmall models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models

A car you can talk to has been a longstanding dream, whether as the basis for…

One way of achieving better, more natural voice commands is by incorporating AI…

Google Cloud AI Blog

Research 7h ago

Supreme Court AI copyright decision sounds sweeping but actually settles very littleSupreme Court AI copyright decision sounds sweeping but actually settles very little

AI inventor Stephen Thaler wanted the US Supreme Court to recognize a machine as the…

The court refused, but the ruling only covers this extreme case.

The Decoder

Research 8h ago

US military uses Anthropic's Claude for AI-driven strike planning in Iran warUS military uses Anthropic's Claude for AI-driven strike planning in Iran war

In the war against Iran, the US military is using generative AI at scale for target…

Of all models, it's the one from the company Washington just banned.

The Decoder

Tech 8h ago

Google faces wrongful death lawsuit after Gemini allegedly ‘coached’ man to die by suicideGoogle faces wrongful death lawsuit after Gemini allegedly ‘coached’ man to die by suicide

A lawsuit filed on Wednesday accuses Google's Gemini AI chatbot of trapping 36-year-old…

In the days leading up to his death, Gemini allegedly convinced Gavalas that he was…

The Verge Tech

Business 9h ago

OpenAI Says ChatGPT Instant 5.3 is Less Cringe, More AccurateOpenAI Says ChatGPT Instant 5.3 is Less Cringe, More Accurate

The AI model maker said it is responding to user criticisms.

AI Business

Research 9h ago

Do Your Customers Have Analysis Paralysis? Find OutDo Your Customers Have Analysis Paralysis? Find Out

Key Takeaways Analysis paralysis in customers happens when you offer too many choices…

Instead, use integrated data to provide personalized recommendations that guide shoppers.

Salesforce AI Research

Press 10h ago

Bridging the operational AI gapBridging the operational AI gap

The transformational potential of AI is already well established.

Enterprise use cases are building momentum and organizations are transitioning from…

MIT Technology Review

Business 10h ago

Pentagon vendor cutoff exposes the AI dependency map most enterprises never builtPentagon vendor cutoff exposes the AI dependency map most enterprises never built

The federal directive ordering all U.S.

government agencies to cease using Anthropic technology comes with a six-month phaseout…

VentureBeat

News 11h ago

Why Enterprise AI StallsEscaping the Prototype Mirage: Why Enterprise AI Stalls

Too many prototypes, too few products

Towards Data Science

Press 11h ago

Earth’s rumblings, and AI for strikes on IranThe Download: Earth’s rumblings, and AI for strikes on Iran

This is today’s edition of The Download , our weekday newsletter that provides a daily…

Listen to Earth’s rumbling, secret soundtrack The boom of a calving glacier.

MIT Technology Review

Product 11h ago

MCP Apps support on VercelMCP Apps support on Vercel

Teams can now build and deploy MCP Apps on Vercel with full support for Next.js.MCP…

They run inside iframes and communicate with any compatible host, such as ChatGPT,…

Vercel Blog

News 12h ago

How Does Keyword Search WorkRAG with Hybrid Search: How Does Keyword Search Work?

Understanding keyword search, TF-IDF, and BM25

appeared first on Towards Data Science .

Towards Data Science

Research 12h ago

Meta creates new applied AI engineering divisionMeta creates new applied AI engineering division

Meta is building a new applied AI engineering organization, according to an internal…

The article Meta creates new applied AI engineering division appeared first on The…

The Decoder

Research 13h ago

Anthropic nears $20 billion revenue run rate despite Pentagon feudAnthropic nears $20 billion revenue run rate despite Pentagon feud

Anthropic is on track to generate nearly $20 billion in annual revenue based on current…

The article Anthropic nears $20 billion revenue run rate despite Pentagon feud appeared…

The Decoder

Business 13h ago

Anthropic’s Skyrocketing Revenue, A Contract Compromise?, Nvidia EarningsAnthropic’s Skyrocketing Revenue, A Contract Compromise?, Nvidia Earnings

Anthropic's enterprise business is reaching escape velocity, which increases the…

Then, agents dramatically increase demand for Nvidia chips, even if they threaten…

Stratechery

Research 14h ago

OpenAI is building a GitHub competitor that could challenge its biggest investorOpenAI is building a GitHub competitor that could challenge its biggest investor

OpenAI is building its own alternative to GitHub, Microsoft's widely used platform for…

The article OpenAI is building a GitHub competitor that could challenge its biggest…

The Decoder

Research yesterday

Toward One Encoder for All Point CloudsUtonia: Toward One Encoder for All Point Clouds

We dream of a future where point clouds from all domains can come together to shape a…

Toward this goal, we present Utonia, a first step toward training a single…

arXiv · Computer Vision

Research yesterday

Towards Expressive Interactive Gesture SynthesisMIBURI: Towards Expressive Interactive Gesture Synthesis

Embodied Conversational Agents (ECAs) aim to emulate human face-to-face interaction…

Current large language model (LLM)-based conversational agents lack embodiment and the…

arXiv · Computer Vision

Research yesterday

Control-Based Classifier-Free Diffusion GuidanceCFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

Classifier-Free Guidance (CFG) has emerged as a central approach for enhancing semantic…

In this paper, we explore a unified framework called CFG-Ctrl, which reinterprets CFG…

arXiv · Machine Learning

Research yesterday

Aligning Fine-Grained Manipulation with Human PreferenceHow to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

Many essential manipulation tasks - such as food preparation, surgery, and…

These tasks are characterized not only by contact-rich, force-sensitive dynamics, but…

arXiv · Machine Learning

Research yesterday

Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-ManipulationULTRA: Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-Manipulation

Achieving autonomous and versatile whole-body loco-manipulation remains a central…

Yet existing approaches are fundamentally constrained: retargeted data are often scarce…

arXiv · Computer Vision

Research yesterday

Autonomous Functional Play with Correspondence-Driven Trajectory WarpingTether: Autonomous Functional Play with Correspondence-Driven Trajectory Warping

The ability to conduct and learn from interaction and experience is a central challenge…

However, realizing such "play" requires (1) a policy robust to diverse, potentially…

arXiv · Artificial Intelligence

Research yesterday

An Exploration of Multimodal PretrainingBeyond Language Modeling: An Exploration of Multimodal Pretraining

The visual world offers a critical axis for advancing foundation models beyond language.

Despite growing interest in this direction, the design space for native multimodal…

arXiv · Computer Vision

Research yesterday

Learning Demographic-Conditioned Mobility Trajectories with Aggregate SupervisionLearning Demographic-Conditioned Mobility Trajectories with Aggregate Supervision

Human mobility trajectories are widely studied in public health and social science,…

However, existing trajectory generation models rarely capture this heterogeneity…

arXiv · Machine Learning

Research yesterday

A Comparative Analysis of Domain-Generation Algorithm (DGA) Detection Methods for…Gravity Falls: A Comparative Analysis of Domain-Generation Algorithm (DGA) Detection Methods for Mobile Device Spearphishing

Mobile devices are frequent targets of eCrime threat actors through SMS spearphishing…

Despite this, DGA research and evaluation largely emphasize malware C2 and email…

arXiv · Machine Learning

Research yesterday

Long-Context Geometric Reconstruction with Hybrid MemoryLoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Feedforward geometric foundation models achieve strong short-window reconstruction, yet…

We present LoGeR (Long-context Geometric Reconstruction), a novel architecture that…

arXiv · Machine Learning

Research yesterday

Dual Motion Diffusion for World-Space Human ReconstructionDuoMo: Dual Motion Diffusion for World-Space Human Reconstruction

We present DuoMo, a generative method that recovers human motion in world-space…

Reconstructing such motion requires solving a fundamental trade-off: generalizing from…

arXiv · Computer Vision

Research yesterday

Physics-informed post-processing of stabilized finite element solutions for transient…Physics-informed post-processing of stabilized finite element solutions for transient convection-dominated problems

The numerical simulation of convection-dominated transient transport phenomena poses…

Classical discretization methods often generate spurious oscillations, requiring…

arXiv · Machine Learning

Research yesterday

Contextual Pressure Can Undermine Agentic GoalsInherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals

The accelerating adoption of language models (LMs) as agents for deployment in…

While prior-generation language model agents have been shown to be susceptible to…

arXiv · Artificial Intelligence

Research yesterday

A Standardized Testbed of Traditional Imperfect-Information Card GamesValet: A Standardized Testbed of Traditional Imperfect-Information Card Games

AI algorithms for imperfect-information games are typically compared using performance…

Card games are a natural domain for imperfect information due to hidden hands and…

arXiv · Artificial Intelligence

Research yesterday

Speculative Speculative DecodingSpeculative Speculative Decoding

Autoregressive decoding is bottlenecked by its sequential nature.

Speculative decoding has become a standard way to accelerate inference by using a fast…

arXiv · Machine Learning

No stories found

Get the brief in your inbox