Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

Accelerating Single-Pass SGD for Generalized Linear Prediction

We study generalized linear prediction under a streaming setting, where each iteration uses only one fresh data point for a gradient-level update. While momentum is well-established in deterministic...

Hugging Face Daily Papers · Mar 02, 2026 15:04 UTC · ~4 min read

2-Minute Brief

According to Hugging Face Daily Papers: We study generalized linear prediction under a streaming setting, where each iteration uses only one fresh data point for a gradient-level update. While momentum is well-established in deterministic optimization, a fundamental open question is whether it can accelerate such single-pass non-quadratic stochastic optimization. We propose the first algorithm that successfully incorporates momentum via a novel data-dependent proximal method, achieving dual-momentum acceleration. Our derived excess ri

Read Original

Accelerating Single-Pass SGD for Generalized Linear Prediction

TLDR

2-Minute Brief

According to Hugging Face Daily Papers: We study generalized linear prediction under a streaming setting, where each iteration uses only one fresh data point for a gradient-level update. While momentum is well-established in deterministic optimization, a fundamental open question is whether it can accelerate such single-pass non-quadratic stochastic optimization. We propose the first algorithm that successfully incorporates momentum via a novel data-dependent proximal method, achieving dual-momentum acceleration. Our derived excess ri

Open

O open S save B back M mode