Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

Interpreting Speaker Characteristics in the Dimensions of Self-Supervised Speech Features

How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers....

arXiv cs.CL · Mar 03, 2026 15:33 UTC · Paper: ~15 min

2-Minute Brief

According to arXiv cs.CL: How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers. But few studies have considered whether speech characteristics are captured within individual dimensions of SSL features. In this paper we specifically look at speaker information using PCA on utterance-averaged representations. Using WavLM, we find that the principal dimension that explains most vari

Read Original

Interpreting Speaker Characteristics in the Dimensions of Self-Supervised Speech Features

TLDR

Artifacts

Paper PDF

2-Minute Brief

According to arXiv cs.CL: How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers. But few studies have considered whether speech characteristics are captured within individual dimensions of SSL features. In this paper we specifically look at speaker information using PCA on utterance-averaged representations. Using WavLM, we find that the principal dimension that explains most vari

Open

O open S save B back M mode