Academic or research source. Check the methodology, sample size, and whether it's been replicated.
Interpreting Speaker Characteristics in the Dimensions of Self-Supervised Speech Features
How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers....
arXiv cs.CL··Paper: ~15 min
2-Minute Brief
According to arXiv cs.CL: How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers. But few studies have considered whether speech characteristics are captured within individual dimensions of SSL features. In this paper we specifically look at speaker information using PCA on utterance-averaged representations. Using WavLM, we find that the principal dimension that explains most vari
Interpreting Speaker Characteristics in the Dimensions of Self-Supervised Speech Features
TLDR
How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers....
According to arXiv cs.CL: How do speech models trained through self-supervised learning structure their representations? Previous studies have looked at how information is encoded in feature vectors across different layers. But few studies have considered whether speech characteristics are captured within individual dimensions of SSL features. In this paper we specifically look at speaker information using PCA on utterance-averaged representations. Using WavLM, we find that the principal dimension that explains most vari