arxiv:2308.07170

Human Voice Pitch Estimation: A Convolutional Network with Auto-Labeled and Synthetic Data

Published on Aug 14, 2023

Authors:

Jeremy Cochoy

Abstract

A specialized convolutional neural network combines synthetic and auto-labeled acapella data for effective pitch extraction in various musical and vocal contexts.

AI-generated summary

In the domain of music and sound processing, pitch extraction plays a pivotal role. Our research presents a specialized convolutional neural network designed for pitch extraction, particularly from the human singing voice in acapella performances. Notably, our approach combines synthetic data with auto-labeled acapella sung audio, creating a robust training environment. Evaluation across datasets comprising synthetic sounds, opera recordings, and time-stretched vowels demonstrates its efficacy. This work paves the way for enhanced pitch extraction in both music and voice settings.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2308.07170 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2308.07170 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2308.07170 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.