Author Goto, Masataka
Abstract This paper describes a predominant-F0 (fundamental frequency) estimation method called PreFEst, which can detect melody and bass lines in monaural audio signals containing sounds of various instruments. While most previous methods premised mixtures of a few sounds and had difficulty dealing with such complex signals, our method can estimate the F0 of the melody and bass lines without assuming the number of sound sources in compactdisc recordings. In this paper we propose the following three extensions to our previous PreFEst to make it more adaptive and flexible: introducing multiple harmonic-structure tone models, estimating the shape of tone models, and introducing a prior distribution of its shape and F0 estimates. These extensions were implemented by the MAP (Maximum A Posteriori Probability) estimation by using the Expectation-Maximization algorithm. Experimental results with compact-disc recordings showed that our real-time system based on the extended PreFEst achieved performance improvement.
Publisher Date 2001-01-01