RefDoc
Haut

Faire une nouvelle recherche
Make a new search
Lancer la recherche


Titre du document / Document title

Word boundary detection with mel-scale frequency bank in noisy environment

Auteur(s) / Author(s)

WU G.-D. (1) ; LIN C.-T. (1) ;

Affiliation(s) du ou des auteurs / Author(s) Affiliation(s)

(1) Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu 300, TAIWAN, PROVINCE DE CHINE

Résumé / Abstract

This paper addresses the problem of automatic word boundary detection in the presence of noise. We first propose an adaptive time-frequency (ATF) parameter for extracting both the time and frequency features of noisy speech signals. The ATF parameter extends the TF parameter proposed by Junqua et al. from single band to multiband spectrum analysis, where the frequency bands help to make the distinction of speech and noise signals clear. The ATF parameter can extract useful frequency information by adaptively choosing proper bands of the mel-scale frequency bank. The ATF parameter increased the recognition rate by about 3% of a TF-based robust algorithm which has been shown to outperform several commonly used algorithms for word boundary detection in the presence of noise. The ATF parameter also reduced the recognition error rate due to endpoint detection to about 20%. Based on the ATF parameter, we further propose a new word boundary detection algorithm by using a neural fuzzy network (called SONFIN) for identifying islands of word signals in noisy environment. Due to the self-learning ability of SONFIN, the proposed algorithm avoids the need of empirically determining thresholds and ambiguous rules in normal word boundary detection algorithms. As compared to normal neural networks, the SONFIN can always find itself an economic network size in high learning speed. Our results also showed that the SONFIN's performance is not significantly affected by the size of training set. The ATF-based SONFIN achieved higher recognition rate than the TF-based robust algorithm by about 5%. It also reduced the recognition error rate due to endpoint detection to about 10%, compared to an average of approximately 30% obtained with the TF-based robust algorithm, and 50% obtained with the modified version of the Lamel et al. algorithm.

Revue / Journal Title

IEEE transactions on speech and audio processing    ISSN  1063-6676   CODEN IESPEJ 

Source / Source

2000, vol. 8, no5, pp. 541-554 (23 ref.)

Langue / Language

Anglais

Editeur / Publisher

Institute of Electrical and Electronics Engineers, New York, NY, ETATS-UNIS  (1993-2005) (Revue)

Mots-clés anglais / English Keywords

Automatic measurement

;

Fuzzy control

;

Neural network

;

Self learning

;

Spectrum analysis

;

Signal to noise ratio

;

Speech analysis

;

Noise source

;

Recognition

;

Error rate

;

Adaptive method

;

Spectral analysis

;

Word

;

Filter bank

;

Frequency band

;

Time frequency domain method

;

Mots-clés français / French Keywords

Mesure automatique

;

Commande floue

;

Réseau neuronal

;

Autoapprentissage

;

Analyse spectre

;

Rapport signal bruit

;

Analyse parole

;

Source bruit

;

Reconnaissance

;

Taux erreur

;

Méthode adaptative

;

Analyse spectrale

;

Mot

;

Banc filtre

;

Bande fréquence

;

Méthode domaine temps fréquence

;

Mots-clés espagnols / Spanish Keywords

Medición automática

;

Control difusa

;

Red neuronal

;

Autodidactismo

;

Análisis espectro

;

Relación señal ruido

;

Análisis palabra

;

Fuente ruido

;

Reconocimiento

;

Indice error

;

Método adaptativo

;

Análisis espectral

;

Palabra

;

Banco filtro

;

Banda frecuencia

;

Método dominio tiempo frecuencia

;

Localisation / Location

INIST-CNRS, Cote INIST : 26266, 35400009090003.0050

Nº notice refdoc (ud4) : 1450733



Faire une nouvelle recherche
Make a new search
Lancer la recherche
Bas