عرض بسيط للتسجيلة

المؤلفAlmaadeed, Noor
المؤلفAggoun, Amar
المؤلفAmira, Abbes
تاريخ الإتاحة2024-08-11T05:39:17Z
تاريخ النشر2012
اسم المنشورLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
المصدرScopus
الرقم المعياري الدولي للكتاب3029743
معرّف المصادر الموحدhttp://dx.doi.org/10.1007/978-3-642-34475-6_8
معرّف المصادر الموحدhttp://hdl.handle.net/10576/57541
الملخصAnalyses of facial and audio features have been considered separately in conventional speaker identification systems. Herein, we propose a robust algorithm for text-independent speaker identification based on a decision-level and feature-level fusion of facial and audio features. The suggested approach makes use of Mel-frequency Cepstral Coefficients (MFCCs) for audio signal processing, Viola-Jones Haar cascade algorithm for face detection from video, eigenface features (EFF) and Gaussian Mixture Models (GMMs) for feature-level and decision-level fusion of audio and video. Decision-level fusion is carried out using PCA for face and GMM for audio through AND voting. Feature-level fusion is investigated by combining both MFCC (audio) and PCA (face) features to construct a hybrid GMM for each speaker. Testing on GRID, a multi-speaker audio-visual database, shows that the decision-level fusion of PCA (face) and GMM (audio) achieves 98.2 % accuracy and it is almost 15 % more efficient than feature-level fusion.
اللغةen
الناشرSpringer
الموضوعaudio-visual fusion
Gaussian mixture models
Mel-frequency Cepstral coefficients
Principal component Analysis
speaker identification
العنوانAudio-visual feature fusion for speaker identification
النوعConference Paper
الصفحات56-67
رقم العددPART 1
رقم المجلد7663 LNCS
dc.accessType Abstract Only


الملفات في هذه التسجيلة

الملفاتالحجمالصيغةالعرض

لا توجد ملفات لها صلة بهذه التسجيلة.

هذه التسجيلة تظهر في المجموعات التالية

عرض بسيط للتسجيلة