Enhancing Voice Liveness Detection in Speaker Verification systems using Machine Learning approach (Record no. 138785)

MARC details
000 -LEADER
fixed length control field 04220ngm a22001577a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 220707b |||||||| |||| 00| 0 eng d
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number TT000111
Item number MAN
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name Mankad, Sapan Hareshbhai
245 ## - TITLE STATEMENT
Title Enhancing Voice Liveness Detection in Speaker Verification systems using Machine Learning approach
Statement of responsibility, etc by Sapan Hareshbhai Mankad
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc Ahmedabad
Name of publisher, distributor, etc Nirma Institute of Technology
Date of publication, distribution, etc April 2021
300 ## - PHYSICAL DESCRIPTION
Extent 119p Ph. D. Thesis with Synopsis and CD
500 ## - GENERAL NOTE
General note Guided by: Dr Sanjay Garg With Synopsis and CD <br/>15EXTPHDE143<br/><br/>ABSTRACT:<br/>Biometric authentication has recently replaced conventional authentication mecha- nisms of \holding" or \remembering" some evidence to claim one's identity. With the advent of these biometric enabled operations, the issue of \forgotten password" or \theft" has been resolved, but at the same time, there are some threats on these bio- metric systems. Since last decade, there has been an increasing interest in developing voice controlled biometric systems, often termed as Automatic Speaker Veri cation (ASV) systems. Voice Biometrics systems are vulnerable to four common spoo ng attacks, namely, impersonation (mimicry), replay (playback), voice conversion and speech synthesis. Among these, playback spoo ng attacks are the easiest to implement from the per- spective of the attacker. Thus, being low e ort attacks, these attacks present the maximum threat to voice biometric systems. This thesis addresses this problem and investigates approaches to detect such attacks. Thus, the goal of this thesis is to supplement the state-of-the-art in playback spoo ng detection in automatic speaker veri cation systems by presenting countermeasures and antispoo ng approaches. In this work, we have attempted to address security issues in voice biometrics systems. These systems are most susceptible to playback spoo ng attacks due to availability of smartphones to any end user. Due to this, ASV systems are not yet commercialized signi cantly. We have used ASVspoof 2017 dataset for implementa- tion and presented our ndings for replay spoo ng detection using fusion of short-term spectral features. Results using inverted Mel frequency cepstral coe cients (IMFCC) are promising and point to directions for further research with the help of these po- tential features. ASV systems are the most susceptible to replay spoo ng attacks. Liveness of an input audio sample has to be ensured to counter such attacks. Experiments are conducted with standalone and fused feature representation of audio in this thesis to assess the performance of the antispoo ng systems using spoo ng detection equal error rate (EER). Further, the impact of proposed static IMFCC based system un- der mismatched conditions by training and testing it in di erent environments (with di erent background conditions) alongwith other systems is evaluated. Results show that the proposed system outperforms other systems used in this study in the exper- iments. Motivated from promising results of IMFCCs, a deep investigation into high- frequency regions of the audio signals on features derived from intrinsic mode func- tions (IMF) obtained through Empirical Mode Decomposition (EMD) is carried out. Experiments show promising results for replay spoo ng detection task on benchmark corpus. An emphasis on the rst IMF based representation serves as a preprocessing technique to retrieve high frequency components of an underlying signal. In order to examine the role of recording instruments in detecting recorded speech, a novel multiclass classi cation based framework using transfer learning has been proposed. A comparison of spoo ng detection system as binary vs. multiclass classi cation task has been performed. Data augmentation has been attempted on audio data to supplement the available corpus to understand the impact of arti cially generated data samples on playback spoo ng detection task. This has been accom- plished using conventional oversampling technique and modern generative adversarial networks based technique.
856 ## - ELECTRONIC LOCATION AND ACCESS
Public note Institute Repository (Campus Access)
Uniform Resource Identifier https://repository.nirmauni.ac.in/jspui/handle/123456789/11136
856 ## - ELECTRONIC LOCATION AND ACCESS
Public note Shodhganga
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Thesis

No items available.

© 2025 by NIMA Knowledge Centre, Ahmedabad.
Koha version 24.05