Normal view MARC view ISBD view

Enhancing Voice Liveness Detection in Speaker Verification systems using Machine Learning approach (Record no. 138785)

MARC details
000 -LEADER
fixed length control field	04220ngm a22001577a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	220707b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	TT000111
Item number	MAN
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name	Mankad, Sapan Hareshbhai
245 ## - TITLE STATEMENT
Title	Enhancing Voice Liveness Detection in Speaker Verification systems using Machine Learning approach
Statement of responsibility, etc	by Sapan Hareshbhai Mankad
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc	Ahmedabad
Name of publisher, distributor, etc	Nirma Institute of Technology
Date of publication, distribution, etc	April 2021
300 ## - PHYSICAL DESCRIPTION
Extent	119p Ph. D. Thesis with Synopsis and CD
500 ## - GENERAL NOTE
General note	Guided by: Dr Sanjay Garg With Synopsis and CD <br/>15EXTPHDE143<br/><br/>ABSTRACT:<br/>Biometric authentication has recently replaced conventional authentication mecha- nisms of \holding" or \remembering" some evidence to claim one's identity. With the advent of these biometric enabled operations, the issue of \forgotten password" or \theft" has been resolved, but at the same time, there are some threats on these bio- metric systems. Since last decade, there has been an increasing interest in developing voice controlled biometric systems, often termed as Automatic Speaker Veri cation (ASV) systems. Voice Biometrics systems are vulnerable to four common spoo ng attacks, namely, impersonation (mimicry), replay (playback), voice conversion and speech synthesis. Among these, playback spoo ng attacks are the easiest to implement from the per- spective of the attacker. Thus, being low e ort attacks, these attacks present the maximum threat to voice biometric systems. This thesis addresses this problem and investigates approaches to detect such attacks. Thus, the goal of this thesis is to supplement the state-of-the-art in playback spoo ng detection in automatic speaker veri cation systems by presenting countermeasures and antispoo ng approaches. In this work, we have attempted to address security issues in voice biometrics systems. These systems are most susceptible to playback spoo ng attacks due to availability of smartphones to any end user. Due to this, ASV systems are not yet commercialized signi cantly. We have used ASVspoof 2017 dataset for implementa- tion and presented our ndings for replay spoo ng detection using fusion of short-term spectral features. Results using inverted Mel frequency cepstral coe cients (IMFCC) are promising and point to directions for further research with the help of these po- tential features. ASV systems are the most susceptible to replay spoo ng attacks. Liveness of an input audio sample has to be ensured to counter such attacks. Experiments are conducted with standalone and fused feature representation of audio in this thesis to assess the performance of the antispoo ng systems using spoo ng detection equal error rate (EER). Further, the impact of proposed static IMFCC based system un- der mismatched conditions by training and testing it in di erent environments (with di erent background conditions) alongwith other systems is evaluated. Results show that the proposed system outperforms other systems used in this study in the exper- iments. Motivated from promising results of IMFCCs, a deep investigation into high- frequency regions of the audio signals on features derived from intrinsic mode func- tions (IMF) obtained through Empirical Mode Decomposition (EMD) is carried out. Experiments show promising results for replay spoo ng detection task on benchmark corpus. An emphasis on the rst IMF based representation serves as a preprocessing technique to retrieve high frequency components of an underlying signal. In order to examine the role of recording instruments in detecting recorded speech, a novel multiclass classi cation based framework using transfer learning has been proposed. A comparison of spoo ng detection system as binary vs. multiclass classi cation task has been performed. Data augmentation has been attempted on audio data to supplement the available corpus to understand the impact of arti cially generated data samples on playback spoo ng detection task. This has been accom- plished using conventional oversampling technique and modern generative adversarial networks based technique.
856 ## - ELECTRONIC LOCATION AND ACCESS
Public note	Institute Repository (Campus Access)
Uniform Resource Identifier	https://repository.nirmauni.ac.in/jspui/handle/123456789/11136
856 ## - ELECTRONIC LOCATION AND ACCESS
Public note	Shodhganga
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme	Dewey Decimal Classification
Koha item type	Thesis

No items available.

Print
Add to your cart (remove)
Save record
BIBTEX Dublin Core MARC (non-Unicode/MARC-8) MARCXML MODS (XML) RIS MARC (Unicode/UTF-8) MARC (Unicode/UTF-8, Standard)
More searches

Search for this title in:
Other Libraries (WorldCat) Other Databases (Google Scholar) Online Stores (Bookfinder.com) Open Library (openlibrary.org)