Tutorial/Keynotes
SIRS'23 Speaker: Dr. S. R. Mahadeva Prasanna, Professor, Dept. of Electrical Engineering, Indian Institute of Technology Dharwad, Dharwad, India
Title of the Talk: Nonlinear Speech Processing by Deep Learning Biography: Dr. S. R. Mahadeva Prasanna is a Professor of Electronics and Communication Engineering at the Indian Institute of Technology, Dharwad. With a lineage deeply rooted in education, his profound enthusiasm for teaching is reflected in his pedagogy and remarkable student feedback. His pivotal contributions extend to the developmental strides of both IIT Guwahati and the establishment of IIT Dharwad. Driven by innovation, he introduced essential courses like speech processing, neural networks, deep learning, and more, tailored effectively to cater to students from diverse academic backgrounds. This pioneering approach has facilitated students in securing admissions to prestigious institutions nationally and internationally. Beyond academia, Dr. Prasanna's commitment to research and innovation is commendable. He spearheads research groups and actively fosters collaborations between academia and industry. Notably, he mentors faculty from various engineering colleges in teaching methodologies, research, and project proposal composition. His initiatives encompass a spectrum of activities, including faculty development programs, workshops, conferences, invited talks, and cultural events, enriching both the campus community and external institutions.
Abstract: Speech processing is the forefront application area of signal processing. Most developments in digital signal processing are directly applied to speech processing. Thus signal processing and speech processing went hand-in-hand for the last many decades. Before the advent of deep learning, the feature extraction by signal processing played a major role in deciding the performance of pattern recognition systems. For more than a decade now, deep learning provided alternative ways of development of speech technologies with much superior performance. The contributions may be viewed under two vertices, representation learning doing nonlinear signal processing for feature extraction and machine learning for modeling pattern information using much more data. This talk will present some of the interesting results of deep learning as nonlinear signal processing that has delivered speech technologies with human level performance.
SIRS'23 Speaker: Dr. Sri Krishnan, Professor of Electrical, Computer Engineering, and Biomedical Engineering, Toronto Metropolitan University, Toronto, Canada
Title of the Talk: Biomedical Signal Analysis and Digital Health Biography: Sri Krishnan joined Toronto Metropolitan University, Toronto, Canada in 1999, and he is now a Professor of Electrical, Computer Engineering, and Biomedical Engineering. Sri Krishnan's research interests are in biomedical signal analysis, audio signal analysis, and explainable machine learning. He is a Fellow of the Canadian Academy of Engineering. From 2007 to 2017 he was a Canada Research Chair in Biomedical Signal Analysis. Sri Krishnan is a recipient of the Outstanding Canadian Biomedical Engineer Award, Achievement in Innovation Award from Innovate Calgary, Sarwan Sahota Distinguished Scholar Award, Young Engineer Achievement Award from Engineers Canada, New Pioneers Award in Science and Technology, and Exemplary Service Award from the IEEE Toronto Section.
Abstract: Biomedical data possess dynamic and complex characteristics that need to be processed using adaptive and advanced algorithms, and digital tools for data-driven decision support systems (DSS) and computer-aided diagnosis (CAD). The role of machine learning and artificial intelligence (AI) holds great promise and significance in designing proactive digital healthcare DSS and CAD. To ensure trustworthy and fair results, AI techniques need explainability to both domain experts and end users. In this talk, the process of explainable AI will be elaborated with some case study research projects being done at the Signal Analysis Research Lab at Toronto Metropolitan University, Canada.
- Overview of Audio Deepfakes: Understanding the concept and the technology behind audio Deepfakes.
- Challenges in Detecting Audio Deepfakes: Discussion on why traditional techniques fall short.
- Introduction to MFAAN: Exploring the architecture, principles, and motivations behind MFAAN.
- Working of MFAAN: Delve into how MFAAN utilizes MFCC, LFCC, and Chroma-STFT for effective audio Deepfake detection.
- Hands-on Demonstration: Walkthrough of implementing MFAAN, showcasing its efficacy in real-world scenarios.
- Q&A and Discussion: Address queries and foster a dialogue on future advancements and potential refinements in audio Deepfake detection.
- Basic understanding of deep learning concepts.
- Familiarity with audio processing and feature extraction would be advantageous.
- Prior exposure to the PyTorch framework is helpful but not mandatory.