UnivIS
Information system of Friedrich-Alexander-University Erlangen-Nuremberg © Config eG 
FAU Logo
  Collection/class schedule    module collection Home  |  Legal Matters  |  Contact  |  Help    
search:      semester:   
 
 Layout
 
printable version

 
 
Module Description Sheet (PDF)

 
 
 Also in UnivIS
 
course list

lecture directory

 
 
events calendar

job offers

furniture and equipment offers

 
 

Selected Topics of Deep Learning for Audio, Speech, and Music Processing (DLA)2.5 ECTS
(englische Bezeichnung: Selected Topics of Deep Learning for Audio, Speech, and Music Processing)

Modulverantwortliche/r: Emanuël A. P. Habets, Meinard Müller
Lehrende: Emanuël A. P. Habets, Meinard Müller


Start semester: SS 2021Duration: 1 semesterCycle: jährlich (SS)
Präsenzzeit: 20 Std.Eigenstudium: 55 Std.Language: Englisch

Lectures:


Empfohlene Voraussetzungen:

It is recommended to finish the following modules before starting this module:

Deep Learning (WS 2020/2021)


Inhalt:

Many recent advances in audio, speech, and music processing have been driven by techniques based on deep learning (DL). For example, DL-based techniques have led to significant improvements in, for example, speaker separation, speech synthesis, acoustic scene analysis, audio retrieval, chord recognition, melody estimation, and beat tracking. Considering specific audio, speech, and music processing tasks, we study various DL-based approaches and their capability to extract complex features and make predictions based on hidden structures and relations. Rather than giving a comprehensive overview, we will study selected and generally applicable DL-based techniques. Furthermore, in the context of challenging application scenarios, we will critically review the potential and limitations of recent deep learning techniques. As one main general objective of the lecture, we want to discuss how you can integrate domain knowledge into neural network architectures to obtain explainable models that are less vulnerable to data biases and confounding factors.

The course consists of two overview-like lectures, where we introduce current research problems in audio, speech, and music processing. We will then continue with 6 to 8 lectures on selected audio processing topics and DL-based techniques. Being based on articles from the research literature, we will provide detailed explanations covered in mathematical depth; we may also try to attract some of the original authors to serve as guest lecturers. Finally, we round off the course by a concluding lecture covering practical aspects (e.g., hardware, software, version control, reproducibility, datasets) that are relevant when working with DL-based techniques.

Lernziele und Kompetenzen:

  • Students will be able to understand central tasks in audio, speech, and music processing, present the main ideas and challenges in their own words, and outline possible solutions.
  • Students will be able to apply and adapt deep learning techniques for the analysis, synthesis, comparison, classification, and decomposition of audio signals.

  • Students will be able to discuss the meaning and impact of parameters for various audio processing tasks.

  • Students will be able to question assumptions that are often implicitly made when using deep learning methods.

  • Students will be able to predict when methods might work for analyzing specific audio signals and when they typically fail.

  • Students will be able to explain how domain knowledge can be taken into account.

  • Students will be able to prepare for the lecture using selected literature and online resources (e.g., Jupyter notebooks).

  • Students will be able to question existing approaches regarding their generalizability and applicability in practice.

  • Students will be able to question their understanding of what they have learned by formulating questions and asking them to the lecturer and the audience in the lecture.

  • Students will be able to independently organize learning groups in which the subject is discussed and deepened.

Organisatorisches:

In this course, we require a good knowledge of deep learning techniques, machine learning, and pattern recognition as well as a strong mathematical background. Furthermore, we require a solid background in general digital signal processing and some experience with audio, image, or video processing.

It is recommended to finish the following modules (or having equivalent knowledge) before starting this module:

  • Lecture Deep Learning

  • Digitale Signalverarbeitung

  • Statistische Signalverarbeitung

  • Sprach- und Audiosignalverarbeitung


Weitere Informationen:

Keywords: AudioLabs Deep Learning Audio Speech Music
www: https://www.audiolabs-erlangen.de/fau/professor/mueller/teaching/2021s_dla

Verwendbarkeit des Moduls / Einpassung in den Musterstudienplan:
Das Modul ist im Kontext der folgenden Studienfächer/Vertiefungsrichtungen verwendbar:

  1. Advanced Signal Processing & Communications Engineering (Master of Science)
    (Po-Vers. 2016w | TechFak | Advanced Signal Processing & Communications Engineering (Master of Science) | Gesamtkonto | Wahlmodule | Technical Electives | Selected Topics of Deep Learning for Audio, Speech, and Music Processing)
  2. Advanced Signal Processing & Communications Engineering (Master of Science)
    (Po-Vers. 2020w | TechFak | Advanced Signal Processing & Communications Engineering (Master of Science) | Gesamtkonto | Technical Electives | Selected Topics of Deep Learning for Audio, Speech, and Music Processing)
  3. Communications and Multimedia Engineering (Master of Science)
    (Po-Vers. 2011 | TechFak | Communications and Multimedia Engineering (Master of Science) | Gesamtkonto | Wahlmodule | Technische Wahlmodule | Selected Topics of Deep Learning for Audio, Speech, and Music Processing)
  4. Information and Communication Technology (Master of Science)
    (Po-Vers. 2019s | TechFak | Information and Communication Technology (Master of Science) | Gesamtkonto | Wahlmodule | Wahlmodule aus dem Angebot von EEI und Informatik | Selected Topics of Deep Learning for Audio, Speech, and Music Processing)
  5. Information and Communication Technology (Master of Science)
    (Po-Vers. 2019s | TechFak | Information and Communication Technology (Master of Science) | Gesamtkonto | Wahlmodule | Wahlmodule aus dem Angebot der Technischen Fakultät oder der Naturwissenschaftlichen Fakultät | Selected Topics of Deep Learning for Audio, Speech, and Music Processing)

Studien-/Prüfungsleistungen:

Selected Topics of Deep Learning for Audio, Speech, and Music Processing (Prüfungsnummer: 45211)
Prüfungsleistung, mündliche Prüfung, Dauer (in Minuten): 30, benotet, 2.5 ECTS
Anteil an der Berechnung der Modulnote: 100.0 %
Prüfungssprache: Englisch

Erstablegung: SS 2021, 1. Wdh.: WS 2021/2022
1. Prüfer: Emanuël A. P. Habets,2. Prüfer: Meinard Müller

UnivIS is a product of Config eG, Buckenhof