Deep Learning for Audio

This two day workshop is designed to give participants the skills need to take Deep Learning in the Real World. Whether that be deploying to the cloud, at the edge on mobile or in the browser, we look at the strategies, frameworks and model changes need to get the best performance for various types of models.

What You'll Learn

✓ Understanding the fundamentals of audio as a data source
✓ Building pipelines to pre-process audio for Neural Network
✓ How to build various types of Audio Classification models

✓ Using relevant audio libraries like Librosa
✓ How ASR models work and use different losses compared to other models
✓ Understanding some of the models use in Alexa and Google Home

Course Overview

The use of Deep Learning to build audio models has lagged their use in the image and text until domains. Now though, Deep Learning is being used in a variety of different types of audio models and applications. From applications as varied as Automatic Speech Recognition (ASR) and speaker diarization through to Audio manipulation models for tasks like noise reduction and signal processing, Deep Learning is finding its way into all areas of digital audio whether that be classification of the audio through to creation of new audio.

In this course we look at the different types of audio models, data pipelines and techniques. Audio files often require conversion and there are a variety of ways you can manipulate audio as data - from models using convolutions after converting audio into spectra through to techniques that can be applied at the level of raw waveforms to generate new audio. The course covers a variety of types of audio manipulations to achieve common tasks that people want to do with audio.

Overall, this course is designed to give the participants a practical hands-on approach. Students will be taught from and given real world code examples for learning, as well as in-class challenges in which they will need to work through and complete in the class. The goal is to prepare students for applications, challenges and needs that they will face in the day-to-day world as a data scientist dealing with audio challenges and tasks.

Topics covered include:

Audio classification
Detecting voices
Audio pipelines
Processing audio with spectagrams
Multi label audio problems
Wavenets for audio and speech generation
Intro to Automatic Speech Recognition (ASR)
Noise reduction
How wake word models work on mobile and custom hardware

Duration

2 days live + 7 hours online

Pricing

$1800 per pax

Prerequisites

A solid understanding of Deep Learning and TensorFlow

Technologies we teach will include:

Certificate

Earn a certificate upon completion

Training Level

Intermediate Level

Time to Complete

Approx. 21 hours to complete

Why Study Artificial Inteligence?

1. Demand for AI/DL jobs has never been at this all time high.

2. Developers need these skills

3. AI/DL Jobs pay more than standard developer jobs

Request More Info

Sign up to receive additional information about this course. Find out what other learners are doing with the skills they gamed, and evaluate if this course is the right fit for you.

Frequently Asked Questions

Do I have to log in at a set time? How does the 360-degree assessment work? At this point, you probably have a few questions, and we’ve got answers.

About RDAI

About us