Skip to main content

ASR Guide 2

Automatic Speech Recognition (ASR) refers to automatically transcribing spoken language, otherwise known as speech-to-text. In this blog, you will learn how to use NVIDIA’s Neural Modules (NeMo) toolkit to train an end-to-end ASR system and Weights & Biases to keep track of various experiments and performance metrics.
Created on August 25|Last edited on August 26

Setting up the Environment

Now that we have some idea about Automated Speech Recognition and the tools that we're going to use as part of this blog post, the first step is to set up the environment so we can run code.