Speech To Text

Speech To Text Conversion Using WaveNet

License: Apache License 2.0

Tags: WAveNet tensorflow NLP Audio Processing Audio to Text

WAVENET FOR SPEECH TO TEXT CONVERSON

WHAT IS IT?

Wavenet speech to text conversation .This is a tensorflow implementation of speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio.The main ingredient of WaveNet are causal convolutions. By using causal convolutions, it make sure the model cannot violate the ordering in which we model the data

HOW TO USE?

To run The Inference Script run this command python run.py model-path mp3 path

Sample Command

python run.py asset/train asset/train/sample.mp3

First Argument MODELPATH Mention the model path

Second Argument Input Path of Input audio file.

ARGUMENTS	DETAILS	HELP OPTIONS
First Argument	MODELPATH	Mention the model path
Second Argument	INPUT	Path of Input audio file.

WHAT ARE THE REQUIREMENTS?

To get all the requirements and dependencies installed run the command For GPU - pip install -r gpu_requirements.txt For CPU - pip install -r cpu_requirements.txt

Note

Only Cuda 8 is supported for GPU inference Make Sure to install these depecndecies as admin to all the users sudo apt-get -qq -y install libsndfile-dev sudo apt-get -qq -y install ffmpeg

Dataset Used	View
Framework	Tensorflow
OS Used	Linux
Publication	View

Speech To Text

Model stats and performance

Inference time in seconds per sample.

Screenshots

WAVENET FOR SPEECH TO TEXT CONVERSON

WHAT IS IT?

HOW TO USE?

Sample Command

WHAT ARE THE REQUIREMENTS?

Note

Author View Profile

User Reviews

0 total ratings

More by this user | Show All

Also checkout...

About Us

Quick Links

Subscribe to our mailing list