ID: 5d9c834694a87f7c3c8f9ed1

Speech To Text

by SHIVAM GARG

Speech To Text Conversion Using WaveNet


License: Apache License 2.0

Tags: WAveNet tensorflow NLP Audio Processing Audio to Text

 Model stats and performance
Dataset Used View
Framework Tensorflow
OS Used Linux
Publication View
Inference time in seconds per sample.

Screenshots


WAVENET FOR SPEECH TO TEXT CONVERSON

WHAT IS IT?

Wavenet speech to text conversation .This is a tensorflow implementation of speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio.The main ingredient of WaveNet are causal convolutions. By using causal convolutions, it make sure the model cannot violate the ordering in which we model the data

HOW TO USE?

To run The Inference Script run this command python run.py model-path mp3 path

Sample Command

python run.py asset/train asset/train/sample.mp3

First Argument             MODELPATH            Mention the model path

Second Argument        Input                          Path of Input audio file.

ARGUMENTS DETAILS HELP OPTIONS
First Argument MODELPATH Mention the model path
Second Argument INPUT Path of Input audio file.

WHAT ARE THE REQUIREMENTS?

To get all the requirements and dependencies installed run the command For GPU - pip install -r gpu_requirements.txt For CPU - pip install -r cpu_requirements.txt

Note

Only Cuda 8 is supported for GPU inference Make Sure to install these depecndecies as admin to all the users sudo apt-get -qq -y install libsndfile-dev sudo apt-get -qq -y install ffmpeg

Author View Profile

SHIVAM GARG
New Delhi, India
Pro
41
LEVEL

25719 Profile
Views

A philosophy student cleverly disguised as a Coax Deep Learning engineer spending whole day, practically every day, experimenting with TensorFlow,Pytorch, and Caffe; dabbling with Python and C++; and drinking a wide variety of Coffee everyday.

User Reviews



0 total ratings

Model has not been reviewed yet.

More by this user | Show All



Also checkout...