Sign in
View Categories
Migration Mapping Assistant Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help

DeePhi Descartes Efficient Speech Recognition Engine

By: Latest Version: 2018.02.2b
Linux/Unix
Linux/Unix

DeePhi Descartes Efficient Speech Recognition Engine

Product Overview

This is an end-to-end ASR (Automatic Speech Recognition) system with FPGA acceleration on AWS F1 by DeePhi. We modify the Baidu DeepSpeech2 framework (https://github.com/SeanNaren/deepspeech.pytorch) for our solution of algorithm, software and hardware co-design, using LibriSpeech 1000h dataset (http://www.openslr.org/12/) for model training and compression. Our model consists of 2 convolution layers (with Batch Normalization and Hardtanh), 5 bi-directional LSTM layers and 1 fully connected layer, together with a Softmax layer. We mainly focus on the acceleration of CNN and LSTM layers by FPGA, while other parts are implemented on CPU. For a test audio of 1 second, we are able to achieve a latency of 20.59ms for the entire end-to-end ASR system on AWS F1 with the help of our acceleration, which is about 2.06X speedup compared to cudnn solution tested locally on GPU P4. Users could run the test scripts for both performance comparisons of CPU/FPGA and single sentence recognition.

Version

2018.02.2b

By

Operating System

Linux/Unix, CentOS 3.10.0-693.2.2.el7.x86_64

Fulfillment Methods

  • Amazon Machine Image

Pricing Information

Usage Information

Support Information

Customer Reviews