Update README.md

b1048999 · Aleksi Papalitsas · 73632e38 · b1048999
Commit b1048999 authored 3 years ago by Aleksi Papalitsas
--- a/README.md
+++ b/README.md
-# CommonVoice-TH Recipe
-A commonvoice-th recipe for training ASR engine using Kaldi. The following recipe follows `commonvoice` recipe with slight modification
+# CommonVoice-FI Recipe
+Fork from [commonvoice-th](https://github.com/vistec-AI/commonvoice-th)
+
+A commonvoice-fi recipe for training ASR engine using Kaldi. The following recipe follows `commonvoice` and `commonvoice-th` recipe with slight modification.
+

 ## Installation
 The author use docker to run the container. **GPU is required** to train `tdnn_chain`, else the script can train only up to `tri3b`.
-### Building Docker
-```bash
-$ docker build -t <docker-name> .
-```
+
+### Downloading SRILM
+Before building docker, SRILM file need to be downloaded. You can download it from [here](http://www.speech.sri.com/projects/srilm/download.html). Once the file is downloaded, remove version name (e.g. from `srilm-1.7.3.tar.gz` to `srilm.tar.gz` and place it inside `docker` directory. Your `docker` directory should contains 2 files: `dockerfile`, and `srilm.tar.gz`.
+
 ### Run docker and attach command line
+Since gpu is required you are going to need the kaldi-gpu-image.
+
 ```bash
-$ docker run -it -v <path-to-repo>:/opt/kaldi/egs/commonvoice-th -v <path-to-labels>:/mnt/labels -v <path-to-cv-corpus>:/mnt --gpus all --name <container-name> <built-docker-name> bash
+$ docker run -it --runtime=nvidia -v <path-to-repo>:/opt/kaldi/egs/commonvoice-th -v <path-to-labels>:/mnt/labels -v <path-to-cv-corpus>:/mnt --gpus all --name <container-name> kaldiasr/kaldi:gpu-latest bash
 ```
 Once you finish this step, you should be in a docker container bash shell now

@@ -20,21 +25,12 @@ $ cd /opt/kaldi/egs/commonvoice-th
 $ ./run.sh --stage 0
 ```

-## Experiment Results
-Here are some experiment results evaluated on dev set:
-|Model|dev WER|
-|:----|:----:|
-|mono|-%|
-|tri1|-%|
-|tri2a|-%|
-|tri2b|-%|
-|tri3b|-%|
-|tdnn-chain|-%|
-
-Here is final `test` set result evaluated on `tdnn-chain`
-|Model|dev WER|test WER|
-|:----|:------|:------:|
-|tdnn-chain|-%|-%|
-
-## Author
-Chompakorn Chaksangchaichot
+Building the model takes roughly 4 hours with the voice dataset from [Mozilla Common Voice](https://commonvoice.mozilla.org/fi/datasets).
+
+Since the dataset is only 14 hours long, it does not contain enough words for the dictionary to be used for actual voice recognition.
+
+## Constructing a working VOSK-model
+
+Vosk is a higher level library that uses Kaldi internally for voice recognition. It requires certain type of Kaldi model in order for it to work.
+There is a list in [VOSKs own website](https://alphacephei.com/vosk/models#training-your-own-model) about what the model folder should contain.
+Find these files produced by the scripts and put them in right folders to create a working model. NOTE: take the files from nnet directories. Using files from tri3b or models created by earlier stages won't work.