Building against latest Kaldi #11

bjascob · 2017-03-08T02:45:36Z

There's a few issues with building this code against the latest Kaldi.

In OnlineDecoder.h we need a std:: in front of vector in a few places (or a using namespace std; at the top)
The constructor for SingleUtteranceNnet3Decoder in kaldi/src/online2/online-nnet3-decoding.h
has changed. It now takes a LatticeFasterDecoderConfig instead of a OnlineNnet3DecodingConfig
and a nnet3::DecodableNnetSimpleLoopedInfo instead of a nnet3::AmNnetSimple
I was able to do a hack fix by finding an older version of online-nnet3-decoding.cc/.h and online-nnet3-decodable-simple.cc/.h and changing the include/Makefile to use the local version instead.
I also had to comment out computer.Forward() in online-nnet3-decodable-simple.cc because the new Kaldi lib doesn't seem to have this method (and I'm not sure yet what the impact will be).

It looks like a real fix shouldn't be too hard but I'm not very familiar with Kaldi so I'd need to do a bunch of digging before I could switch the classes around to use the new lib correctly.

These changes may be enough to get things to work for me. At this point it does compile and link but I haven't gotten it fully running yet.

qharlie · 2017-03-08T15:09:24Z

Are you able to get results out of ASR-Server ? I can only ever get 'yes' or 'no'

bjascob · 2017-03-08T15:35:00Z

Yes, this is doing ASR now and I'm getting back full sentences.

I had to make one more change listed than above. In online-nnet3-decodable-simple.cc, instead of commenting out computer.Forward() I replaced the line with computer.Run(). From the documentation it looks like Run() does a forward then backward computation, not just forward, but this does work for me.

I was able to get it to decode a test wave and got the same results as using the model offline, directly in Kaldi (note that I ripped up the fcgi stuff and just opened an ifstream directly for testing. This is still "online" decoding, only I'm simulating a mic input by directly piping in a test.wav")

qharlie · 2017-03-08T16:09:57Z

Awesome news! Do you have a branch you are working off ? I'd love to get my copy working

bjascob · 2017-03-08T17:04:29Z

Attached is the severely hacked test code.
Untar it and read README_HACKER_CODE_INFO.txt for how to compile and run it.
APIAI_Server.tar.gz

If you want to use the fcgi input methods as intended, you can just uncomment them where I've commented them out. This code is simply to try to get the online ASR portion working.

qharlie · 2017-03-08T17:07:03Z

@bjascob thank you so much for this, I'm reading over it now. After I get it working, I'll try and create a new fork so we can keep it working with the latest Kaldi

bjascob · 2017-03-08T17:25:02Z

This isn't really the "right" fix since it pulls in some older Kaldi files in order to make it work. The right way is to change the code to use LatticeFasterDecoderConfig and DecodableNnetSimpleLoopedInfo when instantiating SingleUtteranceNnet3Decoder.
I might look into this at some point but since it works for now, I'll probably get a little further along in my project before I go back and try this. If you end up making these changes, please post.

realill · 2017-03-08T18:21:05Z

bjascob fork the project, fix it and make pull-request. Our team can not maintain this project right now, so help is appreciated.

bjascob · 2017-03-09T02:22:28Z

Done. I found some example code of how to use the new API in Kaldi for SingleUtteranceNnet3Decoder so the changes in my fork should be the "right" way to do this.

bjascob · 2017-03-10T13:51:31Z

With the latest merge this issues should be resolved.
(BTW... the new code will not work with Kaldi from before 2/9/2017)

mikenewman1 · 2018-01-05T18:26:38Z

Hi bjascob,
A number of people (including myself) have recently reported problems running the server (see issues #30, #31, #32). Personally I only see problems when I try to use a model built from the latest Kaldi, but I don't think that is the case for the others.
I am trying to debug this, but I am not very familiar with Kaldi internals. I am very suspicious that something subtle has changed recently in the API, maybe in some model format. I do know that in src/Nnet3LatgenFasterDecoder.cc there is an additional change needed just to get new models to load:

-    decode_fst_ = fst::ReadFstKaldi(fst_rxfilename_);
+    decode_fst_ = fst::ReadFstKaldiGeneric(fst_rxfilename_);

I would love to hear if you (or anyone else) has any suggestions.

bjascob · 2018-01-05T23:01:12Z

I download the latest version of Kaldi today from Github, compiled it and ran a quick test. Everything worked fine for me so it looks like this code works with the latest Kaldi and the model it was written to run (see https://github.com/dialogflow/api-ai-english-asr-model).

As far as integrating new models, I'm not familiar enough with Kaldi to make any suggestions.

bjascob closed this as completed Mar 10, 2017

bjascob mentioned this issue Jan 5, 2018

simple wav decoding don't work #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building against latest Kaldi #11

Building against latest Kaldi #11

bjascob commented Mar 8, 2017

qharlie commented Mar 8, 2017

bjascob commented Mar 8, 2017

qharlie commented Mar 8, 2017

bjascob commented Mar 8, 2017

qharlie commented Mar 8, 2017

bjascob commented Mar 8, 2017

realill commented Mar 8, 2017

bjascob commented Mar 9, 2017

bjascob commented Mar 10, 2017

mikenewman1 commented Jan 5, 2018

bjascob commented Jan 5, 2018

Building against latest Kaldi #11

Building against latest Kaldi #11

Comments

bjascob commented Mar 8, 2017

qharlie commented Mar 8, 2017

bjascob commented Mar 8, 2017

qharlie commented Mar 8, 2017

bjascob commented Mar 8, 2017

qharlie commented Mar 8, 2017

bjascob commented Mar 8, 2017

realill commented Mar 8, 2017

bjascob commented Mar 9, 2017

bjascob commented Mar 10, 2017

mikenewman1 commented Jan 5, 2018

bjascob commented Jan 5, 2018