Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building against latest Kaldi #11

Closed
bjascob opened this issue Mar 8, 2017 · 11 comments
Closed

Building against latest Kaldi #11

bjascob opened this issue Mar 8, 2017 · 11 comments

Comments

@bjascob
Copy link
Contributor

bjascob commented Mar 8, 2017

There's a few issues with building this code against the latest Kaldi.

  • In OnlineDecoder.h we need a std:: in front of vector in a few places (or a using namespace std; at the top)
  • The constructor for SingleUtteranceNnet3Decoder in kaldi/src/online2/online-nnet3-decoding.h
    has changed. It now takes a LatticeFasterDecoderConfig instead of a OnlineNnet3DecodingConfig
    and a nnet3::DecodableNnetSimpleLoopedInfo instead of a nnet3::AmNnetSimple
    I was able to do a hack fix by finding an older version of online-nnet3-decoding.cc/.h and online-nnet3-decodable-simple.cc/.h and changing the include/Makefile to use the local version instead.
    I also had to comment out computer.Forward() in online-nnet3-decodable-simple.cc because the new Kaldi lib doesn't seem to have this method (and I'm not sure yet what the impact will be).

It looks like a real fix shouldn't be too hard but I'm not very familiar with Kaldi so I'd need to do a bunch of digging before I could switch the classes around to use the new lib correctly.

These changes may be enough to get things to work for me. At this point it does compile and link but I haven't gotten it fully running yet.

@qharlie
Copy link

qharlie commented Mar 8, 2017

Are you able to get results out of ASR-Server ? I can only ever get 'yes' or 'no'

@bjascob
Copy link
Contributor Author

bjascob commented Mar 8, 2017

Yes, this is doing ASR now and I'm getting back full sentences.

I had to make one more change listed than above. In online-nnet3-decodable-simple.cc, instead of commenting out computer.Forward() I replaced the line with computer.Run(). From the documentation it looks like Run() does a forward then backward computation, not just forward, but this does work for me.

I was able to get it to decode a test wave and got the same results as using the model offline, directly in Kaldi (note that I ripped up the fcgi stuff and just opened an ifstream directly for testing. This is still "online" decoding, only I'm simulating a mic input by directly piping in a test.wav")

@qharlie
Copy link

qharlie commented Mar 8, 2017

Awesome news! Do you have a branch you are working off ? I'd love to get my copy working

@bjascob
Copy link
Contributor Author

bjascob commented Mar 8, 2017

Attached is the severely hacked test code.
Untar it and read README_HACKER_CODE_INFO.txt for how to compile and run it.
APIAI_Server.tar.gz

If you want to use the fcgi input methods as intended, you can just uncomment them where I've commented them out. This code is simply to try to get the online ASR portion working.

@qharlie
Copy link

qharlie commented Mar 8, 2017

@bjascob thank you so much for this, I'm reading over it now. After I get it working, I'll try and create a new fork so we can keep it working with the latest Kaldi

@bjascob
Copy link
Contributor Author

bjascob commented Mar 8, 2017

This isn't really the "right" fix since it pulls in some older Kaldi files in order to make it work. The right way is to change the code to use LatticeFasterDecoderConfig and DecodableNnetSimpleLoopedInfo when instantiating SingleUtteranceNnet3Decoder.
I might look into this at some point but since it works for now, I'll probably get a little further along in my project before I go back and try this. If you end up making these changes, please post.

@realill
Copy link
Contributor

realill commented Mar 8, 2017

bjascob fork the project, fix it and make pull-request. Our team can not maintain this project right now, so help is appreciated.

@bjascob
Copy link
Contributor Author

bjascob commented Mar 9, 2017

Done. I found some example code of how to use the new API in Kaldi for SingleUtteranceNnet3Decoder so the changes in my fork should be the "right" way to do this.

@bjascob
Copy link
Contributor Author

bjascob commented Mar 10, 2017

With the latest merge this issues should be resolved.
(BTW... the new code will not work with Kaldi from before 2/9/2017)

@bjascob bjascob closed this as completed Mar 10, 2017
@mikenewman1
Copy link

Hi bjascob,
A number of people (including myself) have recently reported problems running the server (see issues #30, #31, #32). Personally I only see problems when I try to use a model built from the latest Kaldi, but I don't think that is the case for the others.
I am trying to debug this, but I am not very familiar with Kaldi internals. I am very suspicious that something subtle has changed recently in the API, maybe in some model format. I do know that in src/Nnet3LatgenFasterDecoder.cc there is an additional change needed just to get new models to load:

-    decode_fst_ = fst::ReadFstKaldi(fst_rxfilename_);
+    decode_fst_ = fst::ReadFstKaldiGeneric(fst_rxfilename_);

I would love to hear if you (or anyone else) has any suggestions.

@bjascob
Copy link
Contributor Author

bjascob commented Jan 5, 2018

I download the latest version of Kaldi today from Github, compiled it and ran a quick test. Everything worked fine for me so it looks like this code works with the latest Kaldi and the model it was written to run (see https://github.com/dialogflow/api-ai-english-asr-model).

As far as integrating new models, I'm not familiar enough with Kaldi to make any suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants