Skip to content

Commit 46e7e96

Browse files
author
Oleg Mazko
committed
ecnn47
0 parents  commit 46e7e96

26 files changed

+7129
-0
lines changed

.github/README.md

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# TensorFlow Lite Keyword Spotting
2+
Native C/C++. Suitable for embedded devices.
3+
4+
~$ git clone --recursive --depth 1 https://github.com/42io/tflite_kws.git
5+
6+
### Inference
7+
Default models pre-trained on 0-9 words: zero one two three four five six seven eight nine.
8+
9+
~$ arecord -f S16_LE -c1 -r16000 -d1 test.wav
10+
~$ aplay test.wav
11+
~$ dataset/dataset/google_speech_commands/src/features/build.sh
12+
~$ src/brain/build.sh
13+
~$ alias fe=dataset/dataset/google_speech_commands/bin/fe
14+
~$ fe test.wav | bin/guess models/mlp.tflite
15+
~$ fe test.wav | bin/guess models/cnn.tflite
16+
~$ fe test.wav | bin/guess models/rnn.tflite
17+
~$ fe test.wav | bin/guess models/dcnn.tflite
18+
~$ fe test.wav | head -48 | tail -47 | bin/guess models/dcnn47.tflite
19+
20+
### Real Time
21+
Microphone quality is very important. You should probably think about how to remove fan noise from the mic... Using headset seems like a good idea :)
22+
23+
~$ argmax() { mawk -Winteractive '{m=$1;j=1;for(i=j;i<=NF;i++)if($i>m){m=$i;j=i;}print j-1}'; }
24+
~$ stable() { mawk -Winteractive -v u=$1 '{if(x!=$1){c=0;x=$1}else if(++c==u&&y!=x)print y=x}'; }
25+
~$ ignore() { mawk -Winteractive -v t=$1 '{if($1<t)print $1}'; }
26+
27+
Simple non-streaming mode. Model receives the whole input sequence and then returns the classification result:
28+
29+
~$ arecord -f S16_LE -c1 -r16000 -t raw | fe | \
30+
bin/ring 47 | bin/guess models/dcnn47.tflite | argmax | stable 10 | ignore 10
31+
32+
[Streaming](https://arxiv.org/abs/2005.06720) mode is more CPU friendly as it reduces MAC operations in neural
33+
network. Model receives portion of the input sequence and classifies it incrementally:
34+
35+
~$ arecord -f S16_LE -c1 -r16000 -t raw | fe | \
36+
bin/guess models/dcnn13.tflite | argmax | stable 10 | ignore 10
37+
38+
### Training
39+
Jupyter Notebooks [MLP](jupyter/mlp.ipynb) | [CNN](jupyter/cnn.ipynb) | [RNN](jupyter/rnn.ipynb) | [DCNN](jupyter/dcnn.ipynb) | [DCNN47](jupyter/dcnn47.ipynb) | [DCNN13](jupyter/dcnn13.ipynb) | [EDCNN47](jupyter/edcnn47.ipynb) | [ECNN47](jupyter/ecnn47.ipynb).
40+
41+
Each notebook generates model file. To evaluate model accuracy:
42+
43+
~$ apt install gcc lrzip wget
44+
~$ wget https://github.com/42io/dataset/releases/download/v1.0/0-9up.lrz -O /tmp/0-9up.lrz
45+
~$ lrunzip /tmp/0-9up.lrz -o /tmp/0-9up.data # md5 87fc2460c7b6cd3dcca6807e9de78833
46+
~$ dataset/matrix.sh /tmp/0-9up.data
47+
48+
Confusion matrix for pre-trained modeles:
49+
50+
MLP confusion matrix...
51+
zero 0.93 0.00 0.03 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.03 0.00 | 603
52+
one 0.00 0.85 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.05 0.06 0.01 | 575
53+
two 0.03 0.00 0.86 0.02 0.02 0.00 0.00 0.01 0.01 0.00 0.04 0.01 | 564
54+
three 0.00 0.00 0.01 0.90 0.00 0.01 0.01 0.01 0.04 0.01 0.01 0.01 | 548
55+
four 0.00 0.01 0.01 0.00 0.90 0.01 0.00 0.00 0.00 0.00 0.05 0.01 | 605
56+
five 0.00 0.01 0.00 0.01 0.01 0.80 0.01 0.03 0.01 0.03 0.09 0.01 | 607
57+
six 0.00 0.00 0.00 0.00 0.00 0.00 0.96 0.00 0.00 0.00 0.02 0.01 | 462
58+
seven 0.01 0.00 0.03 0.01 0.00 0.00 0.01 0.90 0.00 0.00 0.03 0.01 | 574
59+
eight 0.00 0.00 0.01 0.07 0.00 0.00 0.03 0.00 0.84 0.01 0.03 0.01 | 547
60+
nine 0.00 0.04 0.00 0.01 0.00 0.01 0.00 0.01 0.00 0.86 0.06 0.01 | 596
61+
#unk# 0.02 0.03 0.03 0.05 0.06 0.07 0.02 0.03 0.02 0.07 0.58 0.02 | 730
62+
#pub# 0.00 0.00 0.01 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.96 | 730
63+
MLP guessed wrong 1029...
64+
65+
CNN confusion matrix...
66+
zero 0.97 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.00 | 603
67+
one 0.00 0.93 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.05 0.00 | 575
68+
two 0.01 0.00 0.95 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.03 0.00 | 564
69+
three 0.00 0.00 0.00 0.91 0.00 0.00 0.01 0.01 0.01 0.00 0.06 0.00 | 548
70+
four 0.00 0.00 0.00 0.00 0.90 0.00 0.00 0.00 0.00 0.00 0.09 0.00 | 605
71+
five 0.00 0.00 0.00 0.00 0.00 0.93 0.00 0.00 0.01 0.01 0.06 0.00 | 607
72+
six 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.01 0.00 | 462
73+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.97 0.00 0.00 0.02 0.00 | 574
74+
eight 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.00 0.93 0.00 0.03 0.00 | 547
75+
nine 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.93 0.06 0.00 | 596
76+
#unk# 0.01 0.01 0.00 0.02 0.02 0.00 0.00 0.00 0.00 0.01 0.92 0.01 | 730
77+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.98 | 730
78+
CNN guessed wrong 427...
79+
80+
RNN confusion matrix...
81+
zero 0.98 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | 603
82+
one 0.00 0.95 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.02 0.00 | 575
83+
two 0.00 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 564
84+
three 0.00 0.00 0.00 0.97 0.00 0.00 0.01 0.00 0.01 0.00 0.01 0.00 | 548
85+
four 0.00 0.00 0.00 0.00 0.97 0.00 0.00 0.00 0.00 0.00 0.02 0.00 | 605
86+
five 0.00 0.00 0.00 0.00 0.01 0.98 0.00 0.00 0.00 0.00 0.01 0.00 | 607
87+
six 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 | 462
88+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.98 0.00 0.00 0.01 0.00 | 574
89+
eight 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.97 0.00 0.01 0.00 | 547
90+
nine 0.00 0.01 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.97 0.02 0.00 | 596
91+
#unk# 0.00 0.01 0.00 0.01 0.02 0.02 0.00 0.00 0.01 0.02 0.91 0.00 | 730
92+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 | 730
93+
RNN guessed wrong 220...
94+
95+
DCNN confusion matrix...
96+
zero 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 | 603
97+
one 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 | 575
98+
two 0.01 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 564
99+
three 0.00 0.00 0.00 0.97 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 | 548
100+
four 0.00 0.00 0.00 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 605
101+
five 0.00 0.00 0.00 0.00 0.00 0.98 0.00 0.00 0.00 0.00 0.01 0.00 | 607
102+
six 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 | 462
103+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 | 574
104+
eight 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.01 0.00 | 547
105+
nine 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.98 0.01 0.00 | 596
106+
#unk# 0.00 0.01 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.94 0.00 | 730
107+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 | 730
108+
DCNN guessed wrong 143...
109+
110+
DCNN47 confusion matrix...
111+
zero 0.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | 603
112+
one 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 | 575
113+
two 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | 564
114+
three 0.00 0.00 0.01 0.97 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 548
115+
four 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 605
116+
five 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 | 607
117+
six 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 | 462
118+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 | 574
119+
eight 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 | 547
120+
nine 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.01 0.00 | 596
121+
#unk# 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.97 0.00 | 730
122+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 | 730
123+
DCNN47 guessed wrong 88...
124+
125+
DCNN13 confusion matrix...
126+
zero 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | 603
127+
one 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 | 575
128+
two 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 564
129+
three 0.00 0.00 0.00 0.98 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.00 | 548
130+
four 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 605
131+
five 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.01 0.00 | 607
132+
six 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 | 462
133+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 | 574
134+
eight 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.01 0.00 | 547
135+
nine 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.01 0.00 | 596
136+
#unk# 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.97 0.00 | 730
137+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 | 730
138+
DCNN13 guessed wrong 82...
139+
140+
EDCNN47 confusion matrix...
141+
zero 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 603
142+
one 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.00 | 575
143+
two 0.00 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.00 | 564
144+
three 0.00 0.00 0.00 0.97 0.00 0.00 0.01 0.00 0.00 0.00 0.03 0.00 | 548
145+
four 0.00 0.00 0.00 0.00 0.97 0.00 0.00 0.00 0.00 0.00 0.03 0.00 | 605
146+
five 0.00 0.00 0.00 0.00 0.00 0.98 0.00 0.00 0.00 0.00 0.01 0.00 | 607
147+
six 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.01 0.00 | 462
148+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.01 0.00 | 574
149+
eight 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 | 547
150+
nine 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.01 0.00 | 596
151+
#unk# 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.98 0.00 | 730
152+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 | 730
153+
EDCNN47 guessed wrong 116...
154+
155+
ECNN47 confusion matrix...
156+
zero 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | 603
157+
one 0.00 0.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 | 575
158+
two 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 564
159+
three 0.00 0.00 0.00 0.98 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.00 | 548
160+
four 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.01 0.00 | 605
161+
five 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 | 607
162+
six 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 | 462
163+
seven 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 | 574
164+
eight 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 | 547
165+
nine 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.99 0.00 0.00 | 596
166+
#unk# 0.00 0.01 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.98 0.00 | 730
167+
#pub# 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 | 730
168+
ECNN47 guessed wrong 63...
169+
170+
Evaluate false positives:
171+
172+
~$ wget https://data.deepai.org/timit.zip -O /tmp/timit.zip
173+
~$ unzip -q /tmp/timit.zip -d /tmp/timit # md5 5b736303c55cf4970926bb9978b655fe
174+
~$ dataset/false.sh /tmp/timit 100
175+
176+
A false positive error, or false positive, is a result that indicates a given condition exists when it does not.
177+
178+
EDCNN47 2042 | 11191
179+
ECNN47 4494 | 11191
180+
DCNN13 4787 | 11191
181+
DCNN47 4517 | 11191
182+
MLP 5091 | 10991
183+
CNN 4958 | 10991
184+
RNN 4527 | 10991
185+
186+
### Heap Memory Usage
187+
Some magic numbers to know before stepping into embedded world.
188+
189+
~$ valgrind dataset/dataset/google_speech_commands/bin/fe test.wav # 606,416 bytes allocated
190+
~$ fe test.wav | valgrind bin/guess models/mlp.tflite # 347,138 bytes allocated
191+
~$ fe test.wav | valgrind bin/guess models/cnn.tflite # 1,793,114 bytes allocated
192+
~$ fe test.wav | valgrind bin/guess models/rnn.tflite # 2,442,810 bytes allocated
193+
~$ seq 637 | valgrind bin/guess models/dcnn.tflite # 595,958 bytes allocated
194+
~$ seq 611 | valgrind bin/guess models/dcnn47.tflite # 968,482 bytes allocated
195+
~$ seq 13 | valgrind bin/guess models/dcnn13.tflite # 671,398 bytes allocated
196+
~$ seq 611 | valgrind bin/guess models/edcnn47.tflite # 1,661,132 bytes allocated
197+
~$ seq 611 | valgrind bin/guess models/ecnn47.tflite # 8,625,814 bytes allocated
198+
199+
### Play
200+
Let's consider voice control for led bulb.
201+
202+
~$ bigram() { mawk -Winteractive '{if(s)print prev,$0; prev=$0; s=1}'; }
203+
~$ intent() { mawk -Winteractive '
204+
/0 6/{system("./on.sh")}
205+
/0 7/{system("./off.sh")}
206+
/0 8/{system("./yellow.sh")}
207+
/0 9/{system("./white.sh")}
208+
'; }
209+
210+
There are 4 commands here - turn on, off, change color. When we speak words `zero six`, script `./on.sh` will be executed e.t.c.
211+
212+
~$ arecord -f S16_LE -c1 -r16000 -t raw | fe | \
213+
bin/guess models/dcnn13.tflite | argmax | stable 10 | ignore 10 | bigram | intent

0 commit comments

Comments
 (0)