sevagh
diff --git a/‎LICENSE
Lines changed: 21 additions & 0 deletions b/‎LICENSE
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 86 additions & 0 deletions b/‎README.md
Lines changed: 86 additions & 0 deletions
diff --git a/‎audio/drum.wav
316 KB b/‎audio/drum.wav
316 KB
diff --git a/‎audio/harm_driedger_nonrealtime.wav
315 KB b/‎audio/harm_driedger_nonrealtime.wav
315 KB
diff --git a/‎audio/harm_fitzgerald_nonrealtime.wav
315 KB b/‎audio/harm_fitzgerald_nonrealtime.wav
315 KB
diff --git a/‎audio/harm_rt.wav
316 KB b/‎audio/harm_rt.wav
316 KB
diff --git a/‎audio/mestis_el_mestizo_shorter.wav
938 KB b/‎audio/mestis_el_mestizo_shorter.wav
938 KB
diff --git a/‎audio/mestis_harm_offline_shorter.wav
938 KB b/‎audio/mestis_harm_offline_shorter.wav
938 KB
diff --git a/‎audio/mestis_harm_rt_shorter.wav
938 KB b/‎audio/mestis_harm_rt_shorter.wav
938 KB
diff --git a/‎audio/mestis_perc_offline_shorter.wav
938 KB b/‎audio/mestis_perc_offline_shorter.wav
938 KB
diff --git a/‎audio/mestis_perc_rt_shorter.wav
938 KB b/‎audio/mestis_perc_rt_shorter.wav
938 KB
diff --git a/‎audio/mixed.wav
316 KB b/‎audio/mixed.wav
316 KB
diff --git a/‎audio/perc_driedger_nonrealtime.wav
315 KB b/‎audio/perc_driedger_nonrealtime.wav
315 KB
diff --git a/‎audio/perc_fitzgerald_nonrealtime.wav
315 KB b/‎audio/perc_fitzgerald_nonrealtime.wav
315 KB
diff --git a/‎audio/perc_rt.wav
316 KB b/‎audio/perc_rt.wav
316 KB
diff --git a/‎audio/viola.wav
631 KB b/‎audio/viola.wav
631 KB
diff --git a/‎chunked_wav_example.py
Lines changed: 53 additions & 0 deletions b/‎chunked_wav_example.py
Lines changed: 53 additions & 0 deletions
diff --git a/‎images/drum_waveform.png
71.2 KB b/‎images/drum_waveform.png
71.2 KB
diff --git a/‎images/drumspecgram.png
575 KB b/‎images/drumspecgram.png
575 KB
diff --git a/‎images/drumspecgram_orig.png
531 KB b/‎images/drumspecgram_orig.png
531 KB
diff --git a/‎images/hann_taper_window.png
61.7 KB b/‎images/hann_taper_window.png
61.7 KB
diff --git a/‎images/harm_3way.png
320 KB b/‎images/harm_3way.png
320 KB
diff --git a/‎images/harm_binary.png
404 KB b/‎images/harm_binary.png
404 KB
diff --git a/‎images/harm_driedger_cmp.png
209 KB b/‎images/harm_driedger_cmp.png
209 KB
diff --git a/‎images/harm_driedger_waveform.png
68.5 KB b/‎images/harm_driedger_waveform.png
68.5 KB
diff --git a/‎images/harm_fitzgerald_cmp.png
198 KB b/‎images/harm_fitzgerald_cmp.png
198 KB
diff --git a/‎images/harm_fitzgerald_waveform.png
73.2 KB b/‎images/harm_fitzgerald_waveform.png
73.2 KB
diff --git a/‎images/harm_realtime.png
600 KB b/‎images/harm_realtime.png
600 KB
diff --git a/‎images/harm_realtime_cmp.png
197 KB b/‎images/harm_realtime_cmp.png
197 KB
diff --git a/‎images/harm_soft.png
575 KB b/‎images/harm_soft.png
575 KB
diff --git a/‎images/harm_specgram_matplotlib.png
356 KB b/‎images/harm_specgram_matplotlib.png
356 KB
diff --git a/‎images/hpss_causality.png
640 KB b/‎images/hpss_causality.png
640 KB
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2020 Sevag Hanssian
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,86 @@
+# Real-Time-HPSS
+
+This repository contains a real-time implementation of the median-filtering HPSS algorithm [[1]](http://dafx10.iem.at/papers/DerryFitzGerald_DAFx10_P15.pdf), [[2]](https://www.audiolabs-erlangen.de/content/05-fau/assistant/00-driedger/01-publications/2014_DriedgerMuellerDisch_ExtensionsHPSeparation_ISMIR.pdf).
+
+The original implementation uses the STFT/spectrogram of the audio signal to create harmonic and percussive masks, which are then applied to the STFT. The ISTFT is taken to create the separated harmonic and percussive audio signals. By combining the STFT and ISTFT loops and creating a sliding STFT, the separation can be done in real-time:
+
+<img src="./images/rt_hpss_diagram.png" width=700>
+
+### MATLAB implementation
+
+There's a [demo script](./matlab/HPSSMicrophone.m) which performs live, real-time HPSS on a microphone input stream (watch out for feedback - you should probably have your output and input devices in different rooms). PDFs of the report and presentation (built from the latex sources in this repo) are distributed as well on the [releases page](https://github.com/sevagh/Real-Time-HPSS/releases). The spectrograms below were created with [HPSSRtWav.m](./matlab/HPSSRtWav.m), a chunked processing of wav files to demonstrate the validity of real-time HPSS without the complications of microphone recordings.
+
+Mixed spectrogram:
+
+<img src="./images/mixedspecgram.png" width=512>
+
+Real-time harmonic separation:
+
+<img src="./images/harm_realtime.png" width=512>
+
+Real-time percussive separation:
+
+<img src="./images/perc_realtime.png" width=512>
+
+### Python demo
+
+The file [chunked_wav_example.py](./chunked_wav_example.py) uses the `hpss_rt` package in the [python subdirectory](./python):
+
+```python
+fs, x = scipy.io.wavfile.read("mixed.wav")
+hpss = HPSSRT(fs)
+
+h = numpy.ndarray(shape=x.shape)
+p = numpy.ndarray(shape=x.shape)
+
+x_ptr = 0
+while x_ptr < len(x):
+    if len(x[x_ptr : x_ptr + hpss.hop]) != hpss.hop:
+        # skip uneven/non-hop-sized last chunk
+        break
+    h_, p_ = hpss.process_next_hop(x[x_ptr : x_ptr + hpss.hop])
+    h[x_ptr : x_ptr + hpss.hop] = h_
+    p[x_ptr : x_ptr + hpss.hop] = p_
+    x_ptr += hpss.hop
+
+scipy.io.wavfile.write("h_rt_sep_python.wav", fs, h)
+scipy.io.wavfile.write("p_rt_sep_python.wav", fs, p)
+
+fs, xm = scipy.io.wavfile.read("mixed.wav")
+fs, xh = scipy.io.wavfile.read("h_rt_sep_python.wav")
+fs, xp = scipy.io.wavfile.read("p_rt_sep_python.wav")
+_, _, _, im = plt.specgram(xm, Fs=fs, NFFT=1024, noverlap=256)
+plt.show()
+_, _, _, im = plt.specgram(xh, Fs=fs, NFFT=1024, noverlap=256)
+plt.show()
+_, _, _, im = plt.specgram(xp, Fs=fs, NFFT=1024, noverlap=256)
+plt.show()
+```
+
+Mixed spectrogram:
+
+<img src="./images/mixed_specgram_matplotlib.png" width=512>
+
+Real-time harmonic separation:
+
+<img src="./images/harm_specgram_matplotlib.png" width=512>
+
+Real-time percussive separation:
+
+<img src="./images/perc_specgram_matplotlib.png" width=512>
+
+### Project files
+
+* audio - audio clips used throughout the report and presentation to generate results, plots
+* images - plots, etc. for the report and presentation
+* latex - latex files for the report and presentation PDFs
+* matlab - matlab scripts
+    * HPSS.m - median-filtering HPSS (with both the 2010 and 2014 techniques)
+    * HPSSWav.m - a file that loads a wav file and applies HPSS.m
+    * HPSSMicrophone.m - a real-time implementation that separates and outputs percussive or harmonic separations of the microphone input in real-time
+    * HPSSRtWav.m - a modification of HPSSMicrophone.m to test the real-time implementation with wav files
+* python - python library + class, `from hpss_rt import HPSSRT`
+
+### About this project
+
+Real-Time-HPSS is presented as my final project for MUMT 501, Winter 2020.
@@ -0,0 +1,53 @@
+#!/usr/bin/env python3.7
+
+from hpss_rt import HPSSRT
+import sys
+import time
+import numpy
+import scipy
+import scipy.io.wavfile
+import matplotlib.pyplot as plt
+
+
+if __name__ == "__main__":
+    infile = ""
+    try:
+        infile = sys.argv[1]
+    except:
+        print("usage: {0} /path/to/wav/file", file=sys.stderr)
+        sys.exit(1)
+
+    fs, x = scipy.io.wavfile.read(infile)
+    hpss = HPSSRT(fs)
+
+    h = numpy.ndarray(shape=x.shape)
+    p = numpy.ndarray(shape=x.shape)
+
+    x_ptr = 0
+    total = 0
+    while x_ptr < len(x):
+        if len(x[x_ptr : x_ptr + hpss.hop]) != hpss.hop:
+            # skip uneven/non-hop-sized last chunk
+            break
+        start = time.time()
+        h_, p_ = hpss.process_next_hop(x[x_ptr : x_ptr + hpss.hop])
+        total += time.time() - start
+        h[x_ptr : x_ptr + hpss.hop] = h_
+        p[x_ptr : x_ptr + hpss.hop] = p_
+        x_ptr += hpss.hop
+
+    total /= x_ptr / hpss.hop
+    print("average time per loop iter: {0}".format(total))
+
+    scipy.io.wavfile.write("h_rt_sep_python.wav", fs, h)
+    scipy.io.wavfile.write("p_rt_sep_python.wav", fs, p)
+
+    fs, xm = scipy.io.wavfile.read(infile)
+    fs, xh = scipy.io.wavfile.read("h_rt_sep_python.wav")
+    fs, xp = scipy.io.wavfile.read("p_rt_sep_python.wav")
+    _, _, _, im = plt.specgram(xm, Fs=fs, NFFT=1024, noverlap=256)
+    plt.show()
+    _, _, _, im = plt.specgram(xh, Fs=fs, NFFT=1024, noverlap=256)
+    plt.show()
+    _, _, _, im = plt.specgram(xp, Fs=fs, NFFT=1024, noverlap=256)
+    plt.show()