[fix] doc and bpm work

This commit is contained in:
alban 2020-09-29 00:56:54 +02:00
parent 7f42b10b95
commit 9887f62202
2 changed files with 183 additions and 75 deletions

152
README.md
View File

@ -1,14 +1,108 @@
# Redilysis = Redis + Audio Analysis # (Audio Analysis | redis ) == <3
Redilysis sends audio analysis to a redis server.
Redilysis sends audio analysis to a redis server.
The idea is to share a single audio analysis to many Visual Jockey filters, in our case for lasers. The idea is to share a single audio analysis to many Visual Jockey filters, in our case for lasers.
Two modes exist for now, you need to run two processes to get the complete experience! **Two modes are available, so you might need to run two processes for full analysis.**
### Spectrum Mode
This is the default mode. ### Redis Keys and Contents
Each **word in bold** is a key which you can query the redis server for. Ex:
```
$ redis-cli get spectrum_120
"[2.21, 0.56, 0.51, 0.32, 0.27, 0.21, 0.18, 0.17, 0.18, 0.23]"
```
**rms**
* **Mode** spectrum
* **Type** float number
* **Length** scalar
* **Meaning** Represents the root-mean-square -a mean value- for all frequencies between ```C0``` and ```C9```, e.g. between 12Hz and 8,372Hz.
* **Use** A fairly basic information about the scene audio volume.
* **Example**
* ```"0.12"```
* The audio volume for the scene is pretty low.
* It is obtained by averaging the RMS of every audio frame during the capture.
**spectrum_10**
* **Mode** spectrum
* **Type** array of float numbers (0.0-10.0)
* **Length** 10
* **Meaning** Represents the audio volume for the 10 **octaves** between ```C0``` and ```C9```, e.g. between 12Hz and 8,372Hz.
* **Use** A simple and useful way to get a global idea of the sound landscape.
* **Example**
* ```"[2.21, 0.56, 0.51, 0.32, 0.27, 0.21, 0.18, 0.17, 0.18, 0.23]"```
* The audio volume for the `C4` octave is `spectrum_10[4]`.
* That value being `0.27` is pretty low meaning almost no audio volume for that octave.
* It is calculated by averaging the volume of the octave's notes, e.g. `C4, D4, D#4, E4, F4, F#4, G4, G#4, A4, A#4, B4`.
**spectrum_120**
* **Mode** spectrum
* **Type** array of float numbers (0.0-10.0)
* **Length** 120
* **Meaning** Represents the audio volume for the 120 **notes** between ```C0``` and ```C9```, e.g. between 12Hz and 8,372Hz.
* **Use** More detailed than spectrum_10, it allows to find the standing out notes of the audio landscape.
* **Example**
* ```"[5.55, 2.61, 2.49, 1.79, 2.09, 4.35, 1.99, 1.57, 1.47, 0.77, 0.91, 0.89, 0.85, 0.56, 0.53, 0.73, 0.53, 0.46, 0.43, 0.44, 0.27, 0.45, 0.7, 0.81, 0.98, 0.7, 0.71, 0.6, 0.83, 0.51, 0.32, 0.31, 0.33, 0.24, 0.25, 0.33, 0.39, 0.43, 0.51, 0.28, 0.27, 0.25, 0.38, 0.25, 0.27, 0.3, 0.2, 0.27, 0.35, 0.29, 0.34, 0.3, 0.27, 0.27, 0.22, 0.21, 0.21, 0.29, 0.22, 0.28, 0.18, 0.19, 0.25, 0.26, 0.25, 0.24, 0.2, 0.21, 0.19, 0.18, 0.19, 0.17, 0.2, 0.17, 0.18, 0.17, 0.15, 0.17, 0.19, 0.18, 0.21, 0.16, 0.16, 0.18, 0.15, 0.13, 0.14, 0.16, 0.2, 0.17, 0.17, 0.2, 0.18, 0.16, 0.18, 0.15, 0.15, 0.16, 0.16, 0.19, 0.19, 0.19, 0.17, 0.18, 0.17, 0.19, 0.23, 0.23, 0.2, 0.23, 0.24, 0.36, 0.34, 0.23, 0.22, 0.2, 0.19, 0.18, 0.21, 0.21]"```
* The audio volume for the `C2` note is `spectrum_10[23]` (12x2 - 1).
* That value being `0.81` is average meaning there is some audio volume for that octave.
bpm
* **Mode** bpm
* **Type**
* **Length**
* **Meaning** Represents
* **Use**
* **Example**
bpm_sample_interval
* **Mode** bpm
* **Type**
* **Length**
* **Meaning** Represents
* **Example**
bpm_delay
* **Mode** bpm
* **Type**
* **Length**
* **Meaning** Represents
* **Example**
beats
* **Mode** bpm
* **Type**
* **Length**
* **Meaning** Represents
* **Example**
### Requirements and installation
* python 2.7
* audio card
* redis server
#### Installation
```python
sudo apt install python-pyaudio python
git clone https://git.interhacker.space/tmplab/redilysis.git
cd redilysis
pip install -r requirements.txt
python redilysis.py --help
```
### Running in Spectrum Mode
```
python redilysis.py -m spectrum
```
This is the default mode.
It performs some frequency analysis (Fast Fourier Transform) to detect "energy" in the human audition bandwidths. It performs some frequency analysis (Fast Fourier Transform) to detect "energy" in the human audition bandwidths.
@ -18,53 +112,15 @@ It can run at sub-second frequency (100ms) with no problem.
It reports realistic data: spectrum analysis is the easy part. It reports realistic data: spectrum analysis is the easy part.
### BPM Mode ### Running in BPM Mode
This mode is more experimental. ```
python redilysis.py -m bpm -s 0.5
It attempts to detect beats based on the
## Keys and contents in Redis
bpm_time : (milliseconds integer timestamp) last update time
onset
bpm
beats
spectrum_time
## Installation
```python
sudo apt install python-pyaudio python3
git clone https://git.interhacker.space/tmplab/redilysis.git
cd redilysis
pip install -r requirements.txt
python3 redilysis.py --help
``` ```
## Guide This mode is experimental.
There are two available modes. It attempts to detect beats based on complex parameters.
**One is the slow mode with BPM recognition:**
python3 redilysis.py -m bpm -s 1 -f 44100
Pushes following keys in redis:
* onset
* bpm
* beats
**The other is a fast mode with spectrogram analysis**
python3 redilysis.py -m spectrum -s 0.1 -f 4410
Pushes following keys in redis:
* rms
* spectrum
* tuning

View File

@ -34,42 +34,52 @@ _FRAMES_PER_BUFFER = 4410
_N_FFT = 4096 _N_FFT = 4096
_RATE = 44100 _RATE = 44100
_SAMPLING_FREQUENCY = 0.1 _SAMPLING_FREQUENCY = 0.1
_BPM_MIN=10
_BPM_MAX=400
# Argument parsing # Argument parsing
# Audio Args # Audio Args
parser = argparse.ArgumentParser(prog='realtime_redis') parser = argparse.ArgumentParser(prog='realtime_redis')
# Audio Capture Args
parser.add_argument('--list-devices','-L', action='store_true', help='Which devices are detected by pyaudio') parser.add_argument('--list-devices','-L', action='store_true', help='Which devices are detected by pyaudio')
parser.add_argument('--mode','-m', required=False, default='spectrum', choices=['spectrum', 'bpm'], type=str, help='Which mode to use. Default=spectrum') parser.add_argument('--mode','-m', required=False, default='spectrum', choices=['spectrum', 'bpm'], type=str, help='Which mode to use. Default=spectrum')
parser.add_argument('--device','-d', required=False, type=int, help='Which pyaudio device to use') parser.add_argument('--device','-d', required=False, type=int, help='Which pyaudio device to use')
#parser.add_argument('--frames','-f', required=False, default=4410, type=int, help='How many frames per buffer. Default={}'.format(_FRAMES_PER_BUFFER))
parser.add_argument('--sampling-frequency','-s', required=False, default=0.1, type=float, help='Which frequency, in seconds. Default={}f '.format(_SAMPLING_FREQUENCY)) parser.add_argument('--sampling-frequency','-s', required=False, default=0.1, type=float, help='Which frequency, in seconds. Default={}f '.format(_SAMPLING_FREQUENCY))
parser.add_argument('--channels','-c', required=False, default=_CHANNELS, type=int, help='How many channels. Default={} '.format(_CHANNELS)) parser.add_argument('--channels','-c', required=False, default=_CHANNELS, type=int, help='How many channels. Default={} '.format(_CHANNELS))
parser.add_argument('--rate','-r', required=False, default=44100, type=int, help='Which rate. Default={} '.format(_RATE)) parser.add_argument('--rate','-r', required=False, default=44100, type=int, help='The audio capture rate in Hz. Default={} '.format(_RATE))
#parser.add_argument('--frames','-f', required=False, default=4410, type=int, help='How many frames per buffer. Default={}'.format(_FRAMES_PER_BUFFER))
# BPM Mode Args
parser.add_argument('--bpm-min', required=False, default=_BPM_MIN, type=int, help='BPM mode only. The low BPM threshold. Default={} '.format(_BPM_MIN))
parser.add_argument('--bpm-max', required=False, default=_BPM_MAX, type=int, help='BPM mode only. The high BPM threshold. Default={} '.format(_BPM_MAX))
# Redis Args # Redis Args
parser.add_argument("-i","--ip",help="IP address of the Redis server ",default="127.0.0.1",type=str) parser.add_argument("-i","--ip",help="IP address of the Redis server ",default="127.0.0.1",type=str)
parser.add_argument("-p","--port",help="Port of the Redis server ",default="6379",type=str) parser.add_argument("-p","--port",help="Port of the Redis server ",default="6379",type=str)
# Stardard Args # Standard Args
parser.add_argument("-v","--verbose",action="store_true",help="Verbose") parser.add_argument("-v","--verbose",action="store_true",help="Verbose")
args = parser.parse_args() args = parser.parse_args()
# global # global
bpm = 120.0 bpm = 120.0
start = 0
# Set real variables # Set real variables
F_LO = librosa.note_to_hz('C0') F_LO = librosa.note_to_hz('C0')
F_HI = librosa.note_to_hz('C10') F_HI = librosa.note_to_hz('C10')
BAND_TONES = _BAND_TONES BAND_TONES = _BAND_TONES
CHANNELS = args.channels N_FFT = _N_FFT
DEVICE = args.device CHANNELS = args.channels
FRAMES_PER_BUFFER = int(args.rate * args.sampling_frequency ) DEVICE = args.device
LIST_DEVICES = args.list_devices FRAMES_PER_BUFFER = int(args.rate * args.sampling_frequency )
MODE = args.mode LIST_DEVICES = args.list_devices
N_FFT = _N_FFT MODE = args.mode
RATE = args.rate RATE = args.rate
SAMPLING_FREQUENCY = args.sampling_frequency SAMPLING_FREQUENCY = args.sampling_frequency
ip = args.ip bpm_min = args.bpm_min
port = args.port bpm_max = args.bpm_max
verbose = args.verbose ip = args.ip
port = args.port
verbose = args.verbose
if( MODE == "bpm" and SAMPLING_FREQUENCY < 0.5 ): if( MODE == "bpm" and SAMPLING_FREQUENCY < 0.5 ):
debug( "You should use a --sampling_frequency superior to 0.5 in BPM mode...") debug( "You should use a --sampling_frequency superior to 0.5 in BPM mode...")
@ -106,32 +116,73 @@ p = pyaudio.PyAudio()
def m_bpm(audio_data): def m_bpm(audio_data):
""" """
This function saves slow analysis to redis This function saves slow analysis to redis
* onset
* bpm * bpm
* beat * beat
""" """
global bpm global bpm
global start
if( bpm <= 10): bpm_delay = SAMPLING_FREQUENCY + start - time.time()
bpm = 10
onset = librosa.onset.onset_detect( # Detect tempo / bpm
y = audio_data,
sr = RATE
)
new_bpm, beats = librosa.beat.beat_track( new_bpm, beats = librosa.beat.beat_track(
y = audio_data, y = audio_data,
sr = RATE, sr = RATE,
trim = False, trim = False,
start_bpm = bpm, #start_bpm = bpm,
units = "time" units = "time"
) )
'''
new_bpm = librosa.beat.tempo(y = audio_data, sr=RATE)[0]
'''
# Correct the eventual octave error
if new_bpm < bpm_min or new_bpm > bpm_max:
octaveErrorList = [ 0.5, 2, 0.3333, 3 ]
for key,factor in enumerate(octaveErrorList):
correction = new_bpm * factor
if correction > bpm_min and correction < bpm_max:
debug( "Corrected bpm to:{}".format(correction))
new_bpm = correction
break
if new_bpm < bpm_min :
new_bpm = bpm_min
else :
new_bpm = bpm_max
'''
|start end|
Capture |........................|
BPM detect+Redis set ||
Client Redis get |
Time |........................||.............|
---SAMPLING_FREQUENCY----
- < TIME-START
Read Delay --------------- < 2*SAMPLING_FREQUENCY - PTTL
Delay -----------------------------------------
Beats |last beat
. known ...b....b....b....b....b.
. passed (...b....b....b.)
. guessed (..b....b....b....b...
Next Beat Calculation b....b....b....b.|..b
=> (Delay - last beat) + x*BPM/60 (with x >= read_delay/BPM/60)
Redis:
bpm_sample_interval
|........................|
bpm_delay
|.........................|
'''
bpm = new_bpm
# Save to Redis # Save to Redis
r.set( 'onset', json.dumps( onset.tolist() ) ) r.set( 'bpm', new_bpm, px=( 2* int(SAMPLING_FREQUENCY * 1000)))
r.set( 'bpm', json.dumps( new_bpm ) ) r.set( 'bpm_sample_interval', SAMPLING_FREQUENCY )
r.set( 'bpm_delay', bpm_delay )
r.set( 'beats', json.dumps( beats.tolist() ) ) r.set( 'beats', json.dumps( beats.tolist() ) )
bpm = new_bpm debug( "bpm:{} bpm_delay:{} beats:{}".format(bpm,bpm_delay,beats) )
debug( "bpm:{} onset:{} beats:{}".format(bpm,onset,beats) )
return True return True
def m_spectrum(audio_data): def m_spectrum(audio_data):
@ -179,6 +230,7 @@ def m_spectrum(audio_data):
def callback(in_data, frame_count, time_info, status): def callback(in_data, frame_count, time_info, status):
audio_data = numpy.fromstring(in_data, dtype=numpy.float32) audio_data = numpy.fromstring(in_data, dtype=numpy.float32)
global start
start = time.time() start = time.time()
if MODE == 'spectrum': if MODE == 'spectrum':
m_spectrum(audio_data) m_spectrum(audio_data)