, , , , , , , , , , ,

Arkwood, my corrupt Belgian buddy, has long harboured a wish to be a rock star. He’s done his fair share of auditions, but, alas, rejection is an all-too-familiar bedfellow. ‘Not to worry,’ I told him, slapping him on the spine, ‘I will write some Python code that will act as your very own backing band!’

Here’s the plan. The code will record a sample of Arkwood’s singing voice. It will then analyse the sample, to work out if Arkwood is belting out some rock classic or just crooning the odd word. Lastly, it will play a guitar suited to the style of his singing. Perfect!

Let’s open Python Tools for Visual Studio on my Windows 7 PC and start writing Python…

Record audio

I am going to use PyAudio to record Arkwood warbling into the microphone attached to my PC.

I downloaded PyAudio from www.lfd.uci.edu unzipping the file and dropping pyaudio.py and _portaudio.pyd into my Anaconda site-packages folder. I dropped portaudio_x64.dll into the folder where my Visual Studio project sits.

Here’s my AudioRecord class:

from recordscript import *

class AudioRecord(object):

    # record voice
    def voice(self):
        return record_script()

Not much too it, as it simply acts as a wrapper for eugene’s Detect & Record Audio in Python code on stackoverflow. Check out eugene’s code – it waits until sound is picked up through the microphone, then continues to record until it goes silent again. Hey presto! A sample of Arkwood singing, recorded to wav file.

I’ve placed eugene’s code into a Python file named recordscript. I’ve made a couple of minor amendments to the code, calling it with record_script() and returning the name of the created file. Otherwise it works great as is.

Analyse audio

So we have just recorded audio to file, of Arkwood singing into the microphone. Next we need to analyse the recording to determine whether he is belting out a rock classic or crooning the odd word.

For this I will use the wavfile module of SciPy. I already have SciPy as part of my Anaconda Python distribution, so no need to fetch it.

Sam Carcagno’s blog post was a great help in understanding how to analyse an audio wav file. There’s also a very useful stackoverflow post from NightHallow.

Here’s my AudioAnalysis class:

from scipy.io import wavfile

class AudioAnalysis(object):


    def is_rich(self, wave_file):

        # read voice file
        rate, data = wavfile.read(wave_file)
        data = data / (2.**15)

        # get number of peaks
        peaks = 0
        for val in data:
            if val > self.PEAK_VALUE_THRESHOLD:
                peaks += 1
        # return True if sound is rich (i.e. many peaks)
        if peaks > self.PEAK_COUNT_THRESHOLD:
            return True

        # otherwise return False
        return False

I’ve decided to keep things simple for my first foray into audio analysis. Once my wav file of Arkwood’s voice is loaded into rate and data, a calculation is made to convert the data into floats ranging in value from -1 to 1.

Next we loop the data and count all values above a threshold of 0.5. The plan being, if the sample of Arkwood’s voice is rich (i.e. he’s belting out a rock classic) then the number of values above the threshold will be great. If he is crooning only a couple of words then the number of values above the threshold will be small. Our is_rich method returns True or False, depending on whether the count has breached 1000.

All this is better explained with a demo. I have used the following code to plot Arkwood belting out a rock classic:

from pylab import *

# plot voice file
time = np.arange(len(data))*1.0/rate      
plt.plot(time, data)


We can see from the graphical representation of the wav file that there are many values above our threshold of 0.5 (3808, as it happens). Our is_rich method will return True.

Now let’s take a look at Arkwood crooning just a couple of words:


Four words, to be precise. There are only 487 values above our threshold of 0.5. That’s not to say that Arkwood’s voice is not loud – it’s just to say that there is quite a bit of hush in between his words. Our is_rich method will return False.

Play audio

Okay. So far we have recorded Arkwood singing, and we have analysed his voice to determine if he is belting out a rock classic or crooning a few words. Now we are going to play a guitar sound that will suit his singing.

Here’s the class:

import pygame

class AudioPlay(object):

    def __init__(self):

    def guitar(self, is_rich_voice):
        # select guitar sound to suit voice...
        if is_rich_voice:
            guitar = pygame.mixer.Sound("audio/259663__frankyboomer__guitar-chords.wav")
            guitar = pygame.mixer.Sound("audio/62060__erh__guitar-acstg-v2201b2-62-10.wav")
        # play the guitar

I am using Pygame to play the guitar. I downloaded Pygame from www.lfd.uci.edu unzipping the file and dropping pygame into my Anaconda site-packages folder.

Our guitar method simply selects a guitar sound based on whether Arkwood’s voice is rich or not (as determined in the previous analysis step). The guitar is then played through the computer speakers.

Freesound provided the guitar licks…

FrankyBoomer’s hot jam perfectly suits Arkwood’s voice, when belting out a rock classic: https://www.freesound.org/people/FrankyBoomer/sounds/259663/

ERH’s atmospheric guitar works well with Arkwood’s crooning voice: https://www.freesound.org/people/ERH/sounds/62060/

Bringing it all together

All that’s left to do is write some Python code that will make use of our classes for recording, analysing and playing audio:

from audiorecord import AudioRecord
from audioanalysis import AudioAnalysis
from audioplay import AudioPlay
from time import sleep

audio_record = AudioRecord()
audio_analysis = AudioAnalysis()
audio_play = AudioPlay()

while True:
    print "Arkwood! Get ready to sing..."

    # record Arkwood's singing voice
    voice_file = audio_record.voice()

    # inspect voice to determine if rich
    is_rich_voice = audio_analysis.is_rich(voice_file)

    # play guitar to suit voice


Lovely. Once in a while loop, we instruct Arkwood to get ready to sing into the microphone attached to my PC.

We record a sample of his voice, obtaining the name of the newly created audio file.

We analyse the audio file, to determine if his voice is rich (i.e. it’s rich if he’s belting out a rock classic).

Finally we play a guitar sound that suits his voice, through the computer speakers.

We stall the code for 5 seconds then start over again.

‘So, my friend,’ I asked him, ‘How does it feel to have your very own backing band?’

Arkwood curled his top lip. ‘The song is a bit repetitive. Don’t you think you could mix it up a bit with drums and bass?’

Clearly he knows nothing of pop music, and the simple hooks that belie a masterpiece of political satire.