Tags

, , , , , , , , , , , , ,

Arkwood teased gel through his floppy black hair. ‘When are you going to make me a rock star?’ he asked.

As my buddy’s manager, I had failed to launch his musical career. But I had an idea. ‘I will use one of my robots as a voice coach. You will soon be warbling like a soprano.’

I added a new feature to SaltwashAR – a Python Augmented Reality application – so that Rocky Robot can act as a mixing desk, replaying Arkwood’s vocals to guitar and drums. We will soon have the killer hit.

Here’s the Python code for the Mixing Desk feature:

from features.base import Feature, Speaking
import speech_recognition as sr
import pygame
from time import sleep

class MixingDesk(Feature, Speaking):

    GUITAR = "guitar"
    GUITAR_FILENAME = "scripts/features/mixingdesk/guitar.wav"

    DRUMS = "drums"
    DRUMS_FILENAME = "scripts/features/mixingdesk/drums.wav"

    def __init__(self, text_to_speech, speech_to_text):
        Feature.__init__(self)
        Speaking.__init__(self, text_to_speech)
        self.speech_to_text = speech_to_text
        self.recognizer = sr.Recognizer()
        pygame.mixer.init(frequency=8000)

    def _thread(self, args):
    
        # mixing desk asks for vocal
        self._text_to_speech("Sing the vocal now...")
    
        # user sings
        with sr.Microphone() as source:
            print "listening..."
            wav_data = self.recognizer.listen(source).get_wav_data()

        # check whether to stop thread
        if self.is_stop: return
      
        # mixing desk asks for instruments
        self._text_to_speech("What instruments do you want?")

        # user gives instruments
        instruments = self._speech_to_text()

        # mixing desk gets mixing...
        pygame.mixer.set_num_channels(3)
        
        vocals = pygame.mixer.Channel(0)
        vocals.set_volume(0.8)       
        vocals.play(pygame.mixer.Sound(wav_data))

        if self.GUITAR in instruments:
            guitar = pygame.mixer.Channel(1)
            guitar.set_volume(0.3)
            guitar.play(pygame.mixer.Sound(self.GUITAR_FILENAME))

        if self.DRUMS in instruments:
            drums = pygame.mixer.Channel(2)
            drums.set_volume(0.6)
            drums.play(pygame.mixer.Sound(self.DRUMS_FILENAME))

        while vocals.get_busy():
            continue

        if self.GUITAR in instruments:
            guitar.stop()

        if self.DRUMS in instruments:
            drums.stop()

        sleep(4)

    def _speech_to_text(self):
        text = self.speech_to_text.convert()
        if not text: return ""

        return text.lower().split()

The Mixing Desk feature inherits from the Feature base class, which will provide threading (all features run in threads so as not to block the main application process from rendering to screen). We also inherit from the Speaking base class, to let the robot’s mouth move when she speaks.

The __init__ method is passed Text To Speech and Speech To Text parameters, so that we will be able to communicate with the robot. It sets up a self.recognizer variable so that Arkwood can record his voice. Pygame is initialized with a frequency of 8000, so that Arkwood’s voice can be replayed with guitar and drums.

As a postman would say, the _thread method is where all the shit happens.

First, the robot uses Text To Speech through the computer speakers to ask Arkwood to “Sing the vocal now…”.

Arkwood sings his vocal through the computer microphone, which gets recorded by the self.recognizer variable, yielding wav data.

Next, we check whether we should stop the thread. Why? Well, if the robot is no longer in front of the webcam then it ain’t gonna be able to do any mixing of Arkwood’s vocals.

With my chum’s vocals safely stored to wav data, the robot then asks him what instruments he would like. Speech To Text through the computer microphone is used to convert Arkwood’s reply of “guitar drums” into text.

Now we get to the sexy stuff, mixing Arkwood’s vocal with some instruments.

We set up 3 Pygame channels, for vocal, guitar and drums.

Channel 0 plays the wav data, the recording of Arkwood’s singing. The channel volume is set to 0.8

If Arkwood has requested a guitar, then Channel 1 loads and plays a guitar wav file at volume 0.3

If Arkwood has requested drums, then Channel 2 loads and plays a drums wav file at volume 0.6

The vocal, guitar and drums will continue to play until the vocal recording comes to an end. The guitar and drums are then stopped.

The thread sleeps for 4 seconds before starting over again (assuming that Rocky Robot is still in front of the webcam).

Let’s have a look at Arkwood putting the Mixing Desk through its paces:

Terrific! Arkwood’s singing is replayed to drums, and his second warble is replayed to guitar and drums.

‘How is the voice,’ I asked him, ‘Has my robot coached you to a number one single?’

He dropped his shades to reveal red-rimmed eyes and a powdery nose. Sure, he was well on his way to Rock God stardom.

Ciao!

Please check out the SaltwashAR Wiki for details on how to install and help develop the SaltwashAR Python Augmented Reality application.

Advertisements