, , , , , , , , , , , , ,

In my post Camera angles with OpenCV I was able to construct a camera rig from a pizza box:


There are 4 webcams on the rig – at front, back, left and right – all plugged into a single PC. With a bit of Python code I could read frames from the 4 webcams simultaneously, saving the images to disk.

In this post, I will use Google Speech Recognition of cut between camera angles and direct a movie fight scene!

Here’s the main Python program:

import cv2
from speech import Speech

# setup camera angles
camera_angles = ['front', 'back', 'left', 'right']
camera = camera_angles[0]

# setup speech recognition
speech = Speech()

# setup frame counter
frame_counter = 600

# loop
while frame_counter < 1100:

    # get director's chosen camera
    current_speech = speech.get_current_speech()
    if current_speech in camera_angles:
        camera = current_speech 

    # load frame from camera
    frame = cv2.imread('{}/image_{}.jpg'.format(camera, frame_counter))

    # show frame from camera
    cv2.imshow('movie fight scene', frame)

    # increment frame counter
    frame_counter += 1


First up, we import OpenCV, which will take care of reading the webcam frames from disk and showing our movie.

We also import a Speech class, so that we can act as a movie director and say which camera angle we want to cut to. We start with the ‘front’ webcam from our camera_angles list.

I loop on a frame_counter variable – starting at frame number 600 and ending at frame number 1100 (I don’t want the frames either side in my movie – no fat!)

Once inside the loop, we use our speech recognition class to get the current camera angle. For example, if the microphone plugged into my PC has picked up the director saying “left”, then that is the value assigned to the camera variable.

Next, we load a frame for the movie using OpenCV. The camera variable determines which folder we load the frame from e.g. ‘left’. The frame_counter variable determines the filename of the frame e.g. image_759.jpg

With a frame loaded from disk, we can show it using OpenCV. Notice that cv2.waitKey can be used to slow down / speed up the movie.

Finally, we increment our frame_counter variable, ready for the next frame.

Here I am directing the movie fight scene:

The movie starts on the ‘front’ camera. As the director, I say “left” and the camera angle switches to ‘left’. Next, I say “right” and the camera angle switches to ‘right’. Finally I say “back” and the camera angle switches to ‘back’.

The video above then shows the whole fight scene from each angle.

And there you have it – with a bit of Python code we can be a movie director, shooting a stramash between The Dude and a robot from multiple camera angles!



Here’s the Speech class, which employs a thread so as not to block the main program:

from speechtotext import SpeechToText
from threading import Thread
class Speech:
    def __init__(self):
        self.current_speech = None
        self.speech_to_text = SpeechToText()

    # create thread for capturing speech
    def start(self):
        Thread(target=self._update_speech, args=()).start()
    def _update_speech(self):
            self.current_speech = self.speech_to_text.convert()
    # get the current speech
    def get_current_speech(self):
        return self.current_speech

The Speech class in turn uses the SpeechToText class, which is available on GitHub and incorporates Google Speech Recognition.

Also, if you want to count down to the movie in the main program, use the following code:

from time import sleep

# countdown to movie
countdown = 20
while countdown > 0:
    if countdown == 5:
        print "Ready"
    elif countdown <= 3:
        print countdown
    countdown -= 1