, , , , , , , , , , , , ,

Arkwood was furious. The postman had left his parcel of top-shelf literature with a neighbour whilst he was out, and the magazines were soiled. ‘Not to worry,’ I told him, ‘I will write some Python code that will instruct your postie to leave any future parcels by the back garden gate’.

First off, I need to set up a webcam to monitor his front door, so as to detect when a visitor comes knocking. An excellent article by Matthias Stein provided the code for working with the Intel-developed OpenCV library, which I wrapped into a handy class:

import cv2
from datetime import datetime
class Webcam(object):
    # obtain changes between images
    def _delta(self, t0, t1, t2):
        d1 = cv2.absdiff(t2, t1)
        d2 = cv2.absdiff(t1, t0)
        return cv2.bitwise_and(d1, d2)
    # wait until motion is detected
    def detect_motion(self):
        # set up webcam
        webcam = cv2.VideoCapture(0)
        #set up surveillance window
        window_name = "Arkwood's Surveillance"
        cv2.namedWindow(window_name, cv2.CV_WINDOW_AUTOSIZE)
        # set motion threshold
        threshold = 170000
        # hold three b/w images at any one time
        t_minus = cv2.cvtColor(webcam.read()[1], cv2.COLOR_RGB2GRAY)
        t = cv2.cvtColor(webcam.read()[1], cv2.COLOR_RGB2GRAY)
        t_plus = cv2.cvtColor(webcam.read()[1], cv2.COLOR_RGB2GRAY)
        # now let's loop until we detect some motion
        while True:
          # obtain the changes between our three images
          delta = self._delta(t_minus, t, t_plus)
          # display changes in surveillance window
          cv2.imshow(window_name, delta)

          # obtain white pixel count i.e. where motion detected
          count = cv2.countNonZero(delta)
          # debug
          #print (count)
          # if the threshold has been breached, save some snaps to disk
          # and get the hell out of function...
          if (count > threshold):
              filename = datetime.now().strftime('%Y%m%d_%Hh%Mm%Ss%f') + '.jpg'
              cv2.imwrite('WebCam/Motion/' + filename, delta)
              cv2.imwrite('WebCam/Photograph/' + filename, webcam.read()[1])
          # ...otherise, let's handle a new snap
          t_minus = t
          t = t_plus
          t_plus = cv2.cvtColor(webcam.read()[1], cv2.COLOR_RGB2GRAY)

The Webcam class has a detect_motion function, which connects to the webcam attached to my Raspberry Pi and streams the captured images into a window on the Pi’s desktop. The while loop is concerned with whether the images have changed significantly (by comparing affected pixels against a threshold), and if so, we store two snaps of the detected bod (the pixel image and a nice coloured photograph) before exiting.

Now that we have detected a person at the front door, the next thing to do is to determine if it is the postman. For this we will use Google’s text to speech service to ask the visitor some questions, and Google’s speech to text service to obtain the visitor’s replies. I’ve covered the finer details of these services in a previous post, but here’s my Speech class for reference:

from subprocess import Popen, PIPE, call
import urllib
class Speech(object):
    # converts speech to text
    def speech_to_text(self, filepath):
            # utilise PiAUISuite to turn speech into text
            text = Popen(['sudo', filepath], stdout=PIPE).communicate()[0]
            # tidy up text
            text = text.replace('"', '').strip()
            # debug

            return text
            print ("Error translating speech")
    # converts text to speech
    def text_to_speech(self, text):
            # truncate text as google only allows 100 chars
            text = text[:100]
            # encode the text
            query = urllib.quote_plus(text)
            # build endpoint
            endpoint = "http://translate.google.com/translate_tts?tl=en&q=" + query
            # debug
            # get google to translate and mplayer to play
            call(["mplayer", endpoint], shell=False, stdout=PIPE, stderr=PIPE)
            print ("Error translating text")

All that is left is to code up the main program:

from webcam import Webcam
from speech import Speech

# wait until motion detected at front door

# ask visitor for identification
Speech().text_to_speech("State your name punk")

# capture the visitor's reply
visitor_name = Speech().speech_to_text('/home/pi/PiAUISuite/VoiceCommand/speech-recog.sh')

# ask visitor if postman
Speech().text_to_speech("Are you the postman " + visitor_name)

# capture the visitor's reply
is_postman = Speech().speech_to_text('/home/pi/PiAUISuite/VoiceCommand/speech-recog.sh')

# if postman, provide instruction
if (is_postman == "yes"):
    Speech().text_to_speech("Please leave the parcel at the back gate and leave")
    Speech().text_to_speech("Fuck off")

Once the WebCam class indicates that a person has been detected in the front garden, a speaker asks the visitor for identification and a microphone listens for a reply. It then enquires as to whether the visitor is the postman, and if the visitor says yes then an instruction is issued regarding the parcel. Seamless.

Click on the following screenshot to see the program output, along with the desktop window sporting the pixel image:


Arkwood was beside himself with joy as we rigged up the hardware and waited for the postman. ‘Let’s not answer the door when he comes!’

So, I guess you’ll want to know the outcome? Well, for some reason the postman did not take kindly to our virtual host and ripped up Arkwood’s jazz mags, stuffing them into a hedge. Charming. ‘Don’t worry,’ I said to my distraught Belgian pervert, ‘We have his coloured photograph saved on the Pi’s SD card, we’ll get our revenge.’

Now, you must excuse me as I have to implement the face recognition features of OpenCV, so as to spot the letter-handling culprit and administor some automated punishment.