, , , , , , , , , , , , ,

‘Where are my automobile keys?’ I asked Arkwood, my verminous Belgian buddy. ‘I’m not telling you! Not until you add atmospheric sounds to my saunters.’

You see, I wrote some Python code that opened a browser and took a random stroll through Google Street View. Arkwood asked me to add music – for example, if the street was sunny I would play bird sounds; if it was shady I would play some steamy jazz. But my friend’s intentions were not good. No, they were sordid and unwholesome, and I wanted no part of it.

Nevertheless, it did set me off eyeballing OpenCV Histograms. In my first post, OpenCV Histograms for Google Street View, I used grayscale screenshots of Google Street View to calculate histograms and infer how bright each photograph was. My second post, Histograms for Google Street View (Mark II), tested out a few techniques such as reducing the number of histogram bins and utilising an image mask. So what’s next? Colour, that’s what.

Let’s look at the code from my initial post, reworked to handle colour screenshots of Google Street View:

import cv2
import numpy
import ImageGrab
import winsound
from storage import Storage

class Histogram(object):
    SCREENSHOT_FILE = 'gsv/streetview_screenshot{}.jpg'
    HISTOGRAM_FILE = 'gsv/streetview_histogram.txt'
    BLUESOUND_FILE = 'gsv/145297__dxe10__blues-lick-in-a.wav'
    GREENSOUND_FILE = 'gsv/182502__swiftoid__birds-chirping-02.wav'
    REDSOUND_FILE = 'gsv/234469__beansqueso31__skillet-sizzle.wav'
    THRESHOLD = 16

    # initialise
    def __init__(self):
        self.storage = Storage()
        self.screenshot_counter = 0
    # grab screenshot
    def _grab_screenshot(self):
        screenshot = ImageGrab.grab(bbox=(0,100,1500,850))
        screenshot = cv2.cvtColor(numpy.array(screenshot), cv2.COLOR_RGB2BGR)
        cv2.imwrite(self.SCREENSHOT_FILE.format(self.screenshot_counter), screenshot)
        self.screenshot_counter += 1

        return screenshot

    # calculate histogram
    def _calc_histogram(self, img, channel):
        hist = cv2.calcHist([img],[channel],None,[64],[0,256])
        self.storage.write_value(self.HISTOGRAM_FILE, str(hist))
        return hist

    # calculate top bin index
    def _calc_top_bin_index(self, hist):
        top_bin_index = numpy.argmax(hist)
        self.storage.write_value(self.HISTOGRAM_FILE, "Top bin index: {}".format(top_bin_index))
        return top_bin_index

    # play sound of top channel
    def _play_sound(self, channels):
        top_channel = numpy.argmax(channels)
        if(channels[top_channel] < self.THRESHOLD): return

        if top_channel == 0:
            winsound.PlaySound(self.BLUESOUND_FILE, winsound.SND_FILENAME)
        elif top_channel == 1:
            winsound.PlaySound(self.GREENSOUND_FILE, winsound.SND_FILENAME)
        elif top_channel == 2:
            winsound.PlaySound(self.REDSOUND_FILE, winsound.SND_FILENAME)

    # calculate
    def calculate(self):
        # grab screenshot
        img = self._grab_screenshot()

        # get top bin index for each channel
        channels = [0,0,0]
        for i in xrange(len(channels)):
            hist = self._calc_histogram(img, i)
            channels[i] = self._calc_top_bin_index(hist)

        # play sound of top channel

So what has changed? Let’s take a gander at the public method calculate and find out.

We grab a screenshot of Google Street View rendered in our browser, just as before. However, this time we are converting the red-green-blue screenshot to blue-green-red instead of grayscale (OpenCV works with images in blue-green-red).

Next, we loop through each blue-green-red channel of the photograph and calculate its histogram, putting the top bin index into a channel array for later use. We calculate our histograms using 64 bins – as mentioned in the previous posts, the darker pixels go in the lower bins and the lighter pixels in the upper bins. To work out which of the 64 bins has the highest number of pixels, we use the numpy.argmax function.

All that’s left to do is play a sound depending on the brightness of the photograph. Again, we utilise the numpy.argmax function to determine which of our channels – blue, green or red – has the top bin index. If this index does not meet our threshold of 16 then my computer speakers are silent. Otherwise, we play a sound relating to the winning channel – the blue channel will play a nice blues guitar riff; the green channel plays nature’s bird tweets; the red channel plays sizzling meat.

Okay, let’s have a demo. Here’s the first screenshot:


Top bin index: 8 [Blue]
Top bin index: 9 [Green]
Top bin index: 9 [Red]

Well, none of our colour channels came close to breaching the threshold of 16, so no sound file was played. Clearly the photograph is too dark.

Here’s the next screenshot:


Top bin index: 25 [Blue]
Top bin index: 27 [Green]
Top bin index: 32 [Red]

That’s more like it! All of our channels have easily beaten the threshold, with the winning colour being red. The bright red flag and shopfront triumphs in an altogether dazzling photograph. Sounds of sizzling meat fill the air.

So there you have it. Using OpenCV histograms can help us understand whether a photograph is shady or sunny. It can even let us know if one colour is particularily luminous and bountiful.

Right, where has that miscreant Arkwood gone with my automobile keys? Honestly, these friends of mine.


Here’s a graph for each photograph, show the blue, green and red channels across the 64 bins. First up, our photograph of the taxis and girls outside the nightclubs:


And of our shops and flag:


The OpenCV documentation provides the code to generate these graphs.

I sourced the blues riff, bird tweets and sizzling meat from Freesound:

I used Python Tools for Visual Studio to run the Python code on a Windows 7 PC.

I used Internet Explorer 11 to render the Google Street View photographs of Hong Kong Island.