Tags

, , , , , , , , , , , , ,

In my previous post, OpenCV Histograms for Google Street View, I wrote some Python code to calculate the histograms of Google Street View photographs.

What the devil is a histogram? you ask me. Well, I am using it to understand whether each photograph is of a shady alley or a sun-soaked boulevard. Once I know the answer, I will play music appropriate to the scene i.e. funky jazz if the photo is dark; bird tweets if the photo is light.

Why the hell would you want to do that? you exclaim, throwing your hands in the air. It’s all part of a promise I made to my buddy, Arkwood. You see, I once wrote a program that could take a random stroll through Google Street View. Arkwood was thrilled. Now he wants music, so that the experience can be more… well… sordid.

Let’s try to improve the code from the previous post.

hist = cv2.calcHist([img],[0],None,[16],[0,256])

So, what has changed here? Well, the fourth parameter to the calcHist function has dropped from 256 to 16. This means that instead of having a separate bin for every pixel value (of which there are 256 pixel intensities in a grayscale image) we are grouping our pixels into 16 bins. The code still works as before – the darker pixels in the lower bins and the lighter pixels in the upper bins – only that by grouping pixel values together we can hope to get a more accurate reading of the photograph’s brightness.

Okay, let’s see how the photographs from my previous post get on now that I am using 16 bins. I’ve adjusted the threshold to 4 – therefore if the top bin index is below the threshold we play dark funky music, otherwise we play light and airy bird tweets:

streetview_screenshot0

Top bin index: 2

streetview_screenshot5

Top bin index: 2

streetview_screenshot7

Top bin index: 2

This time, all three images are well below our chosen threshold and the funky music plays.

Let’s have a gander at another photograph in our random stroll through the streets of Google:

streetview_screenshot2

Top bin index: 6

Clearly all the light tiles on the shopfront have pushed the snap well above the threshold. Not exactly a photograph suited to bird tweets, mind. Nevertheless, using fewer bins to store our pixel values does afford us more consistency when describing whether a photograph is murky or not.

There’s one more thing I would like to try, though. If you look closely at the photographs, you’ll see that Google has put the name of the road in a black box at top left. This concerns me. What if my histogram calculation is being skewed by the box, forcing more black pixels into the lower bins. Let’s try using a mask, as described in the OpenCV Histograms documentation, to remove the offending rectangle:

# get mask 
mask = numpy.zeros(img.shape[:2], numpy.uint8)
mask[120:750, 0:1500] = 255

# calculate histogram
hist = cv2.calcHist([img],[0],mask,[16],[0,256])

We simply create a mask for our calcHist function, so as to ignore the top slice of the photograph. So how did our masked snaps get on?

streetview_screenshot0_masked

For 16 bins, our top bin index is still 2. For 256 bins, our top bin index has dropped from 39 to 34.

streetview_screenshot5_masked

For 16 bins, our top bin index is still 2. For 256 bins, our top bin index has risen from 40 to 42.

streetview_screenshot7_masked

For 16 bins, our top bin index is still 2. For 256 bins, our top bin index has risen from 41 to 44.

What does this tell us? When using 16 bins, the omission of the black box has had no effect on the top bin index. When using 256 bins we see some minor shift on the top bin index. However, chopping off the whole top of the photograph is not the same as just removing the black box. Plus, the black box has white writing on it, which may counter-balance the blackness.

Well, it’s been a hoot. But now ’tis time for some rum and a bed. Lest Arkwood provokes a breakdown.

Ciao!

P.S.

Here’s the histogram output for our first photograph, when using 16 bins:

[[  14340.]
 [ 121830.]
 [ 257828.]
 [ 217896.]
 [ 140902.]
 [  66767.]
 [  35488.]
 [  28071.]
 [  16921.]
 [  12659.]
 [   9418.]
 [   7426.]
 [   5028.]
 [   3227.]
 [   3175.]
 [   4024.]]

Note that we are handling a zero-based array, so top value 257828 is at bin index 2.

And if we wish to eyeball our photograph with the mask on top, just save to disk the following image:

masked_img = cv2.bitwise_and(img,img,mask = mask)

I sourced the funky music and bird tweets from Freesound:
https://www.freesound.org/people/Julezhaze/sounds/73621/
https://www.freesound.org/people/swiftoid/sounds/182502/

I used Python Tools for Visual Studio to run the Python code on a Windows 7 PC.

I used Internet Explorer 11 to render the Google Street View photographs of Hong Kong Island.

Advertisements