Tags

, , , , , , , , , , , , ,

In my post Camera pose using OpenCV and OpenGL I was able to use OpenCV computer vision and OpenGL graphics library to project a cube from an optical glyph. What does this mean? Well, here’s a glyph:

glyph_01

And here’s another one:

glyph_02

If I hold a glyphs in front of the webcam, my Python code will detect it and draw a 3D object upon it. The object will be at the correct angle of the glyph. We have augmented reality!

For this post I will try to project a cone and sphere instead. I will do the following:

  1. Create my 3D objects using Blender
  2. Import my 3D objects into OpenGL
  3. Add support for multiple glyph detection using OpenCV
  4. Draw my 3D objects using OpenGL

Let’s go!

Create 3D objects using Blender

Okay, so this is my first time using Blender. Blender is free and open source and will allow me to create 3D objects that I can import into OpenGL.

I downloaded Blender 2.76-rc3 Release Candidate for Windows 64-bit.

Once installed, I set about learning the basics, as I wanted to create some shapes with textures. The YouTube video Making A Textured Cube In Blender from Sri Harsha Chilakapati helped me out (it also explained how to export my creations in Wavefront OBJ format, for use with OpenGL).

Import 3D objects into OpenGL

With my shapes in hand, I now set about importing them into OpenGL. Thankfully, there is already code out there to do this, and I plumped for the OBJFileLoader hosted on Pygame. Indeed, the only code I had to amend was to comment out lines:

glEnable(GL_TEXTURE_2D)

And:

glDisable(GL_TEXTURE_2D)

As I don’t want my augmented reality experience being drawn over!

Add support for multiple glyph detection using OpenCV

I am using OpenCV computer vision in order to detect glyphs in the webcam. If you want a breakdown on how this all works, check out my post Glyph recognition using OpenCV and Python.

I am now supporting multiple glyph detection for my 3D objects. Each glyph is a unique pattern, so one glyph will render a cone and another glyph will render a sphere. All the code is at the foot of this post.

Draw 3D objects using OpenGL

Now that we have code in place to import our shapes and detect our glyphs, all that’s left is to create some augmented reality by drawing the cone and sphere. My post Augmented Reality using OpenCV and OpenGL will help explain some of the OpenGL code.

Again, the main OpenGL program is at the foot of this post, but let’s first take a look at it in action:

Superb! Our cone and sphere are in the correct position and rotation. And they look quite smart too.

Well, that’s a whistle-stop tour of augmented reality using OpenCV, OpenGL and Blender. Please drop me a line if you want me to explain anything in greater detail. Right now, I’m off to turn a few pages of The Inscrutable Diaries Of Rodger Saltwash.

Ciao!

P.S.

I ran the code on my Windows 7 PC using Python Tools for Visual Studio.

Here’s my main OpenGL program:

from OpenGL.GL import *
from OpenGL.GLUT import *
from OpenGL.GLU import *
import cv2
from PIL import Image
import numpy as np
from webcam import Webcam
from glyphs import Glyphs
from objloader import *
from constants import *

class OpenGLGlyphs:
 
    # constants
    INVERSE_MATRIX = np.array([[ 1.0, 1.0, 1.0, 1.0],
                               [-1.0,-1.0,-1.0,-1.0],
                               [-1.0,-1.0,-1.0,-1.0],
                               [ 1.0, 1.0, 1.0, 1.0]])

    def __init__(self):
        # initialise webcam and start thread
        self.webcam = Webcam()
        self.webcam.start()

        # initialise glyphs
        self.glyphs = Glyphs()

        # initialise shapes
        self.cone = None 
        self.sphere = None

        # initialise texture
        self.texture_background = None

    def _init_gl(self, Width, Height):
        glClearColor(0.0, 0.0, 0.0, 0.0)
        glClearDepth(1.0)
        glDepthFunc(GL_LESS)
        glEnable(GL_DEPTH_TEST)
        glShadeModel(GL_SMOOTH)
        glMatrixMode(GL_PROJECTION)
        glLoadIdentity()
        gluPerspective(33.7, 1.3, 0.1, 100.0)
        glMatrixMode(GL_MODELVIEW)
        
        # assign shapes
        self.cone = OBJ('cone.obj')
        self.sphere = OBJ('sphere.obj')

        # assign texture
        glEnable(GL_TEXTURE_2D)
        self.texture_background = glGenTextures(1)

    def _draw_scene(self):
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)
        glLoadIdentity()

        # get image from webcam
        image = self.webcam.get_current_frame()

        # convert image to OpenGL texture format
        bg_image = cv2.flip(image, 0)
        bg_image = Image.fromarray(bg_image)     
        ix = bg_image.size[0]
        iy = bg_image.size[1]
        bg_image = bg_image.tostring("raw", "BGRX", 0, -1)
 
        # create background texture
        glBindTexture(GL_TEXTURE_2D, self.texture_background)
        glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST)
        glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST)
        glTexImage2D(GL_TEXTURE_2D, 0, 3, ix, iy, 0, GL_RGBA, GL_UNSIGNED_BYTE, bg_image)
        
        # draw background
        glBindTexture(GL_TEXTURE_2D, self.texture_background)
        glPushMatrix()
        glTranslatef(0.0,0.0,-10.0)
        self._draw_background()
        glPopMatrix()

        # handle glyphs
        image = self._handle_glyphs(image)

        glutSwapBuffers()

    def _handle_glyphs(self, image):

        # attempt to detect glyphs
        glyphs = []

        try:
            glyphs = self.glyphs.detect(image)
        except Exception as ex: 
            print(ex)

        if not glyphs: 
            return

        for glyph in glyphs:
            
            rvecs, tvecs, glyph_name = glyph

            # build view matrix
            rmtx = cv2.Rodrigues(rvecs)[0]

            view_matrix = np.array([[rmtx[0][0],rmtx[0][1],rmtx[0][2],tvecs[0]],
                                    [rmtx[1][0],rmtx[1][1],rmtx[1][2],tvecs[1]],
                                    [rmtx[2][0],rmtx[2][1],rmtx[2][2],tvecs[2]],
                                    [0.0       ,0.0       ,0.0       ,1.0    ]])

            view_matrix = view_matrix * self.INVERSE_MATRIX

            view_matrix = np.transpose(view_matrix)

            # load view matrix and draw shape
            glPushMatrix()
            glLoadMatrixd(view_matrix)

            if glyph_name == SHAPE_CONE:
                glCallList(self.cone.gl_list)
            elif glyph_name == SHAPE_SPHERE:
                glCallList(self.sphere.gl_list)

            glPopMatrix()

    def _draw_background(self):
        # draw background
        glBegin(GL_QUADS)
        glTexCoord2f(0.0, 1.0); glVertex3f(-4.0, -3.0, 0.0)
        glTexCoord2f(1.0, 1.0); glVertex3f( 4.0, -3.0, 0.0)
        glTexCoord2f(1.0, 0.0); glVertex3f( 4.0,  3.0, 0.0)
        glTexCoord2f(0.0, 0.0); glVertex3f(-4.0,  3.0, 0.0)
        glEnd( )

    def main(self):
        # setup and run OpenGL
        glutInit()
        glutInitDisplayMode(GLUT_RGBA | GLUT_DOUBLE | GLUT_DEPTH)
        glutInitWindowSize(640, 480)
        glutInitWindowPosition(800, 400)
        self.window_id = glutCreateWindow("OpenGL Glyphs")
        glutDisplayFunc(self._draw_scene)
        glutIdleFunc(self._draw_scene)
        self._init_gl(640, 480)
        glutMainLoop()
 
# run an instance of OpenGL Glyphs 
openGLGlyphs = OpenGLGlyphs()
openGLGlyphs.main()

glyphs.py

import cv2
from glyphfunctions import *
from glyphdatabase import *

class Glyphs:
    
    QUADRILATERAL_POINTS = 4
    BLACK_THRESHOLD = 100
    WHITE_THRESHOLD = 155

    def detect(self, image):

        glyphs = []

        # Stage 1: Detect edges in image
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        gray = cv2.GaussianBlur(gray, (5,5), 0)
        edges = cv2.Canny(gray, 100, 200)

        # Stage 2: Find contours
        contours, _ = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
        contours = sorted(contours, key=cv2.contourArea, reverse=True)[:10]

        for contour in contours:

            # Stage 3: Shape check
            perimeter = cv2.arcLength(contour, True)
            approx = cv2.approxPolyDP(contour, 0.01*perimeter, True)

            if len(approx) == self.QUADRILATERAL_POINTS:

                # Stage 4: Perspective warping
                topdown_quad = get_topdown_quad(gray, approx.reshape(4, 2))

                # Stage 5: Border check
                if topdown_quad[(topdown_quad.shape[0]/100.0)*5, 
                                (topdown_quad.shape[1]/100.0)*5] > self.BLACK_THRESHOLD: continue

                # Stage 6: Match glyph pattern
                glyph_pattern = get_glyph_pattern(topdown_quad, self.BLACK_THRESHOLD, self.WHITE_THRESHOLD)
                glyph_found, _, glyph_name = match_glyph_pattern(glyph_pattern)

                if glyph_found:

                    # Stage 7: Get rotation and translation vectors
                    rvecs, tvecs = get_vectors(image, approx.reshape(4, 2))
                    glyphs.append([rvecs, tvecs, glyph_name])

        return glyphs

glyphfunctions.py

import cv2
import numpy as np

def order_points(points):

    s = points.sum(axis=1)
    diff = np.diff(points, axis=1)
    
    ordered_points = np.zeros((4,2), dtype="float32")

    ordered_points[0] = points[np.argmin(s)]
    ordered_points[2] = points[np.argmax(s)]
    ordered_points[1] = points[np.argmin(diff)]
    ordered_points[3] = points[np.argmax(diff)]

    return ordered_points

def max_width_height(points):

    (tl, tr, br, bl) = points

    top_width = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
    bottom_width = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    max_width = max(int(top_width), int(bottom_width))

    left_height = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
    right_height = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    max_height = max(int(left_height), int(right_height))

    return (max_width,max_height)

def topdown_points(max_width, max_height):
    return np.array([
        [0, 0],
        [max_width-1, 0],
        [max_width-1, max_height-1],
        [0, max_height-1]], dtype="float32")

def get_topdown_quad(image, src):

    # src and dst points
    src = order_points(src)

    (max_width,max_height) = max_width_height(src)
    dst = topdown_points(max_width, max_height)
 
    # warp perspective
    matrix = cv2.getPerspectiveTransform(src, dst)
    warped = cv2.warpPerspective(image, matrix, max_width_height(src))

    # return top-down quad
    return warped

def get_glyph_pattern(image, black_threshold, white_threshold):

    # collect pixel from each cell (left to right, top to bottom)
    cells = []
    
    cell_half_width = int(round(image.shape[1] / 10.0))
    cell_half_height = int(round(image.shape[0] / 10.0))

    row1 = cell_half_height*3
    row2 = cell_half_height*5
    row3 = cell_half_height*7
    col1 = cell_half_width*3
    col2 = cell_half_width*5
    col3 = cell_half_width*7

    cells.append(image[row1, col1])
    cells.append(image[row1, col2])
    cells.append(image[row1, col3])
    cells.append(image[row2, col1])
    cells.append(image[row2, col2])
    cells.append(image[row2, col3])
    cells.append(image[row3, col1])
    cells.append(image[row3, col2])
    cells.append(image[row3, col3])

    # threshold pixels to either black or white
    for idx, val in enumerate(cells):
        if val < black_threshold:
            cells[idx] = 0
        elif val > white_threshold:
            cells[idx] = 1
        else:
            return None

    return cells

def get_vectors(image, points):
    
    # order points
    points = order_points(points)

    # load calibration data
    with np.load('webcam_calibration_ouput.npz') as X:
        mtx, dist, _, _ = [X[i] for i in ('mtx','dist','rvecs','tvecs')]
  
    # set up criteria, image, points and axis
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    imgp = np.array(points, dtype="float32")

    objp = np.array([[0.,0.,0.],[1.,0.,0.],
                        [1.,1.,0.],[0.,1.,0.]], dtype="float32")  

    # calculate rotation and translation vectors
    cv2.cornerSubPix(gray,imgp,(11,11),(-1,-1),criteria)
    rvecs, tvecs, _ = cv2.solvePnPRansac(objp, imgp, mtx, dist)

    return rvecs, tvecs

glyphdatabase.py

from constants import *

# glyph table
GLYPH_TABLE = [[[[0, 1, 0, 1, 0, 0, 0, 1, 1],[0, 0, 1, 1, 0, 1, 0, 1, 0],[1, 1, 0, 0, 0, 1, 0, 1, 0],[0, 1, 0, 1, 0, 1, 1, 0, 0]], SHAPE_CONE],[[[1, 0, 0, 0, 1, 0, 1, 0, 1],[0, 0, 1, 0, 1, 0, 1, 0, 1],[1, 0, 1, 0, 1, 0, 0, 0, 1],[1, 0, 1, 0, 1, 0, 1, 0, 0]], SHAPE_SPHERE]]

# match glyph pattern to database record
def match_glyph_pattern(glyph_pattern):
    glyph_found = False
    glyph_rotation = None
    glyph_name = None
    
    for glyph_record in GLYPH_TABLE:
        for idx, val in enumerate(glyph_record[0]):    
            if glyph_pattern == val: 
                glyph_found = True
                glyph_rotation = idx
                glyph_name = glyph_record[1]
                break
        if glyph_found: break

    return (glyph_found, glyph_rotation, glyph_name)

constants.py

# shape constants
SHAPE_CONE = "cone"
SHAPE_SPHERE = "sphere"

webcam.py

import cv2
from threading import Thread
 
class Webcam:
 
    def __init__(self):
        self.video_capture = cv2.VideoCapture(0)
        self.current_frame = self.video_capture.read()[1]
         
    # create thread for capturing images
    def start(self):
        Thread(target=self._update_frame, args=()).start()
 
    def _update_frame(self):
        while(True):
            self.current_frame = self.video_capture.read()[1]
                 
    # get the current frame
    def get_current_frame(self):
        return self.current_frame