2D, 3D, Augmented Reality, Camera Calibration, camera matrix, distortion coefficients, extrinsic parameters, intrinsic parameters, Lego, OpenCV, Pose Estimation, Python, Python Tools for Visual Studio, Webcam
Wouldn’t it be fine and dandy to add augmented reality to a Lego scene. You say No, but stick with me on this.
Before we get going though, we need to calibrate our webcam and prove some basic 3D effects.
Why calibrate our webcam? Well, we need to sort any picture distortion – such as straight lines appearing curved, or some areas of image appearing closer than expected – before we start to render our 3D objects. The OpenCV Camera Calibration article provides all the detail.
Here’s my checklist of what I’ll be using:
- Windows 7 PC
- Python Tools for Visual Studio
- Logitech HD720p webcam
- Printout of a ‘chessboard’ grid
- An unnerving aptitude for tedium
Okay, here’s my grid printout through the eye of my webcam:
I used Microsoft Word tables to construct the grid, but sensible people would probably print the OpenCV pattern online.
The idea is that we will project 3D images onto this grid, which will be placed within a Lego scene. And when I say Lego scene, I mean a couple of Lego policemen with a Lego motorbike and Lego ambulance. Probably a Lego nurse administering CPR. That sort of thing.
So let the calibration begin! First up, I want to obtain 10 sample images of my grid at different angles and locations (pretty much mimicking the OpenCV sample images at sources/samples/cpp/left01.jpg – left14.jpg).
Here’s the code I used to display and save snaps from my webcam:
from webcam import Webcam import cv2 from datetime import datetime webcam = Webcam() webcam.start() while True: # get image from webcam image = webcam.get_current_frame() # display image cv2.imshow('grid', image) cv2.waitKey(3000) # save image to file, if pattern found ret, corners = cv2.findChessboardCorners(cv2.cvtColor(image,cv2.COLOR_BGR2GRAY), (7,6), None) if ret == True: filename = datetime.now().strftime('%Y%m%d_%Hh%Mm%Ss%f') + '.jpg' cv2.imwrite("pose/sample_images/" + filename, image)
You’ll notice that I am only saving images that have corners detected (if we can’t find corners on the grid, then the snap is of no use to us).
And here’s my Webcam class – used in the code above – which runs in a thread:
import cv2 from threading import Thread class Webcam: def __init__(self): self.video_capture = cv2.VideoCapture(0) self.current_frame = self.video_capture.read() # create thread for capturing images def start(self): Thread(target=self._update_frame, args=()).start() def _update_frame(self): while(True): self.current_frame = self.video_capture.read() # get the current frame def get_current_frame(self): return self.current_frame
Note: in previous OpenCV projects I found that the webcam images were lagging i.e. the frame that displayed on screen was not the current frame, instead up to 5 frames behind. To solve this, my Webcam class runs in a thread.
I used the free Python Tools for Visual Studio to run the code on my Windows 7 PC.
Here’s one of the 10 sample images:
Next up, we loop through our 10 images and build arrays to store our object points (3D points in real world space) and image points (2D points in image plane) of the grid corners. The OpenCV Camera Calibration article provides the code.
Note: I had to amend the article code slightly to work with my version of OpenCV 2.4.9. Some of the OpenCV functions were assigning a return value of None to a variable, and when the variable was next used the program blew up (yes, my PC actually caught fire). An example of how to fix this: remove the corners2 variable, as the cornerSubPix function will do its work on the corners parameter directly.
corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
Great. Let’s take a peek at one of our sample images with the grid corners drawn upon it (courtesy of the cv2.drawChessboardCorners function):
Okay, now that we have our arrays to store our object and image points, let’s go calibrate that camera:
# calibrate webcam and save output ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None) np.savez("pose/webcam_calibration_ouput", ret=ret, mtx=mtx, dist=dist, rvecs=rvecs, tvecs=tvecs)
cv2.calibrateCamera consumes our object and image points, yielding parameters relating to distortion (these are known as the distortion coefficients) along with intrinsic parameters (the camera matrix) and extrinsic parameters (for translating 3D coordinates). We save all these parameters to disk – they will help us create 3D effects.
The OpenCV Camera Calibration article also provides code to test out our calibration efforts. Here’s a brand new image of the grid – our future Lego scene – taken from my webcam:
And here’s it once we’ve applied the cv2.undistort function:
I would take a keen eye to spot the difference, but a flip between the images shows a correction in the grid squares. And we can use the Re-projection Error calculation in the OpenCV article to help refine our calibration.
Now that we have our webcam calibrated, let’s create some 3D effects! The OpenCV Pose Estimation article provides all the detail (including code).
Here’s the axis points projected onto our future Lego scene:
And here’s a cube, which could become a virtual jail for our Lego criminals:
Oh, the Lego policemen are salivating at the thought of a spanking new building to bang up all those vagabonds.
Well, that’s that. Time to turn a few pages of The Inscrutable Diaries Of Rodger Saltwash and unwind. Then it’s back to the drawing board, and turning those 3D effects into mind-shattering Lego augmented reality. Watch this space!