Technical Articles

Tracking Objects: Acquiring and Analyzing Image Sequences in MATLAB

By Dan Lee, MathWorks and Steve Eddins, MathWorks


Four-dimensional arrays are about to become a lot more common in MATLAB®. With the new Image Acquisition Toolbox, you can easily stream images from your frame grabbers and scientific cameras directly into MATLAB, often as an array with four dimensions: height, width, color, and time.

In this article, we’ll demonstrate how to get video image sequences into MATLAB and illustrate basic object tracking techniques using the Image Processing Toolbox.

Image Acquisition

A typical image acquisition session includes these steps

Step Toolbox Functions Input
Connect to the device. videoinput adaptorname, deviceID, video format(optional)
Configure acquisition properties and preview the results. set,get,inspect, preview video input object, property names, and settings
Acquire and process data. start, getdata getsnapshot video input object, number of frames
Disconnect from the device and free resources. clear, delete video input object

First, we connect to a Windows video device using the videoinput function. (Use imaqhwinfo to determine your device’s identifier number and supported video formats.)

vid = videoinput('winvideo', 1, 'RGB24_352x288')

Next, we specify that we want to acquire 50 frames at 3 frames per second.

set(vid, 'FramesPerTrigger', 50)
set(getselectedsource(vid), 'FrameRate', 3)

Now we start acquiring images.

start(vid)

By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait function to wait for the acquisition to stop. The getdata function transfers the acquired images into the MATLAB workspace.

wait(vid)
[f, t] = getdata(vid);

Variable f is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.

imview(f(:,:,:,10))
tracking_fig1_w.jpg

Image A 

Before continuing, we disconnect from the camera to enable other applications to use it, and clear vid from the workspace.

delete(vid)
clear(vid)

Object Tracking

First, we connect to a Windows video device using the videoinput function. (Use imaqhwinfo to determine your device’s identifier number and supported video formats.)

vid = videoinput('winvideo', 1, 'RGB24_352x288')

Next, we specify that we want to acquire 50 frames at 3 frames per second.

set(vid, 'FramesPerTrigger', 50)
set(getselectedsource(vid), 'FrameRate', 3)

Now we start acquiring images.

start(vid)

By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait function to wait for the acquisition to stop. The getdata function transfers the acquired images into the MATLAB workspace.

wait(vid)
[f, t] = getdata(vid);

Variable f is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.

imview(f(:,:,:,10))

Now that we have the image sequence in MATLAB, we’ll explore two simple techniques for tracking the ball: frame differencing and background subtraction.We’ll use functions in the Image Processing Toolbox.

Frame Differencing

The absolute difference between successive frames can be used to divide an image frame into changed and unchanged regions. Since only the ball moves, we expect the changed region to be associated only with the ball, or possibly with its shadow.

To begin, we convert each frame to grayscale using rgb2gray. Running the loop “backwards,” from numframes down to 1, is a common MATLAB programming trick to ensure that g is initialized to its final size the first time through the loop.

numframes = size(f, 4);
for k = numframes:-1:1
g(:, :, k) = rgb2gray(f(:, :, :, k));
end

Next, we compute frame differences using imabsdiff.

for k = numframes-1:-1:1
d(:, :, k) = imabsdiff(g(:, :, k), g(:, :, k+1));
end
imview(d(:, :, 1), [])
tracking_fig2_w.jpg

Image B 

The two bright spots correspond to the ball locations in frames 1 and 2. The dim spots are the ball’s shadow. The function graythresh computes a threshold that divides an image into background and foreground pixels. Since graythresh returns a normalized value in the range [0,1], we must scale it to fit our data range, [0,255].

thresh = graythresh(d)
bw = (d >= thresh * 255); imview(bw(:, :, 1))
tracking_fig3_w.gif

Image C 

First, we connect to a Windows video device using the videoinput function. (Use imaqhwinfo to determine your device’s identifier number and supported video formats.)

vid = videoinput('winvideo', 1, 'RGB24_352x288')

Next, we specify that we want to acquire 50 frames at 3 frames per second.

set(vid, 'FramesPerTrigger', 50)
set(getselectedsource(vid), 'FrameRate', 3)

Now we start acquiring images.

start(vid)

By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait function to wait for the acquisition to stop. The getdata function transfers the acquired images into the MATLAB workspace.

wait(vid)
[f, t] = getdata(vid);

Variable f is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.

imview(f(:,:,:,10))

First, we connect to a Windows video device using the videoinput function. (Use imaqhwinfo to determine your device’s identifier number and supported video formats.)

vid = videoinput('winvideo', 1, 'RGB24_352x288')

Next, we specify that we want to acquire 50 frames at 3 frames per second.

set(vid, 'FramesPerTrigger', 50)
set(getselectedsource(vid), 'FrameRate', 3)

Now we start acquiring images.

start(vid)

By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait function to wait for the acquisition to stop. The getdata function transfers the acquired images into the MATLAB workspace.

wait(vid)
[f, t] = getdata(vid);

Variable f is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.

imview(f(:,:,:,10))

As you can see, the resulting binary image has a small extra spot that should be removed. The technique we’ll use, area opening, removes objects in a binary image that are too small. The call to bwareaopen

bw2 = bwareaopen(bw, 20, 8);

removes all objects containing fewer than 20 pixels. The third argument, 8, tells bwareaopen to assume that pixels are connected only to their immediate 8 neighbors in each frame. bwareaopen will then treat bw as a sequence of two-dimensional images rather than one three-dimensional image.

Finally, we label each individual object (using bwlabel) and compute its corresponding center of mass (using regionprops).

s = regionprops(bwlabel(bw2(:,:,1)), 'centroid');
c = [s.Centroid]
c = 226.8231 53.1538 260.3750 43.3167 

Background Subtraction

Another approach to tracking the ball is to estimate the background image and subtract it from each frame. Our approach here is to find the pixel-wise maximum among several neighboring frames. That’s exactly what morphological dilation does, if you use a structuring element oriented along the frame dimension.

background = imdilate(g, ones(1, 1, 5));
imview(background(:,:,1))
tracking_fig4_w.jpg

Image D 

Next, we compute the absolute difference between each frame and its corresponding background estimate. Since the array of frame differences, d, and the array of background images, background, have the same size, we don’t need a loop.

d = imabsdiff(g, background);
thresh = graythresh(d);
bw = (d >= thresh * 255);

Now we want to compute the location of the ball in each frame. As before, some frames contain small extra spots, most of which result from the ball’s shadow. We solve this problem by assuming that the ball is the largest object in each frame.

centroids = zeros(numframes, 2);
for k = 1:numframes
L = bwlabel(bw(:, :, k));
s = regionprops(L, 'area', 'centroid');
area_vector = [s.Area];
[tmp, idx] = max(area_vector);
centroids(k, :) = s(idx(1)).Centroid;
end

Visualization

To finish this example, let’s visualize the ball’s motion by plotting the centroid locations as a function of time:

subplot(2, 1, 1)
plot(t, centroids(:,1)), ylabel('x')
subplot(2, 1, 2)
plot(t, centroids(:, 2)), ylabel('y')
xlabel('time (s)')
tracking_fig5_w.gif

Figure 1. Ball location versus time. 

This article should get you started with mixing MATLAB, cameras, four-dimensional arrays, and a little image processing. If you want to experiment with this data, download the Gravity Measurement Case Study from MATLAB Central.

Published 2003