Introduction
In computer vision, there is always a need to track the motion of an object or blob across frames in a moving sequence. Various methods have been investigated in literature for this purpose. One very interesting method for motion estimation and tracking is optical flow. In simple terms, optical flow gives the measure of movement of a pixel or a block in two consecutive frame. This measure of movement is given in the form of a vector where the magnitude of the vector signifies the amount of motion and the angle of the vector specifies the direction of the motion. This vector is called motion vector.
Example
Consider the image below:
Concentrate on the objects in the image above and the motion vectors drawn on them. Analyzing the motion vectors, we can see that a bicycle is moving towards the left while two pedestrians are moving towards the right. The speed of their motion is categorized by the magnitude of the motion vectors and their direction by the angel of the motion vectors.
How motion vectors are calculated?
Motion vectors can be defined at block level or sometimes at pixel level as well. The pixel level calculation is called dense optical flow. Usually in real-time computer vision applications, motion vectors are calculated at block level.
Lets see how the optical flow algorithm works. Consider two successive frames A and B. The current frame A is divided into small blocks (say of size 8x8). Now for each block in frame A, we try to find its closest match in frame B. For matching, various similarity or distance measures are used for calculating the similarity with each target block and finding the best match. These measures include euclidean distance, edit distance or correlation similarity. When a block is found, a vector is drawn in the direction of the found block from the location of the block in frame A.
This step is repeated for each block of frame A and as a result, we get an image where vectors are drawn on top to highlight the motion of objects in that image. The result of the optical flow algorithm can be seen in the sequence of pictures given below:
In the figure above, four successive frames are analyzed and optical flow is calculated for them. The green lines are the motion vectors with their end points shown in red dot. We can see that the objects that are stationary across frames have motion vectors of zero length (shown by only red dots). The blocks which are moving more have larger motion vectors.
Applications
This analysis of motion vectors is very useful in studying the motion in a video that can further be used for various applications. These applications include
- Object tracking
- Video Compression
- Video Stabilization
- Structure from motion
OpenCV for calculating Optical Flow
In openCV, there are various implementations for calculating Optical Flow. The general principle is simple. The programs reads a video and then extracts its frames. The frames are passed to the Optical Flow function in pairs because an Optical Flow function takes previous and current Frames as input arguments along with few other parameters. For example, to find optical flow using a function that is based on Gunner Farneback's algorithm, the CV function is given below. It calculates dense optical flow i.e. motion vectors are calculated for each pixel in the frame.
void calcOpticalFlowFarneback(InputArray prev, InputArray next,InputOutputArray flow, double pyr_scale, int levels,int winsize, int iterations, int poly_n, double poly_sigma,int flags)
Similarly, there is another function in openCV that calculates Optical flow by employing sparse feature set using the iterative Lucas-Kanade algorithm. It is given here:
void calcOpticalFlowPyrLK(InputArray prevImg, InputArray nextImg,InputArray prevPts, InputOutputArray nextPts,OutputArray status, OutputArray err, Size winSize,int maxLevel, TermCriteria criteria, int flags,double minEigThreshold)
Here prevPts is a vector of 2D points for which the flow needs to be found and nextPts is the output vector of 2D points containing the calculated new positions of input features in the second image.
Details of the other function arguments present in above mentioned two optical flow functions is given here:
- prevImg – first 8-bit input image or pyramid
- nextImg – second input image or pyramid of the same size
- status – output status vector (of unsigned chars); each element of the vector is set to 1 if the flow for the corresponding features has been found, otherwise, it is set to 0.
- err – output vector of errors; each element of the vector is set to an error for the corresponding feature, type of the error measure can be set in flags parameter; if the flow wasn’t found then the error is not defined (use the status parameter to find such cases).
- winSize – size of the search window at each pyramid level.
- maxLevel – 0-based maximal pyramid level number; if set to 0, pyramids are not used (single level), if set to 1, two levels are used, and so on; if pyramids are passed to input then algorithm will use as many levels as pyramids have but no more than maxLevel.
- criteria – parameter, specifying the termination criteria of the iterative search algorithm
- pyr_scale – parameter, specifying the image scale (<1) to build pyramids for each image; pyr_scale=0.5 means a classical pyramid, where each next layer is twice smaller than the previous one.
- levels – number of pyramid layers including the initial image
- winsize – averaging window size
- iterations – number of iterations the algorithm does at each pyramid level.
- poly_n – size of the pixel neighborhood used to find polynomial expansion in each pixel; larger values mean that the image will be approximated with smoother surfaces, yielding more robust algorithm and more blurred motion field, typically poly_n =5 or 7.
- poly_sigma – standard deviation of the Gaussian that is used to smooth derivatives used as a basis for the polynomial expansion; for poly_n=5, you can set poly_sigma=1.1, for poly_n=7, a good value would be poly_sigma=1.5.
For beginners, following related functions are also useful:
To read a video:
vid = cv2.VideoCapture("test.avi");
To extract a frame from the video vid:
frame1 = vid.read();
Two extracted frames are passed to the Optical Flow function that calculates the relative motion between the two frames. Result of a dense optical flow function looks something like this:
Conclusion
In this overview, we have just learned about the optical flow, motion vectors and their calculation. We will look at its further applications in Video Compression and Video Stabilization articles. More details on calculating Optical Flow can seen on the openCV website.