Structure From Motion (SFM) Capture Guide

Structure From Motion (SFM) is a photogrammetric range imaging technique for estimating three-dimensional structures from two-dimensional image sequences that may be coupled with local motion signals. It is studied in the fields of computer vision and visual perception.

In this post, we will learn how to capture photos for the ultimate 3D reconstruction. First of all, keep in mind that the more photographs you take, the better the model will be.

Here are some suggestions for the camera/lens configuration:

  • Choose your distance from the subject and focus. Then tape the focus ring in place.
  • Use prime lenses rather than zoom lenses. If a zoom lens must be used, use the nearest or farthest extent of the zoom.
  • The camera’s aperture must remain constant during the capture sequence. On a 35mm camera, it is good practice not to set the aperture smaller than f/11. With apertures smaller than f/11, diffraction effects occur that blur the image, significantly reducing the camera’s resolution.
  • Use the lowest possible ISO setting. The higher the ISO setting, the more electronic noise is generated in the camera sensor. This noise makes the matching of pixels in different photographs more difficult.
  • Turn off image stabilization and auto-rotate camera functions.
  • The camera should be set to aperture priority mode (preferably f/5.6–f/11 to get the sharpest images).
  • To obtain the highest order results, ensure that the camera configuration does not change for a given sequence of photos.
  • If a change of camera or lens configuration is necessary, group the subsequent photos together in a different set from the previous photos. Calibrate the sets of photos separately.

How to determine where to take the photographs:

This image has an empty alt attribute; its file name is image1.jpeg
  • To maintain a consistent 66% overlap, the camera must be moved a distance equivalent to 34% of the camera’s field of view between photographs, from left to right.
  • Be sure to begin the first row of photos positioned such that two-thirds of the field of view is to the left of the imaging subject.
  • Ensure the entire subject is covered by at least three frames.
  • Proceed systematically from left to right along the length of the subject and take as many photos as necessary to ensure complete coverage.
  • For higher quality results and greater imaging redundancy, which helps lower point matching and depth uncertainty:
    • Raise the camera vertically and aim the camera downward 15 degrees to re-photograph the previously captured area.
    • At the same time, rotate the camera 90 degrees to portrait mode and use the same 66% overlap from left to right.
    • When the second row is finished, lower the camera vertically and aim the camera upward 15 degrees to re-photograph the previously captured area.
    • Rotate the camera 180 degrees (for a total of 270 degrees), and again capture the area in the same way.
  • It is important to maintain a consistent distance from the subject.
  • For multi-resolution applications or to increase or decrease resolution, the camera position (closer or farther away from the subject) or the focal length of the lens (such as 24mm to 50mm) can be changed up to a factor of twice or one half the resolution of the previous set of photos.
    • Follow this rule for as many sets of photos as necessary to reach the desired resolution.
    • Calibrate each set of photos separately.
  • Because of the flexibility of this technique, it is possible to obtain high accuracy 3D data from subjects that are at almost any orientation (horizontal, vertical, above, or below) the camera position.
  • For round subjects, capture photos every 10 to 15 degrees and overlap the beginning and end photos to complete the circuit.

Additionally, a couple of important things to bear in mind: you should always seek landowner’s permission to be on site and take notes in the field (camera model & settings used, weather conditions, site name and location, monument type being recorded, condition of the object recorded).