=Overview

We consider two important structures in understanding and editing images: modeling regular, program-like texture or patterns in 2D planes, and 3D posing of these planes in the scene. Unlike prior work on image-based program synthesis, which assumes the image contains a single visible 2D plane, we present Box Program Induction (BPI), which infers a program-like scene representation that simultaneously models repeated structure on multiple 2D planes, the 3D position and orientation of the planes, and camera parameters, all from a single image. Our model assumes a box prior, i.e., that the image captures either an inner view or an outer view of a box in 3D. It uses neural networks to infer visual cues such as vanishing points, wireframe lines, or plane segmentations to guide a search-based algorithm to find the program that best explains the image. Such a holistic, structured scene representation enables 3D-aware interactive image editing operations such as inpainting missing pixels, changing camera parameters, and extrapolate the image contents.

Figure 1: We present Box Program Induction (BPI), which infers a program-like scene representation that simultaneously models repeated structure on multiple 2D planes, the 3D position and orientation of the planes, and camera parameters, all from a single image. The inferred program can be used to guide perspective-aware and regularity-aware image manipulation tasks, including image inpainting, extrapolation, and view synthesis.

=Resources

=Related Publications

Perspective Plane Program Induction from a Single Image

Yikai Li*, Jiayuan Mao*, Xiuming Zhang, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

(*: First two authors contributed equally.)

Program-Guided Image Manipulators

Jiayuan Mao*, Xiuming Zhang*, Yikai Li, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

(*: First two authors contributed equally; order determined by a coin toss.)

Learning to Describe Scenes with Programs

Yunchao Liu, Zheng Wu, Daniel Ritchie William T. Freeman, Joshua B. Tenenbaum, and Jiajun Wu