Image Synthesis Considerations for Image Refocusing
Figure 1. Causal 5x5 neighborhoods under different
assignment orders: the fixed neighborhood (a) of
scanline-order assignment, and typical
neighborhoods (b), (c), and (d) in confidence-order
assignment. Black pixels are those not included in
the neighborhood.
2.3. Order of Assignment
Recent projects in image synthesis with
neighborhood-based constraints have achieved results
using symmetrical neighborhoods with a scanline or
random assignment order, beginning from initial
pixel value estimates and assigning each pixel more
than once in multiple successive iterations [2, 6]. In
[6], each pixel is weighted by confidence, such that it
bears more strongly on neighborhood comparisons as
it converges to its most probable value.
In multi-resolution texture synthesis – where the
neighborhood used in synthesizing a given resolution
level may extend into lower, already fully assigned
levels – random order [5] and scanline order [4]
assignment are successful even with a causal
neighborhood. Because this synthesis begins with
the lowest-resolution level, which is given, and
proceeds upward, we might say that these algorithms
assign pixels in confidence order. In conjunction
with the causal neighborhood, this order ensures that
the value of any pixel is determined only by pixels
with equal or greater confidence.
We hypothesize that confidence-order assignment is
also beneficial under a symmetrical neighborhood.
However, our image space differs from the multi-
resolution pyramid in that there is no obvious
gradient from more confident to less confident pixels.
The best notion of pixel confidence we are able to
establish prior to assigning any pixels is the
probability P
image
of the most probable candidate
value given by the image-scene constraint,
normalized over all candidates. A map of this
confidence metric over a sample input image is
presented in Figure 2.
In general, the pixels of greatest confidence tend to
lie in the areas of greatest contrast in the misfocused
input image. These correspond to neighborhoods in
the misfocused input image that resemble only very
few neighborhoods in the blurred focused input
images, and thus map very probably to a select few
candidate values. Neighborhoods in large, uniformly
colored regions resemble many patches in the blurred
focused images, and thus map to many candidate
values with roughly equal probability, resulting in
low confidence.
Rather than use these raw pixel confidence values to
determine the pixel assignment order, we consider
that we would like to give precedence to pixels with
high confidence values in their neighborhoods in W,
since the value of a pixel is as much determined by
the scene texture constraint as by the image-scene
constraint. We therefore convolve the pixel
confidence map by a 5 x 5 square to get a
neighborhood confidence map, wherein the value of
any pixel is the average pixel confidence across a 5 x
5 neighborhood around it. Sorting the list of pixels in
W by this neighborhood confidence attribute yields
our order of assignment.
3. Evaluating the Components
In order to determine the contribution of the
components described in the previous section to the
performance of the overall system, we compared the
output of several variations of the system. For these
evaluations we used the photograph shown in Figure
2(a) as our misfocused input image, and an in-focus
photograph taken from the same view as our lone
focused input image. Along with each input image
we provide the program with a hand-made bitmask
defining the boundary of the object to be refocused.
Refocused values are synthesized only for the pixels
within this boundary in the misfocused input, and our
focused patch library is built only from the pixels
within this boundary in the focused input.
In order to acquire an accurate PSF for the
misfocused image, we photographed a simulated
point light source placed at the same depth as the
object, using identical focus and aperture settings.
The point light source was simulated using a
fluorescent desk lamp positioned behind a pinhole
mask with a diffuser between. Since the ostensibly
focused image of the light source was not in fact a
point, but instead a few pixels in diameter, we took
our actual PSF to be the function which when
convolved by the focused image of the light source
most closely resembled the misfocused image of the
light source, constraining that PSF to be a disk with
integer pixel diameter.