Visual Perception Laboratory







Research Topics


    Insight problem solving - symmetry

    Consider the following problem: I have two quart beakers, beaker A filled with a pint of coffee and B with a pint of milk. I take a cup, say a measuring glass, and: (i) fill it up with coffee from A and pour it into B, mix it thoroughly, and: (ii) return a cup filled with the mixture to A and then mix it up. Now both beakers have a pint of liquid in them. Is the concentration of milk in A the same as the concentration of coffee in B? See my book "Problem Solving" for explanations.


    Integration of closed contours using log-polar representation

    When the visual system solves problems in early areas of the visual cortex it does it using log-polar representation of the retina. This representation simplifies a problem of integrating closed contours. When the closed contour is around the center of the retina, its representation in V1 is a path. It follows the finding the shortest path in V1 is equivalent to integrating a closed contour on the retina. The main stages of our model are shown here. See Hii and Pizlo (2023) for details.


    Reconstruction of 3D shapes of natural objects - psychophysics and model

    Subjects reconstructed 3D shapes of three types of stimuli adjusting an aspect ratio of the 3D shape. Natural shapes, random symmetrical polyhedra and polyhedra composed of rectangular boxes were used. See here for several examples. Subjects' reconstruction of natural shapes and of polyhedra composed of rectangular boxes was veridical. See Beers and Pizlo (2024) for details. The computational model performs the same 3D reconstruction with both orthographic and perspective images.


    Figure-ground organization (FGO)

    3D objects are symmetrical or nearly symmetrical, but a configuration of unrelated object is not likely to be symmetrical. It follows that establishing the smallest number of 3D symmetry planes that can account for a 3D scene solves the problem of identifying objects in the scene. Specifically, each symmetry plane corresponds to an object. See Michaux et al. (2016) for details and go here to see examples.


    Recovering 3D shapes of real objects from real images

    A priori constraints are as important as the sensory data in 3D vision. Michaux et al. (2017)combined 3D symmetry of an object with a pair of real camera images. 3D recovery of shapes, sizes and positions was absolutely perfect. When the object has 2 planes of mirror symmetry the front and back of the objects is recovered. Go here to see a few examples.


    Shape Perception

    Jayadevan et al. (2018) tested human shape perception with symmetrical and nearly symmetrical shapes. Viewing was binocular or monocular. Our computational model recovered these 3D shapes as well as the subjects did. The model and the subjects adjusted three parameters from the family of 3D affine transformations. The recovery in the model corresdponded to the minimum of a cost function. The same cost function was used for monocular and binocular viewing of the model. This is the first and the only computational model that explains both viewing conditions. Monocular performance of our subjects and of our model shows that any theory of 3D shape perception that requires multiple views is inadequate.

    See the animation illustrating the three parameter family. Here is how the psychophysical experiment looked. And on this site you can see 3D shapes reconstructed by our subjects and our model.

    Contribution of stereoacuity to 3D shape recovery

    Stereoacuity refers to the binocular ability to judge the depth order of features. Stereoacuity is a hyperacuity which, in technical jargon, means subpixel resolution. Stereoacuity has never been used in theories of shape perception because it does not allow reconstructing depth intervals. So, all previous theories of binocular shape perception used ordinary binocular disparity. It turns out that when stereoacuity is combined with symmetry a priori constraint, the recovery result is absolutely perfect. How this works is described by Li et al. (2011) and illustrated in this animation.


    Symmetry and skewed symmetry

    Most natural objects are symmetrical: animals are symmetrical because of the way they move, plants are symmetrical because of the way they grow, and man-made objects are symmetrical because of the functions they serve. Once the utility and omnipresence of symmetry is appreciated, one should expect symmetry to be used by visual systems (both human and computer) as an important a priori constraint (an assumption) designed to allow them to produce accurate perceptual interpretations of the 3D shapes of objects in their natural environment. Using symmetry effectively for this purpose is complicated by the fact that the 2D retinal image of a symmetrical 3D object is always asymmetrical, but note that the symmetry of the object is only distorted in its 2D image. It is not destroyed. We have been able to show that the human visual system is able to detect the distorted (skewed) symmetry inherent in a 2D retinal image and then use this information to recover the shape of the symmetrical 3D object. Several examples of 3D recovery can be seen here. Details are described in our 2014 book "Making a machine that sees like us."

    Note, however, that 3D symmetry is not sufficient for reliable recovery: it turns out that any 2D retinal image has 3D symmetrical interpretations (Sawada et al., 2011). Here are example 1, and example 2. For 3D symmetry to be fully effective, additional constraints, such as planarity, must be used as well - see example 3.

    Problem Solving - Traveling Salesman

    Problem solving is one of the human beings fundamental cognitive abilities. It is at least as important as the other more commonly-studied, mental activities, namely, perception, memory, decision making and learning. We approach problem solving by adopting an information-processing methodology and use it to study computationally difficult (intractable) problems that can be presented to the subject visually, for example, the Traveling Salesman Problem. Human subjects produce near-optimal solutions to such combinatorial optimization problems in linear time. A hierarchical (pyramid) algorithm is the only model that can emulate human performance. It performs fine-to-coarse or coarse-to-fine hierarchical clustering of states (cities) and then produces a solution tour by using a sequence of successive approximations in a coarse-to-fiine direction. The model emulates non-uniform distribution of receptors in the human retina, as well as eye-movements that move the model's attention. See a demo that shows how the model solves 50-city TSP.

    In 2013 (Pizlo and Stefanov) we modified the model so that its working memory can store only a few pieces of information at a time. This modification did not reduce the quality or the speed of the solution. Four demos illustrate how the model's visual representation zooms-out and zooms-in during the process of analyzing spatially global and spatially local features of the problem.

    demo 2

    demo 3

    demo 4

    demo 5


    Phi Phenomenon

    In 1912, Max Wertheimer (1880-1943), the founder of the Gestalt School of Psychology, published a monograph on the perception of apparent motion that profoundly influenced subsequent perceptual research and theory. Wertheimer's contribution was inspired by his serendipitous observation of what he called a "pure" apparent movement. It was pure in the sense that the motion was not associated with perceiving any object changing its location in space. He called this pure motion the "phi-phenomenon" to distinguish it from "optimal" apparent movement (called "beta"). In the demo you can see beta and "magniphi" which is our vivid version of Wertheimer's phi. Our description of this phenomenon, including history, is in Steinman et al. (2000).


Yll Haxhimusa. Created: March 28, 2008; Last change: April 11th, 2008 | Disclaimer & Copyright Notice |