SkyFinder Image Searching

Researchers at the Visual Computing division of Microsoft Research have developed SkyFinder, a search engine designed to identify features in images of the sky and allow users to perform semantic searches for specific elements within the picture. Current image search engines rely on the associated text to classify images and often return many results that don’t match the user’s needs. This new technique would make images without accompanying text searchable and greatly enhance the level of control users have over the returned image, increasing their correspondence to the specific search terms.
A series of returned results from the SkyFinder image search engine (Source: Microsoft Research)
A series of returned results from the
SkyFinder image search engine
(Source: Microsoft Research)

The SkyFinder search tool currently works with over half a million images of the sky downloaded into a database. Each images is automatically examined and assigned values for a set of sky attributes including category (blue sky, cloudy sky, or sunset), layout (landscape, normal sky, object in sky, etc.), horizon height, sun position, and richness. These attributes are computed offline then fed to an online, interactive search interface for direct user access. The system responds positively to semantic searches, matching keywords within a natural language sentence to the various attributes in its system.

The system automated the classification process using trainable agents designed to look for a single feature. The agents were shown many thousands of examples of their feature along with many more examples of a variety of images that don’t qualify. This enabled them to define a range of allowable positives as well as quickly identify most of the pictures that are negative fits (don’t meet the criteria of their specific feature). While not perfect, the use of so many baseline images of both positive and negative results led to a surprisingly high degree of accuracy.

Most images are not homogeneous and may have elements of more than one of the allowed values for category or layout. The system accounts for this by using what it calls a “bag-of-words” which contains the collection of values assigned to each 16 x 16 pixel section of the image. Thus, although a full image can be classified as a sunset image, each image is actually assigned a score granting it some percentage of blue sky, cloudy sky, and sunset.

In addition to a traditional text search interface, SkyFinder includes a graphical search interface. This interface allows a user to select from among the continuous range of values for each of the three categories by clicking in a triangle. Each corner of the triangle is assigned one of the three keywords. For images that are rated 100 percent sunset, the user would click on that corner of the triangle. For values that are partially rated as sunset and partially as cloudy sky, the user would click along the side connecting the those two corners. Mixtures of all three values are found by clicking inside the triangle. Horizon level and sun position can be approximated by user drawing within a meta image while the desired layout and richness values can be selected from dropdown menus.

TFOT has previously reported on other advances in search technology including Adobe’s efforts to make Flash content searchable, the introduction of the human-driven search engine Mahalo, and an MIT initiative that enables searching within video lectures. TFOT has also reported on several studies exploring search technology including a Penn State study that modeled user interactions with search engines and a UCLA study which shows that surfing the internet can improve brain function in older adults.

Read more about the SkyFinder image search engine in this paper outlining its basic capabilities (PDF).

Related Posts