Query Specific Fusion for Image Retrieval

Abstract: Recent image retrieval algorithms based on local features indexed by a vocabulary tree and holistic features indexed by compact hashing codes both demonstrate excellent scalability. However, their retrieval precision may vary dramatically among queries. This motivates us to investigate how to fuse the ordered retrieval sets given by multiple retrieval methods, to further enhance the retrieval precision. Thus, we propose a graph-based query specific fusion approach where multiple retrieval sets are merged and reranked by conducting a link analysis on a fused graph. The retrieval quality of an individual method is measured by the consistency of the top candidates' nearest neighborhoods. Hence, the proposed method is capable of adaptively integrating the strengths of the retrieval methods using local or holistic features for different query images without any supervision. Extensive experiments demonstrate very competitive performance on 4 public datasets, i.e., the UKbench, Corel-5K, Holidays and San Francisco Landmarks datasets.

Papers and patents

Shaoting Zhang, Ming Yang, Timothee Cour, Kai Yu and Dimitris Metaxas: Query Specific Fusion for Image Retrieval, The 12th European Conference on Computer Vision (ECCV), 2012.
• Ming Yang, Shaoting Zhang, Kai Yu: Query Specific Fusion for Image Retrieval, NEC Lab America, Inc, Invention record, IR number 11075.

Algorithms: To fuse the ranked retrieval results given by different methods, the critical issue is how to automatically measure and compare their quality, since no supervision and user relevance feedbacks are available online. The similarity scores of candidates may vary largely among queries, especially for the vocabulary tree based method, and are not comparable between different retrieval methods. Thus, a reasonable idea is to measure the consistency among the top candidates returned by one retrieval method as the retrieval quality specific to one query. Therefore, for each query image, we construct a weighted undirected graph (Figure 1) from the retrieval results of one method, where the retrieval quality or the relevance is modeled by the edge weights using the Jaccard similarity coefficient of two neighborhood image sets. Then we fuse these graphs to one (Figure 2) and perform a localized PageRank algorithm or find the weighted maximum density subgraph (Figure 3) centered at the query image to rerank the retrieval results. As a result, the fused retrieval results tend to be consistent in terms of different image representations.

Figure 1. An example of graph construction, where the query q links to its reciprocal neighbors (i.e., q and the green discs in the green zone). d is a candidate at the first layer with its reciprocal neighbors in the blue zone, whose Jaccard coefficient to q is 3/7 (# of nodes in the intersection divided by # of nodes in the union of the green and blue zones). The radius of the disc representing a node indicates the influence of the decay coefficient alpha.

Figure 2. Fusion of two graphs where the green and yellow graphs are derived from two different retrieval methods.

Figure 3. Graph fusion by finding the weighted maximum density subgraph. V(G') is the number of vertices in subgraph G'. w(e) is the Jaccard similarity coefficient of edge e.

UKbench: N-S score = 3.83, after fusing SIFT and HSV.
Holidays: mAP = 84.64%, after fusing SIFT and HSV.
Corel5K dataset, two fusion examples are shown in Figure 4 and 5.
Please refer to our paper for more results.

Figure 4. A fusion example from Corel5K dataset (by fusing GIST and SIFT).

Figure 5. A fusion example from Corel5K dataset (by fusing GIST and SIFT).