Unsupervised Discovery of Co-occurrence in Sparse High Dimensional Data
Type of document
příspěvek z konference - elektronickýAuthor
Chum, Ondřej
Matas, Jiří
Rights
© 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Metadata
Show full item recordAbstract
An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features co-occurring with high probability and are called co-ocsets. Sparse high dimensional descriptors, such as bag of words, have been proven very effective in the domain of image retrieval. To maintain high efficiency even for very large data collection, features are assumed independent. We show experimentally that co-ocsets are not rare, i.e. the independence assumption is often violated, and that they may ruin retrieval performance if present in the query image. Two methods for managing co-ocsets in such cases are proposed. Both methods significantly outperform the state-of-the-art in image retrieval, one is also significantly faster.
Collections
The following license files are associated with this item: