Statistical Learning for Automatic Image/Video Understanding
1. Salient Object Detection: We treat salient objects as the mid-level building blocks for automatic image/video understanding and we have developed multiple approaches to enable automatic salient object detection.
- Salient Object Detection Flowchart:
- Salient Object Detection Results:
2. Semantic Image/Video Interpretation: We integrate concept ontology for organizing large numbers of salient objects and image/video concepts, and such concept ontology is further used to organize the learning process for classifier training for semantic image/video interpretation.
- Image Concept Ontology Visualization without ICONS:
Image Concept Ontology Visualization with ICONS:
Video Concept Ontology Visualization without ICONS:
Video Concept Ontology Visualization with ICONS:
3. Visualization for Interactive Image/Video Retrieval: In order to bridge the gap between CBIR/CBVR systems and users' real needs, interactive visualization is developed for users to formulate their queries interactively and access large-scale image/video collections easily.
4. Future Work: Our future work will focus on leveraging social media (weakly-tagged images/videos) to train a large number of inter-related classifiers for semantic image/video interpretation. It is important to note that concept network is more suitable for learning task organization than the concept ontology and parallel computing may play an important role in dealing with the scalability issue.
- Concept Network Construction:
5. Representative Publications
- J. Fan, Y. Shen, C. Yang, N. Zhou,
``Harvesting Large-Scale Weakly-Tagged Image Databases from the Web",
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'10), 2010.
- Y. Shen, J. Fan,
``Leveraging Loosely-Tagged Images and Inter-Object Correlations for Tag Recommendation",
ACM Multimedia, 2010.
- H. Luo, J. Fan, Y. Zhou, ``Multimedia news exploration and retrieval by integrating keywords, relationsand visual features", Multimedia Tools and Applications, vol.51, pp.625–648, 2011.
- J. Fan, Y. Shen, C. Yang, N. Zhou,
``Structured Max-Margin Learning for Inter-Related Classifier Training and Multi-Label Image Annotation",
IEEE Trans. on Image Processing, 2010.
- Z. Li, H. Luo, J. Fan,
``Incorporating Camera Metadata for Attended Region Detection and Consumer Photo Classification",
ACM Multimedia (MM'09), Beijing, 2009. A long version is also published on Multimedia Tools and Applications, 2010. journal version
- J. Fan, H. Luo, Y. Shen, C. Yang,
``Integrating Visual and Semantic Contexts for Topic Network Generation and Word Sense Disambiguation",
ACM Conf. on Image and Video Retrieval (CIVR'09), 2009.
- Y. Gao, J. Peng, H. Luo, D. Keim, J. Fan,
``An Interactive Approach for Filtering out Junk Images from Keyword-Based Google Search Results",
IEEE Trans. on Circuits and Systems for Video Technology, vol. 19, no.10, 2009. Some preliminary
results are also presented at
MMM'08. Online demo is available at Junk Images filtering demo
- Z. Li, J. Fan, ``Stochastic Contour Approach for Automatic Image
Segmentation", Journal of Electronic Imaging, vol.18, no.04, 2009.
J. Fan, D.A. Keim, Y. Gao, H. Luo, Z. Li, ``JustClick: Personalized Image
Recommendation via Exploratory Search from Large-Scale Flickr Images",
IEEE Trans. on Circuits and Systems for Video Technology, vol. 19, no.2, pp.273-288, 2009.
J. Fan, Y. Gao, H. Luo, ``Integrating Concept Ontology and
Multi-Task Learning to Achieve More Effective Classifier Training for Multi-Level Image Annotation",
IEEE Trans. on Image Processing, vol. 17, no.3, pp.407-426, 2008.
J. Fan, Y. Gao, H. Luo, R. Jain, ``Mining Multi-Level Image
Semantics via Hierarchical Classification", IEEE Trans. on Multimedia, special issue on Multimedia Data Mining,
vol. 10, no.1, pp.167-187, 2008. Some preliminary results are also presented at
H. Luo, J. Fan, S. Satoh, J. Yang, W. Ribarsky, ``Integrating Multi-Modal
Content Analysis and Hyperbolic Visualization for Large-Scale News Videos Retrieval and Exploration", Signal Processing:
Image Communication, special issue on Semantic Analysis for Interactive Multimedia Services, vol.23, no.8, pp.538-553, 2008.
Some preliminary results
are also presented at IEEE VAST'07 and
IEEE VAST'06. Video .
- J. Fan, Y. Gao, H. Luo, ``Hierarchical classification
for automatic image annotation", ACM SIGIR,
Amsterdam, pp.111-118, 2007.
- B. Li, M. Chi, J. Fan, X. Xue,
``Support Cluster Machine", International Conference on Machine Learning (ICML'07), June 20-24, 2007,
J. Fan, H. Luo, Y. Gao, R. Jain, ``Incorporating Concept Ontology
for Hierarchical Video Classification, Annotation and Visualization", IEEE
Trans. on Multimedia, special issue on Semantic Image and Video Indexing in Broad Domains,
vol. 9, no.5, pp. 939-957, 2007. Some preliminary results
are also presented at ACM Multimedia'06.
- Y. Gao, J. Fan, H. Luo, X. Xue, R. Jain,
``Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up
SVM Classifiers", ACM Multimedia, Santa Barbara,
Presentation Slides , Video
J. Fan, Y. Gao, H. Luo, ``Multi-level annotation of natural scenes
using dominant image components and semantic image concepts", ACM Multimedia,
New York, Oct.10-15, 2004 ( Best Paper Runner-ups),
Video. An extended journal version for this work: J. Fan , Y. Gao, H. Luo, G. Xu, ``Statistical modeling and conceptualization of
natural images", Pattern Recognition, vol.38, no.6,
J. Fan , H. Luo, A.K. Elmagarmid, ``Concept-Oriented Indexing of Video
More Effective Retrieval and Browsing", IEEE Trans. on Image Processing, vol.13, no.7, pp.974-992, 2004.
J. Fan , X. Zhu, A.K. Elmagarmid, W.G. Aref, L. Wu,
``ClassView: Hierarchical Video Shot Classification, Indexing,
and Accessing", IEEE Trans. on Multimedia, vol.6, no.1, pp.70-86, 2004.
E. Bertino, J. Fan, E. Ferrari, M.-S. Hacid, A.K. Elmagarmid, Xingquan Zhu,
``A Hierarchical Access Control Model for Video
Database Systems", ACM Trans. on Information Systems, vol.21, no.2, pp.155-191, 2003.
J. Fan, David K.Y. Yau, Ahmed K. Elmagarmid, Walid G. Aref, ``Automatic
image segmentation by integrating color edge detection and
seeded region growing", IEEE Trans. on Image Processing, vol.10,
no.10, Oct., pp.1454-1466, 2001. Source Code
If we know what we were doing, it wouldn't be research, would it?