Please use this identifier to cite or link to this item: http://localhost/handle/Hannan/234517
Title: Discovery of Shared Semantic Spaces for Multiscene Video Query and Summarization
Authors: Xun Xu;Timothy M. Hospedales;Shaogang Gong
Year: 2017
Publisher: IEEE
Abstract: The growing rate of public space closed-circuit television (CCTV) installations has generated a need for automated methods for exploiting video surveillance data, including scene understanding, query, behavior annotation, and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of a similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality and sharing any supervised annotations between different scenes is, however, challenging due to the following reason: some scenes are totally unrelated and thus any information sharing between them would be detrimental, whereas others may share only a subset of common activities and thus information sharing is only useful if it is selective. Moreover, semantically similar activities that should be modeled together and shared across scenes may have quite different pixel-level appearances in each scene. To address these issues, we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviors and further discovers which subset of activities are shared versus scene specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks, including scene activity understanding, cross-scene query-by-example, behavior classification with reduced supervised labeling requirements, and video summarization. In each case, we demonstrate how our multiscene model improves on a collection of standard single-scene models and a flat model of all scenes.
URI: http://localhost/handle/Hannan/234517
volume: 27
issue: 6
More Information: 1353,
1367
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7422088.pdf5.39 MBAdobe PDF
Title: Discovery of Shared Semantic Spaces for Multiscene Video Query and Summarization
Authors: Xun Xu;Timothy M. Hospedales;Shaogang Gong
Year: 2017
Publisher: IEEE
Abstract: The growing rate of public space closed-circuit television (CCTV) installations has generated a need for automated methods for exploiting video surveillance data, including scene understanding, query, behavior annotation, and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of a similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality and sharing any supervised annotations between different scenes is, however, challenging due to the following reason: some scenes are totally unrelated and thus any information sharing between them would be detrimental, whereas others may share only a subset of common activities and thus information sharing is only useful if it is selective. Moreover, semantically similar activities that should be modeled together and shared across scenes may have quite different pixel-level appearances in each scene. To address these issues, we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviors and further discovers which subset of activities are shared versus scene specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks, including scene activity understanding, cross-scene query-by-example, behavior classification with reduced supervised labeling requirements, and video summarization. In each case, we demonstrate how our multiscene model improves on a collection of standard single-scene models and a flat model of all scenes.
URI: http://localhost/handle/Hannan/234517
volume: 27
issue: 6
More Information: 1353,
1367
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7422088.pdf5.39 MBAdobe PDF
Title: Discovery of Shared Semantic Spaces for Multiscene Video Query and Summarization
Authors: Xun Xu;Timothy M. Hospedales;Shaogang Gong
Year: 2017
Publisher: IEEE
Abstract: The growing rate of public space closed-circuit television (CCTV) installations has generated a need for automated methods for exploiting video surveillance data, including scene understanding, query, behavior annotation, and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of a similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality and sharing any supervised annotations between different scenes is, however, challenging due to the following reason: some scenes are totally unrelated and thus any information sharing between them would be detrimental, whereas others may share only a subset of common activities and thus information sharing is only useful if it is selective. Moreover, semantically similar activities that should be modeled together and shared across scenes may have quite different pixel-level appearances in each scene. To address these issues, we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviors and further discovers which subset of activities are shared versus scene specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks, including scene activity understanding, cross-scene query-by-example, behavior classification with reduced supervised labeling requirements, and video summarization. In each case, we demonstrate how our multiscene model improves on a collection of standard single-scene models and a flat model of all scenes.
URI: http://localhost/handle/Hannan/234517
volume: 27
issue: 6
More Information: 1353,
1367
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7422088.pdf5.39 MBAdobe PDF