Please use this identifier to cite or link to this item: http://localhost/handle/Hannan/172794
Title: HNIP: Compact Deep Invariant Representations for Video Matching, Localization, and Retrieval
Authors: Jie Lin;Ling-Yu Duan;Shiqi Wang;Yan Bai;Yihang Lou;Vijay Chandrasekhar;Tiejun Huang;Alex Kot;Wen Gao
Year: 2017
Publisher: IEEE
Abstract: With emerging demand for large-scale video analysis, MPEG initiated the compact descriptor for video analysis (CDVA) standardization in 2014. Beyond handcrafted descriptors adopted by the current MPEG-CDVA reference model, we study the problem of deep learned global descriptors for video matching, localization, and retrieval. First, inspired by a recent invariance theory, we propose a nested invariance pooling (NIP) method to derive compact deep global descriptors from convolutional neural networks (CNNs), by progressively encoding translation, scale, and rotation invariances into the pooled descriptors. Second, our empirical studies have shown that a sequence of well designed pooling moments (e.g., max or average) may drastically impact video matching performance, which motivates us to design hybrid pooling operations via NIP (HNIP). HNIP has further improved the discriminability of deep global descriptors. Third, the technical merits and performance improvements by combining deep and handcrafted descriptors are provided to better investigate the complementary effects. We evaluate the effectiveness of HNIP within the well-established MPEG-CDVA evaluation framework. The extensive experiments have demonstrated that HNIP outperforms the state-of-the-art deep and canonical handcrafted descriptors with significant mAP gains of 5.5&x0025; and 4.7&x0025;, respectively. In particular the combination of HNIP incorporated CNN descriptors and handcrafted global descriptors has significantly boosted the performance of CDVA core techniques with comparable descriptor size.
URI: http://localhost/handle/Hannan/172794
volume: 19
issue: 9
More Information: 1968,
1983
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7944594.pdf1.75 MBAdobe PDF
Title: HNIP: Compact Deep Invariant Representations for Video Matching, Localization, and Retrieval
Authors: Jie Lin;Ling-Yu Duan;Shiqi Wang;Yan Bai;Yihang Lou;Vijay Chandrasekhar;Tiejun Huang;Alex Kot;Wen Gao
Year: 2017
Publisher: IEEE
Abstract: With emerging demand for large-scale video analysis, MPEG initiated the compact descriptor for video analysis (CDVA) standardization in 2014. Beyond handcrafted descriptors adopted by the current MPEG-CDVA reference model, we study the problem of deep learned global descriptors for video matching, localization, and retrieval. First, inspired by a recent invariance theory, we propose a nested invariance pooling (NIP) method to derive compact deep global descriptors from convolutional neural networks (CNNs), by progressively encoding translation, scale, and rotation invariances into the pooled descriptors. Second, our empirical studies have shown that a sequence of well designed pooling moments (e.g., max or average) may drastically impact video matching performance, which motivates us to design hybrid pooling operations via NIP (HNIP). HNIP has further improved the discriminability of deep global descriptors. Third, the technical merits and performance improvements by combining deep and handcrafted descriptors are provided to better investigate the complementary effects. We evaluate the effectiveness of HNIP within the well-established MPEG-CDVA evaluation framework. The extensive experiments have demonstrated that HNIP outperforms the state-of-the-art deep and canonical handcrafted descriptors with significant mAP gains of 5.5&x0025; and 4.7&x0025;, respectively. In particular the combination of HNIP incorporated CNN descriptors and handcrafted global descriptors has significantly boosted the performance of CDVA core techniques with comparable descriptor size.
URI: http://localhost/handle/Hannan/172794
volume: 19
issue: 9
More Information: 1968,
1983
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7944594.pdf1.75 MBAdobe PDF
Title: HNIP: Compact Deep Invariant Representations for Video Matching, Localization, and Retrieval
Authors: Jie Lin;Ling-Yu Duan;Shiqi Wang;Yan Bai;Yihang Lou;Vijay Chandrasekhar;Tiejun Huang;Alex Kot;Wen Gao
Year: 2017
Publisher: IEEE
Abstract: With emerging demand for large-scale video analysis, MPEG initiated the compact descriptor for video analysis (CDVA) standardization in 2014. Beyond handcrafted descriptors adopted by the current MPEG-CDVA reference model, we study the problem of deep learned global descriptors for video matching, localization, and retrieval. First, inspired by a recent invariance theory, we propose a nested invariance pooling (NIP) method to derive compact deep global descriptors from convolutional neural networks (CNNs), by progressively encoding translation, scale, and rotation invariances into the pooled descriptors. Second, our empirical studies have shown that a sequence of well designed pooling moments (e.g., max or average) may drastically impact video matching performance, which motivates us to design hybrid pooling operations via NIP (HNIP). HNIP has further improved the discriminability of deep global descriptors. Third, the technical merits and performance improvements by combining deep and handcrafted descriptors are provided to better investigate the complementary effects. We evaluate the effectiveness of HNIP within the well-established MPEG-CDVA evaluation framework. The extensive experiments have demonstrated that HNIP outperforms the state-of-the-art deep and canonical handcrafted descriptors with significant mAP gains of 5.5&x0025; and 4.7&x0025;, respectively. In particular the combination of HNIP incorporated CNN descriptors and handcrafted global descriptors has significantly boosted the performance of CDVA core techniques with comparable descriptor size.
URI: http://localhost/handle/Hannan/172794
volume: 19
issue: 9
More Information: 1968,
1983
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7944594.pdf1.75 MBAdobe PDF