研究生(外文):Man-Yu Lee
論文名稱(外文):Unsupervised Person Re-Identification with Hard Samples Rectification in a Real-time Multi-Camera Tracking System
指導教授(外文):Shao-Yi Chien
口試委員(外文):Yung-Yu ChuangYu TsaoChu-Song ChenHsing-Kuo Pao
外文關鍵詞:Deep learningUnsupervised learningPerson re-identificationMulti-camera tracking
在這篇論文中,我們提出了一個針對行人重識別具被困難樣本校正的非監督式演算法(HSR),解決了聚類分析容易受到困難樣本的影響而表現不佳的問題。我們提出的HSR包含兩個面向,一個是跨相機的困難正樣本收集,能夠幫助辨別不同相機下的同一個人; 另一個是透過檢查局部同質性來區別具有相似外觀的不同人,也就是困難負樣本。藉由我們的兩個面向的訓練方法,可以修正那些困難樣本並且用準確的標籤資料訓練模型以提高性能。我們進行了大量實驗來證明了我們的方法比起現在最先進的非監督式方法表現得還要更好。
Multi-Camera Tracking (MCT) is a crucial technology in an envisioned smart city which aims to track multiple people through a network of cameras. While MCT is a notoriously difficult problem to solve, a popular research topic has derived from the final step of the matching scheme, person re-identification (re-ID), which address the problem of recognizing people across cameras with visual appearance. Although person re-ID has received great improvement due to the rise of the Convolution Neural Network (CNN) with the supervised learning methods, the task of unsupervised cross-domain re-ID is still challenging owing to the lack of labelled data in the target domain. In addition to the matching accuracy, being able to implement an workable MCT system is also critical factor for IoT surveillance applications. However, as IoT devices become more widespread, the MCT system will need to implement on edge devices to reduce network latency and data transmission and enable for more efficient real-time applications.
In this thesis, we propose a unsupervised learning scheme of Hard Samples Rectification (HSR) for person re-ID which resolves the weakness of original clustering-based methods being vulnerable to the hard positive and negative samples in the dataset. Our proposed HSR contains two learning facets, an inter-camera mining technique which helps recognize the same person under different camera views (hard positive), and a part-based homogeneity technique that makes the re-ID model identify different person but with similar appearance (hard negative) by examining the local homogeneity. By jointly rectifying the hard samples with our dual-faceted learning scheme, the re-ID model can learn on more accurate hard cases to improve the performance. Extensive experiments on two large-scale benchmarks demonstrate the superiority of our HSR over state-of-the-art methods.
Furthermore, we proposed the multi-camera tracking system on a real-world hardware with an efficient framework to demonstrate the viability on edge devices. Specifically, we leverage the system pipeline and the characteristic of each operator of MCT system to eliminate the need for tremendous amount of computational resources. By effectively allocate the computing power, our proposed framework achieves favorable performance and is able to run in real-time on mobile hardware. Comprehensive experiments are conducted to illustrate the correlation between each component in MCT and show the utility of our proposed MCT system.
Abstract i
List of Figures v
List of Tables vii
1 Introduction 1
1.1 Re-identification 1
1.2 Multi-Camera Tracking 4
1.3 Contribution 7
2. Hard Samples Rectification for Unsupervised Person Re-identification 9
2.1 Related Work 11
2.2 Proposed Method 14
2.2.1 Overview of HSR learning scheme 14
2.2.2 Inter-Camera Mining 15
2.2.3 Part-based Homogeneity 16
2.2.4 Optimization Procedure 18
2.3 Experiments 19
2.3.1 Datasets and Evaluation Protocol 19
2.3.2 Implementation Details 20
2.3.3 Comparison with State-of-the-arts 20
2.3.4 Ablation Study 22
2.4 Conclusion 27
3 Multi-Camera Tracking 29
3.1 Offline Multi-Camera Tracking 30
3.1.1 MCT framework 30
3.1.2 Experiments 33
3.2 Online MCT System 36
3.2.1 System Overview 36
3.2.2 Experiments 37
3.3 Proposed Framework of Real-time MCT System 42
3.3.1 System Design 42
3.3.2 Experiments 44
4 Conclusion 55
Reference 57
