Towards-Realtime-MOT/DATASET_ZOO.md

# Dataset Zoo
We provide several relevant datasets for training and evaluating the Joint Detection and Embedding (JDE) model. 
Annotations are provided in a unified format. If you want to use these datasets, please **follow their licenses**, 
and if you use any of these datasets in your research, please cite the original work (you can find the BibTeX in the bottom).
## Data Format
All the datasets have the following structure:
```
Caltech
   |——————images
   |        └——————00001.jpg
   |        |—————— ...
   |        └——————0000N.jpg
   └——————labels_with_ids
            └——————00001.txt
            |—————— ...
            └——————0000N.txt
```
Every image has a corresponding annotation text. Given an image path, 
the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`.

In the annotation text, each line is describing a bounding box and has the following format:
```
[class] [identity] [x_center] [y_center] [width] [height]
```
The field `[class]` should be `0`. Only single-class multi-object tracking is supported in this version. 

The field `[identity]` is an integer from `0` to `num_identities - 1`, or `-1` if this box has no identity annotation.

***Note** that the values of `[x_center] [y_center] [width] [height]` are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1.

## Download

### Caltech Pedestrian
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/1sYBXXvQaXZ8TuNwQxMcAgg)
[[1]](https://pan.baidu.com/s/1lVO7YBzagex1xlzqPksaPw) 
[[2]](https://pan.baidu.com/s/1PZXxxy_lrswaqTVg0GuHWg)
[[3]](https://pan.baidu.com/s/1M93NCo_E6naeYPpykmaNgA)
[[4]](https://pan.baidu.com/s/1ZXCdPNXfwbxQ4xCbVu5Dtw)
[[5]](https://pan.baidu.com/s/1kcZkh1tcEiBEJqnDtYuejg)
[[6]](https://pan.baidu.com/s/1sDjhtgdFrzR60KKxSjNb2A)
[[7]](https://pan.baidu.com/s/18Zvp_d33qj1pmutFDUbJyw)

Google Drive: [[annotations]](https://drive.google.com/file/d/1h8vxl_6tgi9QVYoer9XcY9YwNB32TE5k/view?usp=sharing) , 
please download all the images `.tar` files from [this page](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/) and unzip the images under `Caltech/images`

You may need [this tool](https://github.com/mitmul/caltech-pedestrian-dataset-converter) to convert the original data format to jpeg images.
Original dataset webpage: [CaltechPedestrians](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/)
### CityPersons
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/1g24doGOdkKqmbgbJf03vsw)
[[1]](https://pan.baidu.com/s/1mqDF9M5MdD3MGxSfe0ENsA) 
[[2]](https://pan.baidu.com/s/1Qrbh9lQUaEORCIlfI25wdA)
[[3]](https://pan.baidu.com/s/1lw7shaffBgARDuk8mkkHhw)

Google Drive:
[[0]](https://drive.google.com/file/d/1DgLHqEkQUOj63mCrS_0UGFEM9BG8sIZs/view?usp=sharing)
[[1]](https://drive.google.com/file/d/1BH9Xz59UImIGUdYwUR-cnP1g7Ton_LcZ/view?usp=sharing) 
[[2]](https://drive.google.com/file/d/1q_OltirP68YFvRWgYkBHLEFSUayjkKYE/view?usp=sharing)
[[3]](https://drive.google.com/file/d/1VSL0SFoQxPXnIdBamOZJzHrHJ1N2gsTW/view?usp=sharing)

Original dataset webpage: [Citypersons pedestrian detection dataset](https://bitbucket.org/shanshanzhang/citypersons)

### CUHK-SYSU
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/1YFrlyB1WjcQmFW3Vt_sEaQ)

Google Drive:
[[0]](https://drive.google.com/file/d/1D7VL43kIV9uJrdSCYl53j89RE2K-IoQA/view?usp=sharing)

Original dataset webpage: [CUHK-SYSU Person Search Dataset](http://www.ee.cuhk.edu.hk/~xgwang/PS/dataset.html)

### PRW
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/1iqOVKO57dL53OI1KOmWeGQ)

Google Drive:
[[0]](https://drive.google.com/file/d/116_mIdjgB-WJXGe8RYJDWxlFnc_4sqS8/view?usp=sharing)

Original dataset webpage: [Person Search in the Wild datset](http://www.liangzheng.com.cn/Project/project_prw.html)

### ETHZ (overlapping videos with MOT-16 removed):
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/14EauGb2nLrcB3GRSlQ4K9Q)

Google Drive:
[[0]](https://drive.google.com/file/d/19QyGOCqn8K_rc9TXJ8UwLSxCx17e0GoY/view?usp=sharing)

Original dataset webpage: [ETHZ pedestrian datset](https://data.vision.ee.ethz.ch/cvl/aess/dataset/)

### MOT-17
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/1lHa6UagcosRBz-_Y308GvQ)

Google Drive:
[[0]](https://drive.google.com/file/d/1ET-6w12yHNo8DKevOVgK1dBlYs739e_3/view?usp=sharing)

Original dataset webpage: [MOT-17](https://motchallenge.net/data/MOT17/)

### MOT-16 (for evaluation )
Baidu NetDisk: 
[[0]](https://pan.baidu.com/s/10pUuB32Hro-h-KUZv8duiw)

Google Drive:
[[0]](https://drive.google.com/file/d/1254q3ruzBzgn4LUejDVsCtT05SIEieQg/view?usp=sharing)

Original dataset webpage: [MOT-16](https://motchallenge.net/data/MOT16/)


# Citation
Caltech:
```
@inproceedings{ dollarCVPR09peds,
       author = "P. Doll\'ar and C. Wojek and B. Schiele and  P. Perona",
       title = "Pedestrian Detection: A Benchmark",
       booktitle = "CVPR",
       month = "June",
       year = "2009",
       city = "Miami",
}
```
Citypersons:
```
@INPROCEEDINGS{Shanshan2017CVPR,
  Author = {Shanshan Zhang and Rodrigo Benenson and Bernt Schiele},
  Title = {CityPersons: A Diverse Dataset for Pedestrian Detection},
  Booktitle = {CVPR},
  Year = {2017}
 }

@INPROCEEDINGS{Cordts2016Cityscapes,
title={The Cityscapes Dataset for Semantic Urban Scene Understanding},
author={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},
booktitle={Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2016}
}
```
CUHK-SYSU:
```
@inproceedings{xiaoli2017joint,
  title={Joint Detection and Identification Feature Learning for Person Search},
  author={Xiao, Tong and Li, Shuang and Wang, Bochao and Lin, Liang and Wang, Xiaogang},
  booktitle={CVPR},
  year={2017}
}
```
PRW:
```
@inproceedings{zheng2017person,
  title={Person re-identification in the wild},
  author={Zheng, Liang and Zhang, Hengheng and Sun, Shaoyan and Chandraker, Manmohan and Yang, Yi and Tian, Qi},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={1367--1376},
  year={2017}
}
```
ETHZ:
```
@InProceedings{eth_biwi_00534,
author = {A. Ess and B. Leibe and K. Schindler and and L. van Gool},
title = {A Mobile Vision System for Robust Multi-Person Tracking},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08)},
year = {2008},
month = {June},
publisher = {IEEE Press},
keywords = {}
}
```
MOT-16&17:
```
@article{milan2016mot16,
  title={MOT16: A benchmark for multi-object tracking},
  author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
  journal={arXiv preprint arXiv:1603.00831},
  year={2016}
}
```
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00			`# Dataset Zoo`
			`We provide several relevant datasets for training and evaluating the Joint Detection and Embedding (JDE) model.`
			`Annotations are provided in a unified format. If you want to use these datasets, please follow their licenses,`
Minor documentation fixes (#91) 2020-02-17 08:07:27 +01:00			`and if you use any of these datasets in your research, please cite the original work (you can find the BibTeX in the bottom).`
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00			`## Data Format`
Minor documentation fixes (#91) 2020-02-17 08:07:27 +01:00			`All the datasets have the following structure:`
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00			```
			`Caltech`
			`\|——————images`
			`\| └——————00001.jpg`
			`\| \|—————— ...`
			`\| └——————0000N.jpg`
			`└——————labels_with_ids`
			`└——————00001.txt`
			`\|—————— ...`
			`└——————0000N.txt`
			```
Minor documentation fixes (#91) 2020-02-17 08:07:27 +01:00			`Every image has a corresponding annotation text. Given an image path,`
			the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`.
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00
Minor documentation fixes (#91) 2020-02-17 08:07:27 +01:00			`In the annotation text, each line is describing a bounding box and has the following format:`
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00			```
			`[class] [identity] [x_center] [y_center] [width] [height]`
			```
Update DATASET_ZOO.md 2020-01-10 07:11:45 +01:00			The field `[class]` should be `0`. Only single-class multi-object tracking is supported in this version.
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00
			The field `[identity]` is an integer from `0` to `num_identities - 1`, or `-1` if this box has no identity annotation.

Minor documentation fixes (#91) 2020-02-17 08:07:27 +01:00			*Note that the values of `[x_center] [y_center] [width] [height]` are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1.
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00
			`## Download`

			`### Caltech Pedestrian`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/1sYBXXvQaXZ8TuNwQxMcAgg)`
			`[[1]](https://pan.baidu.com/s/1lVO7YBzagex1xlzqPksaPw)`
			`[[2]](https://pan.baidu.com/s/1PZXxxy_lrswaqTVg0GuHWg)`
			`[[3]](https://pan.baidu.com/s/1M93NCo_E6naeYPpykmaNgA)`
			`[[4]](https://pan.baidu.com/s/1ZXCdPNXfwbxQ4xCbVu5Dtw)`
			`[[5]](https://pan.baidu.com/s/1kcZkh1tcEiBEJqnDtYuejg)`
			`[[6]](https://pan.baidu.com/s/1sDjhtgdFrzR60KKxSjNb2A)`
			`[[7]](https://pan.baidu.com/s/18Zvp_d33qj1pmutFDUbJyw)`

Minor documentation fixes (#91) 2020-02-17 08:07:27 +01:00			`Google Drive: [[annotations]](https://drive.google.com/file/d/1h8vxl_6tgi9QVYoer9XcY9YwNB32TE5k/view?usp=sharing) ,`
			please download all the images `.tar` files from [this page](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/) and unzip the images under `Caltech/images`
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00
Update DATASET_ZOO.md 2020-04-20 13:42:01 +02:00			`You may need [this tool](https://github.com/mitmul/caltech-pedestrian-dataset-converter) to convert the original data format to jpeg images.`
replace maskrcnn-benchmark nms with torchvision nms 2020-01-09 15:48:17 +01:00			`Original dataset webpage: [CaltechPedestrians](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/)`
			`### CityPersons`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/1g24doGOdkKqmbgbJf03vsw)`
			`[[1]](https://pan.baidu.com/s/1mqDF9M5MdD3MGxSfe0ENsA)`
			`[[2]](https://pan.baidu.com/s/1Qrbh9lQUaEORCIlfI25wdA)`
			`[[3]](https://pan.baidu.com/s/1lw7shaffBgARDuk8mkkHhw)`

			`Google Drive:`
			`[[0]](https://drive.google.com/file/d/1DgLHqEkQUOj63mCrS_0UGFEM9BG8sIZs/view?usp=sharing)`
			`[[1]](https://drive.google.com/file/d/1BH9Xz59UImIGUdYwUR-cnP1g7Ton_LcZ/view?usp=sharing)`
			`[[2]](https://drive.google.com/file/d/1q_OltirP68YFvRWgYkBHLEFSUayjkKYE/view?usp=sharing)`
			`[[3]](https://drive.google.com/file/d/1VSL0SFoQxPXnIdBamOZJzHrHJ1N2gsTW/view?usp=sharing)`

			`Original dataset webpage: [Citypersons pedestrian detection dataset](https://bitbucket.org/shanshanzhang/citypersons)`

			`### CUHK-SYSU`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/1YFrlyB1WjcQmFW3Vt_sEaQ)`

			`Google Drive:`
			`[[0]](https://drive.google.com/file/d/1D7VL43kIV9uJrdSCYl53j89RE2K-IoQA/view?usp=sharing)`

			`Original dataset webpage: [CUHK-SYSU Person Search Dataset](http://www.ee.cuhk.edu.hk/~xgwang/PS/dataset.html)`

			`### PRW`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/1iqOVKO57dL53OI1KOmWeGQ)`

			`Google Drive:`
			`[[0]](https://drive.google.com/file/d/116_mIdjgB-WJXGe8RYJDWxlFnc_4sqS8/view?usp=sharing)`

			`Original dataset webpage: [Person Search in the Wild datset](http://www.liangzheng.com.cn/Project/project_prw.html)`

			`### ETHZ (overlapping videos with MOT-16 removed):`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/14EauGb2nLrcB3GRSlQ4K9Q)`

			`Google Drive:`
			`[[0]](https://drive.google.com/file/d/19QyGOCqn8K_rc9TXJ8UwLSxCx17e0GoY/view?usp=sharing)`

			`Original dataset webpage: [ETHZ pedestrian datset](https://data.vision.ee.ethz.ch/cvl/aess/dataset/)`

			`### MOT-17`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/1lHa6UagcosRBz-_Y308GvQ)`

			`Google Drive:`
			`[[0]](https://drive.google.com/file/d/1ET-6w12yHNo8DKevOVgK1dBlYs739e_3/view?usp=sharing)`

			`Original dataset webpage: [MOT-17](https://motchallenge.net/data/MOT17/)`

			`### MOT-16 (for evaluation )`
			`Baidu NetDisk:`
			`[[0]](https://pan.baidu.com/s/10pUuB32Hro-h-KUZv8duiw)`

			`Google Drive:`
			`[[0]](https://drive.google.com/file/d/1254q3ruzBzgn4LUejDVsCtT05SIEieQg/view?usp=sharing)`

			`Original dataset webpage: [MOT-16](https://motchallenge.net/data/MOT16/)`


			`# Citation`
			`Caltech:`
			```
			`@inproceedings{ dollarCVPR09peds,`
			`author = "P. Doll\'ar and C. Wojek and B. Schiele and P. Perona",`
			`title = "Pedestrian Detection: A Benchmark",`
			`booktitle = "CVPR",`
			`month = "June",`
			`year = "2009",`
			`city = "Miami",`
			`}`
			```
			`Citypersons:`
			```
			`@INPROCEEDINGS{Shanshan2017CVPR,`
			`Author = {Shanshan Zhang and Rodrigo Benenson and Bernt Schiele},`
			`Title = {CityPersons: A Diverse Dataset for Pedestrian Detection},`
			`Booktitle = {CVPR},`
			`Year = {2017}`
			`}`

			`@INPROCEEDINGS{Cordts2016Cityscapes,`
			`title={The Cityscapes Dataset for Semantic Urban Scene Understanding},`
			`author={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},`
			`booktitle={Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},`
			`year={2016}`
			`}`
			```
			`CUHK-SYSU:`
			```
			`@inproceedings{xiaoli2017joint,`
			`title={Joint Detection and Identification Feature Learning for Person Search},`
			`author={Xiao, Tong and Li, Shuang and Wang, Bochao and Lin, Liang and Wang, Xiaogang},`
			`booktitle={CVPR},`
			`year={2017}`
			`}`
			```
			`PRW:`
			```
			`@inproceedings{zheng2017person,`
			`title={Person re-identification in the wild},`
			`author={Zheng, Liang and Zhang, Hengheng and Sun, Shaoyan and Chandraker, Manmohan and Yang, Yi and Tian, Qi},`
			`booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},`
			`pages={1367--1376},`
			`year={2017}`
			`}`
			```
			`ETHZ:`
			```
			`@InProceedings{eth_biwi_00534,`
			`author = {A. Ess and B. Leibe and K. Schindler and and L. van Gool},`
			`title = {A Mobile Vision System for Robust Multi-Person Tracking},`
			`booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08)},`
			`year = {2008},`
			`month = {June},`
			`publisher = {IEEE Press},`
			`keywords = {}`
			`}`
			```
			`MOT-16&17:`
			```
			`@article{milan2016mot16,`
			`title={MOT16: A benchmark for multi-object tracking},`
			`author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},`
			`journal={arXiv preprint arXiv:1603.00831},`
			`year={2016}`
			`}`
			```