Add torchreid (deep-person-reid) to docker and Readme

This commit is contained in:
Ruben van de Ven 2023-05-01 15:56:14 +02:00
parent cbab35e6d3
commit e4ac1c3aab
3 changed files with 142 additions and 101 deletions

131
README.md
View File

@ -1,101 +1,30 @@
# Towards-Realtime-MOT
**NEWS:**
- **[2021.08.19]** A [pure C++ re-implementation](https://github.com/samylee/Towards-Realtime-MOT-Cpp) by [samylee](https://github.com/samylee). Helpful if you want to deploy JDE in your own project!
- **[2021.06.01]** A [nice re-implementation](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1/configs/mot) (and document) by Baidu [PaddlePaddle](https://github.com/PaddlePaddle) team.
- **[2020.07.14]** Our paper is accepted to ECCV 2020!
- **[2020.01.29]** More models uploaded! The fastest one runs at around **38 FPS!**.
- **[2019.10.11]** Training and evaluation data uploaded! Please see [DATASET_ZOO.md](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) for details.
- **[2019.10.01]** Demo code and pre-trained model released!
## Introduction
This repo is the a codebase of the Joint Detection and Embedding (JDE) model. JDE is a fast and high-performance multiple-object tracker that learns the object detection task and appearance embedding task simutaneously in a shared neural network. Techical details are described in our [ECCV 2020 paper](https://arxiv.org/pdf/1909.12605v1.pdf). By using this repo, you can simply achieve **MOTA 64%+** on the "private" protocol of [MOT-16 challenge](https://motchallenge.net/tracker/JDE), and with a near real-time speed at **22~38 FPS** (Note this speed is for the entire system, including the detection step! ) .
We hope this repo will help researches/engineers to develop more practical MOT systems. For algorithm development, we provide training data, baseline models and evaluation methods to make a level playground. For application usage, we also provide a small video demo that takes raw videos as input without any bells and whistles.
## Requirements
* Python 3.6
* [Pytorch](https://pytorch.org) >= 1.2.0
* python-opencv
* [py-motmetrics](https://github.com/cheind/py-motmetrics) (`pip install motmetrics`)
* cython-bbox (`pip install cython_bbox`)
* (Optional) ffmpeg (used in the video demo)
* (Optional) [syncbn](https://github.com/ytoon/Synchronized-BatchNorm-PyTorch) (compile and place it under utils/syncbn, or simply replace with nn.BatchNorm [here](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/models.py#L12))
* ~~[maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) (Their GPU NMS is used in this project)~~
## Video Demo
<img src="assets/MOT16-03.gif" width="400"/> <img src="assets/MOT16-14.gif" width="400"/>
<img src="assets/IMG_0055.gif" width="400"/> <img src="assets/000011-00001.gif" width="400"/>
Usage:
```
python demo.py --input-video path/to/your/input/video --weights path/to/model/weights
--output-format video --output-root path/to/output/root
```
## Docker demo example
```bash
docker build -t towards-realtime-mot docker/
docker run --rm --gpus all -v $(pwd)/:/Towards-Realtime-MOT -ti towards-realtime-mot /bin/bash
cd /Towards-Realtime-MOT;
python demo.py --input-video path/to/your/input/video --weights path/to/model/weights
--output-format video --output-root path/to/output/root
```
## Dataset zoo
Please see [DATASET_ZOO.md](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) for detailed description of the training/evaluation datasets.
## Pretrained model and baseline models
Darknet-53 ImageNet pretrained model: [[DarkNet Official]](https://pjreddie.com/media/files/darknet53.conv.74)
Trained models with different input resolutions:
|Model| MOTA | IDF1 | IDS | FP | FN | FPS | Link |
|-----|------|------|-----|----|----|-----|------|
|JDE-1088x608| 73.1| 68.9| 1312|6593| 21788| 22.2| [[Google]](https://drive.google.com/open?id=1nlnuYfGNuHWZztQHXwVZSL_FvfE551pA) [[Baidu]](https://pan.baidu.com/s/1Ifgn0Y_JZE65_qSrQM2l-Q) |
|JDE-864x480| 70.8| 65.8| 1279| 5653| 25806| 30.3| [[Google]](https://drive.google.com/open?id=1UKgkYrsV-59kYaHgWeJ70p5Mij3QWuFr) [[Baidu]](https://pan.baidu.com/s/1rBQ7DFjhLQbEq6JTJRntKA) |
|JDE-576x320| 63.7| 63.3| 1307| 6657| 32794| 37.9|[[Google]](https://drive.google.com/file/d/1sca65sHMnxY7YJ89FJ6Dg3S3yAjbLdMz/view?usp=sharing) [[Baidu]](https://pan.baidu.com/s/1cCulbPNneIXOpRRjrTgJ4g) |
The performance is tested on the MOT-16 training set, just for reference. Running speed is tested on an Nvidia Titan Xp GPU. For a more comprehensive comparison with other methods you can test on MOT-16 test set and submit a result to the [MOT-16 benchmark](https://motchallenge.net/results/MOT16/?det=Private). Note that the results should be submitted to the private detector track.
## Test on MOT-16 Challenge
```
python track.py --cfg ./cfg/yolov3_1088x608.cfg --weights /path/to/model/weights
```
By default the script runs evaluation on the MOT-16 training set. If you want to evaluate on the test set, please add `--test-mot16` to the command line.
Results are saved in text files in `$DATASET_ROOT/results/*.txt`. You can also add `--save-images` or `--save-videos` flags to obtain the visualized results. Visualized results are saved in `$DATASET_ROOT/outputs/`
## Training instruction
- Download the training datasets.
- Edit `cfg/ccmcpe.json`, config the training/validation combinations. A dataset is represented by an image list, please see `data/*.train` for example.
- Run the training script:
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py
```
We use 8x Nvidia Titan Xp to train the model, with a batch size of 32. You can adjust the batch size (and the learning rate together) according to how many GPUs your have. You can also train with smaller image size, which will bring faster inference time. But note the image size had better to be multiples of 32 (the down-sampling rate).
### Train with custom datasets
Adding custom datsets is quite simple, all you need to do is to organize your annotation files in the same format as in our training sets. Please refer to [DATASET_ZOO.md](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) for the dataset format.
## Related Resources
- [FairMOT](https://github.com/ifzhang/FairMOT): An improved method based on the JDE framework, SOTA performance.
- [CSTrack](https://arxiv.org/pdf/2010.12138.pdf): Better disentangled detection/embedding heads for JDE.
- [JDE-Paddle](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1/configs/mot): A nice re-implementation (and document) by Baidu [PaddlePaddle](https://github.com/PaddlePaddle) team.
- [JDE-CPP](https://github.com/samylee/Towards-Realtime-MOT-Cpp): A pure C++ re-implementation by [samylee](https://github.com/samylee). Helpful if you want to deploy JDE in your own project!
## Acknowledgement
A large portion of code is borrowed from [ultralytics/yolov3](https://github.com/ultralytics/yolov3) and [longcw/MOTDT](https://github.com/longcw/MOTDT), many thanks to their wonderful work!
## Citation
If you find this repo useful in your project or research, please consider citing it:
```
@article{wang2019towards,
title={Towards Real-Time Multi-Object Tracking},
author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
journal={The European Conference on Computer Vision (ECCV)},
year={2020}
}
```
Repo built on Towards-Realtime-MOT, but now used to work with multiple (multiple) object trackers.
## Towards Realtime MOT
See also [Original Readme](README.original.md).
## Torchreid Feature Extractor
See [documentation](https://kaiyangzhou.github.io/deep-person-reid/user_guide.html#use-torchreid-as-a-feature-extractor-in-your-projects).
Torchreid offers [various pretrained](https://kaiyangzhou.github.io/deep-person-reid/MODEL_ZOO) models
```bash
# osnet_ain_x1_0
wget --content-disposition https://drive.google.com/uc?id=1-CaioD9NaqbHK_kzSMW8VE4_3KcsRjEo
# osnet_ain_x1_0_msmt17_256x128_amsgrad_ep50_lr0.0015_coslr_b64_fb10_softmax_labsmth_flip_jitter.pth
wget --content-disposition https://drive.google.com/uc?id=1SigwBE6mPdqiJMqhuIY4aqC7--5CsMal
```
## TODO:
- DeepSort, eg. https://medium.com/axinc-ai/deepsort-a-machine-learning-model-for-tracking-people-1170743b5984
- https://github.com/ifzhang/ByteTrack?ref=blog.roboflow.com
- https://paperswithcode.com/sota/multi-object-tracking-on-mot17?p=bytetrack-multi-object-tracking-by-1
- https://github.com/tryolabs/norfair
- https://github.com/ale152/ClusterSkeletonTracklets

101
README.original.md Normal file
View File

@ -0,0 +1,101 @@
# Towards-Realtime-MOT
**NEWS:**
- **[2021.08.19]** A [pure C++ re-implementation](https://github.com/samylee/Towards-Realtime-MOT-Cpp) by [samylee](https://github.com/samylee). Helpful if you want to deploy JDE in your own project!
- **[2021.06.01]** A [nice re-implementation](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1/configs/mot) (and document) by Baidu [PaddlePaddle](https://github.com/PaddlePaddle) team.
- **[2020.07.14]** Our paper is accepted to ECCV 2020!
- **[2020.01.29]** More models uploaded! The fastest one runs at around **38 FPS!**.
- **[2019.10.11]** Training and evaluation data uploaded! Please see [DATASET_ZOO.md](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) for details.
- **[2019.10.01]** Demo code and pre-trained model released!
## Introduction
This repo is the a codebase of the Joint Detection and Embedding (JDE) model. JDE is a fast and high-performance multiple-object tracker that learns the object detection task and appearance embedding task simutaneously in a shared neural network. Techical details are described in our [ECCV 2020 paper](https://arxiv.org/pdf/1909.12605v1.pdf). By using this repo, you can simply achieve **MOTA 64%+** on the "private" protocol of [MOT-16 challenge](https://motchallenge.net/tracker/JDE), and with a near real-time speed at **22~38 FPS** (Note this speed is for the entire system, including the detection step! ) .
We hope this repo will help researches/engineers to develop more practical MOT systems. For algorithm development, we provide training data, baseline models and evaluation methods to make a level playground. For application usage, we also provide a small video demo that takes raw videos as input without any bells and whistles.
## Requirements
* Python 3.6
* [Pytorch](https://pytorch.org) >= 1.2.0
* python-opencv
* [py-motmetrics](https://github.com/cheind/py-motmetrics) (`pip install motmetrics`)
* cython-bbox (`pip install cython_bbox`)
* (Optional) ffmpeg (used in the video demo)
* (Optional) [syncbn](https://github.com/ytoon/Synchronized-BatchNorm-PyTorch) (compile and place it under utils/syncbn, or simply replace with nn.BatchNorm [here](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/models.py#L12))
* ~~[maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) (Their GPU NMS is used in this project)~~
## Video Demo
<img src="assets/MOT16-03.gif" width="400"/> <img src="assets/MOT16-14.gif" width="400"/>
<img src="assets/IMG_0055.gif" width="400"/> <img src="assets/000011-00001.gif" width="400"/>
Usage:
```
python demo.py --input-video path/to/your/input/video --weights path/to/model/weights
--output-format video --output-root path/to/output/root
```
## Docker demo example
```bash
docker build -t towards-realtime-mot docker/
docker run --rm --gpus all -v $(pwd)/:/Towards-Realtime-MOT -ti towards-realtime-mot /bin/bash
cd /Towards-Realtime-MOT;
python demo.py --input-video path/to/your/input/video --weights path/to/model/weights
--output-format video --output-root path/to/output/root
```
## Dataset zoo
Please see [DATASET_ZOO.md](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) for detailed description of the training/evaluation datasets.
## Pretrained model and baseline models
Darknet-53 ImageNet pretrained model: [[DarkNet Official]](https://pjreddie.com/media/files/darknet53.conv.74)
Trained models with different input resolutions:
|Model| MOTA | IDF1 | IDS | FP | FN | FPS | Link |
|-----|------|------|-----|----|----|-----|------|
|JDE-1088x608| 73.1| 68.9| 1312|6593| 21788| 22.2| [[Google]](https://drive.google.com/open?id=1nlnuYfGNuHWZztQHXwVZSL_FvfE551pA) [[Baidu]](https://pan.baidu.com/s/1Ifgn0Y_JZE65_qSrQM2l-Q) |
|JDE-864x480| 70.8| 65.8| 1279| 5653| 25806| 30.3| [[Google]](https://drive.google.com/open?id=1UKgkYrsV-59kYaHgWeJ70p5Mij3QWuFr) [[Baidu]](https://pan.baidu.com/s/1rBQ7DFjhLQbEq6JTJRntKA) |
|JDE-576x320| 63.7| 63.3| 1307| 6657| 32794| 37.9|[[Google]](https://drive.google.com/file/d/1sca65sHMnxY7YJ89FJ6Dg3S3yAjbLdMz/view?usp=sharing) [[Baidu]](https://pan.baidu.com/s/1cCulbPNneIXOpRRjrTgJ4g) |
The performance is tested on the MOT-16 training set, just for reference. Running speed is tested on an Nvidia Titan Xp GPU. For a more comprehensive comparison with other methods you can test on MOT-16 test set and submit a result to the [MOT-16 benchmark](https://motchallenge.net/results/MOT16/?det=Private). Note that the results should be submitted to the private detector track.
## Test on MOT-16 Challenge
```
python track.py --cfg ./cfg/yolov3_1088x608.cfg --weights /path/to/model/weights
```
By default the script runs evaluation on the MOT-16 training set. If you want to evaluate on the test set, please add `--test-mot16` to the command line.
Results are saved in text files in `$DATASET_ROOT/results/*.txt`. You can also add `--save-images` or `--save-videos` flags to obtain the visualized results. Visualized results are saved in `$DATASET_ROOT/outputs/`
## Training instruction
- Download the training datasets.
- Edit `cfg/ccmcpe.json`, config the training/validation combinations. A dataset is represented by an image list, please see `data/*.train` for example.
- Run the training script:
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py
```
We use 8x Nvidia Titan Xp to train the model, with a batch size of 32. You can adjust the batch size (and the learning rate together) according to how many GPUs your have. You can also train with smaller image size, which will bring faster inference time. But note the image size had better to be multiples of 32 (the down-sampling rate).
### Train with custom datasets
Adding custom datsets is quite simple, all you need to do is to organize your annotation files in the same format as in our training sets. Please refer to [DATASET_ZOO.md](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) for the dataset format.
## Related Resources
- [FairMOT](https://github.com/ifzhang/FairMOT): An improved method based on the JDE framework, SOTA performance.
- [CSTrack](https://arxiv.org/pdf/2010.12138.pdf): Better disentangled detection/embedding heads for JDE.
- [JDE-Paddle](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1/configs/mot): A nice re-implementation (and document) by Baidu [PaddlePaddle](https://github.com/PaddlePaddle) team.
- [JDE-CPP](https://github.com/samylee/Towards-Realtime-MOT-Cpp): A pure C++ re-implementation by [samylee](https://github.com/samylee). Helpful if you want to deploy JDE in your own project!
## Acknowledgement
A large portion of code is borrowed from [ultralytics/yolov3](https://github.com/ultralytics/yolov3) and [longcw/MOTDT](https://github.com/longcw/MOTDT), many thanks to their wonderful work!
## Citation
If you find this repo useful in your project or research, please consider citing it:
```
@article{wang2019towards,
title={Towards Real-Time Multi-Object Tracking},
author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
journal={The European Conference on Computer Vision (ECCV)},
year={2020}
}
```

View File

@ -14,7 +14,18 @@ RUN pip install ipython
RUN pip install ipywidgets
#RUN pip install panel jupyter_bokeh
# from https://github.com/KaiyangZhou/deep-person-reid/blob/master/Dockerfile
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=Europe/Amsterdam
RUN apt update && apt install -y python3-opencv ca-certificates python3-dev git wget sudo ninja-build
RUN git clone https://github.com/KaiyangZhou/deep-person-reid/ /deep-person-reid
WORKDIR /deep-person-reid
RUN pip install -r requirements.txt
RUN python setup.py develop
# for bokeh
EXPOSE 5006
ENV TORCH_HOME=/Towards-Realtime-MOT/.torch
WORKDIR /Towards-Realtime-MOT
CMD python -m ipykernel_launcher -f $DOCKERNEL_CONNECTION_FILE