Compare commits

...

31 commits

Author SHA1 Message Date
Ruben van de Ven
cebe102e74 Tracker more sensitive and cv_renderer animates lines 2024-12-23 15:35:51 +01:00
Ruben van de Ven
888ef7ff93 fixes during playtest 2024-12-20 16:22:31 +01:00
Ruben van de Ven
dc5e2ff28c change render switches 2024-12-18 15:24:42 +01:00
Ruben van de Ven
8bcca04ecc fixes for projection trial 2024-12-18 12:04:29 +01:00
Ruben van de Ven
5ceeda05d7 slowly fade predicted tracks 2024-12-17 11:10:31 +01:00
Ruben van de Ven
6b12ddf08a CV renderer and binning options 2024-12-17 10:37:30 +01:00
Ruben van de Ven
d8004e9125 Update YOLO & move Trajectron dependency to git 2024-12-13 13:37:24 +01:00
Ruben van de Ven
a7c6aaacd3 Notebook updates 2024-12-10 15:43:30 +01:00
Ruben van de Ven
d6d3092e43 Blacklist tools 2024-12-10 15:43:06 +01:00
Ruben van de Ven
5bf60e4579 Notebook to visualise predictions of a trained model 2024-12-06 08:29:42 +01:00
Ruben van de Ven
dac5caf4fb Visualise data used for training 2024-12-06 08:28:48 +01:00
Ruben van de Ven
d6eac14898 Different smoothing and filtering before parsing data 2024-12-06 08:27:17 +01:00
Ruben van de Ven
0f96611771 Tweak training process 2024-12-03 15:21:52 +01:00
Ruben van de Ven
a4e57ae637 add tensorboard for training monitoring 2024-12-03 15:20:43 +01:00
Ruben van de Ven
a590a0dc35 Tools for blacklisting tracks 2024-11-28 16:08:55 +01:00
Ruben van de Ven
30648b9bb8 Formatting 2024-11-21 13:29:56 +01:00
Ruben van de Ven
389da6701f Tweak tracker to get better tracks 2024-11-20 16:59:22 +01:00
Ruben van de Ven
abc80727da Test a custom LSTM/RNN 2024-11-17 19:39:32 +01:00
Ruben van de Ven
cc952424e0 Data and viz options 2024-11-14 17:01:32 +01:00
Ruben van de Ven
627b320ec7 Build python-opencv with gstreamer support 2024-11-14 11:37:37 +01:00
Ruben van de Ven
204f8a836b Fix issue with logging queue 2024-11-14 11:37:05 +01:00
Ruben van de Ven
0612aa2048 Experiment with prediction 2024-11-12 21:37:20 +01:00
Ruben van de Ven
a2ced9646f Threaded logging to avoid pauses in code 2024-11-12 21:36:37 +01:00
Ruben van de Ven
f2d71a9da3 Different cornerpin and some tweaks for tryout 2024-11-08 18:47:15 +01:00
Ruben van de Ven
dd10ce13af Track directly from rtsp 2024-11-08 14:59:21 +01:00
Ruben van de Ven
b6360c09a3 fix vertical position of image 2024-11-07 16:22:40 +01:00
Ruben van de Ven
e6187964d3 display debug points from homography 2024-11-07 16:07:56 +01:00
Ruben van de Ven
0af5030845 Gstreamer for rtsp 2024-11-07 15:18:33 +01:00
Ruben van de Ven
9284ce8849 tracker tool for fast tracking data n camera undistort 2024-11-06 16:22:03 +01:00
Ruben van de Ven
a0c63c4929 Preliminary rendering of second window with only animation 2024-11-05 19:16:57 +01:00
Ruben van de Ven
2e2bd76b05 Experiment with separate renderer for actual projection 2024-11-05 16:56:43 +01:00
22 changed files with 10860 additions and 1982 deletions

1
.gitignore vendored
View file

@ -1,6 +1,7 @@
.idea/
OUT/
EXPERIMENTS/
runs/
## Core latex/pdflatex auxiliary files:
*.aux

24
README.md Normal file
View file

@ -0,0 +1,24 @@
# Trajectory Prediction Video installation
## Install
* Run `bash build_opencv_with_gstreamer.sh` to build opencv with gstreamer support
* Use pyenv + poetry to install
## How to
> See also the sibling repo [traptools](https://git.rubenvandeven.com/security_vision/traptools) for camera calibration and homography tools that are needed for this repo.
These are roughly the steps to go from datagathering to training
1. Make sure to have some recordings with a fixed camera. [UPDATE: not needed anymore, except for calibration & homography footage]
* Recording can be done with `ffmpeg -rtsp_transport udp -i rtsp://USER:PASS@IP:554/Streaming/Channels/1.mp4 hof2-cam-$(date "+%Y%m%d-%H%M").mp4`
2. Follow the steps in the auxilary [traptools](https://git.rubenvandeven.com/security_vision/traptools) repository to obtain (1) camera matrix, lens distortion, image dimensions, and (2+3) homography
3. Run the tracker, e.g. `poetry run tracker --detector ultralytics --homography ../DATASETS/NAME/homography.json --video-src ../DATASETS/NAME/*.mp4 --calibration ../DATASETS/NAME/calibration.json --save-for-training EXPERIMENTS/raw/NAME/`
* Note: You can run this right of the camera stream: `poetry run tracker --eval_device cuda:0 --detector ultralytics --video-src rtsp://USER:PW@ADDRESS/STREAM --homography ../DATASETS/NAME/homography.json --calibration ../DATASETS/NAME/calibration.json --save-for-training EXPERIMENTS/raw/NAME/`, each recording adding a new file to the `raw` folder.
4. Parse tracker data to Trajectron format: `poetry run process_data --src-dir EXPERIMENTS/raw/NAME --dst-dir EXPERIMENTS/trajectron-data/ --name NAME` Optionally, smooth tracks: `--smooth-tracks`
5. Train Trajectron model `poetry run trajectron_train --eval_every 10 --vis_every 1 --train_data_dict NAME_train.pkl --eval_data_dict NAME_val.pkl --offline_scene_graph no --preprocess_workers 8 --log_dir EXPERIMENTS/models --log_tag _NAME --train_epochs 100 --conf EXPERIMENTS/config.json --batch_size 256 --data_dir EXPERIMENTS/trajectron-data `
6. The run!
* On a video file (you can use a wildcard) `DISPLAY=:1 poetry run trapserv --remote-log-addr 100.69.123.91 --eval_device cuda:0 --detector ultralytics --homography ../DATASETS/NAME/homography.json --eval_data_dict EXPERIMENTS/trajectron-data/hof2s-m_test.pkl --video-src ../DATASETS/NAME/*.mp4 --model_dir EXPERIMENTS/models/models_DATE_NAME/--smooth-predictions --smooth-tracks --num-samples 3 --render-window --calibration ../DATASETS/NAME/calibration.json` (the DISPLAY environment variable is used here to running over SSH connection and display on local monitor)
* or on the RTSP stream. Which uses gstreamer to substantially reduce latency compared to the default ffmpeg bindings in OpenCV.
* To just have a single trajectory pulled from distribution use `--full-dist`. Also try `--z_mode`.

View file

@ -0,0 +1,45 @@
#!/bin/bash
# When using RTSP gstreamer can provides a way lower latency then ffmpeg
# and exposes more options to tweak the connection. However, the pypi
# version of python-opencv is build without gstreamer. Thus, we need to
# build our own python wheel.
# adapted from https://github.com/opencv/opencv-python/issues/530#issuecomment-1006343643
# install gstreamer dependencies
sudo apt-get install --quiet -y --no-install-recommends \
gstreamer1.0-gl \
gstreamer1.0-opencv \
gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-ugly \
gstreamer1.0-tools \
libgstreamer-plugins-base1.0-dev \
libgstreamer1.0-0 \
libgstreamer1.0-dev
# ffmpeg deps
sudo apt install ffmpeg libgtk2.0-dev libavformat-dev libavcodec-dev libavutil-dev libswscale-dev libtbb-dev libjpeg-dev libpng-dev libtiff-dev
OPENCV_VER="84" #fix at 4.10.0.84, or use rolling release: "4.x"
STARTDIR=$(pwd)
TMPDIR=$(mktemp -d)
# Build and install OpenCV from source.
echo $TMPDIR
# pyenv compatibility
cp .python-version $TMPDIR
cd "${TMPDIR}"
git clone --branch ${OPENCV_VER} --depth 1 --recurse-submodules --shallow-submodules https://github.com/opencv/opencv-python.git opencv-python-${OPENCV_VER}
cd opencv-python-${OPENCV_VER}
export ENABLE_CONTRIB=0
# export ENABLE_HEADLESS=1
# We want GStreamer support enabled.
export CMAKE_ARGS="-DWITH_GSTREAMER=ON -DWITH_FFMPEG=ON"
python -m pip wheel . --verbose -w $STARTDIR
# # Install OpenCV
# python3 -m pip install opencv_python*.whl
# cp opencv_python*.whl $STARTDIR

11
custom_bytetrack.yaml Normal file
View file

@ -0,0 +1,11 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license
# Default YOLO tracker settings for ByteTrack tracker https://github.com/ifzhang/ByteTrack
tracker_type: bytetrack # tracker type, ['botsort', 'bytetrack']
track_high_thresh: 0.0001 # threshold for the first association
track_low_thresh: 0.0001 # threshold for the second association
new_track_thresh: 0.0001 # threshold for init new track if the detection does not match any tracks
track_buffer: 50 # buffer to calculate the time when to remove tracks
match_thresh: 0.95 # threshold for matching tracks
fuse_score: True # Whether to fuse confidence scores with the iou distances before matching
# min_box_area: 10 # threshold for min box areas(for tracker evaluation, not used for now)

3477
poetry.lock generated

File diff suppressed because it is too large Load diff

View file

@ -7,12 +7,18 @@ readme = "README.md"
[tool.poetry.scripts]
trapserv = "trap.plumber:start"
tracker = "trap.tools:tracker_preprocess"
compare = "trap.tools:tracker_compare"
process_data = "trap.process_data:main"
blacklist = "trap.tools:blacklist_tracks"
rewrite_tracks = "trap.tools:rewrite_raw_track_files"
[tool.poetry.dependencies]
python = "^3.10,<3.12,"
trajectron-plus-plus = { path = "../Trajectron-plus-plus/", develop = true }
#trajectron-plus-plus = { path = "../Trajectron-plus-plus/", develop = true }
trajectron-plus-plus = { git = "https://git.rubenvandeven.com/security_vision/Trajectron-plus-plus/" }
torch = [
{ version="1.12.1" },
# { url = "https://download.pytorch.org/whl/cu113/torch-1.12.1%2Bcu113-cp38-cp38-linux_x86_64.whl", markers = "python_version ~= '3.8' and sys_platform == 'linux'" },
@ -25,13 +31,19 @@ torchvision = [
{ url = "https://download.pytorch.org/whl/cu113/torchvision-0.13.1%2Bcu113-cp310-cp310-linux_x86_64.whl", markers = "python_version ~= '3.10' and sys_platform == 'linux'" },
]
deep-sort-realtime = "^1.3.2"
ultralytics = "^8.0.200"
ultralytics = "^8.3"
ffmpeg-python = "^0.2.0"
torchreid = "^0.2.5"
gdown = "^4.7.1"
pandas-helper-calc = {git = "https://github.com/scls19fr/pandas-helper-calc"}
tsmoothie = "^1.0.5"
pyglet = "^2.0.15"
pyglet-cornerpin = "^0.3.0"
opencv-python = {file="./opencv_python-4.10.0.84-cp310-cp310-linux_x86_64.whl"}
setproctitle = "^1.3.3"
bytetracker = { git = "https://github.com/rubenvandeven/bytetrack-pip" }
jsonlines = "^4.0.0"
tensorboardx = "^2.6.2.2"
[build-system]
requires = ["poetry-core"]

3624
test_custom_rnn.ipynb Normal file

File diff suppressed because one or more lines are too long

1079
test_model.ipynb Normal file

File diff suppressed because one or more lines are too long

332
test_tracking_data.ipynb Normal file

File diff suppressed because one or more lines are too long

235
test_training_data.ipynb Normal file

File diff suppressed because one or more lines are too long

566
trap/animation_renderer.py Normal file
View file

@ -0,0 +1,566 @@
# used for "Forward Referencing of type annotations"
from __future__ import annotations
import time
import tracemalloc
import ffmpeg
from argparse import Namespace
import datetime
import logging
from multiprocessing import Event
from multiprocessing.synchronize import Event as BaseEvent
import cv2
import numpy as np
import pyglet
import pyglet.event
import zmq
import tempfile
from pathlib import Path
import shutil
import math
from pyglet import shapes
from PIL import Image
import json
from trap.frame_emitter import DetectionState, Frame, Track
from trap.preview_renderer import DrawnTrack, PROJECTION_IMG, PROJECTION_MAP
from trap.utils import convert_world_space_to_img_space, display_top
logger = logging.getLogger("trap.renderer")
# COLOR_PRIMARY = (0,0,0,255)
COLOR_PRIMARY = (255,255,255, 255)
class AnimationRenderer:
def __init__(self, config: Namespace, is_running: BaseEvent):
tracemalloc.start()
self.config = config
self.is_running = is_running
context = zmq.Context()
self.prediction_sock = context.socket(zmq.SUB)
self.prediction_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.prediction_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.prediction_sock.connect(config.zmq_prediction_addr)
self.tracker_sock = context.socket(zmq.SUB)
self.tracker_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.tracker_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.tracker_sock.connect(config.zmq_trajectory_addr)
self.frame_noimg_sock = context.socket(zmq.SUB)
self.frame_noimg_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.frame_noimg_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.frame_noimg_sock.connect(config.zmq_frame_noimg_addr)
self.H = self.config.H
self.inv_H = np.linalg.pinv(self.H)
# TODO: get FPS from frame_emitter
# self.out = cv2.VideoWriter(str(filename), fourcc, 23.97, (1280,720))
self.fps = 60
self.frame_size = (self.config.camera.w,self.config.camera.h)
self.hide_stats = self.config.render_hide_stats
self.hide_bg = True
self.pause = False
self.out_writer = None # self.start_writer() if self.config.render_file else None
self.streaming_process = self.start_streaming() if self.config.render_url else None
# if self.config.render_window:
# pass
# # cv2.namedWindow("frame", cv2.WND_PROP_FULLSCREEN)
# # cv2.setWindowProperty("frame",cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)
# else:
if self.streaming_process is not None:
pyglet.options["headless"] = True
config = pyglet.gl.Config(sample_buffers=1, samples=4)
# , fullscreen=self.config.render_window
display = pyglet.canvas.get_display()
idx = -1 if self.config.render_window else 0
screen = display.get_screens()[idx]
print(display.get_screens())
if self.streaming_process is not None:
self.window = pyglet.window.Window(width=self.frame_size[0], height=self.frame_size[1], config=config, fullscreen=False, screen=screen)
else:
self.window = pyglet.window.Window(width=screen.width, height=screen.height, config=config, fullscreen=True, screen=screen)
self.window.set_handler('on_draw', self.on_draw)
self.window.set_handler('on_refresh', self.on_refresh)
self.window.set_handler('on_close', self.on_close)
self.window.set_handler('on_key_press', self.on_key_press)
# don't know why, but importing this before window leads to "x connection to :1 broken (explicit kill or server shutdown)"
from pyglet_cornerpin import PygletCornerPin
# self.pins = PygletCornerPin(self.window, corners=[[-144,-2], [2880,0], [-168,958], [3011,1553]])
# x1 540 y1 760-360
# x2 1380 y2 670-360
self.pins = PygletCornerPin(
self.window,
source_points=[[540, 670-360], [1380,670-360], [540,760-360], [1380,760-360]],
# corners=[[540, 670-360], [1380,670-360], [540,760-360], [1380,760-360]], # original test: short throw?
# corners=[[396, 442], [1644, 734], [350, 516], [1572, 796]], # beamer downstairs
# corners=[[270, 452], [1698, 784], [314, 568], [1572, 860]], # ??
# corners=[[471, 304], [1797, 376], [467, 387], [1792, 484]] # ??
# corners=[[576, 706], [1790, 696], [588, 794], [1728, 796]], # beamer boven
)
self.window.push_handlers(self.pins)
# pyglet.gl.glClearColor(255,255,255,255)
self.fps_display = pyglet.window.FPSDisplay(window=self.window, color=COLOR_PRIMARY)
self.fps_display.label.x = self.window.width - 50
self.fps_display.label.y = self.window.height - 17
self.fps_display.label.bold = False
self.fps_display.label.font_size = 10
self.drawn_tracks: dict[str, DrawnTrack] = {}
self.first_time: float|None = None
self.frame: Frame|None= None
self.tracker_frame: Frame|None = None
self.prediction_frame: Frame|None = None
self.batch_bg = pyglet.graphics.Batch()
self.batch_overlay = pyglet.graphics.Batch()
self.batch_anim = pyglet.graphics.Batch()
self.batch_debug = pyglet.graphics.Batch()
# if self.config.render_debug_shapes:
self.render_debug_shapes = self.config.render_debug_shapes
self.render_lines = True
self.debug_lines = [
pyglet.shapes.Line(1370, self.config.camera.h-360, 1380, 670-360, 2, COLOR_PRIMARY, batch=self.batch_debug),#v
pyglet.shapes.Line(0, 660-360, 1380, 670-360, 2, COLOR_PRIMARY, batch=self.batch_debug), #h
pyglet.shapes.Line(1140, 760-360, 1140, 675-360, 2, COLOR_PRIMARY, batch=self.batch_debug), #h
pyglet.shapes.Line(540, 760-360,540, 675-360, 2, COLOR_PRIMARY, batch=self.batch_debug), #v
pyglet.shapes.Line(0, 770-360, 1380, 770-360, 2, COLOR_PRIMARY, batch=self.batch_debug), #h
]
self.debug_points = []
# print(self.config.debug_points_file)
if self.config.debug_points_file:
with self.config.debug_points_file.open('r') as fp:
img_points = np.array(json.load(fp))
# to place points accurate I used a 2160p image, but during calibration and
# prediction I use(d) a 1440p image, so convert points to different space:
img_points = np.array(img_points)
# first undistort the points so that lines are actually straight
undistorted_img_points = cv2.undistortPoints(np.array([img_points]).astype('float32'), self.config.camera.mtx, self.config.camera.dist, None, self.config.camera.newcameramtx)
dst_img_points = cv2.perspectiveTransform(np.array(undistorted_img_points), self.config.camera.H)
if dst_img_points.shape[1:] == (1,2):
dst_img_points = np.reshape(dst_img_points, (dst_img_points.shape[0], 2))
self.debug_points = [
pyglet.shapes.Circle(p[0], self.window.height - p[1], 3, color=(255,0,0,255), batch=self.batch_debug) for p in dst_img_points
]
self.init_shapes()
self.init_labels()
def start_streaming(self):
"""TODO)) This should be inherited from a generic renderer"""
return (
ffmpeg
.input('pipe:', format='rawvideo',codec="rawvideo", pix_fmt='bgr24', s='{}x{}'.format(*self.frame_size))
.output(
self.config.render_url,
#codec = "copy", # use same codecs of the original video
codec='libx264',
listen=1, # enables HTTP server
pix_fmt="yuv420p",
preset="ultrafast",
tune="zerolatency",
# g=f"{self.fps*2}",
g=f"{60*2}",
analyzeduration="2000000",
probesize="1000000",
f='mpegts'
)
.overwrite_output()
.run_async(pipe_stdin=True)
)
# return process
def init_shapes(self):
'''
Due to error when running headless, we need to configure options before extending the shapes class
'''
class GradientLine(shapes.Line):
def __init__(self, x, y, x2, y2, width=1, color1=[255,255,255], color2=[255,255,255], batch=None, group=None):
# print('colors!', colors)
# assert len(colors) == 6
r, g, b, *a = color1
self._rgba1 = (r, g, b, a[0] if a else 255)
r, g, b, *a = color2
self._rgba2 = (r, g, b, a[0] if a else 255)
# print('rgba', self._rgba)
super().__init__(x, y, x2, y2, width, color1, batch=None, group=None)
# <pyglet.graphics.vertexdomain.VertexList
# pyglet.graphics.vertexdomain
# print(self._vertex_list)
def _create_vertex_list(self):
'''
copy of super()._create_vertex_list but with additional colors'''
self._vertex_list = self._group.program.vertex_list(
6, self._draw_mode, self._batch, self._group,
position=('f', self._get_vertices()),
colors=('Bn', self._rgba1+ self._rgba2 + self._rgba2 + self._rgba1 + self._rgba2 +self._rgba1 ),
translation=('f', (self._x, self._y) * self._num_verts))
def _update_colors(self):
self._vertex_list.colors[:] = self._rgba1+ self._rgba2 + self._rgba2 + self._rgba1 + self._rgba2 +self._rgba1
def color1(self, color):
r, g, b, *a = color
self._rgba1 = (r, g, b, a[0] if a else 255)
self._update_colors()
def color2(self, color):
r, g, b, *a = color
self._rgba2 = (r, g, b, a[0] if a else 255)
self._update_colors()
self.gradientLine = GradientLine
def init_labels(self):
base_color = COLOR_PRIMARY
color_predictor = (255,255,0, 255)
color_info = (255,0, 255, 255)
color_tracker = (0,255, 255, 255)
options = []
for option in ['prediction_horizon','num_samples','full_dist','gmm_mode','z_mode', 'model_dir']:
options.append(f"{option}: {self.config.__dict__[option]}")
self.labels = {
'waiting': pyglet.text.Label("Waiting for prediction"),
'frame_idx': pyglet.text.Label("", x=20, y=self.window.height - 17, color=base_color, batch=self.batch_overlay),
'tracker_idx': pyglet.text.Label("", x=90, y=self.window.height - 17, color=color_tracker, batch=self.batch_overlay),
'pred_idx': pyglet.text.Label("", x=110, y=self.window.height - 17, color=color_predictor, batch=self.batch_overlay),
'frame_time': pyglet.text.Label("t", x=140, y=self.window.height - 17, color=base_color, batch=self.batch_overlay),
'frame_latency': pyglet.text.Label("", x=235, y=self.window.height - 17, color=color_info, batch=self.batch_overlay),
'tracker_time': pyglet.text.Label("", x=300, y=self.window.height - 17, color=color_tracker, batch=self.batch_overlay),
'pred_time': pyglet.text.Label("", x=360, y=self.window.height - 17, color=color_predictor, batch=self.batch_overlay),
'track_len': pyglet.text.Label("", x=800, y=self.window.height - 17, color=color_tracker, batch=self.batch_overlay),
'options1': pyglet.text.Label(options.pop(-1), x=20, y=30, color=base_color, batch=self.batch_overlay),
'options2': pyglet.text.Label(" | ".join(options), x=20, y=10, color=base_color, batch=self.batch_overlay),
}
def refresh_labels(self, dt: float):
"""Every frame"""
if self.frame:
self.labels['frame_idx'].text = f"{self.frame.index:06d}"
self.labels['frame_time'].text = f"{self.frame.time - self.first_time: >10.2f}s"
self.labels['frame_latency'].text = f"{self.frame.time - time.time():.2f}s"
if self.frame.time - self.first_time > 30 and (not hasattr(self, 'has_snap') or self.has_snap == False):
snapshot = tracemalloc.take_snapshot()
display_top(snapshot, 'traceback', 15)
tracemalloc.stop()
self.has_snap = True
if self.tracker_frame and self.frame:
self.labels['tracker_idx'].text = f"{self.tracker_frame.index - self.frame.index}"
self.labels['tracker_time'].text = f"{self.tracker_frame.time - time.time():.3f}s"
self.labels['track_len'].text = f"{len(self.tracker_frame.tracks)} tracks"
if self.prediction_frame and self.frame:
self.labels['pred_idx'].text = f"{self.prediction_frame.index - self.frame.index}"
self.labels['pred_time'].text = f"{self.prediction_frame.time - time.time():.3f}s"
# self.labels['track_len'].text = f"{len(self.prediction_frame.tracks)} tracks"
# cv2.putText(img, f"{frame.index:06d}", (20,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# cv2.putText(img, f"{frame.time - first_time:.3f}s", (120,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# if prediction_frame:
# # render Δt and Δ frames
# cv2.putText(img, f"{prediction_frame.index - frame.index}", (90,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"{prediction_frame.time - time.time():.2f}s", (200,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"{len(prediction_frame.tracks)} tracks", (500,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# cv2.putText(img, f"h: {np.average([len(t.history or []) for t in prediction_frame.tracks.values()]):.2f}", (580,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"ph: {np.average([len(t.predictor_history or []) for t in prediction_frame.tracks.values()]):.2f}", (660,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"p: {np.average([len(t.predictions or []) for t in prediction_frame.tracks.values()]):.2f}", (740,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# options = []
# for option in ['prediction_horizon','num_samples','full_dist','gmm_mode','z_mode', 'model_dir']:
# options.append(f"{option}: {config.__dict__[option]}")
# cv2.putText(img, options.pop(-1), (20,img.shape[0]-30), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# cv2.putText(img, " | ".join(options), (20,img.shape[0]-10), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
def check_frames(self, dt):
if self.pause:
return
new_tracks = False
try:
self.frame: Frame = self.frame_noimg_sock.recv_pyobj(zmq.NOBLOCK)
if not self.first_time:
self.first_time = self.frame.time
if self.frame.img:
img = self.frame.img
# newcameramtx, roi = cv2.getOptimalNewCameraMatrix(self.config.camera.mtx, self.config.camera.dist, (self.frame.img.shape[1], self.frame.img.shape[0]), 1, (self.frame.img.shape[1], self.frame.img.shape[0]))
img = cv2.undistort(img, self.config.camera.mtx, self.config.camera.dist, None, self.config.camera.newcameramtx)
img = cv2.warpPerspective(img, convert_world_space_to_img_space(self.config.camera.H), (self.config.camera.w, self.config.camera.h))
# img = cv2.GaussianBlur(img, (15, 15), 0)
img = cv2.flip(cv2.cvtColor(img, cv2.COLOR_BGR2RGB), 0)
img = pyglet.image.ImageData(self.frame_size[0], self.frame_size[1], 'RGB', img.tobytes())
# don't draw in batch, so that it is the background
if hasattr(self, 'video_sprite') and self.video_sprite:
self.video_sprite.delete()
self.frame.img = None
self.video_sprite = pyglet.sprite.Sprite(img=img, batch=self.batch_bg)
# transform to flipped coordinate system for pyglet
self.video_sprite.y = self.window.height - self.video_sprite.height
# self.frame.img = np.array([]) # clearing memory?
# self.video_sprite.opacity = 70
except zmq.ZMQError as e:
# idx = frame.index if frame else "NONE"
# logger.debug(f"reuse video frame {idx}")
pass
try:
self.prediction_frame: Frame = self.prediction_sock.recv_pyobj(zmq.NOBLOCK)
new_tracks = True
except zmq.ZMQError as e:
pass
try:
self.tracker_frame: Frame = self.tracker_sock.recv_pyobj(zmq.NOBLOCK)
new_tracks = True
except zmq.ZMQError as e:
pass
if new_tracks:
self.update_tracks()
def update_tracks(self):
"""Updates the track objects and shapes. Called after setting `prediction_frame`
"""
# clean up
# for track_id in list(self.drawn_tracks.keys()):
# if track_id not in self.prediction_frame.tracks.keys():
# # TODO fade out
# del self.drawn_tracks[track_id]
if self.tracker_frame:
for track_id, track in self.tracker_frame.tracks.items():
if track_id not in self.drawn_tracks:
self.drawn_tracks[track_id] = DrawnTrack(track_id, track, self, self.tracker_frame.H, PROJECTION_MAP, self.config.camera)
else:
self.drawn_tracks[track_id].set_track(track)
if self.prediction_frame:
for track_id, track in self.prediction_frame.tracks.items():
if track_id not in self.drawn_tracks:
self.drawn_tracks[track_id] = DrawnTrack(track_id, track, self, self.prediction_frame.H, PROJECTION_MAP, self.config.camera)
else:
self.drawn_tracks[track_id].set_predictions(track)
# clean up
for track_id in list(self.drawn_tracks.keys()):
# TODO make delay configurable
if self.drawn_tracks[track_id].update_at < time.time() - 5:
# TODO fade out
del self.drawn_tracks[track_id]
def on_key_press(self, symbol, modifiers):
print('A key was pressed, use f to hide')
if symbol == ord('f'):
self.window.set_fullscreen(not self.window.fullscreen)
if symbol == ord('h'):
self.hide_stats = not self.hide_stats
if symbol == ord('d'):
self.render_debug_shapes = not self.render_debug_shapes
if symbol == ord('p'):
self.pause = not self.pause
if symbol == ord('b'):
self.hide_bg = not self.hide_bg
if symbol == ord('l'):
self.render_lines = not self.render_lines
def check_running(self, dt):
if not self.is_running.is_set():
self.window.close()
self.event_loop.exit()
print('quit animation renderer')
def on_close(self):
self.is_running.clear()
def on_refresh(self, dt: float):
# update shapes
# self.bg =
for track_id, track in self.drawn_tracks.items():
track.update_drawn_positions(dt)
self.refresh_labels(dt)
# self.shape1 = shapes.Circle(700, 150, 100, color=(50, 0, 30), batch=self.batch_anim)
# self.shape3 = shapes.Circle(800, 150, 100, color=(100, 225, 30), batch=self.batch_anim)
pass
def on_draw(self):
self.window.clear()
if not self.hide_bg:
self.batch_bg.draw()
if self.render_debug_shapes:
self.batch_debug.draw()
self.pins.draw()
if self.render_lines:
for track in self.drawn_tracks.values():
for shape in track.shapes:
shape.draw() # for some reason the batches don't work
for track in self.drawn_tracks.values():
for shapes in track.pred_shapes:
for shape in shapes:
shape.draw()
# self.batch_anim.draw()
# pyglet.graphics.draw(3, pyglet.gl.GL_LINE, ("v2i", (100,200, 600,800)), ('c3B', (255,255,255, 255,255,255)))
if not self.hide_stats:
self.batch_overlay.draw()
self.fps_display.draw()
# if streaming, capture buffer and send
try:
if self.streaming_process or self.out_writer:
buf = pyglet.image.get_buffer_manager().get_color_buffer()
img_data = buf.get_image_data()
data = img_data.get_data() # alternative: .get_data("RGBA", image_data.pitch)
img = np.asanyarray(data).reshape((img_data.height, img_data.width, 4))
img = cv2.cvtColor(img, cv2.COLOR_BGRA2RGB)
img = np.flip(img, 0)
# img = cv2.flip(img, cv2.0)
# cv2.imshow('frame', img)
# cv2.waitKey(1)
if self.streaming_process:
self.streaming_process.stdin.write(img.tobytes())
if self.out_writer:
self.out_writer.write(img)
except Exception as e:
logger.exception(e)
def run(self):
frame = None
prediction_frame = None
tracker_frame = None
i=0
first_time = None
self.event_loop = pyglet.app.EventLoop()
pyglet.clock.schedule_interval(self.check_running, 0.1)
pyglet.clock.schedule(self.check_frames)
self.event_loop.run()
# while self.is_running.is_set():
# i+=1
# # zmq_ev = self.frame_sock.poll(timeout=2000)
# # if not zmq_ev:
# # # when no data comes in, loop so that is_running is checked
# # continue
# try:
# frame: Frame = self.frame_sock.recv_pyobj(zmq.NOBLOCK)
# except zmq.ZMQError as e:
# # idx = frame.index if frame else "NONE"
# # logger.debug(f"reuse video frame {idx}")
# pass
# # else:
# # logger.debug(f'new video frame {frame.index}')
# if frame is None:
# # might need to wait a few iterations before first frame comes available
# time.sleep(.1)
# continue
# try:
# prediction_frame: Frame = self.prediction_sock.recv_pyobj(zmq.NOBLOCK)
# except zmq.ZMQError as e:
# logger.debug(f'reuse prediction')
# if first_time is None:
# first_time = frame.time
# img = decorate_frame(frame, prediction_frame, first_time, self.config)
# img_path = (self.config.output_dir / f"{i:05d}.png").resolve()
# logger.debug(f"write frame {frame.time - first_time:.3f}s")
# if self.out_writer:
# self.out_writer.write(img)
# if self.streaming_process:
# self.streaming_process.stdin.write(img.tobytes())
# if self.config.render_window:
# cv2.imshow('frame',img)
# cv2.waitKey(1)
logger.info('Stopping')
logger.info(f'used corner pins {self.pins.pin_positions}')
print(self.pins.pin_positions)
# if i>2:
if self.streaming_process:
self.streaming_process.stdin.close()
if self.out_writer:
self.out_writer.release()
if self.streaming_process:
# oddly wrapped, because both close and release() take time.
self.streaming_process.wait()
def run_animation_renderer(config: Namespace, is_running: BaseEvent):
renderer = AnimationRenderer(config, is_running)
renderer.run()

View file

@ -1,10 +1,14 @@
import argparse
from pathlib import Path
import types
import numpy as np
import json
from trap.tracker import DETECTORS
from trap.tracker import DETECTORS, TRACKER_BYTETRACK, TRACKERS
from trap.frame_emitter import Camera
from pyparsing import Optional
from trap.frame_emitter import UrlOrPath
class LambdaParser(argparse.ArgumentParser):
"""Execute lambda functions
@ -49,6 +53,51 @@ frame_emitter_parser = parser.add_argument_group('Frame emitter')
tracker_parser = parser.add_argument_group('Tracker')
render_parser = parser.add_argument_group('Renderer')
class HomographyAction(argparse.Action):
def __init__(self, option_strings, dest, nargs=None, **kwargs):
if nargs is not None:
raise ValueError("nargs not allowed")
super().__init__(option_strings, dest, **kwargs)
def __call__(self, parser, namespace, values: Path, option_string=None):
if values.suffix == '.json':
with values.open('r') as fp:
H = np.array(json.load(fp))
else:
H = np.loadtxt(values, delimiter=',')
setattr(namespace, self.dest, values)
setattr(namespace, 'H', H)
class CameraAction(argparse.Action):
def __init__(self, option_strings, dest, nargs=None, **kwargs):
if nargs is not None:
raise ValueError("nargs not allowed")
super().__init__(option_strings, dest, **kwargs)
def __call__(self, parser, namespace, values, option_string=None):
if values is None:
setattr(namespace, self.dest, None)
else:
camera = Camera.from_calibfile(Path(values), namespace.H, namespace.camera_fps)
# values = Path(values)
# with values.open('r') as fp:
# data = json.load(fp)
# # print(data)
# # print(data['camera_matrix'])
# # camera = {
# # 'camera_matrix': np.array(data['camera_matrix']),
# # 'dist_coeff': np.array(data['dist_coeff']),
# # }
# camera = Camera(np.array(data['camera_matrix']), np.array(data['dist_coeff']), data['dim']['width'], data['dim']['height'], namespace.H, namespace.camera_fps)
setattr(namespace, 'camera', camera)
inference_parser.add_argument("--step-size",
# TODO)) Make dataset/model metadata
help="sample step size (should be the same as for data processing and augmentation)",
type=int,
default=1,
)
inference_parser.add_argument("--model_dir",
help="directory with the model to use for inference",
type=str, # TODO: make into Path
@ -166,16 +215,20 @@ inference_parser.add_argument('--num-samples',
default=5)
inference_parser.add_argument("--full-dist",
help="Trajectron.incremental_forward parameter",
type=bool,
default=False)
action='store_true')
inference_parser.add_argument("--gmm-mode",
help="Trajectron.incremental_forward parameter",
type=bool,
default=True)
inference_parser.add_argument("--z-mode",
help="Trajectron.incremental_forward parameter",
type=bool,
default=False)
action='store_true')
inference_parser.add_argument('--cm-to-m',
help="Correct for homography that is in cm (i.e. {x,y}/100). Should also be used when processing data",
action='store_true')
inference_parser.add_argument('--center-data',
help="Center data around cx and cy. Should also be used when processing data",
action='store_true')
# Internal connections.
@ -198,7 +251,12 @@ connection_parser.add_argument('--zmq-prediction-addr',
connection_parser.add_argument('--zmq-frame-addr',
help='Manually specity communication addr for the frame messages',
type=str,
default="ipc:///tmp/feeds_frame")
default="ipc:///tmp/feeds_frame")
connection_parser.add_argument('--zmq-frame-noimg-addr',
help='Manually specity communication addr for the frame messages',
type=str,
default="ipc:///tmp/feeds_frame2")
connection_parser.add_argument('--ws-port',
@ -213,10 +271,10 @@ connection_parser.add_argument('--bypass-prediction',
# Frame emitter
frame_emitter_parser.add_argument("--video-src",
help="source video to track from",
type=Path,
help="source video to track from can be either a relative or absolute path, or a url, like an RTSP resource",
type=UrlOrPath,
nargs='+',
default=lambda: list(Path('../DATASETS/VIRAT_subset_0102x/').glob('*.mp4')))
default=lambda: [UrlOrPath(p) for p in Path('../DATASETS/VIRAT_subset_0102x/').glob('*.mp4')])
frame_emitter_parser.add_argument("--video-offset",
help="Start playback from given frame. Note that when src is an array, this applies to all videos individually.",
default=None,
@ -231,10 +289,20 @@ frame_emitter_parser.add_argument("--video-loop",
# Tracker
tracker_parser.add_argument("--camera-fps",
help="Camera FPS",
type=int,
default=12)
tracker_parser.add_argument("--homography",
help="File with homography params",
type=Path,
default='../DATASETS/VIRAT_subset_0102x/VIRAT_0102_homography_img2world.txt')
default='../DATASETS/VIRAT_subset_0102x/VIRAT_0102_homography_img2world.txt',
action=HomographyAction)
tracker_parser.add_argument("--calibration",
help="File with camera intrinsics and lens distortion params (calibration.json)",
# type=Path,
default=None,
action=CameraAction)
tracker_parser.add_argument("--save-for-training",
help="Specify the path in which to save",
type=Path,
@ -243,11 +311,29 @@ tracker_parser.add_argument("--detector",
help="Specify the detector to use",
type=str,
choices=DETECTORS)
tracker_parser.add_argument("--tracker",
help="Specify the detector to use",
type=str,
default=TRACKER_BYTETRACK,
choices=TRACKERS)
tracker_parser.add_argument("--smooth-tracks",
help="Smooth the tracker tracks before sending them to the predictor",
action='store_true')
# now in calibration.json
# tracker_parser.add_argument("--frame-width",
# help="width of the frames",
# type=int,
# default=1280)
# tracker_parser.add_argument("--frame-height",
# help="height of the frames",
# type=int,
# default=720)
# Renderer
# render_parser.add_argument("--disable-renderer",
# help="Disable the renderer all together. Usefull when using an external renderer",
# action="store_true")
render_parser.add_argument("--render-file",
help="Render a video file previewing the prediction, and its delay compared to the current frame",
@ -255,6 +341,15 @@ render_parser.add_argument("--render-file",
render_parser.add_argument("--render-window",
help="Render a previewing to a window",
action='store_true')
render_parser.add_argument("--render-animation",
help="Render animation (pyglet)",
action='store_true')
render_parser.add_argument("--render-debug-shapes",
help="Lines and points for debugging/mapping",
action='store_true')
render_parser.add_argument("--render-hide-stats",
help="Default toggle to hide (switch with 'h')",
action='store_true')
render_parser.add_argument("--full-screen",
help="Set Window full screen",
action='store_true')
@ -269,3 +364,9 @@ render_parser.add_argument("--render-url",
type=str,
default=None)
render_parser.add_argument("--debug-points-file",
help="A json file with points to test projection/homography etc.",
type=Path,
required=False,
)

539
trap/cv_renderer.py Normal file
View file

@ -0,0 +1,539 @@
# used for "Forward Referencing of type annotations"
from __future__ import annotations
import time
import ffmpeg
from argparse import Namespace
import datetime
import logging
from multiprocessing import Event
from multiprocessing.synchronize import Event as BaseEvent
import cv2
import numpy as np
import json
import pyglet
import pyglet.event
import zmq
import tempfile
from pathlib import Path
import shutil
import math
from typing import Dict, Iterable, Optional
from pyglet import shapes
from PIL import Image
from trap.frame_emitter import DetectionState, Frame, Track, Camera
from trap.preview_renderer import FrameWriter
from trap.tools import draw_track, draw_track_predictions, draw_track_projected, draw_trackjectron_history, to_point
from trap.utils import convert_world_points_to_img_points, convert_world_space_to_img_space
logger = logging.getLogger("trap.simple_renderer")
class CvRenderer:
def __init__(self, config: Namespace, is_running: BaseEvent):
self.config = config
self.is_running = is_running
context = zmq.Context()
self.prediction_sock = context.socket(zmq.SUB)
self.prediction_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.prediction_sock.setsockopt(zmq.SUBSCRIBE, b'')
# self.prediction_sock.connect(config.zmq_prediction_addr if not self.config.bypass_prediction else config.zmq_trajectory_addr)
self.prediction_sock.connect(config.zmq_prediction_addr)
self.tracker_sock = context.socket(zmq.SUB)
self.tracker_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.tracker_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.tracker_sock.connect(config.zmq_trajectory_addr)
self.frame_sock = context.socket(zmq.SUB)
self.frame_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.frame_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.frame_sock.connect(config.zmq_frame_addr)
self.H = self.config.H
self.inv_H = np.linalg.pinv(self.H)
# TODO: get FPS from frame_emitter
# self.out = cv2.VideoWriter(str(filename), fourcc, 23.97, (1280,720))
self.fps = 60
self.frame_size = (self.config.camera.w,self.config.camera.h)
self.hide_stats = False
self.out_writer = self.start_writer() if self.config.render_file else None
self.streaming_process = self.start_streaming() if self.config.render_url else None
self.first_time: float|None = None
self.frame: Frame|None= None
self.tracker_frame: Frame|None = None
self.prediction_frame: Frame|None = None
self.tracks: Dict[str, Track] = {}
self.predictions: Dict[str, Track] = {}
# self.init_shapes()
# self.init_labels()
def init_shapes(self):
'''
Due to error when running headless, we need to configure options before extending the shapes class
'''
class GradientLine(shapes.Line):
def __init__(self, x, y, x2, y2, width=1, color1=[255,255,255], color2=[255,255,255], batch=None, group=None):
# print('colors!', colors)
# assert len(colors) == 6
r, g, b, *a = color1
self._rgba1 = (r, g, b, a[0] if a else 255)
r, g, b, *a = color2
self._rgba2 = (r, g, b, a[0] if a else 255)
# print('rgba', self._rgba)
super().__init__(x, y, x2, y2, width, color1, batch=None, group=None)
# <pyglet.graphics.vertexdomain.VertexList
# pyglet.graphics.vertexdomain
# print(self._vertex_list)
def _create_vertex_list(self):
'''
copy of super()._create_vertex_list but with additional colors'''
self._vertex_list = self._group.program.vertex_list(
6, self._draw_mode, self._batch, self._group,
position=('f', self._get_vertices()),
colors=('Bn', self._rgba1+ self._rgba2 + self._rgba2 + self._rgba1 + self._rgba2 +self._rgba1 ),
translation=('f', (self._x, self._y) * self._num_verts))
def _update_colors(self):
self._vertex_list.colors[:] = self._rgba1+ self._rgba2 + self._rgba2 + self._rgba1 + self._rgba2 +self._rgba1
def color1(self, color):
r, g, b, *a = color
self._rgba1 = (r, g, b, a[0] if a else 255)
self._update_colors()
def color2(self, color):
r, g, b, *a = color
self._rgba2 = (r, g, b, a[0] if a else 255)
self._update_colors()
self.gradientLine = GradientLine
def init_labels(self):
base_color = (255,)*4
color_predictor = (255,255,0, 255)
color_info = (255,0, 255, 255)
color_tracker = (0,255, 255, 255)
options = []
for option in ['prediction_horizon','num_samples','full_dist','gmm_mode','z_mode', 'model_dir']:
options.append(f"{option}: {self.config.__dict__[option]}")
self.labels = {
'waiting': pyglet.text.Label("Waiting for prediction"),
'frame_idx': pyglet.text.Label("", x=20, y=self.window.height - 17, color=base_color, batch=self.batch_overlay),
'tracker_idx': pyglet.text.Label("", x=90, y=self.window.height - 17, color=color_tracker, batch=self.batch_overlay),
'pred_idx': pyglet.text.Label("", x=110, y=self.window.height - 17, color=color_predictor, batch=self.batch_overlay),
'frame_time': pyglet.text.Label("t", x=140, y=self.window.height - 17, color=base_color, batch=self.batch_overlay),
'frame_latency': pyglet.text.Label("", x=235, y=self.window.height - 17, color=color_info, batch=self.batch_overlay),
'tracker_time': pyglet.text.Label("", x=300, y=self.window.height - 17, color=color_tracker, batch=self.batch_overlay),
'pred_time': pyglet.text.Label("", x=360, y=self.window.height - 17, color=color_predictor, batch=self.batch_overlay),
'track_len': pyglet.text.Label("", x=800, y=self.window.height - 17, color=color_tracker, batch=self.batch_overlay),
'options1': pyglet.text.Label(options.pop(-1), x=20, y=30, color=base_color, batch=self.batch_overlay),
'options2': pyglet.text.Label(" | ".join(options), x=20, y=10, color=base_color, batch=self.batch_overlay),
}
def refresh_labels(self, dt: float):
"""Every frame"""
if self.frame:
self.labels['frame_idx'].text = f"{self.frame.index:06d}"
self.labels['frame_time'].text = f"{self.frame.time - self.first_time: >10.2f}s"
self.labels['frame_latency'].text = f"{self.frame.time - time.time():.2f}s"
if self.tracker_frame:
self.labels['tracker_idx'].text = f"{self.tracker_frame.index - self.frame.index}"
self.labels['tracker_time'].text = f"{self.tracker_frame.time - time.time():.3f}s"
self.labels['track_len'].text = f"{len(self.tracker_frame.tracks)} tracks"
if self.prediction_frame:
self.labels['pred_idx'].text = f"{self.prediction_frame.index - self.frame.index}"
self.labels['pred_time'].text = f"{self.prediction_frame.time - time.time():.3f}s"
# self.labels['track_len'].text = f"{len(self.prediction_frame.tracks)} tracks"
# cv2.putText(img, f"{frame.index:06d}", (20,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# cv2.putText(img, f"{frame.time - first_time:.3f}s", (120,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# if prediction_frame:
# # render Δt and Δ frames
# cv2.putText(img, f"{prediction_frame.index - frame.index}", (90,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"{prediction_frame.time - time.time():.2f}s", (200,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"{len(prediction_frame.tracks)} tracks", (500,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# cv2.putText(img, f"h: {np.average([len(t.history or []) for t in prediction_frame.tracks.values()]):.2f}", (580,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"ph: {np.average([len(t.predictor_history or []) for t in prediction_frame.tracks.values()]):.2f}", (660,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# cv2.putText(img, f"p: {np.average([len(t.predictions or []) for t in prediction_frame.tracks.values()]):.2f}", (740,17), cv2.FONT_HERSHEY_PLAIN, 1, info_color, 1)
# options = []
# for option in ['prediction_horizon','num_samples','full_dist','gmm_mode','z_mode', 'model_dir']:
# options.append(f"{option}: {config.__dict__[option]}")
# cv2.putText(img, options.pop(-1), (20,img.shape[0]-30), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
# cv2.putText(img, " | ".join(options), (20,img.shape[0]-10), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
def check_frames(self, dt):
new_tracks = False
try:
self.frame: Frame = self.frame_sock.recv_pyobj(zmq.NOBLOCK)
if not self.first_time:
self.first_time = self.frame.time
img = cv2.GaussianBlur(self.frame.img, (15, 15), 0)
img = cv2.flip(cv2.cvtColor(img, cv2.COLOR_BGR2RGB), 0)
img = pyglet.image.ImageData(self.frame_size[0], self.frame_size[1], 'RGB', img.tobytes())
# don't draw in batch, so that it is the background
self.video_sprite = pyglet.sprite.Sprite(img=img, batch=self.batch_bg)
self.video_sprite.opacity = 100
except zmq.ZMQError as e:
# idx = frame.index if frame else "NONE"
# logger.debug(f"reuse video frame {idx}")
pass
try:
self.prediction_frame: Frame = self.prediction_sock.recv_pyobj(zmq.NOBLOCK)
new_tracks = True
except zmq.ZMQError as e:
pass
try:
self.tracker_frame: Frame = self.tracker_sock.recv_pyobj(zmq.NOBLOCK)
new_tracks = True
except zmq.ZMQError as e:
pass
def on_key_press(self, symbol, modifiers):
print('A key was pressed, use f to hide')
if symbol == ord('f'):
self.window.set_fullscreen(not self.window.fullscreen)
if symbol == ord('h'):
self.hide_stats = not self.hide_stats
def check_running(self, dt):
if not self.is_running.is_set():
self.window.close()
self.event_loop.exit()
def on_close(self):
self.is_running.clear()
def on_refresh(self, dt: float):
# update shapes
# self.bg =
for track_id, track in self.drawn_tracks.items():
track.update_drawn_positions(dt)
self.refresh_labels(dt)
# self.shape1 = shapes.Circle(700, 150, 100, color=(50, 0, 30), batch=self.batch_anim)
# self.shape3 = shapes.Circle(800, 150, 100, color=(100, 225, 30), batch=self.batch_anim)
pass
def on_draw(self):
self.window.clear()
self.batch_bg.draw()
for track in self.drawn_tracks.values():
for shape in track.shapes:
shape.draw() # for some reason the batches don't work
for track in self.drawn_tracks.values():
for shapes in track.pred_shapes:
for shape in shapes:
shape.draw()
# self.batch_anim.draw()
self.batch_overlay.draw()
# pyglet.graphics.draw(3, pyglet.gl.GL_LINE, ("v2i", (100,200, 600,800)), ('c3B', (255,255,255, 255,255,255)))
if not self.hide_stats:
self.fps_display.draw()
# if streaming, capture buffer and send
try:
if self.streaming_process or self.out_writer:
buf = pyglet.image.get_buffer_manager().get_color_buffer()
img_data = buf.get_image_data()
data = img_data.get_data() # alternative: .get_data("RGBA", image_data.pitch)
img = np.asanyarray(data).reshape((img_data.height, img_data.width, 4))
img = cv2.cvtColor(img, cv2.COLOR_BGRA2RGB)
img = np.flip(img, 0)
# img = cv2.flip(img, cv2.0)
# cv2.imshow('frame', img)
# cv2.waitKey(1)
if self.streaming_process:
self.streaming_process.stdin.write(img.tobytes())
if self.out_writer:
self.out_writer.write(img)
except Exception as e:
logger.exception(e)
def start_writer(self):
if not self.config.output_dir.exists():
raise FileNotFoundError("Path does not exist")
date_str = datetime.datetime.now().isoformat(timespec="minutes")
filename = self.config.output_dir / f"render_predictions-{date_str}-{self.config.detector}.mp4"
logger.info(f"Write to {filename}")
return FrameWriter(str(filename), self.fps, self.frame_size)
fourcc = cv2.VideoWriter_fourcc(*'vp09')
return cv2.VideoWriter(str(filename), fourcc, self.fps, self.frame_size)
def start_streaming(self):
return (
ffmpeg
.input('pipe:', format='rawvideo',codec="rawvideo", pix_fmt='bgr24', s='{}x{}'.format(*self.frame_size))
.output(
self.config.render_url,
#codec = "copy", # use same codecs of the original video
codec='libx264',
listen=1, # enables HTTP server
pix_fmt="yuv420p",
preset="ultrafast",
tune="zerolatency",
# g=f"{self.fps*2}",
g=f"{60*2}",
analyzeduration="2000000",
probesize="1000000",
f='mpegts'
)
.overwrite_output()
.run_async(pipe_stdin=True)
)
# return process
def run(self, timer_counter):
frame = None
prediction_frame = None
tracker_frame = None
i=0
first_time = None
cv2.namedWindow("frame", cv2.WINDOW_NORMAL)
# https://gist.github.com/ronekko/dc3747211543165108b11073f929b85e
cv2.moveWindow("frame", 1920, -1)
cv2.setWindowProperty("frame",cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)
while self.is_running.is_set():
i+=1
with timer_counter.get_lock():
timer_counter.value+=1
# zmq_ev = self.frame_sock.poll(timeout=2000)
# if not zmq_ev:
# # when no data comes in, loop so that is_running is checked
# continue
try:
frame: Frame = self.frame_sock.recv_pyobj(zmq.NOBLOCK)
except zmq.ZMQError as e:
# idx = frame.index if frame else "NONE"
# logger.debug(f"reuse video frame {idx}")
pass
# else:
# logger.debug(f'new video frame {frame.index}')
if frame is None:
# might need to wait a few iterations before first frame comes available
time.sleep(.1)
continue
try:
prediction_frame: Frame = self.prediction_sock.recv_pyobj(zmq.NOBLOCK)
for track_id, track in prediction_frame.tracks.items():
prediction_id = f"{track_id}-{track.history[-1].frame_nr}"
self.predictions[prediction_id] = track
except zmq.ZMQError as e:
logger.debug(f'reuse prediction')
try:
tracker_frame: Frame = self.tracker_sock.recv_pyobj(zmq.NOBLOCK)
for track_id, track in tracker_frame.tracks.items():
self.tracks[track_id] = track
except zmq.ZMQError as e:
logger.debug(f'reuse tracks')
if first_time is None:
first_time = frame.time
img = decorate_frame(frame, tracker_frame, prediction_frame, first_time, self.config, self.tracks, self.predictions)
img_path = (self.config.output_dir / f"{i:05d}.png").resolve()
logger.debug(f"write frame {frame.time - first_time:.3f}s")
if self.out_writer:
self.out_writer.write(img)
if self.streaming_process:
self.streaming_process.stdin.write(img.tobytes())
if self.config.render_window:
cv2.imshow('frame',cv2.resize(img, (1920, 1080)))
cv2.waitKey(1)
# clear out old tracks & predictions:
for track_id, track in list(self.tracks.items()):
if get_animation_position(track, frame) == 1:
self.tracks.pop(track_id)
for prediction_id, track in list(self.predictions.items()):
if get_animation_position(track, frame) == 1:
self.predictions.pop(prediction_id)
logger.info('Stopping')
# if i>2:
if self.streaming_process:
self.streaming_process.stdin.close()
if self.out_writer:
self.out_writer.release()
if self.streaming_process:
# oddly wrapped, because both close and release() take time.
logger.info('wait for closing stream')
self.streaming_process.wait()
logger.info('stopped')
# colorset = itertools.product([0,255], repeat=3) # but remove white
# colorset = [(0, 0, 0),
# (0, 0, 255),
# (0, 255, 0),
# (0, 255, 255),
# (255, 0, 0),
# (255, 0, 255),
# (255, 255, 0)
# ]
colorset = [
(255,255,100),
(255,100,255),
(100,255,255),
]
# colorset = [
# (0,0,0),
# ]
def get_animation_position(track: Track, current_frame: Frame):
fade_duration = current_frame.camera.fps * 3
diff = current_frame.index - track.history[-1].frame_nr
return max(0, min(1, diff / fade_duration))
# track.history[-1].frame_nr < (current_frame.index - current_frame.camera.fps * 3)
# track.history[-1].frame_nr < (current_frame.index - current_frame.camera.fps * 3)
# Deprecated
def decorate_frame(frame: Frame, tracker_frame: Frame, prediction_frame: Frame, first_time: float, config: Namespace, tracks: Dict[str, Track], predictions: Dict[str, Track]) -> np.array:
# TODO: replace opencv with QPainter to support alpha? https://doc.qt.io/qtforpython-5/PySide2/QtGui/QPainter.html#PySide2.QtGui.PySide2.QtGui.QPainter.drawImage
# or https://github.com/pygobject/pycairo?tab=readme-ov-file
# or https://pyglet.readthedocs.io/en/latest/programming_guide/shapes.html
# and use http://code.astraw.com/projects/motmot/pygarrayimage.html or https://gist.github.com/nkymut/1cb40ea6ae4de0cf9ded7332f1ca0d55
# or https://api.arcade.academy/en/stable/index.html (supports gradient color in line -- "Arcade is built on top of Pyglet and OpenGL.")
undistorted_img = cv2.undistort(frame.img, config.camera.mtx, config.camera.dist, None, config.camera.newcameramtx)
dst_img = cv2.warpPerspective(undistorted_img,convert_world_space_to_img_space(config.camera.H),(config.camera.w,config.camera.h))
overlay = np.zeros(dst_img.shape, np.uint8)
# Fill image with red color(set each pixel to red)
overlay[:] = (0, 0, 0)
# img = cv2.addWeighted(dst_img, .2, overlay, .3, 0)
img = dst_img.copy()
# all not working:
# if i == 1:
# # thanks to GpG for fixing scaling issue: https://stackoverflow.com/a/39668864
# scale_factor = 1./20 # from 10m to 1000px
# S = np.array([[scale_factor, 0,0],[0,scale_factor,0 ],[ 0,0,1 ]])
# new_H = S * self.H * np.linalg.inv(S)
# warpedFrame = cv2.warpPerspective(img, new_H, (1000,1000))
# cv2.imwrite(str(self.config.output_dir / "orig.png"), warpedFrame)
cv2.rectangle(img, (0,0), (img.shape[1],25), (0,0,0), -1)
if not tracker_frame:
cv2.putText(img, f"and track", (650,17), cv2.FONT_HERSHEY_PLAIN, 1, (255,255,0), 1)
else:
for track_id, track in tracks.items():
inv_H = np.linalg.pinv(tracker_frame.H)
draw_track_projected(img, track, int(track_id), config.camera, convert_world_points_to_img_points)
if not prediction_frame:
cv2.putText(img, f"Waiting for prediction...", (500,17), cv2.FONT_HERSHEY_PLAIN, 1, (255,255,0), 1)
# continue
else:
for track_id, track in predictions.items():
inv_H = np.linalg.pinv(prediction_frame.H)
# draw_track(img, track, int(track_id))
draw_trackjectron_history(img, track, int(track.track_id), convert_world_points_to_img_points)
anim_position = get_animation_position(track, frame)
draw_track_predictions(img, track, int(track.track_id)+1, config.camera, convert_world_points_to_img_points, anim_position=anim_position)
cv2.putText(img, f"{len(track.predictor_history) if track.predictor_history else 'none'}", to_point(track.history[0].get_foot_coords()), cv2.FONT_HERSHEY_COMPLEX, 1, (255,255,255), 1)
base_color = (255,)*3
info_color = (255,255,0)
predictor_color = (255,0,255)
tracker_color = (0,255,255)
cv2.putText(img, f"{frame.index:06d}", (20,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
cv2.putText(img, f"{frame.time - first_time: >10.2f}s", (150,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
cv2.putText(img, f"{frame.time - time.time():.2f}s", (250,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
if prediction_frame:
# render Δt and Δ frames
cv2.putText(img, f"{tracker_frame.index - frame.index}", (90,17), cv2.FONT_HERSHEY_PLAIN, 1, tracker_color, 1)
cv2.putText(img, f"{prediction_frame.index - frame.index}", (120,17), cv2.FONT_HERSHEY_PLAIN, 1, predictor_color, 1)
cv2.putText(img, f"{tracker_frame.time - time.time():.2f}s", (310,17), cv2.FONT_HERSHEY_PLAIN, 1, tracker_color, 1)
cv2.putText(img, f"{prediction_frame.time - time.time():.2f}s", (380,17), cv2.FONT_HERSHEY_PLAIN, 1, predictor_color, 1)
cv2.putText(img, f"{len(tracker_frame.tracks)} tracks", (620,17), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
cv2.putText(img, f"h: {np.average([len(t.history or []) for t in prediction_frame.tracks.values()]):.2f}", (700,17), cv2.FONT_HERSHEY_PLAIN, 1, tracker_color, 1)
cv2.putText(img, f"ph: {np.average([len(t.predictor_history or []) for t in prediction_frame.tracks.values()]):.2f}", (780,17), cv2.FONT_HERSHEY_PLAIN, 1, predictor_color, 1)
cv2.putText(img, f"p: {np.average([len(t.predictions or []) for t in prediction_frame.tracks.values()]):.2f}", (860,17), cv2.FONT_HERSHEY_PLAIN, 1, predictor_color, 1)
options = []
for option in ['prediction_horizon','num_samples','full_dist','gmm_mode','z_mode', 'model_dir']:
options.append(f"{option}: {config.__dict__[option]}")
cv2.putText(img, options.pop(-1), (20,img.shape[0]-30), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
cv2.putText(img, " | ".join(options), (20,img.shape[0]-10), cv2.FONT_HERSHEY_PLAIN, 1, base_color, 1)
return img
def run_cv_renderer(config: Namespace, is_running: BaseEvent, timer_counter):
renderer = CvRenderer(config, is_running)
renderer.run(timer_counter)

View file

@ -1,26 +1,87 @@
from __future__ import annotations
from argparse import Namespace
from dataclasses import dataclass, field
import dataclasses
from enum import IntFlag
from itertools import cycle
import json
import logging
from multiprocessing import Event
from pathlib import Path
import pickle
import sys
import time
from typing import Iterable, Optional
from typing import Iterable, List, Optional
import numpy as np
import cv2
import pandas as pd
import zmq
import os
from deep_sort_realtime.deep_sort.track import Track as DeepsortTrack
from deep_sort_realtime.deep_sort.track import TrackState as DeepsortTrackState
from bytetracker.byte_tracker import STrack as ByteTrackTrack
from bytetracker.basetrack import TrackState as ByteTrackTrackState
from trajectron.environment import Environment, Node, Scene
from urllib.parse import urlparse
from trap.utils import get_bins
from trap.utils import inv_lerp, lerp
logger = logging.getLogger('trap.frame_emitter')
class DataclassJSONEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, np.ndarray):
return o.tolist()
if dataclasses.is_dataclass(o):
if isinstance(o, Frame):
tracks = {}
for track_id, track in o.tracks.items():
track_obj = dataclasses.asdict(track)
track_obj['history'] = track.get_projected_history(None, o.camera)
tracks[track_id] = track_obj
d = {
'index': o.index,
'time': o.time,
'tracks': tracks,
'camera': dataclasses.asdict(o.camera),
}
else:
d = dataclasses.asdict(o)
# if isinstance(o, Frame):
# # Don't send images over JSON
# del d['img']
return d
return super().default(o)
class UrlOrPath():
def __init__(self, string):
self.url = urlparse(str(string))
def __str__(self) -> str:
return self.url.geturl()
def is_url(self) -> bool:
return len(self.url.netloc) > 0
def path(self) -> Path:
if self.is_url():
return Path(self.url.path)
return Path(self.url.geturl()) # can include scheme, such as C:/
class Space(IntFlag):
Image = 1 # As detected in the image
Undistorted = 2 # After applying lense undistortiion
World = 4 # After lens undistort and homography
Render = 8 # View space of renderer
class DetectionState(IntFlag):
Tentative = 1 # state before n_init (see DeepsortTrack)
Confirmed = 2 # after tentative
Lost = 4 # lost when DeepsortTrack.time_since_update > 0 but not Deleted
Interpolated = 8 # A position estimated through interpolation of adjecent detections
@classmethod
def from_deepsort_track(cls, track: DeepsortTrack):
@ -32,6 +93,78 @@ class DetectionState(IntFlag):
return cls.Confirmed
raise RuntimeError("Should not run into Deleted entries here")
@classmethod
def from_bytetrack_track(cls, track: ByteTrackTrack):
if track.state == ByteTrackTrackState.New:
return cls.Tentative
if track.state == ByteTrackTrackState.Lost:
return cls.Lost
# if track.time_since_update > 0:
if track.state == ByteTrackTrackState.Tracked:
return cls.Confirmed
raise RuntimeError("Should not run into Deleted entries here")
def H_from_path(path: Path):
if path.suffix == '.json':
with path.open('r') as fp:
H = np.array(json.load(fp))
else:
H = np.loadtxt(path, delimiter=',')
return H
@dataclass
class Camera:
mtx: cv2.Mat
dist: cv2.Mat
w: float
h: float
H: cv2.Mat # homography
newcameramtx: cv2.Mat = field(init=False)
roi: cv2.typing.Rect = field(init=False)
fps: float
def __post_init__(self):
self.newcameramtx, self.roi = cv2.getOptimalNewCameraMatrix(self.mtx, self.dist, (self.w,self.h), 1, (self.w,self.h))
@classmethod
def from_calibfile(cls, calibration_path, H, fps):
with calibration_path.open('r') as fp:
data = json.load(fp)
# print(data)
# print(data['camera_matrix'])
# camera = {
# 'camera_matrix': np.array(data['camera_matrix']),
# 'dist_coeff': np.array(data['dist_coeff']),
# }
return cls(
np.array(data['camera_matrix']),
np.array(data['dist_coeff']),
data['dim']['width'],
data['dim']['height'],
H, fps)
@classmethod
def from_paths(cls, calibration_path, h_path, fps):
H = H_from_path(h_path)
return cls.from_calibfile(calibration_path, H, fps)
# def __init__(self, mtx, dist, w, h, H):
# self.mtx = mtx
# self.dist = dist
# self.w = w
# self.h = h
# self.newcameramtx, self.roi = cv2.getOptimalNewCameraMatrix(mtx, dist, (w,h), 1, (w,h))
# self.H = H # homography
@dataclass
class Position:
x: float
y: float
conf: float
state: DetectionState
frame_nr: int
det_class: str
@dataclass
class Detection:
@ -43,13 +176,19 @@ class Detection:
conf: float # object detector probablity
state: DetectionState
frame_nr: int
det_class: str
def get_foot_coords(self) -> list[tuple[float, float]]:
def get_foot_coords(self) -> list[float, float]:
return [self.l + 0.5 * self.w, self.t+self.h]
@classmethod
def from_deepsort(cls, dstrack: DeepsortTrack):
return cls(dstrack.track_id, *dstrack.to_ltwh(), dstrack.det_conf, DetectionState.from_deepsort_track(dstrack))
def from_deepsort(cls, dstrack: DeepsortTrack, frame_nr: int):
return cls(dstrack.track_id, *dstrack.to_ltwh(), dstrack.det_conf, DetectionState.from_deepsort_track(dstrack), frame_nr, dstrack.det_class)
@classmethod
def from_bytetrack(cls, bstrack: ByteTrackTrack, frame_nr: int):
return cls(bstrack.track_id, *bstrack.tlwh, bstrack.score, DetectionState.from_bytetrack_track(bstrack), frame_nr, bstrack.cls)
def get_scaled(self, scale: float = 1):
if scale == 1:
@ -62,7 +201,9 @@ class Detection:
self.w*scale,
self.h*scale,
self.conf,
self.state)
self.state,
self.frame_nr,
self.det_class)
def to_ltwh(self):
return (int(self.l), int(self.t), int(self.w), int(self.h))
@ -70,7 +211,17 @@ class Detection:
def to_ltrb(self):
return (int(self.l), int(self.t), int(self.l+self.w), int(self.t+self.h))
@dataclass
class Trajectory:
# TODO)) Replace history and predictions in Track with Trajectory
space: Space
fps: int = 12
points: List[Detection] = field(default_factory=list)
def __iter__(self):
for d in self.points:
yield d
@dataclass
class Track:
@ -79,23 +230,171 @@ class Track:
and acceleration.
"""
track_id: str = None
history: [Detection] = field(default_factory=lambda: [])
history: List[Detection] = field(default_factory=list)
predictor_history: Optional[list] = None # in image space
predictions: Optional[list] = None
fps: int = 12 # TODO)) convert this to camera? That way, incorporates H and dist, alternatively, each track is as a whole attached to a space
source: Optional[int] = None # to keep track of processed tracks
def get_projected_history(self, H) -> np.array:
def get_projected_history(self, H: Optional[cv2.Mat] = None, camera: Optional[Camera]= None) -> np.array:
foot_coordinates = [d.get_foot_coords() for d in self.history]
# TODO)) Undistort points before perspective transform
if len(foot_coordinates):
coords = cv2.perspectiveTransform(np.array([foot_coordinates]),H)
if camera:
coords = cv2.undistortPoints(np.array([foot_coordinates]).astype('float32'), camera.mtx, camera.dist, None, camera.newcameramtx)
coords = cv2.perspectiveTransform(np.array(coords),camera.H)
return coords.reshape((coords.shape[0],2))
else:
coords = cv2.perspectiveTransform(np.array([foot_coordinates]),H)
return coords[0]
return np.array([])
def get_projected_history_as_dict(self, H) -> dict:
coords = self.get_projected_history(H)
def get_projected_history_as_dict(self, H, camera: Optional[Camera]= None) -> dict:
coords = self.get_projected_history(H, camera)
return [{"x":c[0], "y":c[1]} for c in coords]
def get_with_interpolated_history(self) -> Track:
# new_history = [Detection(d.track_id, l, t, w, h, d.conf, d.state, d.frame_nr, d.det_class) for l, t, w, h, d in zip(ls,ts,ws,hs, track.history)]
# new_track = Track(track.track_id, new_history, track.predictor_history, track.predictions)
new_history = []
for j in range(len(self.history)):
a = self.history[j]
new_history.append(Detection(a.track_id, a.l, a.t, a.w, a.h, a.conf, a.state, a.frame_nr, a.det_class))
if j+1 >= len(self.history):
break
b = self.history[j+1]
gap = b.frame_nr - a.frame_nr
if gap < 1:
logger.error(f"WARNING, gap between frames {a.frame_nr} -> {b.frame_nr} is negative?")
if gap > 1:
for g in range(1, gap):
l = lerp(a.l, b.l, g/gap)
t = lerp(a.t, b.t, g/gap)
w = lerp(a.w, b.w, g/gap)
h = lerp(a.h, b.h, g/gap)
conf = 0
state = DetectionState.Lost
frame_nr = a.frame_nr + g
new_history.append(Detection(a.track_id, l, t, w, h, conf, state, frame_nr, a.det_class))
return Track(
self.track_id,
new_history,
self.predictor_history,
self.predictions,
self.fps)
def is_complete(self):
diffs = [(b.frame_nr - a.frame_nr) for a,b in zip(self.history[:-1], self.history[1:])]
return any([d != 1 for d in diffs])
def get_sampled(self, step_size = 1, offset=0):
if not self.is_complete():
t = self.get_with_interpolated_history()
else:
t = self
return Track(
t.track_id,
t.history[offset::step_size],
t.predictor_history,
t.predictions,
t.fps/step_size)
def get_binned(self, bin_size, camera: Camera, bin_start=True):
"""
For an experiment: what if we predict using only concrete positions, by mapping
dx,dy to a grid. Thus prediction can be for 8 moves, or rather headings
see ~/notes/attachments example svg
"""
history = self.get_projected_history_as_dict(H=None, camera=camera)
def round_to_grid_precision(x):
factor = 1/bin_size
return round(x * factor) / factor
new_history: List[dict] = []
for i, (det0, det1) in enumerate(zip(history[:-1], history[1:])):
if i == 0:
new_history.append({
'x': round_to_grid_precision(det0['x']),
'y': round_to_grid_precision(det0['y'])
} if bin_start else det0)
continue
if abs(det1['x'] - new_history[-1]['x']) < bin_size and abs(det1['y'] - new_history[-1]['y']) < bin_size:
continue
# det1 falls outside of the box [-bin_size:+bin_size] around last detection
# 1. Interpolate exact point between det0 and det1 that this happens
if abs(det1['x'] - new_history[-1]['x']) >= bin_size:
if det1['x'] - new_history[-1]['x'] >= bin_size:
# det1 left of last
x = new_history[-1]['x'] + bin_size
f = inv_lerp(det0['x'], det1['x'], x)
elif new_history[-1]['x'] - det1['x'] >= bin_size:
# det1 left of last
x = new_history[-1]['x'] - bin_size
f = inv_lerp(det0['x'], det1['x'], x)
y = lerp(det0['y'], det1['y'], f)
if abs(det1['y'] - new_history[-1]['y']) >= bin_size:
if det1['y'] - new_history[-1]['y'] >= bin_size:
# det1 left of last
y = new_history[-1]['y'] + bin_size
f = inv_lerp(det0['y'], det1['y'], y)
elif new_history[-1]['y'] - det1['y'] >= bin_size:
# det1 left of last
y = new_history[-1]['y'] - bin_size
f = inv_lerp(det0['y'], det1['y'], y)
x = lerp(det0['x'], det1['x'], f)
# 2. Find closest point on rectangle (rectangle's four corners, or 4 midpoints)
points = get_bins(bin_size)
points = [[new_history[-1]['x']+p[0], new_history[-1]['y'] + p[1]] for p in points]
distances = [np.linalg.norm([p[0] - x, p[1]-y]) for p in points]
closest = np.argmin(distances)
point = points[closest]
new_history.append({'x': point[0], 'y':point[1]})
# todo Offsets to points:[ history for in points]
return new_history
def to_trajectron_node(self, camera: Camera, env: Environment) -> Node:
positions = self.get_projected_history(None, camera)
velocity = np.gradient(positions, 1/self.fps, axis=0)
acceleration = np.gradient(velocity, 1/self.fps, axis=0)
new_first_idx = self.history[0].frame_nr
data_columns = pd.MultiIndex.from_product([['position', 'velocity', 'acceleration'], ['x', 'y']])
# vx = derivative_of(x, scene.dt)
# vy = derivative_of(y, scene.dt)
# ax = derivative_of(vx, scene.dt)
# ay = derivative_of(vy, scene.dt)
data_dict = {
('position', 'x'): positions[:,0],
('position', 'y'): positions[:,1],
('velocity', 'x'): velocity[:,0],
('velocity', 'y'): velocity[:,1],
('acceleration', 'x'): acceleration[:,0],
('acceleration', 'y'): acceleration[:,1]
}
node_data = pd.DataFrame(data_dict, columns=data_columns)
return Node(node_type=env.NodeType.PEDESTRIAN, node_id=self.track_id, data=node_data, first_timestep=new_first_idx)
@ -106,6 +405,7 @@ class Frame:
time: float= field(default_factory=lambda: time.time())
tracks: Optional[dict[str, Track]] = None
H: Optional[np.array] = None
camera: Optional[Camera] = None
def aslist(self) -> [dict]:
return { t.track_id:
@ -120,6 +420,16 @@ class Frame:
} for t in self.tracks.values()
}
def without_img(self):
return Frame(self.index, None, self.time, self.tracks, self.H, self.camera)
def video_src_from_config(config) -> UrlOrPath:
if config.video_loop:
video_srcs: Iterable[UrlOrPath] = cycle(config.video_src)
else:
video_srcs: Iterable[UrlOrPath] = config.video_src
return video_srcs
class FrameEmitter:
'''
Emit frame in a separate threat so they can be throttled,
@ -134,30 +444,41 @@ class FrameEmitter:
self.frame_sock = context.socket(zmq.PUB)
self.frame_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. make sure to set BEFORE connect/bind
self.frame_sock.bind(config.zmq_frame_addr)
self.frame_noimg_sock = context.socket(zmq.PUB)
self.frame_noimg_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. make sure to set BEFORE connect/bind
self.frame_noimg_sock.bind(config.zmq_frame_noimg_addr)
logger.info(f"Connection socket {config.zmq_frame_addr}")
logger.info(f"Connection socket {config.zmq_frame_noimg_addr}")
if self.config.video_loop:
self.video_srcs: Iterable[Path] = cycle(self.config.video_src)
else:
self.video_srcs: [Path] = self.config.video_src
self.video_srcs = video_src_from_config(self.config)
def emit_video(self):
def emit_video(self, timer_counter):
i = 0
delay_generation = False
for video_path in self.video_srcs:
logger.info(f"Play from '{str(video_path)}'")
if str(video_path).isdigit():
# numeric input is a CV camera
video = cv2.VideoCapture(int(str(video_path)))
# TODO: make config variables
video.set(cv2.CAP_PROP_FRAME_WIDTH, int(1280))
video.set(cv2.CAP_PROP_FRAME_HEIGHT, int(720))
video.set(cv2.CAP_PROP_FRAME_WIDTH, int(self.config.camera.w))
video.set(cv2.CAP_PROP_FRAME_HEIGHT, int(self.config.camera.h))
print("exposure!", video.get(cv2.CAP_PROP_AUTO_EXPOSURE))
video.set(cv2.CAP_PROP_FPS, 5)
fps=5
elif video_path.url.scheme == 'rtsp':
gst = f"rtspsrc location={video_path} latency=0 buffer-mode=auto ! decodebin ! videoconvert ! appsink max-buffers=0 drop=true"
logger.info(f"Capture gstreamer (gst-launch-1.0): {gst}")
video = cv2.VideoCapture(gst, cv2.CAP_GSTREAMER)
fps=12
else:
# os.environ["OPENCV_FFMPEG_CAPTURE_OPTIONS"] = "fflags;nobuffer|flags;low_delay|avioflags;direct|rtsp_transport;udp"
video = cv2.VideoCapture(str(video_path))
fps = video.get(cv2.CAP_PROP_FPS)
delay_generation = True
fps = video.get(cv2.CAP_PROP_FPS)
target_frame_duration = 1./fps
logger.info(f"Emit frames at {fps} fps")
@ -167,22 +488,27 @@ class FrameEmitter:
i = self.config.video_offset
if '-' in video_path.stem:
path_stem = video_path.stem[:video_path.stem.rfind('-')]
else:
path_stem = video_path.stem
path_stem += "-homography"
homography_path = video_path.with_stem(path_stem).with_suffix('.txt')
logger.info(f'check homography file {homography_path}')
if homography_path.exists():
logger.info(f'Found custom homography file! Using {homography_path}')
video_H = np.loadtxt(homography_path, delimiter=',')
else:
video_H = None
# if '-' in video_path.path().stem:
# path_stem = video_path.stem[:video_path.stem.rfind('-')]
# else:
# path_stem = video_path.stem
# path_stem += "-homography"
# homography_path = video_path.with_stem(path_stem).with_suffix('.txt')
# logger.info(f'check homography file {homography_path}')
# if homography_path.exists():
# logger.info(f'Found custom homography file! Using {homography_path}')
# video_H = np.loadtxt(homography_path, delimiter=',')
# else:
# video_H = None
video_H = self.config.camera.H
prev_time = time.time()
while self.is_running.is_set():
with timer_counter.get_lock():
timer_counter.value += 1
ret, img = video.read()
# seek to 0 if video has finished. Infinite loop
@ -198,19 +524,23 @@ class FrameEmitter:
# hack to mask out area
cv2.rectangle(img, (0,0), (800,200), (0,0,0), -1)
frame = Frame(index=i, img=img, H=video_H)
frame = Frame(index=i, img=img, H=self.config.H, camera=self.config.camera)
# TODO: this is very dirty, need to find another way.
# perhaps multiprocessing Array?
self.frame_noimg_sock.send(pickle.dumps(frame.without_img()))
self.frame_sock.send(pickle.dumps(frame))
# defer next loop
now = time.time()
time_diff = (now - prev_time)
if time_diff < target_frame_duration:
time.sleep(target_frame_duration - time_diff)
now += target_frame_duration - time_diff
prev_time = now
# only delay consuming the next frame when using a file.
# Otherwise, go ASAP
if delay_generation:
# defer next loop
now = time.time()
time_diff = (now - prev_time)
if time_diff < target_frame_duration:
time.sleep(target_frame_duration - time_diff)
now += target_frame_duration - time_diff
prev_time = now
i += 1
@ -223,7 +553,7 @@ class FrameEmitter:
def run_frame_emitter(config: Namespace, is_running: Event):
def run_frame_emitter(config: Namespace, is_running: Event, timer_counter: int):
router = FrameEmitter(config, is_running)
router.emit_video()
router.emit_video(timer_counter)
is_running.clear()

View file

@ -1,22 +1,28 @@
import atexit
import logging
from logging.handlers import SocketHandler
from logging.handlers import SocketHandler, QueueHandler, QueueListener
from multiprocessing import Event, Process, Queue
import multiprocessing
import signal
import sys
import time
from trap.config import parser
from trap.cv_renderer import run_cv_renderer
from trap.frame_emitter import run_frame_emitter
from trap.prediction_server import run_prediction_server
from trap.renderer import run_renderer
from trap.preview_renderer import run_preview_renderer
from trap.animation_renderer import run_animation_renderer
from trap.socket_forwarder import run_ws_forwarder
from trap.timer import TimerCollection
from trap.tracker import run_tracker
from setproctitle import setproctitle, setthreadtitle
logger = logging.getLogger("trap.plumbing")
class ExceptionHandlingProcess(Process):
def run(self):
@ -31,10 +37,12 @@ class ExceptionHandlingProcess(Process):
atexit.register(exit_handler)
signal.signal(signal.SIGTERM, exit_handler)
signal.signal(signal.SIGINT, exit_handler)
setproctitle(f"trap-{self.name}")
try:
super(Process, self).run()
except Exception as e:
print("finished ", self.name)
except BaseException as e:
logger.critical(f"Exception in {self.name}")
logger.exception(e)
self._kwargs['is_running'].clear()
@ -44,55 +52,112 @@ def start():
loglevel = logging.NOTSET if args.verbose > 1 else logging.DEBUG if args.verbose > 0 else logging.INFO
# print(args)
# exit()
logging.basicConfig(
level=loglevel,
)
# set per handler, so we can set it lower for the root logger if remote logging is enabled
root_logger = logging.getLogger()
[h.setLevel(loglevel) for h in root_logger.handlers]
isRunning = Event()
isRunning.set()
q = multiprocessing.Queue(-1)
queue_handler = QueueHandler(q)
stream_handler = logging.StreamHandler()
log_handlers = [stream_handler]
if args.remote_log_addr:
logging.captureWarnings(True)
root_logger.setLevel(logging.NOTSET) # to send all records to cutelog
# root_logger.setLevel(logging.NOTSET) # to send all records to cutelog
socket_handler = SocketHandler(args.remote_log_addr, args.remote_log_port)
root_logger.addHandler(socket_handler)
socket_handler.setLevel(logging.NOTSET)
log_handlers.append(socket_handler)
queue_listener = QueueListener(q, *log_handlers, respect_handler_level=True)
# root = logging.getLogger()
logging.basicConfig(
level=loglevel,
handlers=[queue_handler]
)
# root_logger = logging.getLogger()
# # set per handler, so we can set it lower for the root logger if remote logging is enabled
# [h.setLevel(loglevel) for h in root_logger.handlers]
# queue_listener.handlers.append(socket_handler)
timers = TimerCollection()
timer_fe = timers.new('frame_emitter')
timer_tracker = timers.new('tracker')
# instantiating process with arguments
procs = [
ExceptionHandlingProcess(target=run_ws_forwarder, kwargs={'config': args, 'is_running': isRunning}, name='forwarder'),
ExceptionHandlingProcess(target=run_frame_emitter, kwargs={'config': args, 'is_running': isRunning}, name='frame_emitter'),
ExceptionHandlingProcess(target=run_tracker, kwargs={'config': args, 'is_running': isRunning}, name='tracker'),
# ExceptionHandlingProcess(target=run_ws_forwarder, kwargs={'config': args, 'is_running': isRunning}, name='forwarder'),
ExceptionHandlingProcess(target=run_frame_emitter, kwargs={'config': args, 'is_running': isRunning, 'timer_counter': timer_fe.iterations}, name='frame_emitter'),
ExceptionHandlingProcess(target=run_tracker, kwargs={'config': args, 'is_running': isRunning, 'timer_counter': timer_tracker.iterations}, name='tracker'),
]
if args.render_file or args.render_url or args.render_window:
# if args.render_file or args.render_url or args.render_window:
if args.render_window or args.render_file or args.render_url:
timer_preview = timers.new('preview')
procs.append(
ExceptionHandlingProcess(target=run_renderer, kwargs={'config': args, 'is_running': isRunning}, name='renderer')
# ExceptionHandlingProcess(target=run_cv_renderer, kwargs={'config': args, 'is_running': isRunning}, name='preview')
ExceptionHandlingProcess(target=run_cv_renderer, kwargs={'config': args, 'is_running': isRunning, 'timer_counter': timer_preview.iterations}, name='preview')
)
if args.render_animation:
procs.append(
ExceptionHandlingProcess(target=run_animation_renderer, kwargs={'config': args, 'is_running': isRunning}, name='renderer')
)
if not args.bypass_prediction:
timer_predict = timers.new('predict')
procs.append(
ExceptionHandlingProcess(target=run_prediction_server, kwargs={'config': args, 'is_running':isRunning}, name='inference'),
ExceptionHandlingProcess(target=run_prediction_server, kwargs={'config': args, 'is_running':isRunning, 'timer_counter': timer_predict.iterations}, name='inference'),
)
def timer_process(timers: TimerCollection, is_running: Event):
while is_running.is_set():
time.sleep(1)
timers.snapshot()
print(timers.to_string())
logger.info("start")
for proc in procs:
proc.start()
# wait for processes to clean up
for proc in procs:
proc.join()
procs.append(
ExceptionHandlingProcess(target=timer_process, kwargs={'is_running':isRunning, 'timers': timers}, name='timer'),
)
logger.info('Stop')
try:
logger.info("start")
for proc in procs:
proc.start()
# if start the listener before the subprocesses, it becomes a mess, because the
# running threat is forked too, but cannot easily be stopped in the forks.
# Thus, only start the queue-listener threat _after_ starting processes
queue_listener.start()
# wait for processes to clean up
for proc in procs:
proc.join()
isRunning.clear()
logger.info('Stop')
except BaseException as e:
# mainly for KeyboardInterrupt
# but in any case, on error all processed need to be signalled to shut down
logger.critical(f"Exception in plumber")
logger.exception(e)
isRunning.clear()
# while True:
# time.sleep(2)
# any_alive = False
# alive = [proc for proc in procs if proc.is_alive()]
# print("alive: ", [p.name for p in alive])
# if len(alive) < 1:
# break
print('stop listener')
queue_listener.stop()
print('stopped listener')
print("finished plumber")
if __name__ == "__main__":
start()

View file

@ -26,7 +26,7 @@ import matplotlib.pyplot as plt
import zmq
from trap.frame_emitter import Frame
from trap.frame_emitter import DataclassJSONEncoder, Frame
from trap.tracker import Track, Smoother
logger = logging.getLogger("trap.prediction")
@ -113,6 +113,34 @@ def get_maps_for_input(input_dict, scene, hyperparams):
return maps_dict
# If homography is in cm, predictions can be terrible. Correct that here
# TODO)) This should actually not be here, but we should use alternative homography
# and then scale up in rendering
def history_cm_to_m(history):
return [(h[0]/100, h[1]/100) for h in history]
# TODO)) variable. Now placeholders for hof2 dataset
cx = 11.874955125
cy = 7.186118765
def prediction_m_to_cm(source):
# histories_dict[t][node]
for t in source:
for node in source[t]:
# source[t][node][:,0] += cx
# source[t][node][:,1] += cy
source[t][node] *= 100
# print(t,node, source[t][node])
return source
def offset_trajectron_dict(source, x, y):
# histories_dict[t][node]
for t in source:
for node in source[t]:
source[t][node][:,0] += x
source[t][node][:,1] += y
return source
class PredictionServer:
def __init__(self, config: Namespace, is_running: Event):
self.config = config
@ -122,7 +150,7 @@ class PredictionServer:
logger.warning("Running on CPU. Specifying --eval_device cuda:0 should dramatically speed up prediction")
if self.config.smooth_predictions:
self.smoother = Smoother(window_len=4)
self.smoother = Smoother(window_len=12, convolution=True) # convolution seems fine for predictions
context = zmq.Context()
self.trajectory_socket: zmq.Socket = context.socket(zmq.SUB)
@ -132,10 +160,18 @@ class PredictionServer:
self.prediction_socket: zmq.Socket = context.socket(zmq.PUB)
self.prediction_socket.bind(config.zmq_prediction_addr)
self.external_predictions = not self.config.zmq_prediction_addr.startswith("ipc://")
# print(self.prediction_socket)
def run(self):
def send_frame(self, frame: Frame):
if self.external_predictions:
# data = json.dumps(frame, cls=DataclassJSONEncoder)
self.prediction_socket.send_json(obj=frame, cls=DataclassJSONEncoder)
else:
self.prediction_socket.send_pyobj(frame)
def run(self, timer_counter):
print(self.config)
if self.config.seed is not None:
random.seed(self.config.seed)
np.random.seed(self.config.seed)
@ -153,6 +189,7 @@ class PredictionServer:
if not os.path.exists(config_file):
raise ValueError('Config json not found!')
with open(config_file, 'r') as conf_json:
logger.info(f"Load config from {config_file}")
hyperparams = json.load(conf_json)
# Add hyperparams from arguments
@ -171,18 +208,9 @@ class PredictionServer:
logger.info(f"Use hyperparams: {hyperparams=}")
output_save_dir = os.path.join(self.config.output_dir, 'pred_figs')
pathlib.Path(output_save_dir).mkdir(parents=True, exist_ok=True)
with open(self.config.eval_data_dict, 'rb') as f:
eval_env = dill.load(f, encoding='latin1')
if eval_env.robot_type is None and hyperparams['incl_robot_node']:
eval_env.robot_type = eval_env.NodeType[0] # TODO: Make more general, allow the user to specify?
for scene in eval_env.scenes:
scene.add_robot_from_nodes(eval_env.robot_type)
logger.info('Loaded data from %s' % (self.config.eval_data_dict,))
# Creating a dummy environment with a single scene that contains information about the world.
@ -200,6 +228,7 @@ class PredictionServer:
model_registrar = ModelRegistrar(self.config.model_dir, self.config.eval_device)
model_iterations = pathlib.Path(self.config.model_dir).glob('model_registrar-*.pt')
highest_iter = max([int(p.stem.split('-')[-1]) for p in model_iterations])
logger.info(f"Loading model {highest_iter}")
model_registrar.load_models(iter_num=highest_iter)
@ -218,6 +247,8 @@ class PredictionServer:
prev_run_time = 0
while self.is_running.is_set():
timestep += 1
with timer_counter.get_lock():
timer_counter.value+=1
# this_run_time = time.time()
# logger.debug(f'test {prev_run_time - this_run_time}')
@ -246,12 +277,16 @@ class PredictionServer:
if self.config.predict_training_data:
input_dict = eval_scene.get_clipped_input_dict(timestep, hyperparams['state'])
else:
# print('await', self.config.zmq_trajectory_addr)
zmq_ev = self.trajectory_socket.poll(timeout=2000)
if not zmq_ev:
# on no data loop so that is_running is checked
continue
t_init = time.time()
data = self.trajectory_socket.recv()
# print('recv tracker frame')
frame: Frame = pickle.loads(data)
# trajectory_data = {t.track_id: t.get_projected_history_as_dict(frame.H) for t in frame.tracks.values()}
# trajectory_data = json.loads(data)
@ -269,33 +304,55 @@ class PredictionServer:
# TODO: modify this into a mapping function between JS data an the expected Node format
# node = FakeNode(online_env.NodeType.PEDESTRIAN)
history = [[h['x'], h['y']] for h in track.get_projected_history_as_dict(frame.H)]
history = np.array(history)
x = history[:, 0]
y = history[:, 1]
# TODO: calculate dt based on input
vx = derivative_of(x, 0.1) #eval_scene.dt
vy = derivative_of(y, 0.1)
ax = derivative_of(vx, 0.1)
ay = derivative_of(vy, 0.1)
data_dict = {('position', 'x'): x[:], # [-10:-1]
('position', 'y'): y[:], # [-10:-1]
('velocity', 'x'): vx[:], # [-10:-1]
('velocity', 'y'): vy[:], # [-10:-1]
('acceleration', 'x'): ax[:], # [-10:-1]
('acceleration', 'y'): ay[:]} # [-10:-1]
data_columns = pd.MultiIndex.from_product([['position', 'velocity', 'acceleration'], ['x', 'y']])
if self.config.step_size > 1:
if (len(track.history) % self.config.step_size) != 0:
# only add when having a new step
continue
track = track.get_sampled(self.config.step_size)
if len(track.history) < 2:
continue
node = track.to_trajectron_node(self.config.camera, online_env)
# print(node.data.data[-1])
input_dict[node] = np.array(object=node.data.data[-1])
# print("history", node.data.data[-10:])
# print("get", node.get(np.array([frame.index-10,frame.index]), {'position': ['x', 'y']}))
# history = [[h['x'], h['y']] for h in track.get_projected_history_as_dict(frame.H, self.config.camera)]
# if self.config.cm_to_m:
# history = history_cm_to_m(history)
# history = np.array(history)
# x = history[:, 0] #- cx # we can create bigger steps by doing history[::5,0]
# y = history[:, 1] #- cy # history[::5,1]
# if self.config.center_data:
# x -= cx
# y -= cy
# # TODO: calculate dt based on input
# vx = derivative_of(x, .1) #eval_scene.dt
# vy = derivative_of(y, .1)
# ax = derivative_of(vx, .1)
# ay = derivative_of(vy, .1)
node_data = pd.DataFrame(data_dict, columns=data_columns)
node = Node(
node_type=online_env.NodeType.PEDESTRIAN,
node_id=identifier,
data=node_data,
first_timestep=timestep
)
# data_dict = {('position', 'x'): x[:], # [-10:-1]
# ('position', 'y'): y[:], # [-10:-1]
# ('velocity', 'x'): vx[:], # [-10:-1]
# ('velocity', 'y'): vy[:], # [-10:-1]
# ('acceleration', 'x'): ax[:], # [-10:-1]
# ('acceleration', 'y'): ay[:]} # [-10:-1]
# data_columns = pd.MultiIndex.from_product([['position', 'velocity', 'acceleration'], ['x', 'y']])
input_dict[node] = np.array([x[-1],y[-1],vx[-1],vy[-1],ax[-1],ay[-1]])
# node_data = pd.DataFrame(data_dict, columns=data_columns)
# node = Node(
# node_type=online_env.NodeType.PEDESTRIAN,
# node_id=identifier,
# data=node_data,
# first_timestep=timestep
# )
# input_dict[node] = np.array(object=[x[-1],y[-1],vx[-1],vy[-1],ax[-1],ay[-1]])
# print(input_dict)
@ -305,7 +362,9 @@ class PredictionServer:
# And want to update the network
# data = json.dumps({})
self.prediction_socket.send_pyobj(frame)
# TODO)) signal doing nothing
# TODO)) And update the network
self.send_frame(frame)
continue
@ -330,19 +389,16 @@ class PredictionServer:
# in the OnlineMultimodalGenerativeCVAE (see trajectron.model.online_mgcvae.py) each node's distribution
# is put stored in self.latent.p_dist by OnlineMultimodalGenerativeCVAE.p_z_x(). Type: torch.distributions.OneHotCategorical
# Later sampling in discrete_latent.py: DiscreteLatent.sample_p()
# print(input_dict)
dists, preds = trajectron.incremental_forward(input_dict,
maps,
prediction_horizon=self.config.prediction_horizon, # TODO: make variable
num_samples=self.config.num_samples, # TODO: make variable
full_dist=self.config.full_dist, # "The models full sampled output, where z and y are sampled sequentially"
full_dist=self.config.full_dist, # "The moldes full sampled output, where z and y are sampled sequentially"
gmm_mode=self.config.gmm_mode, # "If True: The mode of the Gaussian Mixture Model (GMM) is sampled (see trajectron.model.mgcvae.py)"
z_mode=self.config.z_mode # "Predictions from the models most-likely high-level latent behavior mode" (see trajecton.models.components.discrete_latent:sample_p(most_likely_z=z_mode))
)
end = time.time()
logger.debug("took %.2f s (= %.2f Hz) w/ %d nodes and %d edges" % (end - start,
1. / (end - start), len(trajectron.nodes),
trajectron.scene_graph.get_num_edges()))
# unsure what this bit from online_prediction.py does:
# detailed_preds_dict = dict()
# for node in eval_scene.nodes:
@ -354,12 +410,27 @@ class PredictionServer:
# histories_dict provides the trajectory used for prediction
# futures_dict is the Ground Truth, which is unvailable in an online setting
prediction_dict, histories_dict, futures_dict = prediction_output_to_trajectories({timestep: preds},
prediction_dict, histories_dict, futures_dict = prediction_output_to_trajectories({frame.index: preds},
eval_scene.dt,
hyperparams['maximum_history_length'],
hyperparams['prediction_horizon']
)
end = time.time()
logger.debug("took %.2f s (= %.2f Hz) w/ %d nodes and %d edges -- init: %.2f s" % (end - start,
1. / (end - start), len(trajectron.nodes),
trajectron.scene_graph.get_num_edges(), start-t_init))
# if self.config.center_data:
# prediction_dict, histories_dict, futures_dict = offset_trajectron_dict(prediction_dict, cx, cy), offset_trajectron_dict(histories_dict, cx, cy), offset_trajectron_dict(futures_dict, cx, cy)
# print('pred timesteps', list(prediction_dict.keys()))
# print('histories', [n.data.data.shape[0] for n in prediction_dict[frame.index].keys()])
if self.config.cm_to_m:
# convert back to fit homography
prediction_dict, histories_dict, futures_dict = prediction_m_to_cm(prediction_dict), prediction_m_to_cm(histories_dict), prediction_m_to_cm(futures_dict)
assert(len(prediction_dict.keys()) <= 1)
if len(prediction_dict.keys()) == 0:
return
@ -371,13 +442,14 @@ class PredictionServer:
response = {}
logger.debug(f"{histories_dict=}")
for node in histories_dict:
history = histories_dict[node]
# future = futures_dict[node] # ground truth dict
predictions = prediction_dict[node]
# print('preds', len(predictions[0][0]))
if not len(history) or np.isnan(history[-1]).any():
logger.warning('skip for no history')
continue
# response[node.id] = {
@ -388,24 +460,26 @@ class PredictionServer:
# 'predictions': predictions[0].tolist() # use batch 0
# }
frame.tracks[node.id].predictor_history = history.tolist()
frame.tracks[node.id].predictor_history = history.tolist() #node.data[:,{'position': ['x', 'y']}].tolist()
frame.tracks[node.id].predictions = predictions[0].tolist() # use batch 0
# data = json.dumps(response)
if self.config.predict_training_data:
logger.info(f"Frame prediction: {len(trajectron.nodes)} nodes & {trajectron.scene_graph.get_num_edges()} edges. Trajectron: {end - start}s")
else:
logger.info(f"Total frame delay = {time.time()-frame.time}s ({len(trajectron.nodes)} nodes & {trajectron.scene_graph.get_num_edges()} edges. Trajectron: {end - start}s)")
logger.debug(f"Total frame delay = {time.time()-frame.time}s ({len(trajectron.nodes)} nodes & {trajectron.scene_graph.get_num_edges()} edges. Trajectron: {end - start}s)")
if self.config.smooth_predictions:
frame = self.smoother.smooth_frame_predictions(frame)
self.prediction_socket.send_pyobj(frame)
self.send_frame(frame)
logger.info('Stopping')
def run_prediction_server(config: Namespace, is_running: Event):
def run_prediction_server(config: Namespace, is_running: Event, timer_counter):
# attempt to trace the warnings coming from pytorch
# def warn_with_traceback(message, category, filename, lineno, file=None, line=None):
@ -416,4 +490,4 @@ def run_prediction_server(config: Namespace, is_running: Event):
# warnings.showwarning = warn_with_traceback
s = PredictionServer(config, is_running)
s.run()
s.run(timer_counter)

View file

@ -10,7 +10,7 @@ from multiprocessing import Event
from multiprocessing.synchronize import Event as BaseEvent
import cv2
import numpy as np
import json
import pyglet
import pyglet.event
import zmq
@ -18,15 +18,18 @@ import tempfile
from pathlib import Path
import shutil
import math
from typing import List, Optional
from pyglet import shapes
from PIL import Image
from trap.frame_emitter import DetectionState, Frame, Track
from trap.utils import convert_world_points_to_img_points
from trap.frame_emitter import DetectionState, Frame, Track, Camera
logger = logging.getLogger("trap.renderer")
logger = logging.getLogger("trap.preview")
class FrameAnimation:
def __init__(self, frame: Frame):
@ -55,63 +58,107 @@ def relativePointToPolar(origin, point) -> tuple[float, float]:
def relativePolarToPoint(origin, r, angle) -> tuple[float, float]:
return r * np.cos(angle) + origin[0], r * np.sin(angle) + origin[1]
PROJECTION_IMG = 0
PROJECTION_UNDISTORT = 1
PROJECTION_MAP = 2
PROJECTION_PROJECTOR = 4
class DrawnTrack:
def __init__(self, track_id, track: Track, renderer: Renderer, H):
def __init__(self, track_id, track: Track, renderer: PreviewRenderer, H, draw_projection = PROJECTION_IMG, camera: Optional[Camera] = None):
# self.created_at = time.time()
self.draw_projection = draw_projection
self.update_at = self.created_at = time.time()
self.track_id = track_id
self.renderer = renderer
self.set_track(track, H)
self.camera = camera
self.H = H # TODO)) Move H to Camera object
self.drawn_positions = []
self.drawn_predictions = []
self.drawn_pred_history = []
self.shapes: list[pyglet.shapes.Line] = []
self.pred_shapes: list[list[pyglet.shapes.Line]] = []
self.pred_history_shapes: list[pyglet.shapes.Line] = []
def set_track(self, track: Track, H):
self.set_track(track, H)
self.set_predictions(track, H)
def set_track(self, track: Track, H = None):
self.update_at = time.time()
self.track = track
self.H = H
self.coords = [d.get_foot_coords() for d in track.history]
# self.H = H
self.coords = [d.get_foot_coords() for d in track.history] if self.draw_projection == PROJECTION_IMG else track.get_projected_history(None, self.camera)
# perhaps only do in constructor:
self.inv_H = np.linalg.pinv(self.H)
def set_predictions(self, track: Track, H = None):
pred_coords = []
for pred_i, pred in enumerate(track.predictions):
pred_coords.append(cv2.perspectiveTransform(np.array([pred]), self.inv_H)[0].tolist())
pred_history_coords = []
if track.predictions:
if self.draw_projection == PROJECTION_IMG:
for pred_i, pred in enumerate(track.predictions):
pred_coords.append(cv2.perspectiveTransform(np.array([pred]), self.inv_H)[0].tolist())
pred_history_coords = cv2.perspectiveTransform(np.array([track.predictor_history]), self.inv_H)[0].tolist()
elif self.draw_projection == PROJECTION_MAP:
pred_coords = [pred for pred in track.predictions]
pred_history_coords = track.predictor_history
self.pred_track = track
self.pred_coords = pred_coords
self.pred_history_coords = pred_history_coords
# color = (128,0,128) if pred_i else (128,
def update_drawn_positions(self, dt) -> []:
def update_drawn_positions(self, dt) -> List:
'''
use dt to lerp the drawn positions in the direction of current prediction
'''
# TODO: make lerp, currently quick way to get results
def int_or_not(v):
"""quick wrapper to toggle int'ing"""
return v
# return int(v)
# 1. track history
for i, pos in enumerate(self.drawn_positions):
self.drawn_positions[i][0] = int(exponentialDecay(self.drawn_positions[i][0], self.coords[i][0], 16, dt))
self.drawn_positions[i][1] = int(exponentialDecay(self.drawn_positions[i][1], self.coords[i][1], 16, dt))
self.drawn_positions[i][0] = self.coords[i][0]
self.drawn_positions[i][1] = self.coords[i][1]
# self.drawn_positions[i][0] = int_or_not(exponentialDecay(self.drawn_positions[i][0], self.coords[i][0], 16, dt))
# self.drawn_positions[i][1] = int_or_not(exponentialDecay(self.drawn_positions[i][1], self.coords[i][1], 16, dt))
# print(self.drawn_positions)
if len(self.coords) > len(self.drawn_positions):
self.drawn_positions.extend(self.coords[len(self.drawn_positions):])
for a, drawn_prediction in enumerate(self.drawn_predictions):
for i, pos in enumerate(drawn_prediction):
# TODO: this should be done in polar space starting from origin (i.e. self.drawn_posision[-1])
decay = max(3, (18/i) if i else 10) # points further away move with more delay
decay = 6
origin = self.drawn_positions[-1]
drawn_r, drawn_angle = relativePointToPolar( origin, drawn_prediction[i])
pred_r, pred_angle = relativePointToPolar(origin, self.pred_coords[a][i])
r = exponentialDecay(drawn_r, pred_r, decay, dt)
angle = exponentialDecay(drawn_angle, pred_angle, decay, dt)
x, y = relativePolarToPoint(origin, r, angle)
self.drawn_predictions[a][i] = int(x), int(y)
# self.drawn_predictions[i][0] = int(exponentialDecay(self.drawn_predictions[i][0], self.pred_coords[i][0], decay, dt))
# self.drawn_predictions[i][1] = int(exponentialDecay(self.drawn_predictions[i][1], self.pred_coords[i][1], decay, dt))
# 2. history as seen by predictor (Trajectron)
for i, pos in enumerate(self.drawn_pred_history):
if len(self.pred_history_coords) > i:
self.drawn_pred_history[i][0] = int_or_not(exponentialDecay(self.drawn_pred_history[i][0], self.pred_history_coords[i][0], 16, dt))
self.drawn_pred_history[i][1] = int_or_not(exponentialDecay(self.drawn_pred_history[i][1], self.pred_history_coords[i][1], 16, dt))
if len(self.pred_history_coords) > len(self.drawn_pred_history):
self.drawn_pred_history.extend(self.coords[len(self.drawn_pred_history):])
# 3. predictions
if len(self.pred_coords):
for a, drawn_prediction in enumerate(self.drawn_predictions):
for i, pos in enumerate(drawn_prediction):
# TODO: this should be done in polar space starting from origin (i.e. self.drawn_posision[-1])
decay = max(3, (18/i) if i else 10) # points further away move with more delay
decay = 16
origin = self.drawn_positions[-1]
drawn_r, drawn_angle = relativePointToPolar( origin, drawn_prediction[i])
pred_r, pred_angle = relativePointToPolar(origin, self.pred_coords[a][i])
r = exponentialDecay(drawn_r, pred_r, decay, dt)
angle = exponentialDecay(drawn_angle, pred_angle, decay, dt)
x, y = relativePolarToPoint(origin, r, angle)
self.drawn_predictions[a][i] = int_or_not(x), int_or_not(y)
# self.drawn_predictions[i][0] = int(exponentialDecay(self.drawn_predictions[i][0], self.pred_coords[i][0], decay, dt))
# self.drawn_predictions[i][1] = int(exponentialDecay(self.drawn_predictions[i][1], self.pred_coords[i][1], decay, dt))
if len(self.pred_coords) > len(self.drawn_predictions):
self.drawn_predictions.extend(self.pred_coords[len(self.drawn_predictions):])
@ -120,77 +167,131 @@ class DrawnTrack:
# self.drawn_predictions.extend(self.pred_coords[len(self.drawn_predictions):])
# self.drawn_positions = self.coords
# finally: update shapes from coordinates
self.update_shapes(dt)
return self.drawn_positions
def update_shapes(self, dt):
if len(self.shapes) > len(self.drawn_positions):
self.shapes = self.shapes[:len(self.drawn_positions)]
# for i, pos in self.drawn_positions.enumerate():
for ci in range(1, len(self.drawn_positions)):
x, y = [int(p) for p in self.drawn_positions[ci-1]]
x2, y2 = [int(p) for p in self.drawn_positions[ci]]
drawn_positions = convert_world_points_to_img_points(self.coords[:500]) # TODO)) Glitch in self.drawn_positions, now also capped
drawn_pred_history = convert_world_points_to_img_points(self.drawn_pred_history)
drawn_predictions = [convert_world_points_to_img_points(p) for p in self.drawn_predictions]
# positions = convert_world_points_to_img_points(self.drawn_predictions)
y, y2 = self.renderer.window.height - y, self.renderer.window.height - y2
color = [100+155*ci // len(self.drawn_positions)]*3
# print(x,y,x2,y2,color)
# print("drawn",
# drawn_positions,'self', self.drawn_positions
# )
if len(self.shapes) > len(drawn_positions):
self.shapes = self.shapes[:len(drawn_positions)]
# for i, pos in drawn_positions.enumerate():
draw_dot = False # if False, draw line
for_laser = True
if True:
for ci in range(1, len(drawn_positions)):
x, y = [int(p) for p in drawn_positions[ci-1]]
x2, y2 = [int(p) for p in drawn_positions[ci]]
y, y2 = self.renderer.window.height - y, self.renderer.window.height - y2
color = [100+155*ci // len(drawn_positions)]*3
# print(x,y,x2,y2,color)
if ci >= len(self.shapes):
# TODO: add color2
if draw_dot:
line = pyglet.shapes.Arc(x2, y2, 10, thickness=2, color=color, batch=self.renderer.batch_anim)
else:
line = self.renderer.gradientLine(x, y, x2, y2, 3, color, color, batch=self.renderer.batch_anim)
line.opacity = 20 if not for_laser else 255
self.shapes.append(line)
else:
line = self.shapes[ci-1]
line.x, line.y = x, y
if draw_dot:
line.radius = int(exponentialDecay(line.radius, 1.5, 3, dt))
else:
line.x2, line.y2 = x2, y2
line.color = color
if not for_laser:
line.opacity = int(exponentialDecay(line.opacity, 180, 8, dt))
if ci >= len(self.shapes):
# TODO: add color2
line = self.renderer.gradientLine(x, y, x2, y2, 3, color, color, batch=self.renderer.batch_anim)
line.opacity = 5
self.shapes.append(line)
else:
line = self.shapes[ci-1]
line.x, line.y = x, y
line.x2, line.y2 = x2, y2
line.color = color
line.opacity = int(exponentialDecay(line.opacity, 180, 3, dt))
# TODO: basically a duplication of the above, do this smarter?
# TODO: add intermediate segment
color = colorset[self.track_id % len(colorset)]
for a, drawn_predictions in enumerate(self.drawn_predictions):
if len(self.pred_shapes) <= a:
self.pred_shapes.append([])
if False:
if len(self.pred_history_shapes) > len(drawn_pred_history):
self.pred_history_shapes = self.pred_history_shapes[:len(drawn_pred_history)]
if len(self.pred_shapes[a]) > (len(drawn_predictions) +1):
self.pred_shapes[a] = self.pred_shapes[a][:len(drawn_predictions)]
# for i, pos in drawn_predictions.enumerate():
for ci in range(0, len(drawn_predictions)):
if ci == 0:
x, y = [int(p) for p in self.drawn_positions[-1]]
else:
x, y = [int(p) for p in drawn_predictions[ci-1]]
x2, y2 = [int(p) for p in drawn_predictions[ci]]
# for i, pos in drawn_pred_history.enumerate():
for ci in range(1, len(drawn_pred_history)):
x, y = [int(p) for p in drawn_pred_history[ci-1]]
x2, y2 = [int(p) for p in drawn_pred_history[ci]]
y, y2 = self.renderer.window.height - y, self.renderer.window.height - y2
# color = [255,0,0]
# print(x,y,x2,y2,color)
if ci >= len(self.pred_shapes[a]):
# TODO: add color2
line = self.renderer.gradientLine(x, y, x2, y2, 3, color, color, batch=self.renderer.batch_anim)
line.opacity = 5
self.pred_shapes[a].append(line)
if ci >= len(self.pred_history_shapes):
# line = self.renderer.gradientLine(x, y, x2, y2, 3, color, color, batch=self.renderer.batch_anim)
line = pyglet.shapes.Line(x,y ,x2, y2, 2.5, color, batch=self.renderer.batch_anim)
# line = pyglet.shapes.Arc(x2, y2, 10, thickness=2, color=color, batch=self.renderer.batch_anim)
line.opacity = 120
self.pred_history_shapes.append(line)
else:
line = self.pred_shapes[a][ci-1]
line = self.pred_history_shapes[ci-1]
line.x, line.y = x, y
line.x2, line.y2 = x2, y2
# line.radius = int(exponentialDecay(line.radius, 1.5, 3, dt))
line.color = color
decay = (16/ci) if ci else 16
half = len(drawn_predictions) / 2
if ci < half:
target_opacity = 180
line.opacity = int(exponentialDecay(line.opacity, 180, 8, dt))
if True:
for a, drawn_prediction in enumerate(drawn_predictions):
if len(self.pred_shapes) <= a:
self.pred_shapes.append([])
if len(self.pred_shapes[a]) > (len(drawn_prediction) +1):
self.pred_shapes[a] = self.pred_shapes[a][:len(drawn_prediction)]
# for i, pos in drawn_predictions.enumerate():
for ci in range(0, len(drawn_prediction)):
if ci == 0:
continue
# x, y = [int(p) for p in drawn_positions[-1]]
else:
target_opacity = (1 - ((ci - half) / half)) * 180
line.opacity = int(exponentialDecay(line.opacity, target_opacity, decay, dt))
x, y = [int(p) for p in drawn_prediction[ci-1]]
x2, y2 = [int(p) for p in drawn_prediction[ci]]
y, y2 = self.renderer.window.height - y, self.renderer.window.height - y2
# color = [255,0,0]
# print(x,y,x2,y2,color)
if ci >= len(self.pred_shapes[a]):
# TODO: add color2
# line = self.renderer.gradientLine(x, y, x2, y2, 3, color, color, batch=self.renderer.batch_anim)
line = pyglet.shapes.Line(x,y ,x2, y2, 1.5, color, batch=self.renderer.batch_anim)
# line = pyglet.shapes.Arc(x,y ,1.5, thickness=1.5, color=color, batch=self.renderer.batch_anim)
line.opacity = 5
self.pred_shapes[a].append(line)
else:
line = self.pred_shapes[a][ci-1]
line.x, line.y = x, y
line.x2, line.y2 = x2, y2
line.color = color
decay = (16/ci) if ci else 16
half = len(drawn_prediction) / 2
if ci < half:
target_opacity = 60
else:
target_opacity = (1 - ((ci - half) / half)) * 60
line.opacity = int(exponentialDecay(line.opacity, target_opacity, decay, dt))
class FrameWriter:
@ -232,7 +333,7 @@ class FrameWriter:
class Renderer:
class PreviewRenderer:
def __init__(self, config: Namespace, is_running: BaseEvent):
self.config = config
self.is_running = is_running
@ -241,7 +342,8 @@ class Renderer:
self.prediction_sock = context.socket(zmq.SUB)
self.prediction_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.prediction_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.prediction_sock.connect(config.zmq_prediction_addr if not self.config.bypass_prediction else config.zmq_trajectory_addr)
# self.prediction_sock.connect(config.zmq_prediction_addr if not self.config.bypass_prediction else config.zmq_trajectory_addr)
self.prediction_sock.connect(config.zmq_prediction_addr)
self.tracker_sock = context.socket(zmq.SUB)
self.tracker_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
@ -253,14 +355,23 @@ class Renderer:
self.frame_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.frame_sock.connect(config.zmq_frame_addr)
self.H = np.loadtxt(self.config.homography, delimiter=',')
# TODO)) Move loading H to config.py
# if self.config.homography.suffix == '.json':
# with self.config.homography.open('r') as fp:
# self.H = np.array(json.load(fp))
# else:
# self.H = np.loadtxt(self.config.homography, delimiter=',')
# print('h', self.config.H)
self.H = self.config.H
self.inv_H = np.linalg.pinv(self.H)
# TODO: get FPS from frame_emitter
# self.out = cv2.VideoWriter(str(filename), fourcc, 23.97, (1280,720))
self.fps = 60
self.frame_size = (1280,720)
self.frame_size = (self.config.camera.w,self.config.camera.h)
self.hide_stats = False
self.out_writer = self.start_writer() if self.config.render_file else None
self.streaming_process = self.start_streaming() if self.config.render_url else None
@ -649,17 +760,27 @@ class Renderer:
self.out_writer.release()
if self.streaming_process:
# oddly wrapped, because both close and release() take time.
logger.info('wait for closing stream')
self.streaming_process.wait()
logger.info('stopped')
# colorset = itertools.product([0,255], repeat=3) # but remove white
colorset = [(0, 0, 0),
(0, 0, 255),
(0, 255, 0),
(0, 255, 255),
(255, 0, 0),
(255, 0, 255),
(255, 255, 0)
# colorset = [(0, 0, 0),
# (0, 0, 255),
# (0, 255, 0),
# (0, 255, 255),
# (255, 0, 0),
# (255, 0, 255),
# (255, 255, 0)
# ]
colorset = [
(255,255,100),
(255,100,255),
(100,255,255),
]
# colorset = [
# (0,0,0),
# ]
# Deprecated
def decorate_frame(frame: Frame, prediction_frame: Frame, first_time: float, config: Namespace) -> np.array:
@ -772,6 +893,6 @@ def decorate_frame(frame: Frame, prediction_frame: Frame, first_time: float, con
return img
def run_renderer(config: Namespace, is_running: BaseEvent):
renderer = Renderer(config, is_running)
def run_preview_renderer(config: Namespace, is_running: BaseEvent):
renderer = PreviewRenderer(config, is_running)
renderer.run()

290
trap/process_data.py Normal file
View file

@ -0,0 +1,290 @@
from collections import defaultdict
import datetime
from pathlib import Path
from random import shuffle
import sys
import os
import time
from attr import dataclass
import numpy as np
import pandas as pd
import dill
import tqdm
import argparse
from typing import List
from trap.config import CameraAction, HomographyAction
from trap.frame_emitter import Camera
from trap.tracker import FinalDisplacementFilter, Smoother, TrackReader
#sys.path.append("../../")
from trajectron.environment import Environment, Scene, Node
from trajectron.utils import maybe_makedirs
from trajectron.environment import derivative_of
FPS = 12
desired_max_time = 100
pred_indices = [2, 3]
state_dim = 6
frame_diff = 10
desired_frame_diff = 1
dt = 1/FPS # dt per frame (e.g. 1/FPS)
smooth_window = FPS # see also tracker.py
min_track_length = 20
standardization = {
'PEDESTRIAN': {
'position': {
'x': {'mean': 0, 'std': 1},
'y': {'mean': 0, 'std': 1}
},
'velocity': {
'x': {'mean': 0, 'std': 2},
'y': {'mean': 0, 'std': 2}
},
'acceleration': {
'x': {'mean': 0, 'std': 1},
'y': {'mean': 0, 'std': 1}
}
}
}
class RollingAverage():
def __init__(self):
self.v = 0
self.n = 0
def add(self, v):
self.v = (self.v * self.n + v) / (self.n +1)
self.n += 1
return self.v
@dataclass
class TrackIteration:
smooth: bool
step_size: int
step_offset: int
@classmethod
def iteration_variations(cls, smooth = True, toggle_smooth=True, sample_step_size=1):
iterations: List[TrackIteration] = []
for i in range(sample_step_size):
iterations.append(TrackIteration(smooth, sample_step_size, i))
if toggle_smooth:
iterations.append(TrackIteration(not smooth, sample_step_size, i))
return iterations
# maybe_makedirs('trajectron-data')
# for desired_source in [ 'hof2', ]:# ,'hof-maskrcnn', 'hof-yolov8', 'VIRAT-0102-parsed', 'virat-resnet-keypoints-full']:
def process_data(src_dir: Path, dst_dir: Path, name: str, smooth_tracks: bool, cm_to_m: bool, center_data: bool, bin_positions: bool, camera: Camera, step_size: int, filter_displacement:float):
name += f"-{datetime.date.today()}"
print(f"Process data in {src_dir}, to {dst_dir}, identified by {name}")
nl = 0
l = 0
data_columns = pd.MultiIndex.from_product([['position', 'velocity', 'acceleration'], ['x', 'y']])
skipped_for_error = 0
created = 0
smoother = Smoother(window_len=smooth_window, convolution=True) if smooth_tracks else None
reader = TrackReader(src_dir, camera.fps)
tracks = [t for t in reader]
if filter_displacement > 0:
filter = FinalDisplacementFilter(filter_displacement)
tracks = filter.apply(tracks, camera)
total = len(tracks)
bar = tqdm.tqdm(total=total)
destinations = {
'train': int(total * .8),
'val': int(total * .12),
'test': int(total * .08),
}
max_track = reader.get(str(max([int(k) for k in reader._tracks.keys()])))
max_frame_nr = max_track.history[-1].frame_nr
print(max_frame_nr)
# separate call so cursor is kept during multiple loops
shuffle(tracks)
dt1 = RollingAverage()
dt2 = RollingAverage()
dt3 = RollingAverage()
dt4 = RollingAverage()
sets = {}
offset = 0
for data_class, nr in destinations.items():
# TODO)) think of a way to shuffle while keeping scenes
sets[data_class] = tracks[offset : offset+nr]
offset += nr
print(f"Camera FPS: {camera.fps}, actual fps: {camera.fps/step_size} (or {(1/camera.fps)*step_size})")
for data_class, nr_of_items in destinations.items():
env = Environment(node_type_list=['PEDESTRIAN'], standardization=standardization)
attention_radius = dict()
attention_radius[(env.NodeType.PEDESTRIAN, env.NodeType.PEDESTRIAN)] = 2.0
env.attention_radius = attention_radius
scenes = []
split_id = f"{name}_{data_class}"
data_dict_path = dst_dir / (split_id + '.pkl')
# subpath = src_dir / data_class
# prev_src_file = None
# scene = None
scene_nodes = defaultdict(lambda: [])
iterations = TrackIteration.iteration_variations(smooth_tracks, False, step_size)
for i, track in enumerate(sets[data_class]):
bar.update()
track_source = track.source
# if track.source != prev_src_file:
# scene =
tot = (dt1.v+dt2.v+dt3.v+dt4.v)
if tot:
bar.set_description(f"{data_dict_path.name} {track_source} ({dt1.v/tot:.4f}, {dt2.v/tot:.4f}, {dt3.v/tot:.4f}, {dt4.v/tot:.4f}) - {len(scene_nodes)}")
# for file in subpath.glob("*.txt"):]
input_data_dict = dict()
if len(track.history) < min_track_length:
continue
a = time.time()
interpolated_track = track.get_with_interpolated_history()
b = time.time()
for i_nr, iteration_settings in enumerate(iterations):
if iteration_settings.smooth:
track = smoother.smooth_track(interpolated_track)
# track = Smoother(smooth_window, False).smooth_track(track)
else:
track = interpolated_track # TODO)) Copy & move smooth outside iter loop
c = time.time()
if iteration_settings.step_size > 1:
track = track.get_sampled(iteration_settings.step_size, iteration_settings.step_offset)
# redo test, it might fall out again
if len(track.history) < min_track_length:
continue
# track.get_projected_history(H=None, camera=self.config.camera)
node = track.to_trajectron_node(camera, env)
data_class = time.time()
# if center_data:
# data['pos_x'] -= cx
# data['pos_y'] -= cy
# if bin_positions:
# data['pos_x'] =np.digitize(data['pos_x'], bins=space_x)
# data['pos_y'] =np.digitize(data['pos_y'], bins=space_y)
# print(data['pos_x'])
scene_nodes[f"{track_source}_{i_nr}"].append(node)
created+=1
e = time.time()
dt1.add(b-a)
dt2.add(c-b)
dt3.add(data_class-c)
dt4.add(e-data_class)
for scene_nr, nodes in scene_nodes.items():
first_ts = min([n.first_timestep for n in nodes])
for node in nodes:
node.first_timestep -= (first_ts - 1)
last_ts = max([n.last_timestep for n in nodes])
# print(sorted([n.first_timestep for n in nodes]))
scene = Scene(timesteps=last_ts, dt=(1/camera.fps)*step_size, name=f'{split_id}_{scene_nr}', aug_func=None)
scene.nodes.extend(nodes)
scenes.append(scene)
# print(scene)
# print(scene.nodes[0].first_timestep)
print(f'Processed {len(scenes):.2f} scene for data class {data_class}')
env.scenes = scenes
# print(env.scenes)
if len(scenes) > 0:
with open(data_dict_path, 'wb') as f:
dill.dump(env, f, protocol=dill.HIGHEST_PROTOCOL)
# print(f"Linear: {l}")
# print(f"Non-Linear: {nl}")
print(f"error: {skipped_for_error}, used: {created}")
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--src-dir", "-s", type=Path, required=True, help="Directory with tracker output in .txt files")
parser.add_argument("--dst-dir", "-d", type=Path, required=True, help="Destination directory to store parsed .pkl files (typically 'trajectron-data')")
parser.add_argument("--name", "-n", type=str, required=True, help="Identifier to prefix the output .pkl files with (result is NAME-train.pkl, NAME-test.pkl)")
parser.add_argument("--smooth-tracks", action='store_true', help=f"Enable smoother. Set to {smooth_window} frames")
parser.add_argument("--cm-to-m", action='store_true', help=f"If homography is in cm, convert tracked points to meter for beter results")
parser.add_argument("--center-data", action='store_true', help=f"Normalise around center")
parser.add_argument("--bin-positions", action='store_true', help=f"Experiment to put round positions to a grid")
parser.add_argument("--step-size", type=int, default=1, help=f"Take only every n-th point")
parser.add_argument("--camera-fps",
help="Camera FPS",
type=int,
default=12)
parser.add_argument("--homography",
help="File with homography params",
type=Path,
default='../DATASETS/VIRAT_subset_0102x/VIRAT_0102_homography_img2world.txt',
action=HomographyAction)
parser.add_argument("--calibration",
help="File with camera intrinsics and lens distortion params (calibration.json)",
# type=Path,
default=None,
action=CameraAction)
parser.add_argument("--filter-displacement",
help="Filter tracks with a final displacement less then the given value",
# type=Path,
default=0,
type=float)
args = parser.parse_args()
# process_data(**args.__dict__)
process_data(
args.src_dir,
args.dst_dir,
args.name,
args.smooth_tracks,
args.cm_to_m,
args.center_data,
args.bin_positions,
args.camera,
args.step_size,
filter_displacement=args.filter_displacement
)

View file

@ -1,6 +1,9 @@
import collections
from re import A
import time
from multiprocessing.sharedctypes import RawValue, Value, Array
from ctypes import c_double
from typing import MutableSequence
class Timer():
@ -8,47 +11,58 @@ class Timer():
Measure 2 independent things: the freuency of tic, and the duration of tic->toc
Note that indeed these don't need to be equal
"""
def __init__(self) -> None:
self.last_tic = RawValue(c_double, -1)
self.last_toc = RawValue(c_double, -1)
self.fps = RawValue(c_double, -1)
self.processing_duration = RawValue(c_double, -1)
self.smoothing = .1
def tic(self):
now = time.time()
if self.last_tic is None:
self.last_tic = now
return
def __init__(self, name = 'timer') -> None:
self.name = name
self.tocs: MutableSequence[(float, int)] = collections.deque(maxlen=5)
self.iterations = Value('i', 0)
duration = now - self.last_tic
self.last_tic = now
# def tic(self):
# now = time.time()
# if self.last_tic is None:
# self.last_tic = now
# return
# duration = now - self.last_tic
# self.last_tic = now
current_fps = 1 / duration
if not self.fps:
self.fps = current_fps
else:
self.fps = self.fps * (1-self.smoothing) + current_fps * self.smoothing
# current_fps = 1 / duration
# if not self.fps:
# self.fps = current_fps
# else:
# self.fps = self.fps * (1-self.smoothing) + current_fps * self.smoothing
def toc(self):
self.last_toc = time.time()
duration = self.last_toc - self.last_tic
self.processing_duration = self.processing_duration * (1-self.smoothing) + duration * self.smoothing
self.iterations += 1
def snapshot(self):
self.tocs.append((time.perf_counter(), self.iterations.value))
@property
def fps(self):
pass
fpses = []
if len(self.tocs) < 2:
return 0
dt = self.tocs[-1][0] - self.tocs[0][0]
di = self.tocs[-1][1] - self.tocs[0][1]
return di/dt
class TimerCollection():
def __init__(self) -> None:
self._timers = set()
def print(self)->str:
print('Update', end='\r')
def snapshot(self):
for timer in self._timers:
timer.snapshot()
def to_string(self)->str:
strs = [f"{t.name} {t.fps:.2f}" for t in self._timers]
return " ".join(strs)
def new(self, name='timer'):
t = Timer(name)
self._timers.add(t)
return t

568
trap/tools.py Normal file
View file

@ -0,0 +1,568 @@
from argparse import Namespace
import json
import math
from pathlib import Path
import pickle
from tempfile import mktemp
import jsonlines
import numpy as np
import pandas as pd
import trap.tracker
from trap.config import parser
from trap.frame_emitter import Camera, Detection, DetectionState, video_src_from_config, Frame
from trap.tracker import DETECTOR_YOLOv8, FinalDisplacementFilter, Smoother, TrackReader, _yolov8_track, Track, TrainingDataWriter, Tracker, read_tracks_json
from collections import defaultdict
import logging
import cv2
from typing import Callable, List, Iterable, Optional
from ultralytics import YOLO
from ultralytics.engine.results import Results as YOLOResult
import tqdm
from trap.utils import inv_lerp, lerp
logger = logging.getLogger('tools')
class FrameGenerator():
def __init__(self, config):
self.video_srcs = video_src_from_config(config)
self.config = config
if not hasattr(config, "H"):
raise RuntimeError("Set homography file with --homography param")
# store current position
self.video_path = None
self.video_nr = None
self.frame_count = None
self.frame_idx = None
self.n = 0
def __iter__(self):
for video_nr, video_path in enumerate(self.video_srcs):
self.video_path = video_path
self.video_nr = video_nr
logger.info(f"Play from '{str(video_path)}'")
video = cv2.VideoCapture(str(video_path))
fps = video.get(cv2.CAP_PROP_FPS)
self.frame_count = video.get(cv2.CAP_PROP_FRAME_COUNT)
if self.frame_count < 0:
self.frame_count = math.inf
self.frame_idx = 0
if self.config.video_offset:
logger.info(f"Start at frame {self.config.video_offset}")
video.set(cv2.CAP_PROP_POS_FRAMES, self.config.video_offset)
self.frame_idx = self.config.video_offset
while True:
ret, img = video.read()
self.frame_idx+=1
self.n+=1
# seek to 0 if video has finished. Infinite loop
if not ret:
# now loading multiple files
break
frame = Frame(index=self.n, img=img, H=self.config.H, camera=self.config.camera)
yield frame
def marquee_string(string: str, window: int, i: int):
if window > len(string):
return string
# too_much = len(string) - window
# offset = i % too_much
# return string[offset:offset+window]
too_much = len(string) - window
offset = i % (too_much*2)
if offset > too_much:
offset = too_much - (offset-too_much)
return string[offset:offset+window]
def tracker_preprocess():
config = parser.parse_args()
tracker = Tracker(config)
# model = YOLO('EXPERIMENTS/yolov8x.pt')
with TrainingDataWriter(config.save_for_training) as writer:
bar = tqdm.tqdm()
tracks = defaultdict(lambda: Track())
total = 0
frames = FrameGenerator(config)
total_tracks = set()
for frame in frames:
bar.update()
detections = tracker.track_frame(frame)
total += len(detections)
# detections = _yolov8_track(frame, model, imgsz=1440, classes=[0])
for detection in detections:
track = tracks[detection.track_id]
track.track_id = detection.track_id # for new tracks
track.history.append(detection) # add to history
active_track_ids = [d.track_id for d in detections]
active_tracks = {t.track_id: t for t in tracks.values() if t.track_id in active_track_ids}
total_tracks.update(active_track_ids)
bar.set_description(f"{frames.video_nr}/{len(frames.video_srcs)} [{frames.frame_idx}/{frames.frame_count}] {marquee_string(str(frames.video_path), 10, frames.n//2)} | dets {len(detections)}: {[d.track_id for d in detections]} (∑{total}{len(total_tracks)})")
writer.add(frame, active_tracks.values())
logger.info("Done!")
bgr_colors = [
(255, 0, 0),
(0, 255, 0),
# (0, 0, 255),# red used for missing waypoints
(0, 255, 255),
]
def detection_color(detection: Detection, i, prev_detection: Optional[Detection] = None):
vague = detection.state == DetectionState.Lost or (prev_detection and detection.frame_nr - prev_detection.frame_nr > 1)
return bgr_colors[i % len(bgr_colors)] if not vague else (0,0,255)
def to_point(coord):
return (int(coord[0]), int(coord[1]))
def tracker_compare():
config = parser.parse_args()
trackers: List[Tracker] = []
# TODO, support all tracker.DETECTORS
for tracker_id in [
trap.tracker.DETECTOR_YOLOv8,
# trap.tracker.DETECTOR_MASKRCNN,
# trap.tracker.DETECTOR_RETINANET,
trap.tracker.DETECTOR_FASTERRCNN,
]:
tracker_config = Namespace(**vars(config))
tracker_config.detector = tracker_id
trackers.append(Tracker(tracker_config))
frames = FrameGenerator(config)
bar = tqdm.tqdm(frames)
cv2.namedWindow("frame", cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty("frame",cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)
for frame in bar:
# frame.img = cv2.undistort(frame.img, config.camera.mtx, config.camera.dist, None, config.camera.newcameramtx) # try to undistort for better detections, seems not to matter at all
trackers_detections = [(t, t.track_frame(frame)) for t in trackers]
for i, tracker in enumerate(trackers):
cv2.putText(frame.img, tracker.config.detector, (10,30*(i+1)), cv2.FONT_HERSHEY_DUPLEX, 1, color=bgr_colors[i % len(bgr_colors)])
for i, (tracker, detections) in enumerate(trackers_detections):
for track_id in tracker.tracks:
draw_track(frame.img, tracker.tracks[track_id], i)
for detection in detections:
color = color = detection_color(detection, i)
l, t, r, b = detection.to_ltrb()
cv2.rectangle(frame.img, (l, t), (r,b), color)
cv2.putText(frame.img, f"{detection.track_id}", (l, b+10), cv2.FONT_HERSHEY_DUPLEX, 1, color=color)
conf = f"{detection.conf:.3f}" if detection.conf is not None else "None"
cv2.putText(frame.img, f"{detection.det_class} - {conf}", (l, t), cv2.FONT_HERSHEY_DUPLEX, .7, color=color)
cv2.imshow('frame',cv2.resize(frame.img, (1920, 1080)))
cv2.waitKey(1)
bar.set_description(f"[{frames.video_nr}/{len(frames.video_srcs)}] [{frames.frame_idx}/{frames.frame_count}] {str(frames.video_path)}")
def transition_path_points(path: np.array, t: float):
"""
"""
if t >= 1:
return path
if t <= 0:
return np.array([path[0]])
# new_path = np.array([])
lengths = np.sqrt(np.sum(np.diff(path, axis=0)**2, axis=1))
cum_lenghts = np.cumsum(lengths)
# distance = cum_lenghts[-1] * t
ts = np.concatenate((np.array([0.]), cum_lenghts / cum_lenghts[-1]))
new_path = [path[0]]
for a, b, t_a, t_b in zip(path[:-1], path[1:], ts[:-1], ts[1:]):
if t_b < t:
new_path.append(b)
continue
# interpolate
relative_t = inv_lerp(t_a, t_b, t)
x = lerp(a[0], b[0], relative_t)
y = lerp(a[1], b[1], relative_t)
print(relative_t, a , b, x, y)
new_path.append([x,y])
break
return np.array(new_path)
def draw_track_predictions(img: cv2.Mat, track: Track, color_index: int, camera:Camera, convert_points: Optional[Callable], anim_position=.8):
"""
anim_position: 0-1
"""
if not track.predictions:
return
current_point = track.get_projected_history(camera=camera)[-1]
opacity = 1-min(1, max(0, inv_lerp(0.8, 1, anim_position))) # fade out
slide_t = min(1, max(0, inv_lerp(0, 0.8, anim_position))) # slide_position
# if convert_points:
# current_point = convert_points([current_point])[0]
for pred_i, pred in enumerate(track.predictions):
pred_coords = pred #cv2.perspectiveTransform(np.array([pred]), inv_H)[0].tolist()
# line_points = np.concatenate(([current_point], pred_coords)) # 'current point' is amoving target
line_points = pred_coords
# print(pred_coords, current_point, line_points)
line_points = transition_path_points(line_points, slide_t)
if convert_points:
line_points = convert_points(line_points)
# color = (128,0,128) if pred_i else (128,128,0)
color = bgr_colors[color_index % len(bgr_colors)]
color = tuple([int(c*opacity) for c in color])
for start, end in zip(line_points[:-1], line_points[1:]):
# for ci in range(0, len(pred_coords)):
# if ci == 0:
# # TODO)) prev point
# # continue
# start = [int(p) for p in current_point]
# # start = [int(p) for p in coords[-1]]
# # start = [0,0]?
# # print(start)
# else:
# start = [int(p) for p in pred_coords[ci-1]]
# end = [int(p) for p in pred_coords[ci]]
# print(np.rint(start),np.rint(end).tolist())
cv2.line(img, np.rint(start).astype(int), np.rint(end).astype(int), color, 1, lineType=cv2.LINE_AA)
# cv2.circle(img, end, 2, color, 1, lineType=cv2.LINE_AA)
def draw_trackjectron_history(img: cv2.Mat, track: Track, color_index: int, convert_points: Optional[Callable]):
if not track.predictor_history:
return
coords = track.predictor_history #cv2.perspectiveTransform(np.array([track.predictor_history]), inv_H)[0].tolist()
if convert_points:
coords = convert_points(coords)
# color = (128,0,128) if pred_i else (128,128,0)
color = tuple(b/2 for b in bgr_colors[color_index % len(bgr_colors)])
for ci in range(0, len(coords)):
if ci == 0:
# TODO)) prev point
continue
# start = [int(p) for p in coords[-1]]
# start = [0,0]?
# print(start)
else:
start = [int(p) for p in coords[ci-1]]
end = [int(p) for p in coords[ci]]
cv2.line(img, start, end, color, 1, lineType=cv2.LINE_AA)
cv2.circle(img, end, 4, color, 1, lineType=cv2.LINE_AA)
def draw_track_projected(img: cv2.Mat, track: Track, color_index: int, camera: Camera, convert_points: Optional[Callable]):
history = track.get_projected_history(camera=camera)
if convert_points:
history = convert_points(history)
cv2.putText(img, f"{track.track_id} ({len(history)})", to_point(history[0]), cv2.FONT_HERSHEY_DUPLEX, 1, color=bgr_colors[color_index % len(bgr_colors)])
point_color = bgr_colors[color_index % len(bgr_colors)]
cv2.circle(img, to_point(history[0]), 3, point_color, 2)
for j in range(len(history)-1):
a = history[j]
b = history[j+1]
cv2.line(img, to_point(a), to_point(b), point_color, 1)
cv2.circle(img, to_point(b), 3, point_color, 2)
def draw_track(img: cv2.Mat, track: Track, color_index: int):
history = track.history
cv2.putText(img, f"{track.track_id} ({len(history)})", to_point(history[0].get_foot_coords()), cv2.FONT_HERSHEY_DUPLEX, 1, color=bgr_colors[color_index % len(bgr_colors)])
point_color = detection_color(history[0], color_index)
cv2.circle(img, to_point(history[0].get_foot_coords()), 3, point_color, 2)
for j in range(len(history)-1):
a = history[j]
b = history[j+1]
# TODO)) replace with Track.get_with_interpolated_history()
# gap = b.frame_nr - a.frame_nr - 1
# if gap < 0:
# print(f"WARNING, gap between frames {a.frame_nr} -> {b.frame_nr} is negative?")
# if gap > 0:
# for g in range(gap):
# p1 = a.get_foot_coords()
# p2 = b.get_foot_coords()
# point = (lerp(p1[0], p2[0], g/gap), lerp(p1[1], p2[1], g/gap))
# cv2.circle(img, to_point(point), 3, (0,0,255), 1)
color = detection_color(b, color_index, a)
cv2.line(img, to_point(a.get_foot_coords()), to_point(b.get_foot_coords()), color, 1)
point_color = detection_color(b, color_index)
cv2.circle(img, to_point(b.get_foot_coords()), 3, point_color, 2)
def blacklist_tracks():
config = parser.parse_args()
cv2.namedWindow("frame", cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty("frame",cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)
backdrop = cv2.imread('../DATASETS/hof3/output.png')
blacklist = []
path: Path = config.save_for_training
reader = TrackReader(path, config.camera.fps, exclude_whitelisted = True)
tracks = [t for t in reader]
filter = FinalDisplacementFilter(2.0)
tracks = filter.apply(tracks, config.camera)
# blacklist_file = path / "blacklist.jsonl"
# whitelist_file = path / "whitelist.jsonl" # for skipping
# tracks_file = path / "tracks.json"
# FPS = 12 # TODO)) From config
# if whitelist_file.exists():
# # with whitelist_file.open('r') as fp:
# with jsonlines.open(whitelist_file, 'r') as reader:
# whitelist = [l for l in reader.iter(type=str)]
# else:
# whitelist = []
smoother = Smoother()
try:
for track in tqdm.tqdm(tracks):
if len(track.history) < 5:
continue
img = backdrop.copy()
draw_track(img, track.get_with_interpolated_history(), 0)
draw_track(img, smoother.smooth_track(track.get_with_interpolated_history()).get_sampled(5), 1)
imgS = cv2.resize(img, (1920, 1080))
cv2.imshow('frame', imgS)
while True:
k = cv2.waitKey(0)
if k==27: # Esc key to stop
raise StopIteration
elif k == ord('s'):
break # skip for now
elif k == ord('y'):
print('whitelist', track.track_id)
with jsonlines.open(reader.whitelist_file, mode='a') as writer:
# skip next time around
writer.write(track.track_id)
break
elif k == ord('n'):
print('blacklist', track.track_id)
# logger.info(f"Append {len(track)} items to {str(reader.blacklist_file)}")
with jsonlines.open(reader.blacklist_file, mode='a') as writer:
writer.write(track.track_id)
break
else:
# ignore all other keypresses
print(k) # else print its value
continue
except StopIteration as e:
pass
def rewrite_raw_track_files():
logging.basicConfig(level=logging.DEBUG)
config = parser.parse_args()
trap.tracker.rewrite_raw_track_files(config.save_for_training)
def interpolate_missing_frames(data: pd.DataFrame):
missing=0
old_size=len(data)
# slow way to append missing steps to the dataset
for ind, row in tqdm.tqdm(data.iterrows()):
if row['diff'] > 1:
for s in range(1, int(row['diff'])):
# add as many entries as missing
missing += 1
data.loc[len(data)] = [row['frame_id']-s, row['track_id'], np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, 1, 1]
# new_frame = [data.loc[ind-1]['frame_id']+s, row['track_id'], np.nan, np.nan, np.nan, np.nan, np.nan]
# data.loc[len(data)] = new_frame
logger.info(f'was:{old_size} added:{missing}, new length: {len(data)}')
# now sort, so that the added data is in the right place
data.sort_values(by=['track_id', 'frame_id'], inplace=True)
df=data.copy()
df = df.groupby('track_id').apply(lambda group: group.interpolate(method='linear'))
df.reset_index(drop=True, inplace=True)
# update diff, shouldnow be 1 | NaN
data['diff'] = data.groupby(['track_id'])['frame_id'].diff()
# data = df
return df
def smooth(data: pd.DataFrame):
df=data.copy()
if 'x_raw' not in df:
df['x_raw'] = df['x']
if 'y_raw' not in df:
df['y_raw'] = df['y']
print("Running smoother")
# print(df)
# from tsmoothie.smoother import KalmanSmoother, ConvolutionSmoother
smoother = Smoother(convolution=False)
def smoothing(data):
# smoother = ConvolutionSmoother(window_len=SMOOTHING_WINDOW, window_type='ones', copy=None)
return smoother.smooth(data).tolist()
# df=df.assign(smooth_data=smoother.smooth_data[0])
# return smoother.smooth_data[0].tolist()
# operate smoothing per axis
print("smooth x")
df['x'] = df.groupby('track_id')['x_raw'].transform(smoothing)
print("smooth y")
df['y'] = df.groupby('track_id')['y_raw'].transform(smoothing)
return df
def load_tracks_from_csv(file: Path, fps: float, grid_size: Optional[int] = None, sample: Optional[int] = None):
cache_file = Path('/tmp/load_tracks-smooth-' + file.name)
if cache_file.exists():
data = pd.read_pickle(cache_file)
else:
# grid_size is in points per meter
# sample: sample to every n-th point. Thus sample=5 converts 12fps to 2.4fps, and 4 to 3fps
data = pd.read_csv(file, delimiter="\t", index_col=False, header=None)
# l,t,w,h: image space (pixels)
# x,y: world space (meters or cm depending on homography)
data.columns = ['frame_id', 'track_id', 'l', 't', 'w', 'h', 'x', 'y', 'state']
data['frame_id'] = pd.to_numeric(data['frame_id'], downcast='integer')
data['frame_id'] = data['frame_id'] // 10 # compatibility with Trajectron++
data.sort_values(by=['track_id', 'frame_id'],inplace=True)
data.set_index(['track_id', 'frame_id'])
# cm to meter
data['x'] = data['x']/100
data['y'] = data['y']/100
if grid_size is not None:
data['x'] = (data['x']*grid_size).round() / grid_size
data['y'] = (data['y']*grid_size).round() / grid_size
data['diff'] = data.groupby(['track_id'])['frame_id'].diff() #.fillna(0)
data['diff'] = pd.to_numeric(data['diff'], downcast='integer')
data = interpolate_missing_frames(data)
data = smooth(data)
data.to_pickle(cache_file)
if sample is not None:
print(f"Samping 1/{sample}, of {data.shape[0]} items")
data["idx_in_track"] = data.groupby(['track_id']).cumcount() # create index in group
groups = data.groupby(['track_id'])
# print(groups, data)
# selection = groups['idx_in_track'].apply(lambda x: x % sample == 0)
# print(selection)
selection = data["idx_in_track"].apply(lambda x: x % sample == 0)
# data = data[selection]
data = data.loc[selection].copy() # avoid errors
# # convert from e.g. 12Hz, to 2.4Hz (1/5)
# sampled_groups = []
# for name, group in data.groupby('track_id'):
# sampled_groups.append(group.iloc[::sample])
# print(f"Sampled {len(sampled_groups)} groups")
# data = pd.concat(sampled_groups, axis=1).T
print(f"Done sampling kept {data.shape[0]} items")
# String ot int
data['track_id'] = pd.to_numeric(data['track_id'], downcast='integer')
# redo diff after possible sampling:
data['diff'] = data.groupby(['track_id'])['frame_id'].diff()
# timestep to seconds
data['dt'] = data['diff'] * (1/fps)
# "Deriving displacement, velocity and accelation from x and y")
data['dx'] = data.groupby(['track_id'])['x'].diff()
data['dy'] = data.groupby(['track_id'])['y'].diff()
data['vx'] = data['dx'].div(data['dt'], axis=0)
data['vy'] = data['dy'].div(data['dt'], axis=0)
data['ax'] = data.groupby(['track_id'])['vx'].diff().div(data['dt'], axis=0)
data['ay'] = data.groupby(['track_id'])['vy'].diff().div(data['dt'], axis=0)
# then we need the velocity itself
data['v'] = np.sqrt(data['vx'].pow(2) + data['vy'].pow(2))
# and derive acceleration
data['a'] = data.groupby(['track_id'])['v'].diff().div(data['dt'], axis=0)
# we can calculate heading based on the velocity components
data['heading'] = (np.arctan2(data['vy'], data['vx']) * 180 / np.pi) % 360
# and derive it to get the rate of change of the heading
data['d_heading'] = data.groupby(['track_id'])['heading'].diff().div(data['dt'], axis=0)
# we can backfill the derived parameters (v and a), assuming they were constant when entering the frame
# so that our model can make estimations, based on these assumed values
group = data.groupby(['track_id'])
for field in ['dx', 'dy', 'vx', 'vy', 'ax', 'ay', 'v', 'a', 'heading', 'd_heading']:
data[field] = group[field].bfill()
data.set_index(['track_id', 'frame_id'], inplace=True) # use for quick access
return data
def filter_short_tracks(data: pd.DataFrame, n):
return data.groupby(['track_id']).filter(lambda group: len(group) >= n) # a lenght of 3 is neccessary to have all relevant derivatives of position
# print(filtered_data.shape[0], "items in filtered set, out of", data.shape[0], "in total set")
def normalise_position(data: pd.DataFrame):
mu = data[['x','y']].mean(axis=0)
std = data[['x','y']].std(axis=0)
data[['x_norm','y_norm']] = (data[['x','y']] - mu) / std
return data, mu, std

View file

@ -4,17 +4,20 @@ import csv
from dataclasses import dataclass, field
import json
import logging
from math import nan
from multiprocessing import Event
from pathlib import Path
import pickle
import time
from typing import Optional
from typing import Dict, Optional, List
import jsonlines
import numpy as np
import torch
import torchvision
import zmq
import cv2
from torchvision.models.detection import retinanet_resnet50_fpn_v2, RetinaNet_ResNet50_FPN_V2_Weights, keypointrcnn_resnet50_fpn, KeypointRCNN_ResNet50_FPN_Weights, maskrcnn_resnet50_fpn_v2, MaskRCNN_ResNet50_FPN_V2_Weights
from torchvision.models.detection import retinanet_resnet50_fpn_v2, RetinaNet_ResNet50_FPN_V2_Weights, keypointrcnn_resnet50_fpn, KeypointRCNN_ResNet50_FPN_Weights, maskrcnn_resnet50_fpn_v2, MaskRCNN_ResNet50_FPN_V2_Weights, FasterRCNN_ResNet50_FPN_V2_Weights, fasterrcnn_resnet50_fpn_v2
from deep_sort_realtime.deepsort_tracker import DeepSort
from torchvision.models import ResNet50_Weights
from deep_sort_realtime.deep_sort.track import Track as DeepsortTrack
@ -22,11 +25,12 @@ from deep_sort_realtime.deep_sort.track import Track as DeepsortTrack
from ultralytics import YOLO
from ultralytics.engine.results import Results as YOLOResult
from trap.frame_emitter import DetectionState, Frame, Detection, Track
from trap.frame_emitter import Camera, DataclassJSONEncoder, DetectionState, Frame, Detection, Track
from bytetracker import BYTETracker
from tsmoothie.smoother import KalmanSmoother, ConvolutionSmoother
import tsmoothie.smoother
from datetime import datetime, timedelta
# Detection = [int, int, int, int, float, int]
# Detections = [Detection]
@ -44,26 +48,341 @@ DETECTOR_MASKRCNN = 'maskrcnn'
DETECTOR_FASTERRCNN = 'fasterrcnn'
DETECTOR_YOLOv8 = 'ultralytics'
TRACKER_DEEPSORT = 'deepsort'
TRACKER_BYTETRACK = 'bytetrack'
DETECTORS = [DETECTOR_RETINANET, DETECTOR_MASKRCNN, DETECTOR_FASTERRCNN, DETECTOR_YOLOv8]
TRACKERS =[TRACKER_DEEPSORT, TRACKER_BYTETRACK]
TRACKER_CONFIDENCE_MINIMUM = .2
TRACKER_BYTETRACK_MINIMUM = .1 # bytetrack can track items iwth lower thershold
NON_MAXIMUM_SUPRESSION = 1
RCNN_SCALE = .4 # seems to have no impact on detections in the corners
def _yolov8_track(frame: Frame, model: YOLO, **kwargs) -> List[Detection]:
results: List[YOLOResult] = list(model.track(frame.img, persist=True, tracker="custom_bytetrack.yaml", verbose=False, conf=0.00001, **kwargs))
if results[0].boxes is None or results[0].boxes.id is None:
# work around https://github.com/ultralytics/ultralytics/issues/5968
return []
boxes = results[0].boxes.xywh.cpu()
track_ids = results[0].boxes.id.int().cpu().tolist()
classes = results[0].boxes.cls.int().cpu().tolist()
return [Detection(track_id, bbox[0]-.5*bbox[2], bbox[1]-.5*bbox[3], bbox[2], bbox[3], 1, DetectionState.Confirmed, frame.index, class_id) for bbox, track_id, class_id in zip(boxes, track_ids, classes)]
class Multifile():
def __init__(self, srcs: List[Path]):
self.srcs = srcs
self.g = self.__iter__()
self.current_file = None
@property
def name(self):
return ", ".join([s.name for s in self.srcs])
def __iter__(self):
for path in self.srcs:
self.current_file = path
with path.open('r') as fp:
for l in fp:
yield l
def readline(self):
return self.g.__next__()
FIELDNAMES = ['frame_id', 'track_id', 'l', 't', 'w', 'h', 'x', 'y', 'state', 'source']
class TrackFilter:
pass
def apply(self, tracks: List[Track], camera: Camera):
return [t for t in tracks if self.filter(t, camera)]
def apply_to_dict(self, tracks: Dict[str, Track], camera: Camera):
tracks = self.apply(tracks.values(), camera)
return {t.track_id: t for t in tracks}
class FinalDisplacementFilter(TrackFilter):
def __init__(self, min_displacement):
self.min_displacement = min_displacement
def filter(self, track: Track, camera: Camera):
history = track.get_projected_history(H=None, camera=camera)
displacement = np.linalg.norm(history[0]-history[-1])
return displacement > self.min_displacement
class TrackReader:
def __init__(self, path: Path, fps: int, include_blacklisted = False, exclude_whitelisted = False):
self.blacklist_file = path / "blacklist.jsonl"
self.whitelist_file = path / "whitelist.jsonl" # for skipping
self.tracks_file = path / "tracks.pkl"
# with self.tracks_file.open('r') as fp:
# tracks_dict: dict = json.load(fp)
with self.tracks_file.open('rb') as fp:
tracks: dict = pickle.load(fp)
if self.blacklist_file.exists():
with jsonlines.open(self.blacklist_file, 'r') as reader:
blacklist = [track_id for track_id in reader.iter(type=str)]
else:
blacklist = []
if self.whitelist_file.exists():
with jsonlines.open(self.whitelist_file, 'r') as reader:
whitelist = [track_id for track_id in reader.iter(type=str)]
else:
whitelist = []
self._tracks = { track_id: detection_values
for track_id, detection_values in tracks.items()
if (include_blacklisted or track_id not in blacklist) and
(not exclude_whitelisted or track_id not in whitelist)
}
self.fps = fps
def __len__(self):
return len(self._tracks)
def get(self, track_id):
return self._tracks[track_id]
# detection_values = self._tracks[track_id]
# history = []
# # for detection_values in
# source = None
# for detection_items in detection_values:
# d = dict(zip(FIELDNAMES, detection_items))
# history.append(Detection(
# d['track_id'],
# d['l'],
# d['t'],
# d['w'],
# d['h'],
# nan,
# d['state'],
# d['frame_id'],
# 1,
# ))
# source = int(d['source'])
# return Track(track_id, history, fps=self.fps, source=source)
def __iter__(self):
for track_id in self._tracks:
yield self.get(track_id)
def read_tracks_json(path: Path, fps):
"""
Reader for tracks.json produced by TrainingDataWriter
"""
reader = TrackReader(path, fps)
for t in reader:
yield t
class TrainingDataWriter:
def __init__(self, training_path: Optional[Path]):
if training_path is None:
self.path = None
return
if not isinstance(training_path, Path):
raise ValueError("save-for-training should be a path")
if not training_path.exists():
logger.info(f"Making path for training data: {training_path}")
training_path.mkdir(parents=True, exist_ok=False)
else:
logger.warning(f"Path for training-data exists: {training_path}. Continuing assuming that's ok.")
# following https://github.com/StanfordASL/Trajectron-plus-plus/blob/master/experiments/pedestrians/process_data.py
self.path = training_path
def __enter__(self):
if self.path:
d = datetime.now().isoformat(timespec="minutes")
self.training_fp = open(self.path / f'all-{d}.txt', 'w')
logger.debug(f"Writing tracker data to {self.training_fp.name}")
# following https://github.com/StanfordASL/Trajectron-plus-plus/blob/master/experiments/pedestrians/process_data.py
self.csv = csv.DictWriter(self.training_fp, fieldnames=FIELDNAMES, delimiter='\t', quoting=csv.QUOTE_NONE)
self.count = 0
return self
def add(self, frame: Frame, tracks: List[Track]):
if not self.path:
# skip if disabled
return
self.csv.writerows([{
'frame_id': round(frame.index * 10., 1), # not really time
'track_id': t.track_id,
'l': float(t.history[-1].l), # to float, so we're sure it's not a torch.tensor()
't': float(t.history[-1].t),
'w': float(t.history[-1].w),
'h': float(t.history[-1].h),
'x': t.get_projected_history(frame.H, frame.camera)[-1][0],
'y': t.get_projected_history(frame.H, frame.camera)[-1][1],
'state': t.history[-1].state.value
# only keep _actual_detections, no lost entries
} for t in tracks
# if t.history[-1].state != DetectionState.Lost
])
self.count += len(tracks)
def __exit__(self, exc_type, exc_value, exc_tb):
# ... ignore exception (type, value, traceback)
if not self.path:
return
self.training_fp.close()
rewrite_raw_track_files(self.path)
def rewrite_raw_track_files(path: Path):
source_files = list(sorted(path.glob("*.txt"))) # we loop twice, so need a list instead of generator
total = 0
sources = Multifile(source_files)
for line in sources:
if len(line) > 3: # make sure not to count empty lines
total += 1
destinations = {
'train': int(total * .8),
'val': int(total * .12),
'test': int(total * .08),
}
logger.info(f"Splitting gathered data from {source_files}")
# for source_file in source_files:
tracks_file = path / 'tracks.json'
tracks_pkl = path / 'tracks.pkl'
tracks = defaultdict(lambda: Track())
offset = 0
max_track_id = 0
prev_file = None
# all-2024-11-12T13:30.txt
file_date = None
src_file_nr = 0
for name, line_nrs in destinations.items():
dir_path = path / name
dir_path.mkdir(exist_ok=True)
file = dir_path / 'tracked.txt'
logger.debug(f"- Write {line_nrs} lines to {file}")
with file.open('w') as target_fp:
for i in range(line_nrs):
line = sources.readline()
current_file = sources.current_file
if prev_file != current_file:
offset: int = max_track_id
logger.info(f'{name} - update offset {offset} ({sources.current_file})')
prev_file = current_file
src_file_nr += 1
try:
file_date = datetime.strptime(current_file.name, 'all-%Y-%m-%dT%H:%M.txt')
except ValueError as e:
logger.error(str(e))
file_date = None
parts = line.split('\t')
track_id = int(parts[1]) + offset
if file_date:
frame_date = file_date + timedelta(seconds = int(float(parts[0]))//10)
else:
frame_date = None
if track_id > max_track_id:
max_track_id = track_id
track_id = str(track_id)
target_fp.write("\t".join(parts))
parts = [float(p) for p in parts]
# ['frame_id', 'track_id', 'l', 't', 'w', 'h', 'x', 'y', 'state', 'source']
point = Detection(track_id, parts[2], parts[3], parts[4], parts[5], 1, DetectionState(int(parts[8])), int(parts[0]/10), 1)
# history = [
# for d in parts]
tracks[track_id].track_id = track_id
tracks[track_id].source = src_file_nr
tracks[track_id].history.append(point)
# tracks[track_id].append([
# int(parts[0] / 10),
# track_id,
# ] + parts[2:8] + [int(parts[8]), src_file_nr])
with tracks_file.open('w') as fp:
logger.info(f"Write {len(tracks)} tracks to {str(tracks_file)}")
json.dump(tracks, fp, cls=DataclassJSONEncoder, indent=2)
with tracks_pkl.open('wb') as fp:
logger.info(f"Write {len(tracks)} tracks to {str(tracks_pkl)}")
pickle.dump(dict(tracks), fp)
class TrackerWrapper():
def __init__(self, tracker):
self.tracker = tracker
def track_detections():
raise RuntimeError("Not implemented")
@classmethod
def init_type(cls, tracker_type: str):
if tracker_type == TRACKER_BYTETRACK:
return ByteTrackWrapper(BYTETracker(track_thresh=TRACKER_BYTETRACK_MINIMUM, match_thresh=TRACKER_CONFIDENCE_MINIMUM, frame_rate=12)) # TODO)) Framerate from emitter
else:
return DeepSortWrapper(DeepSort(n_init=5, max_age=30, nms_max_overlap=NON_MAXIMUM_SUPRESSION,
embedder='torchreid', embedder_wts="../MODELS/osnet_x1_0_imagenet.pth"
))
class DeepSortWrapper(TrackerWrapper):
def track_detections(self, detections, img: cv2.Mat, frame_idx: int):
detections = Tracker.detect_persons_deepsort_wrapper(detections)
tracks: List[DeepsortTrack] = self.tracker.update_tracks(detections, frame=img)
active_tracks = [t for t in tracks if t.is_confirmed()]
return [Detection.from_deepsort(t, frame_idx) for t in active_tracks]
# raw_detections, embeds=None, frame=None, today=None, others=None, instance_masks=Non
class ByteTrackWrapper(TrackerWrapper):
def __init__(self, tracker: BYTETracker):
self.tracker = tracker
def track_detections(self, detections: tuple[list[float,float,float,float], float, float], img: cv2.Mat, frame_idx: int):
# detections
if detections.shape[0] == 0:
detections = np.ndarray((0,0)) # needs to be 2-D
_ = self.tracker.update(detections)
active_tracks = [track for track in self.tracker.tracked_stracks if track.is_activated]
active_tracks = [track for track in active_tracks if track.start_frame < (self.tracker.frame_id - 5)]
return [Detection.from_bytetrack(track, frame_idx) for track in active_tracks]
class Tracker:
def __init__(self, config: Namespace, is_running: Event):
def __init__(self, config: Namespace):
self.config = config
self.is_running = is_running
context = zmq.Context()
self.frame_sock = context.socket(zmq.SUB)
self.frame_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.frame_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.frame_sock.connect(config.zmq_frame_addr)
self.trajectory_socket = context.socket(zmq.PUB)
self.trajectory_socket.setsockopt(zmq.CONFLATE, 1) # only keep latest frame
self.trajectory_socket.bind(config.zmq_trajectory_addr)
# # TODO: config device
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@ -73,212 +392,243 @@ class Tracker:
logger.debug(f"Load tracker: {self.config.detector}")
conf = TRACKER_BYTETRACK_MINIMUM if self.config.tracker == TRACKER_BYTETRACK else TRACKER_CONFIDENCE_MINIMUM
if self.config.detector == DETECTOR_RETINANET:
# weights = RetinaNet_ResNet50_FPN_V2_Weights.DEFAULT
# self.model = retinanet_resnet50_fpn_v2(weights=weights, score_thresh=0.2)
weights = KeypointRCNN_ResNet50_FPN_Weights.DEFAULT
self.model = keypointrcnn_resnet50_fpn(weights=weights, box_score_thresh=0.35)
weights = KeypointRCNN_ResNet50_FPN_Weights.COCO_V1
self.model = keypointrcnn_resnet50_fpn(weights=weights, box_score_thresh=conf)
self.model.to(self.device)
# Put the model in inference mode
self.model.eval()
# Get the transforms for the model's weights
self.preprocess = weights.transforms().to(self.device)
self.mot_tracker = DeepSort(max_iou_distance=1, max_cosine_distance=0.5, max_age=15, nms_max_overlap=0.9,
# embedder='torchreid', embedder_wts="../MODELS/osnet_x1_0_imagenet.pth"
)
self.mot_tracker = TrackerWrapper.init_type(self.config.tracker)
elif self.config.detector == DETECTOR_FASTERRCNN:
# weights = RetinaNet_ResNet50_FPN_V2_Weights.DEFAULT
# self.model = retinanet_resnet50_fpn_v2(weights=weights, score_thresh=0.2)
weights = FasterRCNN_ResNet50_FPN_V2_Weights.COCO_V1
self.model = fasterrcnn_resnet50_fpn_v2(weights=weights, box_score_thresh=conf)
self.model.to(self.device)
# Put the model in inference mode
self.model.eval()
# Get the transforms for the model's weights
self.preprocess = weights.transforms().to(self.device)
self.mot_tracker = TrackerWrapper.init_type(self.config.tracker)
elif self.config.detector == DETECTOR_MASKRCNN:
weights = MaskRCNN_ResNet50_FPN_V2_Weights.COCO_V1
self.model = maskrcnn_resnet50_fpn_v2(weights=weights, box_score_thresh=0.7)
self.model = maskrcnn_resnet50_fpn_v2(weights=weights, box_score_thresh=conf) # if we use ByteTrack we can work with low probablity!
self.model.to(self.device)
# Put the model in inference mode
self.model.eval()
# Get the transforms for the model's weights
self.preprocess = weights.transforms().to(self.device)
self.mot_tracker = DeepSort(n_init=5, max_iou_distance=1, max_cosine_distance=0.5, max_age=15, nms_max_overlap=0.9,
# embedder='torchreid', embedder_wts="../MODELS/osnet_x1_0_imagenet.pth"
)
# self.mot_tracker = DeepSort(n_init=5, max_iou_distance=1, max_cosine_distance=0.2, max_age=15, nms_max_overlap=0.9,
self.mot_tracker = TrackerWrapper.init_type(self.config.tracker)
elif self.config.detector == DETECTOR_YOLOv8:
self.model = YOLO('EXPERIMENTS/yolov8x.pt')
# self.model = YOLO('EXPERIMENTS/yolov8x.pt')
self.model = YOLO('yolo11x.pt')
else:
raise RuntimeError(f"{self.config.detector} is not implemented yet. See --help")
# homography = list(source.glob('*img2world.txt'))[0]
self.H = np.loadtxt(self.config.homography, delimiter=',')
self.H = self.config.H
if self.config.smooth_tracks:
logger.info("Smoother enabled")
self.smoother = Smoother()
fps = 12 # TODO)) make configurable, or get from cam
self.smoother = Smoother(window_len=fps*5, convolution=False)
else:
logger.info("Smoother Disabled (enable with --smooth-tracks)")
logger.debug("Set up tracker")
def track_frame(self, frame: Frame):
if self.config.detector == DETECTOR_YOLOv8:
detections: List[Detection] = _yolov8_track(frame, self.model, classes=[0, 15, 16], imgsz=[1152, 640])
else :
detections: List[Detection] = self._resnet_track(frame, scale = RCNN_SCALE)
for detection in detections:
track = self.tracks[detection.track_id]
track.track_id = detection.track_id # for new tracks
track.fps = self.config.camera.fps # for new tracks
track.history.append(detection) # add to history
return detections
def track(self):
def track(self, is_running: Event, timer_counter: int = 0):
"""
Live tracking of frames coming in over zmq
"""
self.is_running = is_running
context = zmq.Context()
self.frame_sock = context.socket(zmq.SUB)
self.frame_sock.setsockopt(zmq.CONFLATE, 1) # only keep latest frame. NB. make sure this comes BEFORE connect, otherwise it's ignored!!
self.frame_sock.setsockopt(zmq.SUBSCRIBE, b'')
self.frame_sock.connect(self.config.zmq_frame_addr)
self.trajectory_socket = context.socket(zmq.PUB)
self.trajectory_socket.setsockopt(zmq.CONFLATE, 1) # only keep latest frame
self.trajectory_socket.bind(self.config.zmq_trajectory_addr)
prev_run_time = 0
training_fp = None
training_csv = None
training_frames = 0
# training_fp = None
# training_csv = None
# training_frames = 0
if self.config.save_for_training is not None:
if not isinstance(self.config.save_for_training, Path):
raise ValueError("save-for-training should be a path")
if not self.config.save_for_training.exists():
logger.info(f"Making path for training data: {self.config.save_for_training}")
self.config.save_for_training.mkdir(parents=True, exist_ok=False)
else:
logger.warning(f"Path for training-data exists: {self.config.save_for_training}. Continuing assuming that's ok.")
training_fp = open(self.config.save_for_training / 'all.txt', 'w')
# following https://github.com/StanfordASL/Trajectron-plus-plus/blob/master/experiments/pedestrians/process_data.py
training_csv = csv.DictWriter(training_fp, fieldnames=['frame_id', 'track_id', 'l', 't', 'w', 'h', 'x', 'y', 'state'], delimiter='\t', quoting=csv.QUOTE_NONE)
# if self.config.save_for_training is not None:
# if not isinstance(self.config.save_for_training, Path):
# raise ValueError("save-for-training should be a path")
# if not self.config.save_for_training.exists():
# logger.info(f"Making path for training data: {self.config.save_for_training}")
# self.config.save_for_training.mkdir(parents=True, exist_ok=False)
# else:
# logger.warning(f"Path for training-data exists: {self.config.save_for_training}. Continuing assuming that's ok.")
# training_fp = open(self.config.save_for_training / 'all.txt', 'w')
# # following https://github.com/StanfordASL/Trajectron-plus-plus/blob/master/experiments/pedestrians/process_data.py
# training_csv = csv.DictWriter(training_fp, fieldnames=['frame_id', 'track_id', 'l', 't', 'w', 'h', 'x', 'y', 'state'], delimiter='\t', quoting=csv.QUOTE_NONE)
prev_frame_i = -1
while self.is_running.is_set():
# this waiting for target_dt causes frame loss. E.g. with target_dt at .1, it
# skips exactly 1 frame on a 10 fps video (which, it obviously should not do)
# so for now, timing should move to emitter
# this_run_time = time.time()
# # logger.debug(f'test {prev_run_time - this_run_time}')
# time.sleep(max(0, prev_run_time - this_run_time + TARGET_DT))
# prev_run_time = time.time()
zmq_ev = self.frame_sock.poll(timeout=2000)
if not zmq_ev:
logger.warn('skip poll after 2000ms')
# when there's no data after timeout, loop so that is_running is checked
continue
start_time = time.time()
frame: Frame = self.frame_sock.recv_pyobj() # frame delivery in current setup: 0.012-0.03s
if frame.index > (prev_frame_i+1):
logger.warn(f"Dropped {frame.index - prev_frame_i - 1} frames ({frame.index=}, {prev_frame_i=})")
prev_frame_i = frame.index
# load homography into frame (TODO: should this be done in emitter?)
if frame.H is None:
# logger.warning('Falling back to default H')
# fallback: load configured H
frame.H = self.H
# logger.info(f"Frame delivery delay = {time.time()-frame.time}s")
if self.config.detector == DETECTOR_YOLOv8:
detections: [Detection] = self._yolov8_track(frame)
else :
detections: [Detection] = self._resnet_track(frame.img, scale = 1)
with TrainingDataWriter(self.config.save_for_training) as writer:
end_time = None
tracker_dt = None
w_time = None
displacement_filter = FinalDisplacementFilter(.2)
while self.is_running.is_set():
# Store detections into tracklets
projected_coordinates = []
for detection in detections:
track = self.tracks[detection.track_id]
track.track_id = detection.track_id # for new tracks
with timer_counter.get_lock():
timer_counter.value += 1
# this waiting for target_dt causes frame loss. E.g. with target_dt at .1, it
# skips exactly 1 frame on a 10 fps video (which, it obviously should not do)
# so for now, timing should move to emitter
# this_run_time = time.time()
# # logger.debug(f'test {prev_run_time - this_run_time}')
# time.sleep(max(0, prev_run_time - this_run_time + TARGET_DT))
# prev_run_time = time.time()
poll_time = time.time()
zmq_ev = self.frame_sock.poll(timeout=2000)
if not zmq_ev:
logger.warning('skip poll after 2000ms')
# when there's no data after timeout, loop so that is_running is checked
continue
start_time = time.time()
frame: Frame = self.frame_sock.recv_pyobj() # frame delivery in current setup: 0.012-0.03s
if frame.index > (prev_frame_i+1):
logger.warning(f"Dropped {frame.index - prev_frame_i - 1} frames ({frame.index=}, {prev_frame_i=}) -- poll time {start_time-poll_time:.5f}")
if tracker_dt:
logger.warning(f"last loop took {tracker_dt} (finished {start_time - end_time:0.5f} ago, writing took {w_time-end_time} and finshed {start_time - w_time} ago).. {writer.path}")
track.history.append(detection) # add to history
# projected_coordinates.append(track.get_projected_history(self.H)) # then get full history
# TODO: hadle occlusions, and dissappearance
# if len(track.history) > 30: # retain 90 tracks for 90 frames
# track.history.pop(0)
prev_frame_i = frame.index
# load homography into frame (TODO: should this be done in emitter?)
if frame.H is None:
# logger.warning('Falling back to default H')
# fallback: load configured H
frame.H = self.H
# trajectories = {}
# for detection in detections:
# tid = str(detection.track_id)
# track = self.tracks[detection.track_id]
# coords = track.get_projected_history(self.H) # get full history
# trajectories[tid] = {
# "id": tid,
# "det_conf": detection.conf,
# "bbox": detection.to_ltwh(),
# "history": [{"x":c[0], "y":c[1]} for c in coords[0]] if not self.config.bypass_prediction else coords[0].tolist() # already doubles nested, fine for test
# }
active_track_ids = [d.track_id for d in detections]
active_tracks = {t.track_id: t for t in self.tracks.values() if t.track_id in active_track_ids}
# logger.info(f"{trajectories}")
frame.tracks = active_tracks
# logger.info(f"Frame delivery delay = {time.time()-frame.time}s")
# if self.config.bypass_prediction:
# self.trajectory_socket.send_string(json.dumps(trajectories))
# else:
# self.trajectory_socket.send(pickle.dumps(frame))
if self.config.smooth_tracks:
frame = self.smoother.smooth_frame_tracks(frame)
self.trajectory_socket.send_pyobj(frame)
current_time = time.time()
logger.debug(f"Trajectories: {len(active_tracks)}. Current frame delay = {current_time-frame.time}s (trajectories: {current_time - start_time}s)")
# self.trajectory_socket.send_string(json.dumps(trajectories))
# provide a {ID: {id: ID, history: [[x,y],[x,y],...]}}
# TODO: provide a track object that actually keeps history (unlike tracker)
#TODO calculate fps (also for other loops to see asynchonity)
# fpsfilter=fpsfilter*.9+(1/dt)*.1 #trust value in order to stabilize fps display
if training_csv:
training_csv.writerows([{
'frame_id': round(frame.index * 10., 1), # not really time
'track_id': t.track_id,
'l': t.history[-1].l,
't': t.history[-1].t,
'w': t.history[-1].w,
'h': t.history[-1].h,
'x': t.get_projected_history(frame.H)[-1][0],
'y': t.get_projected_history(frame.H)[-1][1],
'state': t.history[-1].state.value
# only keep _actual_detections, no lost entries
} for t in active_tracks.values()
# if t.history[-1].state != DetectionState.Lost
])
training_frames += len(active_tracks)
# print(time.time() - start_time)
if training_fp:
training_fp.close()
lines = {
'train': int(training_frames * .8),
'val': int(training_frames * .12),
'test': int(training_frames * .08),
}
logger.info(f"Splitting gathered data from {training_fp.name}")
with open(training_fp.name, 'r') as source_fp:
for name, line_nrs in lines.items():
dir_path = self.config.save_for_training / name
dir_path.mkdir(exist_ok=True)
file = dir_path / 'tracked.txt'
logger.debug(f"- Write {line_nrs} lines to {file}")
with file.open('w') as target_fp:
for i in range(line_nrs):
target_fp.write(source_fp.readline())
detections: List[Detection] = self.track_frame(frame)
# Store detections into tracklets
projected_coordinates = []
# now in track_frame()
# for detection in detections:
# track = self.tracks[detection.track_id]
# track.track_id = detection.track_id # for new tracks
# track.history.append(detection) # add to history
# projected_coordinates.append(track.get_projected_history(self.H)) # then get full history
# TODO: hadle occlusions, and dissappearance
# if len(track.history) > 30: # retain 90 tracks for 90 frames
# track.history.pop(0)
# trajectories = {}
# for detection in detections:
# tid = str(detection.track_id)
# track = self.tracks[detection.track_id]
# coords = track.get_projected_history(self.H) # get full history
# trajectories[tid] = {
# "id": tid,
# "det_conf": detection.conf,
# "bbox": detection.to_ltwh(),
# "history": [{"x":c[0], "y":c[1]} for c in coords[0]] if not self.config.bypass_prediction else coords[0].tolist() # already doubles nested, fine for test
# }
active_track_ids = [d.track_id for d in detections]
active_tracks = {t.track_id: t.get_with_interpolated_history() for t in self.tracks.values() if t.track_id in active_track_ids}
active_tracks = displacement_filter.apply_to_dict(active_tracks, frame.camera)# a filter to remove just detecting static objects
# active_tracks = {t.track_id: t for t in self.tracks.values() if t.track_id in active_track_ids}
# active_tracks = {t.track_id: t for t in self.tracks.values() if t.track_id in active_track_ids}
# logger.info(f"{trajectories}")
frame.tracks = active_tracks
# if self.config.bypass_prediction:
# self.trajectory_socket.send_string(json.dumps(trajectories))
# else:
# self.trajectory_socket.send(pickle.dumps(frame))
if self.config.smooth_tracks:
frame = self.smoother.smooth_frame_tracks(frame)
# print(f"send to {self.trajectory_socket}, {self.config.zmq_trajectory_addr}")
self.trajectory_socket.send_pyobj(frame.without_img()) # ditch image for faster passthrough
end_time = time.time()
tracker_dt = end_time - start_time
# having {end_time-frame.time} creates incidental delay... don't know why, maybe because of send?. So add n/a for now
# or is it {len(active_tracks)} or {tracker_dt}
# logger.debug(f"Trajectories: n/a. Current frame delay = n/a s (trajectories:s)")
# self.trajectory_socket.send_string(json.dumps(trajectories))
# provide a {ID: {id: ID, history: [[x,y],[x,y],...]}}
# TODO: provide a track object that actually keeps history (unlike tracker)
# TODO calculate fps (also for other loops to see asynchonity)
# fpsfilter=fpsfilter*.9+(1/dt)*.1 #trust value in order to stabilize fps display
writer.add(frame, active_tracks.values())
w_time = time.time()
logger.info('Stopping')
def _yolov8_track(self, frame: Frame,) -> [Detection]:
results: [YOLOResult] = self.model.track(frame.img, persist=True, tracker="bytetrack.yaml", verbose=False)
if results[0].boxes is None or results[0].boxes.id is None:
# work around https://github.com/ultralytics/ultralytics/issues/5968
return []
return [Detection(track_id, bbox[0]-.5*bbox[2], bbox[1]-.5*bbox[3], bbox[2], bbox[3], 1, DetectionState.Confirmed, frame.index) for bbox, track_id in zip(results[0].boxes.xywh.cpu(), results[0].boxes.id.int().cpu().tolist())]
def _resnet_track(self, img, scale: float = 1) -> [Detection]:
def _resnet_track(self, frame: Frame, scale: float = 1) -> List[Detection]:
img = frame.img
if scale != 1:
dsize = (int(img.shape[1] * scale), int(img.shape[0] * scale))
img = cv2.resize(img, dsize)
detections = self._resnet_detect_persons(img)
tracks: [DeepsortTrack] = self.mot_tracker.update_tracks(detections, frame=img)
return [Detection.from_deepsort(t).get_scaled(1/scale) for t in tracks]
tracks: List[Detection] = self.mot_tracker.track_detections(detections, img, frame.index)
# active_tracks = [t for t in tracks if t.is_confirmed()]
return [d.get_scaled(1/scale) for d in tracks]
def _resnet_detect_persons(self, frame) -> [Detection]:
def _resnet_detect_persons(self, frame) -> List[Detection]:
t = torch.from_numpy(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
# change axes of image loaded image to be compatilbe with torch.io.read_image (which has C,W,H format instead of W,H,C)
t = t.permute(2, 0, 1)
@ -293,8 +643,19 @@ class Tracker:
mask = prediction['labels'] == 1 # if we want more than one label: np.isin(prediction['labels'], [1,86])
scores = prediction['scores'][mask]
# print(scores, prediction['labels'])
labels = prediction['labels'][mask]
boxes = prediction['boxes'][mask]
# print(prediction['scores'])
if NON_MAXIMUM_SUPRESSION < 1:
nms_mask = torch.zeros(scores.shape[0]).bool()
nms_keep_ids = torchvision.ops.nms(boxes, scores, NON_MAXIMUM_SUPRESSION)
nms_mask[nms_keep_ids] = True
print(scores.shape[0], nms_keep_ids, nms_mask)
scores = scores[nms_mask]
labels = labels[nms_mask]
boxes = boxes[nms_mask]
# TODO: introduce confidence and NMS supression: https://github.com/cfotache/pytorch_objectdetecttrack/blob/master/PyTorch_Object_Tracking.ipynb
# (which I _think_ we better do after filtering)
@ -302,7 +663,7 @@ class Tracker:
# dets - a numpy array of detections in the format [[x1,y1,x2,y2,score, label],[x1,y1,x2,y2,score, label],...]
detections = np.array([np.append(bbox, [score, label]) for bbox, score, label in zip(boxes.cpu(), scores.cpu(), labels.cpu())])
detections = self.detect_persons_deepsort_wrapper(detections)
return detections
@ -315,35 +676,49 @@ class Tracker:
return [([d[0], d[1], d[2]-d[0], d[3]-d[1]], d[4], d[5]) for d in detections]
def run_tracker(config: Namespace, is_running: Event):
router = Tracker(config, is_running)
router.track()
def run_tracker(config: Namespace, is_running: Event, timer_counter):
router = Tracker(config)
router.track(is_running, timer_counter)
class Smoother:
def __init__(self, window_len=2):
self.smoother = ConvolutionSmoother(window_len=window_len, window_type='ones', copy=None)
def __init__(self, window_len=6, convolution=False):
# for some reason this smoother messes the predictions. Probably skews the points too much??
if convolution:
self.smoother = ConvolutionSmoother(window_len=window_len, window_type='hanning', copy=None)
else:
# "Unlike Kalman filtering, which focuses on predicting and updating the current state using historical measurements, Kalman smoothing enhances the accuracy of past state values"
# see https://medium.com/@shahalkp1/kalman-smoothing-using-tsmoothie-0175260464e5
self.smoother = KalmanSmoother(component='level_trend', component_noise={'level':0.03, 'season': .02, 'trend':0.04},n_seasons = 2, copy=None)
def smooth(self, points: List[float]):
self.smoother.smooth(points)
return self.smoother.smooth_data[0]
def smooth_track(self, track: Track) -> Track:
ls = [d.l for d in track.history]
ts = [d.t for d in track.history]
ws = [d.w for d in track.history]
hs = [d.h for d in track.history]
self.smoother.smooth(ls)
ls = self.smoother.smooth_data[0]
self.smoother.smooth(ts)
ts = self.smoother.smooth_data[0]
self.smoother.smooth(ws)
ws = self.smoother.smooth_data[0]
self.smoother.smooth(hs)
hs = self.smoother.smooth_data[0]
new_history = [Detection(d.track_id, l, t, w, h, d.conf, d.state, d.frame_nr, d.det_class) for l, t, w, h, d in zip(ls,ts,ws,hs, track.history)]
return Track(track.track_id, new_history, track.predictor_history, track.predictions, track.fps)
def smooth_frame_tracks(self, frame: Frame) -> Frame:
new_tracks = []
for track in frame.tracks.values():
ls = [d.l for d in track.history]
ts = [d.t for d in track.history]
ws = [d.w for d in track.history]
hs = [d.h for d in track.history]
self.smoother.smooth(ls)
ls = self.smoother.smooth_data[0]
self.smoother.smooth(ts)
ts = self.smoother.smooth_data[0]
self.smoother.smooth(ws)
ws = self.smoother.smooth_data[0]
self.smoother.smooth(hs)
hs = self.smoother.smooth_data[0]
new_history = [Detection(d.track_id, l, t, w, h, d.conf, d.state, d.frame_nr) for l, t, w, h, d in zip(ls,ts,ws,hs, track.history)]
new_track = Track(track.track_id, new_history, track.predictor_history, track.predictions)
new_track = self.smooth_track(track)
new_tracks.append(new_track)
frame.tracks = {t.track_id: t for t in new_tracks}
return frame

71
trap/utils.py Normal file
View file

@ -0,0 +1,71 @@
# lerp & inverse lerp from https://gist.github.com/laundmo/b224b1f4c8ef6ca5fe47e132c8deab56
import linecache
import os
import tracemalloc
from typing import Iterable
import cv2
import numpy as np
def lerp(a: float, b: float, t: float) -> float:
"""Linear interpolate on the scale given by a to b, using t as the point on that scale.
Examples
--------
50 == lerp(0, 100, 0.5)
4.2 == lerp(1, 5, 0.8)
"""
return (1 - t) * a + t * b
def inv_lerp(a: float, b: float, v: float) -> float:
"""Inverse Linar Interpolation, get the fraction between a and b on which v resides.
Examples
--------
0.5 == inv_lerp(0, 100, 50)
0.8 == inv_lerp(1, 5, 4.2)
"""
return (v - a) / (b - a)
def get_bins(bin_size: float):
return [[bin_size, 0], [bin_size, bin_size], [0, bin_size], [-bin_size, bin_size], [-bin_size, 0], [-bin_size, -bin_size], [0, -bin_size], [bin_size, -bin_size]]
def convert_world_space_to_img_space(H: cv2.Mat):
"""Transform the given matrix so that it immediately converts
the points to img space"""
new_H = H.copy()
new_H[:2] = H[:2] * 100
return new_H
def convert_world_points_to_img_points(points: Iterable):
"""Transform the given matrix so that it immediately converts
the points to img space"""
if isinstance(points, np.ndarray):
return np.array(points) * 100
return [[p[0]*100, p[1]*100] for p in points]
def display_top(snapshot: tracemalloc.Snapshot, key_type='lineno', limit=5):
snapshot = snapshot.filter_traces((
tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
tracemalloc.Filter(False, "<unknown>"),
))
top_stats = snapshot.statistics(key_type)
print("Top %s lines" % limit)
for index, stat in enumerate(top_stats[:limit], 1):
frame = stat.traceback[0]
# replace "/path/to/module/file.py" with "module/file.py"
filename = os.sep.join(frame.filename.split(os.sep)[-2:])
print("#%s: %s:%s: %.1f KiB"
% (index, filename, frame.lineno, stat.size / 1024))
line = linecache.getline(frame.filename, frame.lineno).strip()
if line:
print(' %s' % line)
other = top_stats[limit:]
if other:
size = sum(stat.size for stat in other)
print("%s other: %.1f KiB" % (len(other), size / 1024))
total = sum(stat.size for stat in top_stats)
print("Total allocated size: %.1f KiB" % (total / 1024))