Cops Detection and Tracking

2020

This project started in 2020.

I constructed this project for academic and experimental purposes, I set the goal of being able to detect and classify a worker by uniform and other available data. I chose to do this with law enforcement personals for the convenience of gathering data.
The construction of this project included:

Utilizing different object detection models trained on the COCO dataset to gather new data and classify it

Building and using a simple C# program for fast labelling

Building a large labelled dataset of more than 3500 images in less than 3 hours

Designing, modifying and training various CNNs for classification on this dataset

Integration of tracking and filtering algorithms to the detection and classification processes

Improving the end classification result of a person by using previous data gathered on that person

I tested the program on different online videos and on a remote camera feed that was set up on a Raspberry Pi. I got very good results on the limited data I managed to test on, but there is much room for improvement.
Future improvements ideas:

Try a combination of RNNs and unsupervised learning methods to the statistical data for more accurate classification

Use GANs to extend the dataset and improve the CNN classification model

Add a version in TF Lite for the Raspberry Pi

Use multiple feeds and combine the data

Cops Detection and Tracking - Pipeline Overview

This project inspired the TrackEverything package and now has an example using it in this repository. You can find an old part of this project in one of my repositories here, this repository only contains the small implementation part for webcams and not the whole project.

The Pipeline

The pipeline starts by receiving a series of images (frames) and outputs a list of tracker objects that contains the persons detected and the probability of them being a cop.

Breaking it Down to 4 Steps

1st Step - Get All Detections in Current Frame

First, we take the frame and passe it through an object detection model, I use the base of this Object Detection repository (I modified the version for TF1 since the TF2 version only came out 10 days ago). This model is trained on the COCO dataset which detects around 90 different objects, I tried some models with different architectures like the Faster RCNN InceptionV2 and the MobileNetV2. I used the model to give me all the persons detected in a frame. Later I filter out redundant overlapping detections using the Non-maximum Suppression (NMS) method.

2nd Step - Get Classification Probabilities for the Detected Persons

After we have the persons from step 1, we put them through a classification model to determine the probability of them being a cop. I used the Xception CNN architecture with some added layers to train this model, I used this architecture for its low parameters count since my GPU’s capacity is limited.

Then, we create our Detections object list and which contains the positions boxes and the classification data.

3rd Step - Updated the Trackers Object List

We have a list of Trackers object which is a class that contains an OpenCV CSRT tracker (A Discriminative Correlation Filter Tracker with Channel and Spatial Reliability).

Overview of the CSR-DCF approach. An automatically estimated spatial reliability map restricts the correlation filter to the parts suitable for tracking (top) improving localization within a larger search region and performance for irregularly shaped objects. Channel reliability weights calculated in the constrained optimization step of the correlation filter learning reduce the noise of the weight-averaged filter response (bottom).

My tracker class also contains a unique ID, previous statistics about this ID and indicators for the accuracy of this tracker. In the first frame, this Trackers list is empty and then in step 4, it’s being filled with new trackers matching the detected objects. If the Trackers list is not empty, in this step we update the trackers positions using the current frame and dispose of failed trackers.

4th Step - Matching Detection with Trackers

Using intersection over union (IOU) of a tracker bounding box and detection bounding box as a metric. We solve the linear sum assignment problem (also known as minimum weight matching in bipartite graphs) for the IOU matrix using the Hungarian algorithm (also known as Munkres algorithm). The machine learning package scipy has a build-in utility function that implements the Hungarian algorithm.

matched_idx = linear_sum_assignment(-IOU_mat)

The linear_sum_assignment function by default minimizes the cost, so we need to reverse the sign of IOU matrix for maximization.
The result will look like this:

For each unmatched detector, we create a new tracker with the detector’s data, for the unmatched trackers we update the accuracy indicators for the tracker and remove any that are way off. For the matched ones, we update the tracker position to the more accurate detection box, we get the class data and average it with the previous 15 data points of the tracker.

5th Step - Decide What to Do

After step 4 the Trackers list is up to date with all the statistical and current data. The tracker class has a method to return the current classifications and confidence of those scores, we then update the detectors and iterate through them. A detector with low confidence score probably came from a tracker with not enough data or the detection is poor, we mark those in orange. A detector with a high enough confidence score will be green if it’s not a cop, and red/blue if it is. The unmatched trackers that are not dead will show in cyan.

Programmed in Python.
Over 2.8K lines of code.This figure may include comment lines and some modified library files.
TensorFlow, OpenCV, NumPy, Pandas, Pillow, Requests,
SciPy & MultiProcessing used in Python.

Lang/Lib/Pro	Version
Python	3.8.1
TensorFlow	2.2.0
OpenCV	4.2.0.34
Jupyter	1.0.0
Matplotlib	3.2.1
NumPy	1.18.4
Pandas	1.0.4
Pillow	7.1.2
Requests	2.23.0
SciPy	1.4.1
TF-Slim	1.1.0
MultiProcessing	in Python 3.8