Multiple object detection and tracking with stereo camera from the drone's perspective to avoid collision
1. Project description
This project is a part of my internship at the National Institute of Informatics (NII), Japan. The goal of this project is to detect and track multiple objects from the drone’s perspective to avoid collision.
At a time when drones are becoming more and more popular and afordable, the number of accidents caused by drones is also increasing. One of the main reasons for this is the lack of collision avoidance systems. In this project, we target a scenario where a drone is flying in the air and there are multiple objects in its path. The drone needs to detect and track these objects to avoid collision. Considering the cost and size of the drone, we use a stereo camera mounted on the drone to capture the images in front of it.
Challenges:
- Objects at long distances are usually very small and difficult to detect and track in the drone’s view.
- The camera moves aggressively.
- No available dataset (drone’s perspective) for model training and evaluation.
2. Responsibilities
- Investigated the possible setups of stereo cameras on the drone to fulfill the collision avoidance requirements of the drone running at a high speed (around 80~100 km/h).
- Developed a detection-based tracking algorithm to track multiple objects in the air from the drone’s perspective.
- Developed a collision avoidance system based on the tracking algorithm.
3. Contributions
- Designed a stereo camera-based collision avoidance system to detect and track objects from 2D to 3D to avoid collisions.
- Proposed a simple yet efficient multi-modal variant of YOLOX which effectively leverages the depth information to boost the detection performance.
- Proposed a light motion compensation strategy based on the optical flow.
- designed a depth-aware scale strategy that remarkably improves the tracking ability of distant objects.
- Proposed a new stereo synthetic dataset for MOT from the drone’s perspective.
- Preparing a paper for submission to IEEE Trans. on Intelligent Transportation Systems.
Tools: Python, PyTorch, OpenCV, AirSim
4. Some Results
(I didn’t include too many details here because the paper is still under writing.)
- Setup:
- Stereo Baseline = 0.25 m
- Maximum Distance ≈ 80 m
- Resolution = 720 × 1280
- Frame Rate ≥ 30 fps
- Simulated Environment: AirSim
- The proposed 2D to 3D collision avoidance system.
- Detecting and tracking area definition and spatial distribution of other airborne objects.
- Depth accuracy under different baseline setups.
- A toy demo to show the collision avoidance with the proposed method.
In the demo, the ego drone is located at the center of yellow shpere. We define the yellow sphere as the safe zone, if a predicted trajectory of an object intersects with the yellow sphere, the ego drone will take an evasive action to avoid collision.