Robust Detection for Traffic Monitoring in Poor Visibility Conditions

Authors: Lukáš Maršík and Bořek Reich (CAMEA and BUT)

As we increasingly rely on smart systems in traffic, we need to find ways to make these systems as robust as possible. In traffic environments, it is essential to detect vehicles and VRUs (vulnerable road users) together to protect especially the VRUs, assist drivers, or determine the number of traffic participants in potentially dangerous environments, etc.

The robustness of detection systems is achievable by multiple approaches. We may improve detection systems that use only visual data by using larger datasets, improving their architecture, etc. We can use more suitable sensors to collect environmental data for a given use case. We can also use multiple sensors that aid each other in different conditions, provide complementary data, and more. If we use more than one sensor, we need to combine the data received; we need to perform a sensor data fusion.

In our work, we utilized sensor fusion of camera and FMCW radar data. This combination should provide more robust detection systems thanks to the radar’s ability to see in poor visibility conditions and the camera’s abundant features. We used a proven architecture of a one-stage detection method, which we modified to accept multi-channel input consisting of camera image data and radar point-cloud data. To verify the system’s ability to detect vehicles in poor visibility, we simulated darkness, lens glare, noise, etc.

The following figure presents some examples of our detection fusion method results. The first row shows unmodified footage, and the second shows examples with simulated poor visibility for the camera. Green points represent calibrated point-cloud from FMCW radar.

As the detection method, we used the latest version of the YOLO single-pass detection method, which we modified so that its input can be four channels: three camera channels (RGB) and a grayscale image representing the point-cloud. The network was trained on a dataset of 2800 images (for both radar and camera). The following diagram shows the individual steps of the fusion detection method. Radar point-clouds are transformed into the camera coordinate system and represented by grayscale images. Images from both sensors are combined, and the detection is performed with our modified YOLOv8 method.

We have also demonstrated VRUs detection during the demo days in Eindhoven. Due to the size of our equipment, we brought only CAMEA-designed FMCW radar. We showed a pedestrian detection demo that relies only on radar data and ensures the safety of people on a crosswalk.