Towards more robust vulnerable road user detection and tracking for smart infrastructure

Writer: Ljubomir Jovanov (imec)

Vulnerable road users (VRU) still comprise the largest part of road traffic deaths, according to the study of the World Health Organization. Numerous initiatives have been taken to cope with this important problem in the modern society. One of the most important challenges regarding road safety that European Union has set, is reaching zero deaths on EU roads by 2050.

Road traffic safety can be improved in multiple ways:

  1. in-cabin driver state monitoring systems, which actively monitor physiological state of the driver and produce warnings if the driver is unfit for driving
  2. Advanced-Driver Assistance Systems (ADAS) that actively predict and suggest potentially unsafe situations to the driver or intervene in the case there is no time for driver’s reaction (e.g. emergency braking when potential collision is predicted)
  3. high-level self-driving functions in the cars, that perform driving instead
  4. smart infrastructure, e.g. smart intersections, junctions and roads, which analyze the situation, based on the inputs from different sensors installed on it and take actions to prevent dangerous situations
  5. communications between vehicles and infrastructure and other vehicles (V2X) to share the information that is not available to the receiving side


Most of these aspects were investigated in NextPerception. Imec focused on the VRU detection for smart infrastructure, together with Flir and Macq. Together with all UC3 partners, imec has worked on the definition of the V2X protocol, to exchange the information about road users which are otherwise not visible. Main results of this research were demonstrated at the project final event in Eindhoven.

Sensor fusion for VRU detection and tracking

Object detection based solely on visible light sensors, like RGB cameras, do not provide reliable and robust VRU detection in various visibility conditions. In order to improve the robustness of the detection, it is necessary to use additional sensors which perform better in challenging conditions.

In NextPerception, imec has developed a new sensing system that captures the data from the whole intersection and covers all road users, using different sensors. To achieve the full coverage of the intersection, imec has deployed multiple sensor boxes, each inspecting a certain area. Each of these sensor boxes contains radar, RGB and thermal cameras, with the possibility to add new sensors to the system.

Figure 1: (left) imec multi-sensor boxes for VRU detection, (right) the output of the sensor fusion and VRU detection.

Moreover, our sensor boxes are capable of communicating with each other, to receive the information about the areas outside their field of view.

Image analysis is performed locally at the intersection, relying on the data from radar, thermal and RGB cameras. The results of this analysis provide fine-grained detection and identification of types of road users, such as cyclists and pedestrians, which can be further used for traffic control and prioritizing various traffic modalities appropriately.

Since we process data locally, it is not needed to store or communicate privacy sensitive data. Our VRU detection system provides much more accurate flow control, compared to the traffic counting via smartphones. The main advantages of the proposed system are:

  • Flexibility: it is possible to capture data from any type of RGB or thermal camera, radar, lidar, at different frame rates and resolutions
  • Scalability: seamless extension of the system with new sensor boxes
  • High performance: real time detection and tracking, supported by communication between sensors
  • Real-time operation: VRU detection based on RGB-thermal-radar fusion at 0 frames per second


Detection performance, measured in terms of average precision (AP) of the proposed system, outperforms the current state-of-the art on VRU detection and tracking.