Data annotation

Labeling Autonomous Vehicle Data in Low Light and Infrared

Labeling data for autonomous vehicles in low-light conditions and utilizing infrared radiation is a crucial step in developing safe autonomous driving systems. Night scenes present unique challenges for sensors, but traditional cameras often lose contrast and detail, and objects can remain poorly visible or distorted due to glare and shadows. The use of infrared sensors allows for partial compensation for these limitations by providing additional information about the thermal contours of objects and people. However, to train autonomous driving models, it is necessary not only to collect such data but also to correctly annotate it, using infrared annotation techniques that account for the peculiarities of the nighttime environment, incomplete visibility, and edge cases that affect the system's reliability in real-world conditions.

Key Takeaways

LED Projection Improves Depth RMS Error Across Architectures.
The abstract should include depth, segmentation, and edge-case taxonomies adapted for limited visibility.
Sensor integration and camera improvements are changing the cost and reliability of the system for real-world use.
The ethics and design of urban lighting impact safety and environmental outcomes.

Low light, glare, and sensor performance

Night changes the perception of the environment for humans and for technical systems. Low light reduces the amount of available visual information, glare from artificial light sources distorts the contours of objects, and shadows hide details that seem obvious during the day. For sensors and computer vision systems, this means reduced accuracy, increased noise, and the need for other approaches to data processing. It is in night conditions that the limits of sensor capabilities and, at the same time, the importance of their correct combination and adaptation are most clearly manifested.

Sensor	Strength	Typical failure modes	Mitigation
RGB cameras	Cost-effective, high-res	Glare, noise, contrast loss	HDR, denoising, domain-specific data
LiDAR	Accurate depth	Scattering in fog/rain	Filtering, sensor fusion
Thermal	Works in full dark	Low texture for detection	Fusion with RGB, specialized models

Low-light annotation

Low-light data annotation is a challenging task for preparing datasets for computer vision and sensor systems, especially for low light detection tasks where objects are barely visible or obscured by shadows and glare. In low light, it becomes more difficult for humans to interpret a scene unambiguously; therefore, the annotation process requires clear rules, additional context, and thorough checks. The main task of annotation is not only to label objects, but also to capture the degree of uncertainty with which these objects are perceived in the dark.

Label types in low-light scenarios usually go beyond standard bounding boxes or segmentation masks. In addition to object localization, it is essential to indicate their visibility, partial occlusion, level of blur or illumination, and light sources affecting the scene.

Attribute labels are often used to describe the shooting conditions, such as night, dusk, rain, fog, artificial lighting, headlights, or lanterns. This enables models to distinguish between real-world object properties and artifacts caused by lighting conditions.

Edge cases are significant because they determine the model's reliability in real-world scenarios. These include objects that are only partially visible, pedestrians in dark clothing, objects on the edge of the lighting zone, and scenes with sharp contrast between bright and dark areas.

In such situations, annotation often requires a compromise between accuracy and consistency. Sometimes it is necessary to label an object even with minimal signs of its presence, and sometimes it is required to fix uncertainty or abandon a rigid classification.

It is precisely the correct handling of these edge cases that makes the dataset suitable for training systems that can operate stably in night and complex conditions.

Nighttime benchmarking

Benchmarking progress requires robust datasets, repeatable metrics, and protocols that reflect real-world road performance. Let's review and compare them:

Benchmark	Size / Labels	Primary use	Limitations
ExDark	7,363 images; boxes	Detection under limited lighting	Small scale; limited geometry
NightOwls	~115k images; ~279k pedestrian boxes	Pedestrian safety & detection	Pedestrian-focused; fewer depth labels
Nighttime Synthetic Drive	~49,995 images; depth & seg	Depth estimation and domain transfer	Synthetic gap; realism constraints

Metrics and cross-domain protocols

Standard computer vision metrics, such as accuracy, completeness, mAP, or IoU, do not fully reflect real-world performance in low-light conditions. Therefore, in such scenarios, extended or conditional metrics are used that analyze the results separately for well-lit and poorly lit areas, for partially visible objects, or for different degrees of noise and contrast.

Assess the stability of the model, not just its average accuracy. Robustness metrics enable you to understand how recognition quality changes with the gradual deterioration of lighting conditions, the appearance of glare, or local illumination. To achieve this, performance curves are used that depend on the scene's brightness, error analysis is conducted in edge cases, and separate counts are made of false positives and false negatives in night scenes. This approach provides a practical understanding of the system's behavior in a real environment.

Cross-domain evaluation protocols are designed to assess the model's ability to generalize knowledge across various conditions and data sources. This means training on daytime or well-lit scenes and testing on night scenes, or vice versa.

These protocols enable you to uncover the hidden dependencies of the model on specific shooting conditions and assess its robustness to domain shifts. If performance drops sharply when switching between domains, this signals the need for domain adaptation or dataset expansion.

Infrared, thermal, and RGB fusion

Thermal and infrared imaging detect thermal signatures independently of visible details. They reliably detect pedestrians and animals in complete darkness, eliminating errors caused by glare from the camera.

Sensor	Strength	Trade-off
RGB cameras	High detail for classification	Suffers from glare; needs lighting
Thermal / IR	Detects heat; reduces false negatives	Lower texture; higher cost per pixel
LiDAR	Accurate range	Reflective noise in weather

We recommend combining RGB, thermal, and LiDAR to balance range, resolution, and reliability.

Depth estimation at night

Traditional supervised learning methods that rely on precisely labeled LiDAR or stereo data remain an important benchmark for quality. Still, their application at night is limited by the high cost of data acquisition and the difficulty of obtaining reliable annotations.

Self-supervised approaches that utilize the internal structure of the data, rather than manual labeling, are actively being developed. Such methods learn from the photometric consistency between consecutive frames or between stereo pairs, enabling unsupervised depth estimation.

Domain adaptation is a key trend in depth estimation at night. Instead of training models from scratch on limited nighttime data, daytime datasets are used as a starting point, and models are adapted to the nighttime domain using stylistic transformations, feature alignment, or joint training across multiple domains. This approach enables the preservation of geometric knowledge acquired under favorable conditions and enhances robustness to changes in illumination. Combined with self-supervised signals, domain adaptation forms hybrid solutions that are better suited to stable depth estimation in real-world nighttime scenarios.

Approach	Strength	Trade-offs
Self-supervised	Scales with video; no dense labels	Scale ambiguity; sensitive to specularities
Supervised (synthetic + real)	Strong RMSE reductions; sharper edges	Synthetic gap; labeling cost
Domain adaptation	Feature alignment; better transfer	Complex training; needs paired domains

Application and impact

The development of methods for perceiving and analyzing scenes in complex lighting conditions has a significant impact on practical applications, where errors can have severe consequences. In driver assistance systems (ADAS), object recognition and depth estimation at night form the basis for timely collision warnings, pedestrian detection, and the correct operation of adaptive lighting. It is during the dark that the risk of accidents increases, so a slight improvement in night vision accuracy can significantly impact traffic safety and reduce the number of accidents.

Intelligent traffic signals and traffic management systems benefit from improved night perception models. Cameras and sensors that can operate stably in low light enable a more accurate assessment of traffic intensity, allowing for the detection of congestion or hazardous situations. At night, such systems help compensate for reduced driver attention and improve the overall predictability of the road infrastructure, making it more "sensitive" to the real situation on the road.

For law enforcement agencies, night vision and data analytics open up new opportunities. Automatic detection of suspicious behavior, traffic violations, or dangerous crowds at night reduces the workload on operators and increases the speed of decisions. At the same time, algorithms enable the reduction of false positives, which is particularly important in conditions of limited visibility and complex scenes.

FAQ

What physical factors complicate perception after dark?

After dark, perception is complicated by low light, high contrast between light and dark areas, glare, shadows, and increased visual noise.

What annotation tasks are important for low-visibility scenarios?

Important annotation tasks include accurately labeling objects while taking into account partial visibility, noise, glare, and the degree of uncertainty of their contours.

How do depth estimation approaches handle the distribution shift from day to night?

Depth estimation approaches compensate for the distribution shift from day to night through domain adaptation, stylization transformations, and supervised learning, allowing the model to generalize geometric information regardless of illumination.

What are the practices for collecting and labeling images and videos after dark?

Nighttime image and video capture and labeling practices include the use of infrared and low-light cameras, multi-sensor setups for precise scene capture, and custom annotation protocols that capture object visibility, glare, and light levels.

What applications benefit most from improved perception after dark?

Autonomous driving systems, smart traffic signals, and security and law enforcement platforms benefit the most from these advancements.

Labeling Autonomous Vehicle Data in Low Light and Infrared

Key Takeaways

Low light, glare, and sensor performance

Low-light annotation

Nighttime benchmarking

Metrics and cross-domain protocols

Infrared, thermal, and RGB fusion

Depth estimation at night

Application and impact

FAQ

What physical factors complicate perception after dark?

What annotation tasks are important for low-visibility scenarios?

How do depth estimation approaches handle the distribution shift from day to night?

What are the practices for collecting and labeling images and videos after dark?

What applications benefit most from improved perception after dark?

Read next

Bird's Eye View (BEV) Annotation: Converting Camera to Top-Down Representation

Multi-Object Tracking Annotation with ID Consistency

Automating audiovisual annotation

Comments ()

Key Takeaways

Low light, glare, and sensor performance

Low-light annotation

Nighttime benchmarking

Metrics and cross-domain protocols

Infrared, thermal, and RGB fusion

Depth estimation at night

Application and impact

FAQ

What physical factors complicate perception after dark?

What annotation tasks are important for low-visibility scenarios?

How do depth estimation approaches handle the distribution shift from day to night?

What are the practices for collecting and labeling images and videos after dark?

What applications benefit most from improved perception after dark?

Read next

Comments ( )

Comments ()