Data annotation

SOTIF scenario annotation: Triggering conditions and functional deficiencies tagging for ISO 21448

Many real-world incidents occur even when all system components are functioning as intended. The problem lies in the system’s limitations in perceiving or responding correctly to complex environmental conditions.

To address these, the automotive industry introduced ISO 21448, known as safety of intended functionality (SOTIF). SOTIF addresses the risks arising from functional deficiencies and foreseeable misuse scenarios. This has created a growing demand for specialized SOTIF annotation workflows capable of detecting hazardous conditions.

Modern SOTIF-oriented datasets rely on triggering-condition tagging, functional-failure data, misuse-scenario tagging, and residual-risk classification to support the development and validation of perception and decision-making systems.

Key Takeaways

SOTIF focuses on safety risks caused by system limitations rather than component failures.
SOTIF annotation helps identify safety operational scenarios.
Triggering condition labeling captures environmental factors that expose system weaknesses.
Functional insufficiency data documents limitations in perception and decision-making capabilities.
Misuse scenario tagging addresses foreseeable human-system interaction risks.
Residual risk classification supports safety validation and compliance activities.
ISO 21448 datasets are essential for ADAS and autonomous vehicle development.

Understanding SOTIF and ISO 21448

ISO 21448 was developed to address safety issues. In many cases, ADAS and autonomous systems can malfunction due to limitations in sensor perception, environmental understanding, coverage of training data, or algorithm performance.

For example:

Poor object detection in unusual lighting conditions.
Incorrect classification of rare road users.
Incorrect interpretation of road markings.
Ambiguous road situations.
Environmental conditions were not reflected during training.

These scenarios may not be considered system failures under safety systems, but they can still lead to dangerous consequences.

The goal of SOTIF is to identify, assess, and mitigate these risks through systematic analysis, testing, and validation.

Core components of SOTIF annotation

Building reliable SOTIF datasets requires several specialized annotation categories designed to capture safety-relevant operational conditions.

Launch condition tagging

Launch condition tagging focuses on identifying operational and contextual factors that can reveal weaknesses in perceptual or decision-making systems. These conditions do not cause system failures but increase the likelihood of unsafe behavior when the system encounters challenging situations. For example, low sunlight glare, heavy rain, fog, unusual object views, road construction zones, pedestrian crossings, complex intersections, and heavy traffic.

By annotating these conditions, engineers can better understand how environmental factors affect system performance and identify edge cases that may be missed during routine testing.

Functional failure data

Functional failure data captures situations where a system operates as designed but is unable to handle a particular scenario safely. These limitations arise from inadequate perceptual capabilities, gaps in data coverage, or limitations in model design. For example, incomplete object recognition, reduced perceptual range, misclassification, delayed hazard detection, and inadequate path prediction.

Annotating these scenarios helps organizations identify vulnerabilities before deployment, improve model robustness, and reduce safety risks associated with normal system operation in challenging environments.

ISO 21448 dataset development

A high-quality ISO 21448 dataset combines common driving scenarios with selected edge cases and complex environmental conditions.

These datasets are designed to support:

Safety verification.
Hazard analysis.
Scenario coverage assessments.
Performance limitation identification.
Regulatory compliance activities.

Dataset development requires collaboration between safety engineers, perception specialists, annotation teams, and validation experts.

To maximize effectiveness, ISO 21448 datasets should include diverse geographic regions, weather conditions, road environments, and road user behavior.

The comprehensive diversity of scenarios improves the ability to detect unknown safety risks.

Misuse scenario tagging

Misuse scenario tagging focuses on identifying predictable situations in which drivers interact incorrectly with ADAS systems or use them beyond their intended operational limitations. Many real-world safety incidents involve human behavior, making misuse analysis an important part of SOTIF validation.

Examples include inadequate driver supervision, reliance on automation, ignoring system warnings, activating systems under inappropriate conditions, or operating outside the defined operational design domain. Annotating these behavior patterns helps organizations develop measures to mitigate safety risks and understand the risks of human-system interactions.

Residual risk classification and assessment

This method is used to identify, categorize, and prioritize these risks to determine whether they are acceptable within the defined security objectives.

A residual risk assessment considers the severity, likelihood of occurrence, detectability, operational impact, and effectiveness of mitigation. These classifications support the development of security justifications, risk management processes, and regulatory documentation, helping organizations make informed deployment decisions.

Data sources for SOTIF datasets

Creating robust SOTIF datasets requires collecting information from multiple sources.

Data source	Information provided	Role in SOTIF analysis
Camera video streams	Visual scene information, road users, traffic signs, road conditions	Supports perception validation and triggering condition identification
LiDAR point clouds	3D environmental geometry and object positioning	Helps evaluate perception accuracy and object detection limitations
Radar detections	Object distance, speed, and motion information	Supports sensor fusion analysis and tracking validation
Vehicle telemetry	Vehicle speed, steering, braking, and control inputs	Provides context for system behavior during safety-critical events
Driver monitoring systems	Driver attention, gaze direction, and engagement levels	Enables analysis of foreseeable misuse and human-system interaction
GPS and localization data	Vehicle position, route information, and map alignment	Supports scenario reconstruction and operational context analysis
Simulation environments	Synthetic safety scenarios and rare edge cases	Expands coverage of hazardous situations that are difficult to collect in real-world driving

Annotation workflows for SOTIF scenarios

SOTIF annotation follows a structured workflow designed to identify, analyze, and evaluate safety-related scenarios.

1. The process begins with scenario identification, where hazardous situations are selected from real driving data, simulation environments, or fleet records.

2. These scenarios are then investigated to identify environmental, operational, or contextual factors that contribute to the hazardous behavior.

3. The next step is to run condition detection, where annotators identify circumstances to understand weaknesses in perception, prediction, or decision-making systems. Examples include poor visibility, unusual object appearances, overlaps, complex traffic situations, or difficult weather conditions.

4. After condition detection, a functional failure analysis is performed to determine whether the system cannot safely handle the scenario despite operating properly.

The workflow also includes a misuse behavior assessment that focuses on foreseeable situations in which drivers or operators may interact with the system incorrectly or use it outside its intended operating area.

5. Once potential hazards and limitations have been identified, a risk classification is performed to assess the severity, likelihood, and potential consequences of each scenario. This step helps determine which risks require additional mitigation measures.

6. Finally, all annotations are validated and reviewed to ensure consistency and compliance with safety requirements.

Human-in-the-loop validation remains a key component throughout the process, as many SOTIF scenarios require expert interpretation and safety engineering knowledge. While automated tools can assist with data filtering, scenario discovery, and pre-labeling, the final validation typically relies on domain experts who can accurately assess the safety implications and functional limitations.

Challenges in creating SOTIF datasets

Creating robust SOTIF validation datasets is more challenging than creating traditional perception datasets. The datasets must capture rare, complex, and highly contextual scenarios.

Challenge	Description	Impact
Rare event collection	Safety scenarios occur infrequently in real-world driving	Limits availability of valuable training and evaluation data
Annotation subjectivity	Identifying triggering conditions and functional insufficiencies often requires expert interpretation	Can reduce labeling consistency and reproducibility
Scenario complexity	Multiple environmental and behavioral factors interact simultaneously	Makes scenario analysis and annotation more difficult
Dataset coverage	Operational design domains must be represented comprehensively	Insufficient coverage may leave safety gaps undetected
Evolving system capabilities	New AI capabilities continuously introduce new edge cases and limitations	Requires ongoing dataset updates and maintenance

Applications of SOTIF datasets

SOTIF datasets are used throughout the automotive development lifecycle.

Applications include:

ADAS validation.
Autonomous vehicle testing.
Safety case development.
Hazard analysis.
Edge case discovery.
Simulation scenario generation.
Regulatory compliance support.

These datasets help understand not only when systems fail, but when they may be insufficient despite functioning correctly.

SOTIF dataset creation practices

Focus on real-world edge cases.

Rare but safety-critical scenarios provide value for SOTIF analysis.

Combine real and simulated data.

Simulation can supplement limited real-world data for hazardous or unusual situations.

Establish clear annotation rules.

Harmonized labeling standards improve the quality and reproducibility of datasets.

Human factors.

Driver behavior and expected misuse should be considered when developing scenarios.

Continuously update datasets.

New risks and operating conditions emerge as autonomous systems evolve.

FAQ

What is SOTIF annotation?

SOTIF annotation is the process of labeling safety-relevant scenarios, triggering conditions, and functional limitations for ISO 21448 compliance and validation.

What are the triggering conditions?

Triggering conditions are environmental or operational factors that expose weaknesses in an otherwise correctly functioning system.

What is functional insufficiency data?

Functional insufficiency data captures situations where system capabilities are inadequate despite the absence of hardware or software failures.

Why is misuse scenario tagging important?

It helps identify foreseeable ways users may interact with systems incorrectly, creating additional safety risks.

What is residual risk classification?

Residual risk classification evaluates and categorizes remaining safety risks after mitigation measures have been applied.

SOTIF scenario annotation: Triggering conditions and functional deficiencies tagging for ISO 21448

Key Takeaways

Understanding SOTIF and ISO 21448