Annotating Fabric Actions Datasets for Textile Robots

Annotating Fabric Actions Datasets for Textile Robots

The manipulation of textile products belongs to the most complex challenges of modern robotics due to the fundamental difference between rigid objects and deformable materials. While in classic industrial automation, a robot interacts with parts that have a constant shape and clear geometry, fabric changes its state at the slightest physical contact. This characteristic transforms the common grasping of an object into a multi-level computational task, where every movement of the manipulator leads to an unpredictable change in the configuration of the entire product.

This is exactly why textile robotics requires the development of specialized datasets that teach artificial intelligence to perceive material as a dynamic physical system with flexible properties. The transition from simple object recognition to understanding the "physics of softness" is a key stage on the path to creating robots capable of assisting in the household or at garment manufacturing facilities.

Quick Take

  • A high-quality dataset combines video, movement trajectories, and metrics from force and touch sensors.
  • The use of tags for fold lines and wrinkle classification is critical for precise manipulations.
  • Simulators allow for the creation of perfectly annotated ground truth data without the risk of damaging real equipment.
  • Sensor alignment guarantees exact correspondence between the visual sequence and the physical metrics of the sensors.
  • A unified labeling standard is mandatory to avoid errors during neural network training.

Data Fundamentals for Textile Robots

To teach a machine to work with soft materials, developers need a specialized fabric manipulation dataset. This is a large array of digital information that describes every fold, movement, and physical property of the fabric in detail while working with it. Such data serves as a kind of textbook that explains to artificial intelligence how textiles behave in the real world and exactly how the robot should influence them to achieve the desired result.

Fabric Data Structure

A high-quality training dataset is not limited only to images; it covers a comprehensive understanding of physical processes. Each file in such a data array helps the system perform garment state estimation, meaning it correctly assesses the current state of the clothing: whether it is unfolded, crumpled, or folded in half.

The main types of data collected for training:

  • Manipulation videos. Recordings of how human hands or mechanical claws interact with textiles under various conditions.
  • Sensor metrics. Data on pressing force and tactile sensations help the robot avoid tearing thin fabric.
  • Movement trajectories. Precise mathematical paths that manipulators must move along to successfully complete a task.
  • Interaction states. A description of how the object's shape changes after every touch or displacement.

The Role of Labeling in System Training

The process of preparing textile robot training data includes applying special digital tags that suggest to the algorithm exactly what to pay attention to. Without these prompts, the robot would see only a chaotic pile of material, but thanks to annotations, it begins to distinguish the structure of the product and plan its actions. This allows a regular video to be transformed into a clear instruction for performing complex operations.

Labeling Type

What Exactly Does the System Describe

Important for Which Task

Fold line annotation

Defining the lines along which the fabric needs to be folded

Correct folding of clothes or napkins

Wrinkle classification

Recognizing the type and depth of each fold

Smoothing fabric with a manipulator

Stitch path labeling

Laying out the precise path for the needle to pass through

Automation of sewing in production

Through the combination of these labeling methods, the robot learns to predict the material's reaction to its actions. For example, by analyzing the data, the system understands that pulling one edge of a shirt will cause the entire back to move. Such a deep understanding of physics allows for the creation of autonomous systems that can independently prepare clothes for sale or assist with household chores.

Core Textile Manipulations

For successful training, textile robot training data must include a clear list of actions the robot performs with the material. Each such action requires specific prompts and labeling, as the movements of a manipulator for unfolding a towel differ fundamentally from those used when threading a needle. The entire spectrum of operations can be divided into three large categories depending on the goal and complexity of execution.

Manipulation actions

These actions are aimed at changing the general configuration of the fabric and giving it a certain appearance. Here, the main task for the system is understanding the object's geometry.

  • Folding – folding the product along certain lines so the robot knows exactly where the fold should occur.
  • Unfolding – carefully unfolding crumpled or folded fabric for its further processing.
  • Draping – placing fabric on a mannequin or surface so that it falls naturally under its own weight.
  • Stretching – tensioning material to eliminate wrinkles, which often requires wrinkle classification to determine the zones of greatest tension.

Precision tasks

This is the most complex level, where the robot must work with millimeter precision. Such tasks are the foundation for the full automation of sewing factories.

  • Sewing – the direct process of joining parts of fabric, indicating the exact needle route to the machine.
  • Aligning edges – aligning the edges of two pieces of fabric before sewing so that the seams are straight.
  • Inserting fabric – pulling fabric into narrow openings or guiding elements of a sewing machine.

Movement and Sorting Tasks

Before beginning to sew or fold, the robot must find the object and pick it up correctly. This is the base without which further supply chain automation in the light industry is impossible.

  • Grasping – determining the optimal grip point so the fabric does not slip or get damaged.
  • Picking from a pile – the hardest task, where AI must recognize an individual product in a chaotic pile of textiles and pull it out.
  • Sorting textiles – distributing products by material type, color, or size based on computer vision data.

Annotation Methods for Textile Datasets

Creating a high-quality fabric manipulation dataset requires a combination of different approaches to marking. Since fabric constantly changes shape, a single annotation method cannot cover all the nuances of its behavior.

Manual annotation

Manual marking remains the gold standard of accuracy, especially for complex scenarios.

  • Frame-by-frame marking. Annotators highlight edges, corners, and critical fold zones, creating a base for garment state estimation.
  • Event detection. People mark the exact moments an action begins and ends, such as exactly when the fabric was grasped.
  • Subjective evaluation. Only a human can accurately determine the quality of smoothing or the correctness of the final folding of a product for training the evaluation system.

Semi-automatic

To speed up the process, computer vision tools are used to assist the human.

  • CV-assisted labeling. An algorithm automatically tracks the movement of a marked point in subsequent video frames, and a human only corrects errors.
  • Pose estimation. Specialized neural networks recognize the general "pose" of the clothing, automatically overlaying a skeletal model on the image of a shirt or trousers.
  • Intelligent masks. The system helps to quickly highlight fabric contours, separating them from the background and the operator's hands.

Simulation-based

The most scalable method is the use of physical simulators.

  • Automatic ground truth labels. The computer itself generates fold line annotation and the coordinates of every wrinkle, as it creates this virtual fabric itself.
  • Variability. In a simulation, one can instantly change the fabric type from silk to denim, creating thousands of scenarios in seconds.
  • Training without risk. A robot can make mistakes millions of times in virtual space, gaining experience without damaging real manipulators or textiles.

Sensor-based Annotation

This method allows for the annotation of data that a person cannot see with their eyes, but a robot can "feel".

  • Force/tactile labeling. Data from pressure sensors on manipulator tips is automatically added to the video sequence as tags.
  • Tension and friction. The system records with what force a certain type of fabric needs to be pulled so it doesn't tear, creating a base for stretching and sewing tasks.
  • Tactile maps. Creating annotations that describe material texture and its slipperiness, which is critical for precise grasping.

Data Collection Pipelines

The process of gathering data for training textile robots begins with the careful preparation of the workspace and the recording of manipulations. At this stage, it is important to use a multi-camera setup to cover the fabric from different angles and solve the problem of layer self-occlusion. A high frame rate allows for the capture of the slightest fluctuations and dynamic changes in material shape that occur instantly upon contact with a manipulator. Simultaneously with the video, data streams from tactile sensors and force sensors are recorded, creating a complete physical picture of the interaction.

The next step is the synchronization and alignment of all received information streams. Sensor alignment guarantees that the visual frame at a specific millisecond corresponds exactly to the fabric tension metrics and the robot's position in space. This allows for the transformation of disparate data into a holistic fabric manipulation dataset, where every pixel change on the screen is logically backed by physical parameters. Without perfect synchronization, AI will not be able to learn the correct link between its action and the soft object's reaction.

After technical preparation, the stage of annotation and validation begins. At this point, specialists or automated systems are involved to apply tags such as fold line annotation or stitch path labeling. It is vital to ensure consistent labeling – a unified marking standard for all process participants to avoid errors in neural network training. Validation checks the accuracy of these tags, filtering out frames with incorrect recognition or technical defects, which guarantees the high quality of the final dataset.

The pipeline concludes with a stage of systematized storage and preparation for use in training cycles. The resulting data is structured to provide fast access to specific action types, for example, separately for wrinkle classification or garment state estimation. Proper storage organization allows for easy scaling of textile robot training data, adding new materials or work scenarios. Such a sequential approach transforms a raw recording of movements into a powerful tool capable of teaching a machine to work with textiles at the level of a professional seamstress.

FAQ

How does fabric type affect annotation complexity?

Different materials, like silk or denim, have different levels of elasticity and friction, requiring additional tags regarding the physical properties of the material. Annotators often have to indicate draping specifics for each individual type of textile in the dataset.

Why is it so hard for robots to recognize a fabric edge in a pile of clothes?

In a chaotic pile, severe self-occlusion occurs, where edges merge with other folds of the same color or texture. To solve this, special "edge segmentation maps" are annotated to help the AI see the boundaries of individual layers. 

What is "self-occlusion" and how does it affect annotation?

This is a state where one part of the fabric hides another part under it, making it invisible to the camera. Annotators use multi-camera shoots or simulations to mark the positions of "hidden" points that the robot must learn to predict.

How is the wrinkle-smoothing process annotated?

Depth maps and direction vectors are created, indicating to the robot exactly where it needs to pull the fabric to level a specific wrinkle. This allows for the transformation of visual observation into a clear mechanical action.

What is the role of 5G or high-speed connectivity in data collection?

When using multiple high-frame-rate cameras, the data volume becomes enormous. Fast connectivity is necessary for the instant transfer and synchronization of video with sensor data from the robot in real-time.

How is the success of an action's execution marked?

Annotators compare the robot's result with an "ideal template" and assign a score or mark deviation zones. This is used for reinforcement learning, where the robot tries to minimize the number of errors. 

What difficulties arise when annotating seams?

The hardest part is marking an ideal seam line on an uneven surface that is constantly moving under the sewing machine foot. A combination of micro-cameras and thread tension sensors is often used for this.