Name	Name	Last commit message	Last commit date
parent directory ..
attach_tracking_id	attach_tracking_id
change_directory_structure	change_directory_structure
create_annotation_tool_format	create_annotation_tool_format
create_info	create_info
create_pseudo_t4dataset	create_pseudo_t4dataset
docs	docs
entrypoint	entrypoint
filter_objects	filter_objects
utils	utils
Dockerfile	Dockerfile
README.md	README.md

auto_labeling_3d

The pipeline of auto labeling for 3D detection.

Support priority: Tier S

graph LR
    NADATA[(non-annotated T4Dataset)]

    subgraph "Model A inference"
        INFERENCE_A[create_info]
    end

    subgraph "Model B inference"
        INFERENCE_B[create_info]
    end

    subgraph "Model C inference"
        INFERENCE_C[create_info]
    end

    subgraph "Ensemble"
        ENSEMBLE[filter_objects]
    end

    subgraph "Temporal ID Consistency"
        TRACKING[attach_tracking_id]
    end

    subgraph "Convert to T4Dataset"
        CONVERT[create_pseudo_dataset]
    end

    DATA[(auto-labeled T4Dataset)]

    NADATA --> INFERENCE_A
    NADATA --> INFERENCE_B
    NADATA --> INFERENCE_C

    INFERENCE_A --> ENSEMBLE
    INFERENCE_B --> ENSEMBLE
    INFERENCE_C --> ENSEMBLE

    ENSEMBLE --> TRACKING
    TRACKING --> CONVERT
    CONVERT --> DATA

    click INFERENCE_A "https://github.com/tier4/AWML/tree/main/tools/auto_labeling_3d#step-31-create-info-file-from-non-annotated-t4dataset"
    click INFERENCE_B "https://github.com/tier4/AWML/tree/main/tools/auto_labeling_3d#step-31-create-info-file-from-non-annotated-t4dataset"
    click INFERENCE_C "https://github.com/tier4/AWML/tree/main/tools/auto_labeling_3d#step-31-create-info-file-from-non-annotated-t4dataset"
    click ENSEMBLE "https://github.com/tier4/AWML/tree/main/tools/auto_labeling_3d#step-32-filter-and-ensemble-results"
    click TRACKING "https://github.com/tier4/AWML/tree/main/tools/auto_labeling_3d#step-33-attach-tracking-ids"
    click CONVERT "https://github.com/tier4/AWML/tree/main/tools/auto_labeling_3d#step-34-create-the-auto-labeled-t4dataset"

1. Setup Environment

Please follow the installation tutorial to set up the environment.
In addition, please follow the below setting up procedure.

Build docker image

Build docker image.
- If you build AWML image locally, please add --build-arg BASE_IMAGE=awml or --build-arg BASE_IMAGE=awml-ros2 to build script.

DOCKER_BUILDKIT=1 docker build -t auto_labeling_3d -f tools/auto_labeling_3d/Dockerfile .

Run docker container.

docker run -it --gpus '"device=0"' --name auto_labeling_3d --shm-size=64g -d -v {path to autoware-ml}:/workspace -v {path to data}:/workspace/data auto_labeling_3d bash

If you want to use these models in auto labeling, please follow the setting up procedure in the README of each model:
- BEVFusion
- StreamPETR

2. Prepare Dataset

Prepare your non-annotated T4dataset in the following structure:

- data/t4dataset/
  - pseudo_xx1/
    - scene_0/
      - annotation/
        - ..
      - data/
        - LIDAR_CONCAT/
        - CAM_*/
        - ..
      - ...
    - scene_1/
      - ..

3. Run Auto Labeling Pipeline

You have two options to run the pipeline:

Option A: Quick Start with launch.py

For most users, use launch.py to run the entire pipeline in one command:

python tools/auto_labeling_3d/entrypoint/launch.py tools/auto_labeling_3d/entrypoint/configs/example.yaml

This executes all steps automatically:

Download model checkpoints from Model Zoo
Run inference and create info files with pseudo labels
Ensemble/filter results from multiple models
Attach consistent tracking IDs across frames
Generate final auto-labeled T4Dataset
Restructure directory format

See example.yaml and update paths for your workspace.

Option B: Run Individual Modules

For advanced users who need granular control or want to customize the pipeline, you can run each step separately:

Step 3.1: Create info file from non-annotated T4dataset

Run inference with a 3D detection model to generate info files:

python tools/auto_labeling_3d/create_info_data/create_info_data.py --root-path {path to directory of non-annotated T4dataset} --out-dir {path to output} --config {model config file to use auto labeling} --ckpt {checkpoint file}

For example, run the following command

python tools/auto_labeling_3d/create_info_data/create_info_data.py --root-path ./data/t4dataset/pseudo_xx1 --out-dir ./data/t4dataset/info --config projects/BEVFusion/configs/t4dataset/bevfusion_lidar_voxel_second_secfpn_1xb1_t4offline.py --ckpt ./work_dirs/bevfusion_offline/epoch_20.pth

If you want to ensemble for auto labeling, you should create info files for each model.
As a result, the data is as below

- data/t4dataset/
  - pseudo_xx1/
    - scene_0/
      - annotation/
        - ..
      - data/
      - ...
    - scene_1/
      - ..
  - info/
    - pseudo_infos_raw_centerpoint.pkl
    - pseudo_infos_raw_bevfusion.pkl

Step 3.2: Filter and ensemble results

Filter

Set a config to decide what you want to filter
- Set threshold to filter objects with low confidence

centerpoint_pipeline = [
    dict(
        type="ThresholdFilter",
        confidence_thresholds={
            "car": 0.35,
            "truck": 0.35,
            "bus": 0.35,
            "bicycle": 0.35,
            "pedestrian": 0.35,
        },
        use_label=["car", "truck", "bus", "bicycle", "pedestrian"],
    ),
]

filter_pipelines = dict(
  type="Filter",
  input=dict(
          name="centerpoint",
          info_path="./data/t4dataset/info/pseudo_infos_raw_centerpoint.pkl",
          filter_pipeline=centerpoint_pipeline,
  ),
)

Make the info file to filter the objects which do not use for auto-labeled T4Dataset

python tools/auto_labeling_3d/filter_objects/filter_objects.py --config {config_file} --work-dir {path to output}

Ensemble

If you want to ensemble model, you set a config as below.

centerpoint_pipeline = [
    dict(
        type="ThresholdFilter",
        confidence_thresholds={
            "car": 0.35,
            "truck": 0.35,
            "bus": 0.35,
            "bicycle": 0.35,
            "pedestrian": 0.35,
        },
        use_label=["car", "truck", "bus", "bicycle", "pedestrian"],
    ),
]

bevfusion_pipeline = [
    dict(
        type="ThresholdFilter",
        confidence_thresholds={
            "bicycle": 0.35,
            "pedestrian": 0.35,
        },
        use_label=["bicycle", "pedestrian"],
    ),
]

filter_pipelines = dict(
    type="Ensemble",
    config=dict(
        type="NMSEnsembleModel",
        ensemble_setting=dict(
            weights=[1.0, 1.0],
            iou_threshold=0.55,
        ),
    ),
    inputs=[
        dict(
            name="centerpoint",
            info_path="./data/t4dataset/info/pseudo_infos_raw_centerpoint.pkl",
            filter_pipeline=centerpoint_pipeline,
        ),
        dict(
            name="bevfusion",
            info_path="./data/t4dataset/info/pseudo_infos_raw_bevfusion.pkl",
            filter_pipeline=bevfusion_pipeline,
        ),
    ],
)

Make the info file to filter the objects which do not use for auto-labeled T4Dataset and ensemble filtered results.

python tools/auto_labeling_3d/filter_objects/ensemble_infos.py --config {config_file} --work-dir {path to output}

As a result, the data is as below

- data/t4dataset/
  - pseudo_xx1/
    - scene_0/
      - annotation/
        - ..
      - data/
      - ...
    - scene_1/
      - ..
  - info/
    - pseudo_infos_raw_centerpoint.pkl
    - pseudo_infos_raw_bevfusion.pkl
    - pseudo_infos_filtered.pkl

Step 3.3: Attach tracking IDs

Attach tracking IDs to maintain temporal consistency:
- If you do not use for target annotation, you can skip this section.

python tools/auto_labeling_3d/attach_tracking_id/attach_tracking_id.py --input {info file} --output {info_file}

As a result, an info file is made as below.

- data/t4dataset/
  - pseudo_xx1/
    - scene_0/
      - annotation/
        - ..
      - data/
      - ...
    - scene_1/
      - ..
  - info/
    - pseudo_infos_raw_centerpoint.pkl
    - pseudo_infos_raw_bevfusion.pkl
    - pseudo_infos_filtered.pkl
    - pseudo_infos_tracked.pkl

Step 3.4: Create the auto-labeled T4Dataset

Generate the auto-labeled T4Dataset:

python tools/auto_labeling_3d/create_pseudo_t4dataset/create_pseudo_t4dataset.py {yaml config file about T4dataset data} --root-path {path to directory of non-annotated T4dataset} --input {path to pkl file}

As a result, auto-labeled T4Dataset is made as below.

- data/t4dataset/
  - pseudo_xx1/
    - scene_0/
      - annotation/
        - sample.json
        - ..
    - scene_1/
      - ..
    - ..

4. Use for training

Verify the auto-labeled T4Dataset

Before using the auto-labeled T4Dataset for training, you can visualize and verify the generated labels using t4-devkit.

Please refer to t4-devkit render tutorial for visualization instructions.

Upload to WebAuto

Please upload auto-labeled T4Dataset to WebAuto to share easily for other users.

Please check Web.Auto document for the detail.

Use in local PC

To align T4dataset directory structure, you run the script as following.

python tools/auto_labeling_3d/change_directory_structure/change_directory_structure.py --dataset_dir data/t4dataset/pseudo_xx1/

The result of the structure of auto-labeled T4Dataset is following.

- data/t4dataset/
  - pseudo_xx1/
    - scene_0/
      - 0/
        - annotation/
          - sample.json
          - ..
    - scene_1/
      - 0/
        - ..
    - ..

How to train with the auto-labeled T4Dataset

1. Add a YAML config for your auto-labeled T4Dataset

Create a YAML under autoware_ml/configs/t4dataset/ describing your auto-labeled T4Dataset.

Example: autoware_ml/configs/t4dataset/pseudo_j6_v1.yaml

2. Run docker and mount the auto-labeled T4Dataset

Ensure your auto-labeled T4Dataset is mounted under ./data/t4dataset inside the container.

Example:

docker run -it --gpus '"device=0"' --name auto_labeling_3d --shm-size=64g -d -v {path to autoware-ml}:/workspace -v {path to auto-labeled T4Dataset}:/workspace/data auto_labeling_3d bash

3. Update `dataset.py` used in training config

Add the name of your auto-labeled T4Dataset directory to the dataset_version_list in the dataset config file used by your training configuration.

Example Case

Example Case:
- If your training config is: Centerpoint/second_secfpn_4xb16_121m_j6gen2_base.py
  - This config uses: j6gen2_base.py
- And your pseudo-label dataset directory is named: pseudo_x2
To Do:
- Add pseudo_x2 to the dataset_version_list in the j6gen2_base.py file.

dataset_version_list = [
    "db_j6gen2_v1",
    "db_j6gen2_v2",
    "db_j6gen2_v3",
    "db_j6gen2_v4",
    "db_j6gen2_v5",
    "db_largebus_v1",
    "db_largebus_v2",
    "pseudo_x2",
]

4. Prepare T4Dataset info and train

Follow the dataset preparation and generate info files and start training using your chosen model and the YAML you added in Step 1.

(Optional) Add downsampling for your dataset

Auto-labeling works at 10hz while manually annotated dataset are usually 1hz. Depending on your data distribution, you might want to down-sample your specific dataset. This is currently done at the info file creation stage, as seen here. Add your dataset name to the conditional check and set sample_steps = 10.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

auto_labeling_3d

1. Setup Environment

Build docker image

2. Prepare Dataset

3. Run Auto Labeling Pipeline

Option A: Quick Start with launch.py

Option B: Run Individual Modules

Step 3.1: Create info file from non-annotated T4dataset

Step 3.2: Filter and ensemble results

Filter

Ensemble

Step 3.3: Attach tracking IDs

Step 3.4: Create the auto-labeled T4Dataset

4. Use for training

Verify the auto-labeled T4Dataset

Upload to WebAuto

Use in local PC

How to train with the auto-labeled T4Dataset

1. Add a YAML config for your auto-labeled T4Dataset

2. Run docker and mount the auto-labeled T4Dataset

3. Update `dataset.py` used in training config

4. Prepare T4Dataset info and train

(Optional) Add downsampling for your dataset

FilesExpand file tree

auto_labeling_3d

Directory actions

More options

Directory actions

More options

Latest commit

History

auto_labeling_3d

Folders and files

parent directory

README.md

auto_labeling_3d

1. Setup Environment

Build docker image

2. Prepare Dataset

3. Run Auto Labeling Pipeline

Option A: Quick Start with launch.py

Option B: Run Individual Modules

Step 3.1: Create info file from non-annotated T4dataset

Step 3.2: Filter and ensemble results

Filter

Ensemble

Step 3.3: Attach tracking IDs

Step 3.4: Create the auto-labeled T4Dataset

4. Use for training

Verify the auto-labeled T4Dataset

Upload to WebAuto

Use in local PC

How to train with the auto-labeled T4Dataset

1. Add a YAML config for your auto-labeled T4Dataset

2. Run docker and mount the auto-labeled T4Dataset

3. Update dataset.py used in training config

4. Prepare T4Dataset info and train

(Optional) Add downsampling for your dataset

3. Update `dataset.py` used in training config