This project is an implementation of our Loddis framework, published in the proceedings of the 40th AAAI Conference on Artificial Intelligence (AAAI’26).
Currently, almost all traditional infrared small target detection methods work on the assumption that training and test sets always belong to a same domain, and training samples are sufficient. However, in real applications, a new detection task could often have no sufficient training samples from a special domain. In this situation, adopting the auxiliary data from big-sample domains is usually believed to be one of the most potential solutions. However, exceeding expectations, it is found that simply adding auxiliary samples cannot often be always effective, even causing performance decline, due to existing infrared domain shift. To overcome this unexpected problem, we propose the first infrared moving small target detection framework with domain-auxiliary supports by Learning to Overlook Domain Discrepancy (Loddis).
-
Datasets are available at
DAUB-R(code: jya7) andIRDST-H(code: c4ar). DAUB-R is a reconstructed version of DAUB, split into training, validation, and test sets. IRDST-H it is a hard version of IRDST. -
You need to reorganize these datasets in a format similar to the
Loddis/train_3IR+7DA.txtandIRDST-H/val.txtfiles we provided (.txt filesare used in training). We provide the.txt filesfor ITSDT-15K, DAUB-R and IRDST-H. For example:
train_annotation_path = '/home/chenshengjia/Loddis/train_3IR+7DA.txt'
val_annotation_path = '/home/public/IRDST-H/val.txt'- Or you can generate a new
txt filebased on the path of your datasets..txt files(e.g.,IRDST-H/train.txt) can be generated from.json files(e.g.,train.json). We also provide all.json filesforDAUB-R(code: jya7) andIRDST-H(code: c4ar).
python utils_coco/coco_to_txt.py- The folder structure should look like this:
IRDST-H
├─train.json
├─val.json
├─test.json
├─train.txt
├─val.txt
├─test.txt
├─1
│ ├─0.bmp
│ ├─1.bmp
│ ├─2.bmp
│ ├─ ...
├─2
│ ├─0.bmp
│ ├─1.bmp
│ ├─2.bmp
│ ├─ ...
├─3
│ ├─ ...
- python==3.11.8
- pytorch==2.1.1
- torchvision==0.16.1
- numpy==1.26.4
- opencv-python==4.9.0.80
- scipy==1.13
- Tested on Ubuntu 20.04, with CUDA 11.8, and 1x NVIDIA 3090.
-
It is necessary to predefine the domain IDs for different dataset settings during training. Specifically, the ID of the smaller dataset is set to 0, and the ID of the larger dataset is set to 1.
-
Take the
3IR+7DAsetting as an example, modify the dataset ID assignments indataloader_for_all_domain.py:
# Loddis/utils/dataloader_for_all_domain.py
domain_key = os.path.normpath(file_name).split(os.sep)[-3]
domain_map = {
'IRDST': 0,
'DAUB': 1,
'TSIRMT': 1,
}
domain_id = None
for key, did in domain_map.items():
if key in file_name:
domain_id = did
break
if domain_id is None:
raise ValueError(f"Unrecognized domain (no mapping substring found): {file_name}")- Take the
3DA+7IRsetting as an example:
domain_map = {
'IRDST': 1,
'DAUB': 0,
'TSIRMT': 1,
}- Note: Please use different
dataloaderfor different datasets. For example, to train the model on ITSDT dataset, enter the following command:
python train.py - Usually
model_best.pthis not necessarily the best model. The best model may have a lower val_loss or a higher AP50 during verification.
"model_path": '/home/MoPKL/logs/model.pth'- You need to change the path of the
json fileof test sets. For example:
# Use IRDST-H dataset for test
cocoGt_path = '/home/public/IRDST-H/test.json'
dataset_img_path = '/home/public/IRDST-H/'python test.py- We also provide a batch-testing script,
test_all, which takes the weight director (e.g.,models_dir = '/home/chenshengjia/Loddis/logs/loss_********) as input. It also allows specifying the test epoch range, e.g.,start_epoch = 15,end_epoch = 50.
python test_all.py- We support
videoandsingle-frame imageprediction.
# mode = "video" (predict a sequence)
mode = "predict" # Predict a single-frame image python predict.pypython summary.py- Negative-shift degradation comparisons on six mixed-domain datasets.
P: primary data,A: auxiliary data. This figure presents negative-shift degradation comparisons for representative methods, revealing two notable observations. One is that our Loddis establishes a new SOTA benchmarks in terms of the F1 score, consistently securing the top performance indicators on six mixed-domain datasets. The other is that our method consistently enhances the performance of a weaker baseline across all six datasets using auxiliary datasets, whereas other methods may lead to performance degradation.
If any questions, kindly contact with Shengjia Chen via e-mail: [email protected].
- S. Chen, L. Ji, J. Zhu, M. Ye and X. Yao, "SSTNet: Sliced Spatio-Temporal Network With Cross-Slice ConvLSTM for Moving Infrared Dim-Small Target Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1-12, 2024, Art no. 5000912, doi: 10.1109/TGRS.2024.3350024.
- Bingwei Hui, Zhiyong Song, Hongqi Fan, et al. A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background[DS/OL]. V1. Science Data Bank, 2019[2024-12-10]. https://cstr.cn/31253.11.sciencedb.902. CSTR:31253.11.sciencedb.902.
- Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” arXiv preprint arXiv:2107.08430, 2021.
Coming soon.
