Skip to content

cppcute-pm/RangeBEV

Repository files navigation

RangeBEV

Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

1) introduction

This repository contains the code for our proposed method "RangeBEV". the link of the paper is Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

We propose an innovative initial retrieval + re-rank method that effectively combines information from range (or RGB) images and Bird's Eye View (BEV) images. Our approach relies solely on a computationally efficient global descriptor similarity search process to achieve re-ranking. Additionally, we introduce a novel similarity label supervision technique to maximize the utility of limited training data.

Experimental results on the KITTI dataset demonstrate that our method significantly outperforms state-of-the-art approaches.

1、pipeline

pipeline

2、quantitative results

quantitative results

3、qualitative results

qualitative results

2) Get started

1、Environment Setup

We use PyTorch and the MMSegmentation Library. We acknowledge their great contributions!

conda create -yn RangeBEV python=3.8
conda activate RangeBEV
pip install -r requirements.txt

if you encounter problems with installing mmsegmentation, MinkowskiEngine, faiss, open3d, vision3d, etc., please refer to the official websites for help.

2、Download Datasets

KITTI Odomety dataset

You should firstly login in the KITTI official website and then download the odometry dataset. Download the "color", "velodyne laser data", "calibration files" and "ground truth poses" .zip files. Unzip them into a folder structure according to the official guide. Then manually create a folder named "my_tool" for further use.

SemanticKITTI dataset

Then you need to download the SemanticKITTI label data from the official website, which will be used as the ground truth for training model on 11~21 sequences.

After this step, you should have the following folder structure in a data root folder:

KITTI/
├── dataset/
│   ├── poses/
│   ├── sequences/
├── my_tool/

semanticKITTI/
├── dataset/
│   ├── sequences/

Boreas dataset

Additionally, if you want to run the model on the Boreas dataset, you can download it from the official website. The demanding sequences are in the datasets/Boreas_dp/mv_boreas_minuse.pyand only the LiDAR and the Camera sensor data are required. We only do the single modal and RGB to pointcloud cross modal experiments, if you want to run our proposed method on the Boreas dataset, you can may need a further data preprocessing.

3、Prepare Models

Prepare the pretrained model weights for the RGB image branch from here, the model weights needed is fpn_r50_512x512_160k_ade20k_20200718_131734-5b5a6ab9.pth and you modified it first to change the keyname architecture in state_dict then save it as /path/to/your/fpn_r50_512x512_160k_ade20k_20200718_131734-5b5a6ab9_v2.pth.

Prepare the pretrained weights on KITTI for the monocular metric depth estimation model from here, the depth_anything_metric_depth_outdoor.pt is needed. Next pull the Depth-Anything repository and install according to the official guide. After that, you should move the datasets/Kitti_Odometry_dp/generate_bev_v2_sobel.py and datasets/Kitti_Odometry_dp/my_pykitti_odometry.py files to yourfolder/Depth-Anything/metric_depth/ folder, at the same time, you should assign the pretrained weights to the right path.

4、Data Preprocessing

run the following command in order to preprocess the data.

cd datasets/Kitti_Odometry_dp
python rgb_image_pre_process_v8.py
python pointcloud_pre_process_v1.py
python generate_range_image.py
python generate_bev_v2_pc_nonground.py
python generate_UTM_coords_v1.py
cd ../../../Depth-Anything/metric_depth
python generate_bev_v2_sobel.py

3) Training

assign the script.sh correctly, you should set the data_path, weight_path, config_path, need_eval, seed, local_rank parameters, seed is fixed to 3407, config_path is configs/model_5/phase_5_standard.py. Then run the following command to train the model.

bash script.sh

4) Inference

you can train from scratch with your own model weights or just use our pretrained model weights, run the following command to evaluate the model.

python Kitti_evaluate.py

5) Citation

If you find this work useful in your research, please consider citing:

@misc{peng2025rangebirdseyeview,
      title={Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition}, 
      author={Jianyi Peng and Fan Lu and Bin Li and Yuan Huang and Sanqing Qu and Guang Chen},
      year={2025},
      eprint={2502.11742},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.11742}, 
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages