Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

1) introduction

This repository contains the code for our proposed method "RangeBEV". the link of the paper is Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

We propose an innovative initial retrieval + re-rank method that effectively combines information from range (or RGB) images and Bird's Eye View (BEV) images. Our approach relies solely on a computationally efficient global descriptor similarity search process to achieve re-ranking. Additionally, we introduce a novel similarity label supervision technique to maximize the utility of limited training data.

Experimental results on the KITTI dataset demonstrate that our method significantly outperforms state-of-the-art approaches.

1、pipeline

2、quantitative results

3、qualitative results

2) Get started

1、Environment Setup

We use PyTorch and the MMSegmentation Library. We acknowledge their great contributions!

conda create -yn RangeBEV python=3.8
conda activate RangeBEV
pip install -r requirements.txt

if you encounter problems with installing mmsegmentation, MinkowskiEngine, faiss, open3d, vision3d, etc., please refer to the official websites for help.

2、Download Datasets

KITTI Odomety dataset

You should firstly login in the KITTI official website and then download the odometry dataset. Download the "color", "velodyne laser data", "calibration files" and "ground truth poses" .zip files. Unzip them into a folder structure according to the official guide. Then manually create a folder named "my_tool" for further use.

SemanticKITTI dataset

Then you need to download the SemanticKITTI label data from the official website, which will be used as the ground truth for training model on 11~21 sequences.

After this step, you should have the following folder structure in a data root folder:

KITTI/
├── dataset/
│   ├── poses/
│   ├── sequences/
├── my_tool/

semanticKITTI/
├── dataset/
│   ├── sequences/

Boreas dataset

Additionally, if you want to run the model on the Boreas dataset, you can download it from the official website. The demanding sequences are in the datasets/Boreas_dp/mv_boreas_minuse.pyand only the LiDAR and the Camera sensor data are required. We only do the single modal and RGB to pointcloud cross modal experiments, if you want to run our proposed method on the Boreas dataset, you can may need a further data preprocessing.

3、Prepare Models

Prepare the pretrained model weights for the RGB image branch from here, the model weights needed is fpn_r50_512x512_160k_ade20k_20200718_131734-5b5a6ab9.pth and you modified it first to change the keyname architecture in state_dict then save it as /path/to/your/fpn_r50_512x512_160k_ade20k_20200718_131734-5b5a6ab9_v2.pth.

Prepare the pretrained weights on KITTI for the monocular metric depth estimation model from here, the depth_anything_metric_depth_outdoor.pt is needed. Next pull the Depth-Anything repository and install according to the official guide. After that, you should move the datasets/Kitti_Odometry_dp/generate_bev_v2_sobel.py and datasets/Kitti_Odometry_dp/my_pykitti_odometry.py files to yourfolder/Depth-Anything/metric_depth/ folder, at the same time, you should assign the pretrained weights to the right path.

4、Data Preprocessing

run the following command in order to preprocess the data.

cd datasets/Kitti_Odometry_dp
python rgb_image_pre_process_v8.py
python pointcloud_pre_process_v1.py
python generate_range_image.py
python generate_bev_v2_pc_nonground.py
python generate_UTM_coords_v1.py
cd ../../../Depth-Anything/metric_depth
python generate_bev_v2_sobel.py

3) Training

assign the script.sh correctly, you should set the data_path, weight_path, config_path, need_eval, seed, local_rank parameters, seed is fixed to 3407, config_path is configs/model_5/phase_5_standard.py. Then run the following command to train the model.

bash script.sh

4) Inference

you can train from scratch with your own model weights or just use our pretrained model weights, run the following command to evaluate the model.

python Kitti_evaluate.py

5) Citation

If you find this work useful in your research, please consider citing:

@misc{peng2025rangebirdseyeview,
      title={Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition}, 
      author={Jianyi Peng and Fan Lu and Bin Li and Yuan Huang and Sanqing Qu and Guang Chen},
      year={2025},
      eprint={2502.11742},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.11742}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
MTL_weighters		MTL_weighters
assets		assets
configs		configs
datasets		datasets
evaluator		evaluator
losses		losses
models		models
utils		utils
.gitignore		.gitignore
Boreas_evaluate.py		Boreas_evaluate.py
Kitti_evaluate.py		Kitti_evaluate.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
script.sh		script.sh
train.py		train.py
train_ddp.py		train_ddp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

1) introduction

1、pipeline

2、quantitative results

3、qualitative results

2) Get started

1、Environment Setup

2、Download Datasets

KITTI Odomety dataset

SemanticKITTI dataset

Boreas dataset

3、Prepare Models

4、Data Preprocessing

3) Training

4) Inference

5) Citation

About

Uh oh!

Releases

Packages

Languages

License

cppcute-pm/RangeBEV

Folders and files

Latest commit

History

Repository files navigation

Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

1) introduction

1、pipeline

2、quantitative results

3、qualitative results

2) Get started

1、Environment Setup

2、Download Datasets

KITTI Odomety dataset

SemanticKITTI dataset

Boreas dataset

3、Prepare Models

4、Data Preprocessing

3) Training

4) Inference

5) Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages