Skip to content

tesfaldet/genpt

Repository files navigation

Generative Point Tracking with Flow Matching

An example of generative point tracking An example of generative point tracking

GenPT is the first generative point tracker that addresses the limitations of conventional discriminative models in capturing multi-modality by directly modelling the multi-modality inherent to point tracking through Flow Matching.

By modifying flow matching with iterative refinement, a window-dependent prior, and a specialized variance schedule, GenPT effectively captures uncertainty, particularly behind occlusions. GenPT can sample trajectories from plausible modes in the solution space, which can be exploited by a best-first search on generated samples, guided by the model’s confidence, to improve tracking accuracy. GenPT achieves its state-of-the-art performance while using few parameters and operating at a fast inference speed compared to baselines.

News

  • [Oct 21, 2025] Initial release. We are actively adding to this repo, so please ping or open an issue if you notice something missing or broken!

Installation instructions

Install miniconda:

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
source ~/miniconda3/bin/activate
conda init

Clone the repo and set up a new conda environment for GenPT:

git clone https://github.com/tesfaldet/genpt
cd genpt
conda env create -f environment.yaml
conda activate genpt

Data preparation

Datasets used

Dataset Version Link Notes
PointOdyssey 1.4 download v1.4 contains some bug fixes but it shouldn't dramatically change results.
CoTracker3_Kubric N/A website For training only.
Dynamic Replica N/A website For testing only.
TAP-Vid DAVIS N/A website For testing only.
TAP-Vid Kinetics N/A website For testing only.
TAP-Vid RGB-S N/A website For testing only.
TAP-Vid RoboTAP N/A website For testing only.

Datasets should be downloaded to ./data/datasets. Please make sure the paths match with what's shown in ./configs/local/default.yaml.

Make sure to modify the PROJECT_SCRATCH env var in the .env file to the appropriate path.

Evaluation

Available models

Model Version Trained on Checkpoint
GenPT 1.0.0 PointOdyssey download
GenPT 1.0.0 CoTracker3_Kubric download

Checkpoints should be downloaded to ./data/checkpoints.

Model checkpoints for ablated models and the discriminative variant will be made available at a later time.

Single GPU example

To evaluate GenPT on PointOdyssey with a single GPU, execute the following:

python src/evaluate.py trainer=gpu trainer.devices=1 experiment=evaluate_genpt_fm_pointodyssey ckpt_path=<PATH>

where <PATH> is set to a downloaded GenPT checkpoint, e.g., ./data/checkpoints/genpt_fm_podv14.ckpt.

Local DDP example

To evaluate GenPT on PointOdyssey with 4 GPUs, execute the following:

python src/evaluate.py trainer=ddp trainer.devices=4 experiment=evaluate_genpt_fm_pointodyssey ckpt_path=<PATH>

where <PATH> is set to a downloaded GenPT checkpoint, e.g., ./data/checkpoints/genpt_fm_podv14.ckpt.

PointOdyssey and Dynamic Replica benchmarks

To evaluate on PointOdyssey, set experiment=evaluate_genpt_fm_pointodyssey. To evaluate on Dynamic Replica, set experiment=evaluate_genpt_fm_dynamic_replica.

TAP-Vid benchmark

To evaluate on DAVIS, Kinetics, RGB-S, or RoboTAP, set experiment=evaluate_genpt_fm_tapvid_<NAME>, where <NAME> is one of [davis, kinetics, rgbs, robotap].

TAP-Vid Sliding Occluder benchmark

To enable the sliding occluder while evaluating on any of the TAP-Vid datasets mentioned above, set data.test_datasets.0.occluder_direction=<DIRECTION>, where <DIRECTION> is one of [lr, rl, tb, bt] (corresponding to left-to-right, right-to-left, top-to-bottom, and bottom-to-top, respectively).

Improving results with best-first search

To pick the best of N generated trajectories (within each window, in a causal fashion), guided by GenPT's confidence estimates, you will need to modify the test_step() function in ./src/models/genpt_fm_lightning_module.py. Specifically, modify the following variables as such:

  • num_samples = <N> (<N> is some integer value, like 5)
  • search_mode = "greedy"

Beam search can be used instead of a simple greedy search:

  • num_samples = <N> (<N> is some integer value, like 5)
  • search_mode = "beam"
  • beam_width = <W> (respect the inequality 1 <= beam_width <= num_samples)

The user friendliness of this functionality will eventually be improved.

Training

Below are examples of training GenPT on PointOdyssey in a variety of compute setups. For the paper, we used 4 GPUs in total with everything else set to default. To train on CoTracker3_Kubric, set experiment=train_trainer_tapvid_kubric.

Single GPU, single node example

To train GenPT on PointOdyssey with a single GPU on a single node, execute the following:

python src/train.py experiment=train_tracker_pointodyssey trainer=gpu model=genpt_fm

Local DDP example

To train with local DDP on a single node and 4 GPUs with Tensorboard logging, execute the following:

python src/train.py experiment=train_tracker_pointodyssey trainer=ddp trainer.devices=4 trainer.num_nodes=1 logger=tensorboard model=genpt_fm

Remote DDP example

To train on a remote SLURM cluster using 2 nodes and 2 GPUs each, with Tensorboard logging, execute the following:

python src/train.py --multirun experiment=train_tracker_pointodyssey trainer=ddp trainer.devices=2 trainer.num_nodes=2 logger=tensorboard hydra=submitit_remote_launcher model=genpt_fm

This requires the CLUSTER_PARTITION and CLUSTER_GPU_TYPE env vars in .env to be modified accordingly.

Disabling logging

An example of local single GPU training of GenPT with no logging:

python src/train.py experiment=train_tracker_pointodyssey trainer=gpu logger=null callbacks.learning_rate_monitor=null model=genpt_fm

Contributing and issue reporting

We welcome contributions to this project. If you would like to contribute, please make a pull request and Mattie will take a look as soon as possible.

For reporting issues, such as bugs or missing information, please create a new issue.

Citing GenPT

If you use this code for your research, please consider giving it a star ⭐️ and cite its research paper:

Mattie Tesfaldet, Adam W. Harley, Konstantinos G. Derpanis, Derek Nowrouzezahrai, and Christopher Pal. Generative Point Tracking with Flow Matching. arXiv preprint, 2025.

Bibtex format:

@article{tesfaldet2025,
  title = {{Generative Point Tracking with Flow Matching}},
  author = {Tesfaldet, Mattie and Harley, Adam W and Derpanis, Konstantinos G and Nowrouzezahrai, Derek and Pal, Christopher},
  journal = {arXiv preprint},
  year = {2025}
}

About

Official repo for "Generative Point Tracking with Flow Matching".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published