SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation

🔥 Project Highlights

Feature	Description
📡 Streaming Framework	Autoregressive multi-modal hand forecasting
✋ Full-State Predictions	Hand type, 2D box, 3D pose, and trajectory
🧠 ROI-Enhanced Memory	Temporal hand awareness
🗣️ Language-guided	Follows natural language instructions

🎬 Method Overview

📝 Introduction

💡 SFHand is the first streaming architecture for language-guided 3D hand forecasting.

SFHand predicts future hand dynamics from continuous egocentric video + text instructions. The model outputs the following hand states autoregressively: Hand type, 2D bounding box, 3D hand pose, and 3D trajectory

Key components: Streaming autoregressive transformer and ROI-enhanced memory.

📦 Project Status

Component	Status
EgoHaFL Dataset	✅
Pretraining Code	✅
Pretrained Weights	✅
Evaluation Code	✅
Embodied Evaluation (Franka Kitchen)	🔜 Coming soon
3D Hand Annotation Code	🔜 Coming soon

🔧 Installation

We develop and test the project under: torch 2.8.0+cu129.

git clone [email protected]:ut-vision/SFHand.git

conda env create -f environment.yml
conda activate sfhand

pip install -r requirements.txt
conda install -c conda-forge libgl

Download MANO model and put MANO_LEFT.pkl and MANO_RIGHT.pkl under data/mano.

Download base_best.pt from EgoHOD checkpoint and place it at ./pre_ckpt/base_best.pt.

📂 Dataset: EgoHaFL

EgoHaFL Dataset (annotations)： 👉 https://huggingface.co/datasets/ut-vision/EgoHaFL

Videos originate from Ego4D V1: https://ego4d-data.org/. We use 224p compressed clips.

Directory structure:

EgoHaFL
    ├── EgoHaFL_lmdb
    │   ├── data.mdb
    │   └── lock.mdb
    ├── EgoHaFL_train.csv
    ├── EgoHaFL_test.csv
    └── v1
        └── videos_224p

🚀 Training & Evaluation

Train + Eval

bash ./exps/pretrain.sh

⚠️ Before training, edit configs in ./configs.

Eval + Visualization

python main.py --config_file configs/config/clip_base_eval.yml --eval --vis

Output visualizations → ./render_results/

🧠 Pretrained Models

Download here:

👉 https://huggingface.co/ut-vision/SFHand

🤖 Embodied Evaluation (Franka Kitchen)

⏳ Coming soon — code will be added once finalized.

✍️ 3D Hand Annotation

⏳ Coming soon — detailed annotation tools, formats, and processing scripts will be released once finalized.

📚 Citation

@article{liu2025sfhand,
  title={SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation},
  author={Liu, Ruicong and Huang, Yifei and Ouyang, Liangyang and Kang, Caixin and and Sato, Yoichi},
  journal={arXiv preprint arXiv:2511.18127},
  year={2025}
}

🙏 Acknowledgement

SFHand builds on EgoHOD. Thanks to all contributors of the original codebase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation

🔥 Project Highlights

🎬 Method Overview

📝 Introduction

📦 Project Status

🔧 Installation

📂 Dataset: EgoHaFL

🚀 Training & Evaluation

Train + Eval

Eval + Visualization

🧠 Pretrained Models

🤖 Embodied Evaluation (Franka Kitchen)

✍️ 3D Hand Annotation

📚 Citation

🙏 Acknowledgement

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
configs		configs
data/mano		data/mano
dataset		dataset
exps		exps
model		model
util		util
.gitignore		.gitignore
README.md		README.md
engine_pretrain.py		engine_pretrain.py
environment.yml		environment.yml
main.py		main.py
requirements.txt		requirements.txt

ut-vision/SFHand

Folders and files

Latest commit

History

Repository files navigation

SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation

🔥 Project Highlights

🎬 Method Overview

📝 Introduction

📦 Project Status

🔧 Installation

📂 Dataset: EgoHaFL

🚀 Training & Evaluation

Train + Eval

Eval + Visualization

🧠 Pretrained Models

🤖 Embodied Evaluation (Franka Kitchen)

✍️ 3D Hand Annotation

📚 Citation

🙏 Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages