FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

One liner code:

model_input = (mel_spec @ mel_filter.pinverse()).abs().clamp_min(1e-5)

Official Repository of the paper: FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

Audio samples at: https://bakerbunker.github.io/FreeV/

Model checkpoints and tensorboard training logs available at: huggingface

Requirements

git clone https://github.com/BakerBunker/FreeV.git
cd FreeV
pip install -r requirements.txt

Configs

I tried using PGHI(Phase Gradient Heap Integration) as phase spec initialization. But sadly it didn't work.

Here is the config and train script of different settings, diff <train-script> <train-script> to see the differences.

Model	Config File	Train Script
APNet2	config.json	train.py
APNet2 w/pghi	config_pghi.json	train_pghi.py
FreeV	config2.json	train2.py
FreeV w/pghi	config2_pghi.json	train2_pghi.py

Training

python <train-script>

Checkpoints and copy of the configuration file are saved in the checkpoint_path directory in config.json.

Modify the training and inference configuration by modifying the parameters in the config.json.

Inference

Download pretrained model on LJSpeech dataset at huggingface.

Modify the inference.py to inference.

Model Structure

Comparison with other models

Acknowledgements

We referred to APNet2 to implement this.

See the code changes at this commit

Citation

@misc{lv2024freevfreelunchvocoders,
      title={FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter}, 
      author={Yuanjun Lv and Hai Li and Ying Yan and Junhui Liu and Danming Xie and Lei Xie},
      year={2024},
      eprint={2406.08196},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2406.08196}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
figure		figure
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
config2.json		config2.json
config2_pghi.json		config2_pghi.json
config_pghi.json		config_pghi.json
dataset.py		dataset.py
env.py		env.py
inference.py		inference.py
models.py		models.py
models2.py		models2.py
models2_pghi.py		models2_pghi.py
models_pghi.py		models_pghi.py
requirements.txt		requirements.txt
train.py		train.py
train2.py		train2.py
train2_pghi.py		train2_pghi.py
train_pghi.py		train_pghi.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

Requirements

Configs

Training

Inference

Model Structure

Comparison with other models

Acknowledgements

Citation

About

Uh oh!

Releases 1

Uh oh!

Languages

License

BakerBunker/FreeV

Folders and files

Latest commit

History

Repository files navigation

FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

Requirements

Configs

Training

Inference

Model Structure

Comparison with other models

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Languages