Audio Denoising – 60 Hz Hum-Removal

code, results and an interactive demo for Spectrogram U-Net that suppresses mains hum in music recordings.

Project overview

This repo trains a U-Net that predicts a soft mask over magnitude spectrograms and removes the hum while preserving the insturment sound

Input: single-channel 16 kHz WAV
Output: denoised WAV (same length)
Training target: SI-SDR ↑ / PSNR ↑

Dataset

Split	Files	Duration
train	2 621	≈ 72 h
val	562	≈ 15 h
test	188	≈ 5 h

Each example contains:

Clean music excerpts (from Neo Scholars Take Home Google Drive)
Synthetic 60 Hz + harmonics + random phase + slowly-varying amplitude
Weighted mix at SNR ∈ {0, 5, 10, 15} dB

Scripts live in dataset/:

# Download music, synthesise hum, build CSV metadata
python dataset/build_dataset.py --root dataset

Model architecture

Spectrogram U-Net (Oh, J., Kim, D., & Yun, S.-Y. (2018)) operating on STFT magnitudes (513 bins):

                            +------+               +------+
       │ mag │              | up 4 |<--skip--+-----| out  |
  x ──▶│(B,1)│──inc──down1──| ...  |        |     +------+
       +------+             +------+        |
                         ···                        sigmoid→ soft mask ∈ [0,1]

Base channels = 32 (≈ 1.7 M params)
Bilinear up-sampling, sigmoid on logits → mask
Estimate complex STFT: mask × noisy STFT, then iSTFT

Hyper-parameters (see train_spectrogram_unet.py):

argument	value	note
`--epochs`	50	cosine LR decay
`--lr`	3 × 10⁻⁴	AdamW
`--batch`	16	1-s crops (44 100 samples)
Loss	0.7 · L1(spec) + 0.3 · L1(time)
Aug.	random crops, amplitude normalisation

Results

Training vs. validation loss and SI-SDR stay closely aligned, indicating the model converges without over-fitting.

method,split,si_sdr [dB],psnr [dB]
identity,test,7.50,44.24
notch,test,7.04,43.93
unet (best.pt),test,26.30,62.79

+18.8 dB SI-SDR over raw noisy input ⇒ ~75× noise power reduction
Fixed notch filter barely helps because music energy also sits at 60 Hz and harmonics.

Generate CSV & figures:

python scripts/evaluate_models.py \
  --dataset dataset --split test --checkpoint checkpoints/best.pt \
  --out results/results.csv
python scripts/plot_results.py results/results.csv

🔊 Listen

noisy	denoised
sax – noisy 🎧	sax – denoised 🎧

Quickstart

# create env (conda or venv) and install deps
pip install -r requirements.txt

#  Train (≈ 5 h on one RTX 3060, CPU works but slower)
python train_spectrogram_unet.py --dataset dataset --epochs 50 --batch 16

#   Evaluate best checkpoint
python scripts/evaluate_models.py --dataset dataset --split test \
       --checkpoint checkpoints/best.pt --out results/results.csv

Local Gradio app

python scripts/denoise_demo.py

Then open http://127.0.0.1:7860 and drag-&-drop a WAV.

Reproducibility

Set random seeds in train_spectrogram_unet.py (disabled by default).
All checkpoints, CSVs and figures are version-controlled – see checkpoints/best.pt.

References

Oh, J., Kim, D., & Yun, S.-Y. (2018). Spectrogram-channels U-Net: A source separation model viewing each channel as the spectrogram of each source. arXiv preprint arXiv:1810.11520.
Zhang, Y., & Li, J. (2023). BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 3164–3173.
Babaev, N., Tamogashev, K., Saginbaev, A., Shchekotov, I., Bae, H., Sung, H., Lee, W., Cho, H.-Y., & Andreev, P. (2024). FINALLY: Fast and Universal Speech Enhancement with Studio-like Quality. Advances in Neural Information Processing Systems (NeurIPS).
Zhang, Y., & Li, J. (2023). Complex Image Generation SwinTransformer Network for Audio Denoising. Proceedings of INTERSPEECH 2023, Dublin, Ireland.

Made with ❤️ & PyTorch 2.7 ✨

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
checkpoints		checkpoints
dataset		dataset
models		models
results		results
samples		samples
scripts		scripts
utils		utils
wandb		wandb
.gitattributes		.gitattributes
.gitignore		.gitignore
FULL_REPORT.pdf		FULL_REPORT.pdf
README.md		README.md
REPORT.md		REPORT.md
evaluate.py		evaluate.py
requirements.txt		requirements.txt
train_spectrogram_unet.py		train_spectrogram_unet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Denoising – 60 Hz Hum-Removal

Project overview

Dataset

Model architecture

Results

🔊 Listen

Quickstart

Local Gradio app

Reproducibility

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Denoising – 60 Hz Hum-Removal

Project overview

Dataset

Model architecture

Results

🔊 Listen

Quickstart

Local Gradio app

Reproducibility

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages