Skip to content

Commit acf8ac8

Browse files
committed
Update README to reflect corrections in processed_audio and target_audio tensor shapes
1 parent 20811a7 commit acf8ac8

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,22 +39,22 @@ log_wmse = LogWMSE(
3939
# Generate random inputs (scale between -1 and 1)
4040
audio_lengths_samples = int(audio_length * sample_rate)
4141
unprocessed_audio = 2 * torch.rand(batch, audio_channels, audio_lengths_samples) - 1
42-
processed_audio = unprocessed_audio.unsqueeze(1).expand(-1, audio_stems, -1, -1) * 0.1
43-
target_audio = torch.zeros(batch, audio_stems, audio_channels, audio_lengths_samples)
42+
processed_audio = 2 * torch.rand(batch, audio_channels, audio_stems, audio_lengths_samples) - 1
43+
target_audio = torch.zeros(batch, audio_channels, audio_stems, audio_lengths_samples)
4444

4545
log_wmse = log_wmse(unprocessed_audio, processed_audio, target_audio)
4646
print(log_wmse) # Expected output: approx. -18.42
4747
```
4848

4949
logWMSE accepts three torch tensors of the following shapes:
5050
- unprocessed_audio: `[batch, audio_channels, samples]`
51-
- processed_audio: `[batch, audio_stems, audio_channels, samples]`
52-
- target_audio: `[batch, audio_stems, audio_channels, samples]`
51+
- processed_audio: `[batch, audio_channels, audio_stems, samples]`
52+
- target_audio: `[batch, audio_channels, audio_stems, samples]`
5353

5454
Each dimension being:
5555
- `batch`: Number of audio files in a batch (i.e. batch size).
56-
- `audio_stems`: Number of separate audio sources. For source separation, this could be multiple different instruments, vocals, etc. For denoising audio, this will be 1.
5756
- `audio_channels`: Number of channels (i.e. 1 for mono and 2 for stereo).
57+
- `audio_stems`: Number of separate audio sources. For source separation, this could be multiple different instruments, vocals, etc. For denoising audio, this will be 1.
5858
- `samples`: Number of audio samples (e.g. 1 second of audio @ 44.1kHz is 44100 samples).
5959

6060
## Motivation

0 commit comments

Comments
 (0)