You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -179,30 +186,27 @@ Using `torch.compile` in addition to the accelerated transformers implementation
179
186
|V100|8||43.95|43.37|42.25|3.87||
180
187
|V100|16||84.99|84.73|82.55|2.87||
181
188
||||||||
182
-
|3090|1|7.09|6.78|6.11|6.03|11.06|14.95|
183
-
|3090|4|22.69|21.45|18.67|18.09|15.66|20.27|
184
-
|3090|8||42.59|36.75|35.59|16.44||
185
-
|3090|16||85.35|72.37|70.25|17.69||
186
-
|3090|32(1)||162.05|138.99|134.53|16.98||
187
-
|3090|48||241.91|207.75||14.12||
189
+
|3090|1|7.09|6.78|5.34|5.35|21.09|24.54|
190
+
|3090|4|22.69|21.45|18.56|18.18|15.24|19.88|
191
+
|3090|8||42.59|36.68|35.61|16.39||
192
+
|3090|16||85.35|72.93|70.18|17.77||
193
+
|3090|32(1)||162.05|143.46|138.67|14.43||
188
194
||||||||
189
-
|3090Ti|1|6.45|6.19|5.64|5.49|11.31|14.88|
190
-
|3090Ti|4|20.32|19.31|16.9|16.37|15.23|19.44|
191
-
|3090Ti|8(2)||37.93|33.05|31.99|15.66||
192
-
|3090Ti|16||75.37|65.25|64.32|14.66||
193
-
|3090Ti|32(1)||142.55|124.44|120.74|15.30||
194
-
|3090Ti|48||213.19|186.55||12.50||
195
+
|3090Ti|1|6.45|6.19|4.99|4.89|21.00|24.19|
196
+
|3090Ti|4|20.32|19.31|17.02|16.48|14.66|18.90|
197
+
|3090Ti|8||37.93|33.21|32.24|15.00||
198
+
|3090Ti|16||75.37|66.63|64.5|14.42||
199
+
|3090Ti|32(1)||142.55|128.89|124.92|12.37||
195
200
||||||||
196
-
|4090|1|5.54|4.99|4.51|4.44|11.02|19.86|
197
-
|4090|4|13.67|11.4|10.3|9.84|13.68|28.02|
198
-
|4090|8||19.79|17.13|16.19|18.19||
199
-
|4090|16||38.62|33.14|32.31|16.34||
200
-
|4090|32(1)||76.57|65.96|62.05|18.96||
201
-
|4090|48||114.44|98.78||13.68||
202
-
201
+
|4090|1|5.54|4.99|2.66|2.58|48.30|53.43|
202
+
|4090|4|13.67|11.4|8.81|8.46|25.79|38.11|
203
+
|4090|8||19.79|17.55|16.62|16.02||
204
+
|4090|16||38.62|35.65|34.07|11.78||
205
+
|4090|32(1)||76.57|69.48|65.35|14.65||
206
+
|4090|48||114.44|106.3||7.11||
203
207
204
208
205
-
(1)BatchSize >= 32 requires enable_vae_slicing() because of https://github.com/pytorch/pytorch/issues/81665.
206
-
This is required for PyTorch 1.13.1, and also for PyTorch 2.0 and batch size of 64.
209
+
(1)BatchSize >= 32 requires enable_vae_slicing() because of https://github.com/pytorch/pytorch/issues/81665.
210
+
This is required for PyTorch 1.13.1, and also for PyTorch 2.0 and large batch sizes.
207
211
208
-
For more details about how this benchmark was run, please refer to [this PR](https://github.com/huggingface/diffusers/pull/2303).
212
+
For more details about how this benchmark was run, please refer to [this PR](https://github.com/huggingface/diffusers/pull/2303) and to [the blog post](https://pytorch.org/blog/accelerated-diffusers-pt-20/).
0 commit comments