Skip to content

Commit ae535a2

Browse files
authored
Merge pull request #5 from gtamer2/torchscript
Torchscript
2 parents 93d16e2 + 5534bbf commit ae535a2

11 files changed

+548
-38
lines changed
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Starting up...
2+
Building data loaders...
3+
Initializing Model...
4+
> initializing model parallel with size 1
5+
> initializing ddp with size 1
6+
> initializing pipeline with size 1
7+
Loaded in 12.05 seconds
8+
Running inference benchmark...
9+
10+
Working on device: cuda
11+
Starting BATCH 1 of 5
12+
Finished Batch 1 of 5
13+
Batch load time: 0.0010531009902479127
14+
Batch inference time: 5.044861934002256
15+
Batch total time: 5.045921023993287
16+
Starting BATCH 2 of 5
17+
Finished Batch 2 of 5
18+
Batch load time: 0.000634292999166064
19+
Batch inference time: 4.504916821009829
20+
Batch total time: 4.505557062002481
21+
Starting BATCH 3 of 5
22+
Finished Batch 3 of 5
23+
Batch load time: 0.0007395810098387301
24+
Batch inference time: 4.533521624005516
25+
Batch total time: 4.534268278002855
26+
Starting BATCH 4 of 5
27+
Finished Batch 4 of 5
28+
Batch load time: 0.0006648830021731555
29+
Batch inference time: 4.495368515010341
30+
Batch total time: 4.496039069999824
31+
Starting BATCH 5 of 5
32+
Finished Batch 5 of 5
33+
Batch load time: 0.0006519919988932088
34+
Batch inference time: 4.496985145000508
35+
Batch total time: 4.497643199007143
36+
37+
38+
Manual Profile Results...
39+
Data-loading times
40+
> per epoch: tensor([0.0011, 0.0006, 0.0007, 0.0007, 0.0007])
41+
> average: tensor(0.0007)
42+
43+
Inference time for each epoch
44+
> per epoch tensor([5.0430, 4.5039, 4.5352, 4.4961, 4.4961])
45+
> average tensor(4.6133)
46+
47+
Total time for each epoch
48+
> per epoch tensor([5.0469, 4.5039, 4.5352, 4.4961, 4.4961])
49+
> average tensor(4.6172)
50+
51+
52+
53+
Profiling sorted by CUDA time total
54+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
55+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem # of Calls
56+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
57+
run_benchmark 27.07% 6.249s 100.00% 23.083s 23.083s 0.000us 0.00% 15.070s 15.070s 0 b -2.66 Kb 8.13 Mb -4.18 Gb 1
58+
"benchmark_outputs/batch_size_1_num_workers_0.txt" 91L, 8729B 53,1 Top
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Starting up...
2+
Building data loaders...
3+
Initializing Model...
4+
> initializing model parallel with size 1
5+
> initializing ddp with size 1
6+
> initializing pipeline with size 1
7+
Loaded in 65.44 seconds
8+
Running inference benchmark...
9+
10+
Working on device: cuda
11+
Starting BATCH 1 of 5
12+
Finished Batch 1 of 5
13+
Batch load time: 0.17542113500530832
14+
Batch inference time: 5.1042240419919835
15+
Batch total time: 5.27965795599448
16+
Starting BATCH 2 of 5
17+
Finished Batch 2 of 5
18+
Batch load time: 0.11129688000073656
19+
Batch inference time: 4.523343688008026
20+
Batch total time: 4.634657599002821
21+
Starting BATCH 3 of 5
22+
Finished Batch 3 of 5
23+
Batch load time: 0.0730158430087613
24+
Batch inference time: 4.516634412007988
25+
Batch total time: 4.589664141007233
26+
Starting BATCH 4 of 5
27+
Finished Batch 4 of 5
28+
Batch load time: 0.07771697499265429
29+
Batch inference time: 4.432252533995779
30+
Batch total time: 4.509983689000364
31+
Starting BATCH 5 of 5
32+
Finished Batch 5 of 5
33+
Batch load time: 0.0820326890097931
34+
Batch inference time: 4.4701670890062815
35+
Batch total time: 4.552215193005395
36+
37+
38+
Manual Profile Results...
39+
Data-loading times
40+
> per epoch: tensor([0.1754, 0.1113, 0.0730, 0.0777, 0.0820])
41+
> average: tensor(0.1039)
42+
43+
Inference time for each epoch
44+
> per epoch tensor([5.1055, 4.5234, 4.5156, 4.4336, 4.4688])
45+
> average tensor(4.6094)
46+
47+
Total time for each epoch
48+
> per epoch tensor([5.2812, 4.6328, 4.5898, 4.5117, 4.5508])
49+
> average tensor(4.7148)
50+
51+
52+
53+
Profiling sorted by CUDA time total
54+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
55+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem # of Calls
56+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
57+
run_benchmark 27.94% 6.585s 100.00% 23.570s 23.570s 0.000us 0.00% 15.065s 15.065s 0 b -2.66 Kb 8.13 Mb -4.16 Gb 1
58+
"benchmark_outputs/batch_size_1_num_workers_1.txt" 91L, 8719B 1,1 Top
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Starting up...
2+
Building data loaders...
3+
Initializing Model...
4+
> initializing model parallel with size 1
5+
> initializing ddp with size 1
6+
> initializing pipeline with size 1
7+
Loaded in 67.71 seconds
8+
Running inference benchmark...
9+
10+
Working on device: cuda
11+
Starting BATCH 1 of 5
12+
Finished Batch 1 of 5
13+
Batch load time: 0.0012337989901425317
14+
Batch inference time: 4.979560280000442
15+
Batch total time: 4.980800386008923
16+
Starting BATCH 2 of 5
17+
Finished Batch 2 of 5
18+
Batch load time: 0.0006314070051303133
19+
Batch inference time: 4.436402372986777
20+
Batch total time: 4.4370395870064385
21+
Starting BATCH 3 of 5
22+
Finished Batch 3 of 5
23+
Batch load time: 0.0006100949976826087
24+
Batch inference time: 4.489002016998711
25+
Batch total time: 4.489618256004178
26+
Starting BATCH 4 of 5
27+
Finished Batch 4 of 5
28+
Batch load time: 0.000657053999020718
29+
Batch inference time: 4.481591136995121
30+
Batch total time: 4.482254397997167
31+
Starting BATCH 5 of 5
32+
Finished Batch 5 of 5
33+
Batch load time: 0.0006241549999685958
34+
Batch inference time: 4.433049211991602
35+
Batch total time: 4.43367860399303
36+
37+
38+
Manual Profile Results...
39+
Data-loading times
40+
> per epoch: tensor([0.0012, 0.0006, 0.0006, 0.0007, 0.0006])
41+
> average: tensor(0.0008)
42+
43+
Inference time for each epoch
44+
> per epoch tensor([4.9805, 4.4375, 4.4883, 4.4805, 4.4336])
45+
> average tensor(4.5625)
46+
47+
Total time for each epoch
48+
> per epoch tensor([4.9805, 4.4375, 4.4883, 4.4805, 4.4336])
49+
> average tensor(4.5625)
50+
51+
52+
53+
Profiling sorted by CUDA time total
54+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
55+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem # of Calls
56+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
57+
run_benchmark 26.66% 6.085s 100.00% 22.827s 22.827s 0.000us 0.00% 15.085s 15.085s 0 b -2.66 Kb 8.13 Mb -4.18 Gb 1
58+
"benchmark_outputs/batch_size_2_num_workers_0.txt" 91L, 8729B 52,0-1 Top
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Starting up...
2+
Building data loaders...
3+
Initializing Model...
4+
> initializing model parallel with size 1
5+
> initializing ddp with size 1
6+
> initializing pipeline with size 1
7+
Loaded in 111.55 seconds
8+
Running inference benchmark...
9+
10+
Working on device: cuda
11+
Starting BATCH 1 of 5
12+
Finished Batch 1 of 5
13+
Batch load time: 0.0553153660002863
14+
Batch inference time: 4.988452916993992
15+
Batch total time: 5.043779415995232
16+
Starting BATCH 2 of 5
17+
Finished Batch 2 of 5
18+
Batch load time: 0.06645661000220571
19+
Batch inference time: 4.431401344001642
20+
Batch total time: 4.49787032698805
21+
Starting BATCH 3 of 5
22+
Finished Batch 3 of 5
23+
Batch load time: 0.06894606399873737
24+
Batch inference time: 4.5093786460056435
25+
Batch total time: 4.5783342299982905
26+
Starting BATCH 4 of 5
27+
Finished Batch 4 of 5
28+
Batch load time: 0.10248679800133687
29+
Batch inference time: 4.488342932003434
30+
Batch total time: 4.590840438992018
31+
Starting BATCH 5 of 5
32+
Finished Batch 5 of 5
33+
Batch load time: 0.07949267400545068
34+
Batch inference time: 4.5397761540079955
35+
Batch total time: 4.619280054001138
36+
37+
38+
Manual Profile Results...
39+
Data-loading times
40+
> per epoch: tensor([0.0553, 0.0665, 0.0690, 0.1025, 0.0795])
41+
> average: tensor(0.0745)
42+
43+
Inference time for each epoch
44+
> per epoch tensor([4.9883, 4.4297, 4.5078, 4.4883, 4.5391])
45+
> average tensor(4.5898)
46+
47+
Total time for each epoch
48+
> per epoch tensor([5.0430, 4.4961, 4.5781, 4.5898, 4.6211])
49+
> average tensor(4.6641)
50+
51+
52+
53+
Profiling sorted by CUDA time total
54+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
55+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem # of Calls
56+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
57+
run_benchmark 27.47% 6.410s 100.00% 23.334s 23.334s 0.000us 0.00% 15.079s 15.079s 0 b -2.66 Kb 8.13 Mb -4.18 Gb 1
58+
"benchmark_outputs/batch_size_2_num_workers_1.txt" 91L, 8722B 1,1 Top
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Starting up...
2+
Building data loaders...
3+
Initializing Model...
4+
> initializing model parallel with size 1
5+
> initializing ddp with size 1
6+
> initializing pipeline with size 1
7+
Loaded in 182.83 seconds
8+
Running inference benchmark...
9+
10+
Working on device: cuda
11+
Starting BATCH 1 of 5
12+
Finished Batch 1 of 5
13+
Batch load time: 0.0010882980132009834
14+
Batch inference time: 5.046935585996835
15+
Batch total time: 5.048030566002126
16+
Starting BATCH 2 of 5
17+
Finished Batch 2 of 5
18+
Batch load time: 0.0006488210055977106
19+
Batch inference time: 4.47928033999051
20+
Batch total time: 4.479962162004085
21+
Starting BATCH 3 of 5
22+
Finished Batch 3 of 5
23+
Batch load time: 0.0006499989976873621
24+
Batch inference time: 4.4514494110044325
25+
Batch total time: 4.452105590986321
26+
Starting BATCH 4 of 5
27+
Finished Batch 4 of 5
28+
Batch load time: 0.0006679049984086305
29+
Batch inference time: 4.445740676994319
30+
Batch total time: 4.446414493009797
31+
Starting BATCH 5 of 5
32+
Finished Batch 5 of 5
33+
Batch load time: 0.0006229520076885819
34+
Batch inference time: 4.457714663003571
35+
Batch total time: 4.458343616002821
36+
37+
38+
Manual Profile Results...
39+
Data-loading times
40+
> per epoch: tensor([0.0011, 0.0006, 0.0006, 0.0007, 0.0006])
41+
> average: tensor(0.0007)
42+
43+
Inference time for each epoch
44+
> per epoch tensor([5.0469, 4.4805, 4.4531, 4.4453, 4.4570])
45+
> average tensor(4.5781)
46+
47+
Total time for each epoch
48+
> per epoch tensor([5.0469, 4.4805, 4.4531, 4.4453, 4.4570])
49+
> average tensor(4.5781)
50+
51+
52+
53+
Profiling sorted by CUDA time total
54+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
55+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem # of Calls
56+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
57+
run_benchmark 26.98% 6.176s 100.00% 22.888s 22.888s 0.000us 0.00% 15.065s 15.065s 0 b -2.66 Kb 8.13 Mb -4.19 Gb 1
58+
"benchmark_outputs/batch_size_4_num_workers_0.txt" 91L, 8731B 1,1 Top
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Starting up...
2+
Building data loaders...
3+
Initializing Model...
4+
> initializing model parallel with size 1
5+
> initializing ddp with size 1
6+
> initializing pipeline with size 1
7+
Loaded in 100.42 seconds
8+
Running inference benchmark...
9+
10+
Working on device: cuda
11+
Starting BATCH 1 of 5
12+
Finished Batch 1 of 5
13+
Batch load time: 0.1019776799948886
14+
Batch inference time: 5.030931850997149
15+
Batch total time: 5.132919580995804
16+
Starting BATCH 2 of 5
17+
Finished Batch 2 of 5
18+
Batch load time: 0.0671316090010805
19+
Batch inference time: 4.438084541005082
20+
Batch total time: 4.505228757989244
21+
Starting BATCH 3 of 5
22+
Finished Batch 3 of 5
23+
Batch load time: 0.06837264500791207
24+
Batch inference time: 4.474854444008088
25+
Batch total time: 4.543237587000476
26+
Starting BATCH 4 of 5
27+
Finished Batch 4 of 5
28+
Batch load time: 0.07436333999794442
29+
Batch inference time: 4.4623387989995535
30+
Batch total time: 4.53671289801423
31+
Starting BATCH 5 of 5
32+
Finished Batch 5 of 5
33+
Batch load time: 0.07757725499686785
34+
Batch inference time: 4.4232901810028125
35+
Batch total time: 4.500878610997461
36+
37+
38+
Manual Profile Results...
39+
Data-loading times
40+
> per epoch: tensor([0.1020, 0.0671, 0.0684, 0.0743, 0.0776])
41+
> average: tensor(0.0779)
42+
43+
Inference time for each epoch
44+
> per epoch tensor([5.0312, 4.4375, 4.4766, 4.4609, 4.4219])
45+
> average tensor(4.5664)
46+
47+
Total time for each epoch
48+
> per epoch tensor([5.1328, 4.5039, 4.5430, 4.5352, 4.5000])
49+
> average tensor(4.6445)
50+
51+
52+
53+
Profiling sorted by CUDA time total
54+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
55+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem # of Calls
56+
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
57+
run_benchmark 27.55% 6.399s 100.00% 23.222s 23.222s 0.000us 0.00% 15.063s 15.063s 0 b -3.18 Kb 8.13 Mb -4.17 Gb 1
58+
"benchmark_outputs/batch_size_4_num_workers_1.txt" 91L, 8720B 1,1 Top

0 commit comments

Comments
 (0)