Build performance analysis #643
Description
Using the hugo --stepAnalysis
option we'll find that, as we'd expect, the biggest amount of time is being spent in the "render and write pages" steps (~60 seconds overall compared to millseconds for the other steps).
We can use hugo --templateMetrics
to break this down. Running this a couple times to make sure any disk cache is warmed up (although it didn't really make a difference), I get the following:
$ hugo --templateMetrics
Building sites …
Template Metrics:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
1m17.995022823s 20.364235ms 87.373313ms 3830 partials/footer.html
1m17.27805573s 59.171558ms 172.15785ms 1306 talk/single.html
1m4.596311572s 50.505325ms 124.3805ms 1279 speaker/single.html
40.976022719s 602.588569ms 2.238049589s 68 program/single.html
31.163888904s 32.8733ms 98.139365ms 948 event/single.html
16.358431432s 4.27113ms 39.009423ms 3830 partials/head.html
13.780003964s 3.597912ms 31.973227ms 3830 partials/head/seo.html
7.063846619s 1.910696ms 19.516799ms 3697 partials/sponsors.html
6.793879509s 1.773858ms 20.901307ms 3830 partials/head/seo/open_graph.html
5.630256291s 1.47004ms 18.66299ms 3830 partials/head/seo/twitter_cards.html
3.759348706s 55.284539ms 100.522296ms 68 speakers/single.html
3.204683581s 33.38212ms 80.29072ms 96 welcome/single.html
2.599846197s 678.811µs 11.047846ms 3830 partials/footer_scripts.html
1.893856439s 1.814038ms 10.184713ms 1044 partials/events/event_metadata.html
1.481349819s 386.775µs 14.390457ms 3830 partials/head_includes.html
1.283811567s 338.647µs 14.35677ms 3791 partials/events/event_navbar.html
1.204243451s 314.423µs 22.983662ms 3830 partials/head/seo/schema.html
1.015476969s 42.31154ms 67.800444ms 24 blog/single.html
959.264588ms 28.213664ms 59.11527ms 34 _default/single.html
569.009769ms 3.979089ms 10.815179ms 143 partials/welcome.html
517.457233ms 13.617295ms 32.827341ms 38 partials/future.html
506.543895ms 132.256µs 13.611213ms 3830 partials/global_navbar.html
361.180423ms 94.302µs 8.99108ms 3830 partials/google_analytics.html
241.433581ms 241.433581ms 241.433581ms 1 events/single.html
212.039645ms 212.039645ms 212.039645ms 1 partials/past.html
167.433621ms 1.418928ms 4.282184ms 118 shortcodes/event_logo.html
141.419853ms 70.709926ms 139.768148ms 2 _internal/_default/rss.xml
98.010874ms 25.59µs 401.257µs 3830 partials/meta.html
96.959016ms 96.959016ms 96.959016ms 1 index.html
91.01287ms 30.337623ms 31.888345ms 3 section/blog.html
69.631276ms 69.631276ms 69.631276ms 1 _internal/_default/sitemap.xml
69.443485ms 69.443485ms 69.443485ms 1 speaking/single.html
62.680491ms 435.281µs 3.336465ms 144 shortcodes/list_organizers.html
59.129445ms 69.4µs 735.991µs 852 shortcodes/event_link.html
56.235845ms 390.526µs 20.92267ms 144 shortcodes/list_core.html
55.062137ms 152.95µs 13.932607ms 360 shortcodes/email_organizers.html
20.76961ms 145.242µs 1.219671ms 143 partials/events/cta.html
19.230417ms 19.230417ms 19.230417ms 1 404.html
17.544956ms 130.932µs 587.166µs 134 shortcodes/cfp_dates.html
15.964269ms 15.964269ms 15.964269ms 1 partials/speaking.html
13.784635ms 119.866µs 1.187868ms 115 shortcodes/event_map.html
12.359809ms 66.809µs 356.611µs 185 shortcodes/event_start.html
11.896248ms 11.896248ms 11.896248ms 1 section/events.rss.xml
8.102662ms 8.102662ms 8.102662ms 1 partials/map.html
7.876521ms 52.162µs 316.568µs 151 shortcodes/event_location.html
6.605939ms 40.28µs 249.934µs 164 shortcodes/event_end.html
5.913856ms 41.355µs 258.694µs 143 shortcodes/event_twitter.html
5.261609ms 5.261609ms 5.261609ms 1 section/speaking.rss.xml
3.993664ms 50.552µs 258.007µs 79 shortcodes/email_proposals.html
3.361318ms 3.361318ms 3.361318ms 1 section/blog.rss.xml
1.360909ms 56.704µs 652.551µs 24 blog/summary.html
796.904µs 265.634µs 724.5µs 3 shortcodes/google_form.html
636.562µs 159.14µs 462.615µs 4 shortcodes/registration_start.html
516.528µs 129.132µs 397.251µs 4 shortcodes/registration_end.html
511.324µs 170.441µs 508.691µs 3 _default/list.html
435.761µs 217.88µs 383.803µs 2 _internal/shortcodes/figure.html
425.574µs 141.858µs 266.418µs 3 partials/blog_pagination.html
420.5µs 420.5µs 420.5µs 1 shortcodes/privacy_policy.html
328.436µs 328.436µs 328.436µs 1 _internal/shortcodes/youtube.html
| EN
+------------------+------+
Pages | 3847
Paginator pages | 2
Non-page files | 67
Static files | 6266
Processed images | 0
Aliases | 347
Sitemaps | 1
Cleaned | 0
Total in 57247 ms
The immediate low-hanging fruit here is the partials/footer.html
which is taking up ~60s of CPU time all by itself. The footer has nothing specific to a page on it, so we can use the partialCached
function on it. Because this is called in every page this ends up saving us almost half our rendering time right off the bat.
After caching the footer:
$ hugo --templateMetrics
Building sites …
Template Metrics:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
41.411188753s 31.708414ms 189.413536ms 1306 talk/single.html
40.120164067s 31.368384ms 77.390703ms 1279 speaker/single.html
34.756380122s 511.123237ms 1.586236629s 68 program/single.html
13.11547194s 3.424405ms 28.981152ms 3830 partials/head.html
11.028678939s 2.87955ms 23.307328ms 3830 partials/head/seo.html
6.70176324s 7.06937ms 34.614808ms 948 event/single.html
6.05451722s 1.637683ms 14.983533ms 3697 partials/sponsors.html
5.387492342s 1.406655ms 20.829883ms 3830 partials/head/seo/open_graph.html
4.629631978s 1.208781ms 21.779831ms 3830 partials/head/seo/twitter_cards.html
2.158141125s 31.737369ms 56.709983ms 68 speakers/single.html
1.779862497s 464.716µs 8.390858ms 3830 partials/footer_scripts.html
1.266605005s 1.213223ms 7.715693ms 1044 partials/events/event_metadata.html
1.216056198s 317.508µs 8.950679ms 3830 partials/head_includes.html
1.094646464s 288.748µs 6.59618ms 3791 partials/events/event_navbar.html
1.043537285s 10.87018ms 26.050307ms 96 welcome/single.html
...
(many more)
...
25.201791ms 25.201791ms 25.201791ms 1 partials/footer.html
Total in 32700 ms
That saves us a lot!
Some other areas of improvement to be made:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
41.411188753s 31.708414ms 189.413536ms 1306 talk/single.html
40.120164067s 31.368384ms 77.390703ms 1279 speaker/single.html
The talk and speaker pages are unique so we can't avoid rendering these a lot. But they are relatively expensive so if we can make some improvements on these it'll bring down overall time.
A look at the talk/single.html
template shows that in order to find data about the speaker we're looping over all the speakers that have ever spoken! If instead we use the .GetPage
(with proper nil-testing), we can avoid doing all that work.
After making that fix:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
9.401234875s 7.198495ms 34.449072ms 1306 talk/single.html
...
Total in 29238 ms
We do the same thing for the speaker page, where we iterate over all the talks to find the just the ones they're giving at that event. After making that fix for speaker pages too:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
7.668514759s 5.995711ms 32.592151ms 1279 speaker/single.html
...
Total in 16277 ms
I've opened a PR with these three big-ticket items.
A future area to examine is the program page:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
33.957641921s 499.377087ms 1.547919927s 68 program/single.html
The program page is very expensive on an individual basis. So optmizing that would have a big impact.
Also for future work, smaller but harder-fought wins could be found in some of the other partials:
cumulative average maximum
duration duration duration count template
---------- -------- -------- ----- --------
13.076714781s 3.414285ms 36.539806ms 3830 partials/head.html
11.006667232s 2.873803ms 35.556077ms 3830 partials/head/seo.html
8.106079242s 6.206798ms 24.68941ms 1306 talk/single.html
7.668514759s 5.995711ms 32.592151ms 1279 speaker/single.html
6.906285184s 7.28511ms 33.678155ms 948 event/single.html
6.228540789s 1.684755ms 15.531651ms 3697 partials/sponsors.html
5.261766705s 1.373829ms 25.380798ms 3830 partials/head/seo/open_graph.html
4.725021227s 1.233687ms 33.282696ms 3830 partials/head/seo/twitter_cards.html
2.555655355s 37.583166ms 68.366637ms 68 speakers/single.html
1.794808041s 468.618µs 12.21502ms 3830 partials/footer_scripts.html
1.290601801s 1.236208ms 5.400415ms 1044 partials/events/event_metadata.html
1.224713451s 319.768µs 11.675193ms 3830 partials/head_includes.html
Probably no way around rendering most of these for each page, but each renders very quickly. If we could extract cacheable partials from these that might reduce some time here but we'd want to test each change.