|
| 1 | +--- |
| 2 | +layout: blog-detail |
| 3 | +post-type: blog |
| 4 | +by: Julien Richard-Foy, Wojciech Mazur |
| 5 | +title: "Sustainable Scala" |
| 6 | +--- |
| 7 | + |
| 8 | +To what extent do programming languages affect the energy consumption of |
| 9 | +programs? How does Scala compare to the other programming languages? Does that |
| 10 | +matter for building an environmentally sustainable digital world? |
| 11 | + |
| 12 | +We found out that the most important lever to reduce the energy |
| 13 | +consumption of the IT sector is to **extend the lifetime of hardware**. |
| 14 | +Nevertheless, the energy consumption due to **running software** is |
| 15 | +important, and it **varies significantly between programming languages**. |
| 16 | +Scala is well positioned for a high-level language, and **writing low-level |
| 17 | +imperative code** can make a difference. |
| 18 | + |
| 19 | +## The world is warming, what about the digital world? |
| 20 | + |
| 21 | +According to [climate experts], it is urgent to assess the environmental |
| 22 | +impact of our decisions (in all areas) to significantly reduce the |
| 23 | +greenhouse-gas (GHG) emissions of human activities. |
| 24 | + |
| 25 | +In 2019, the sector of IT was responsible for 3.5% of the worldwide GHG |
| 26 | +emissions [^1] [^2], which is similar to the aviation sector. More |
| 27 | +importantly, this number is growing exponentially, and it is expected to |
| 28 | +double between now and 2025 [^1]. This trajectory is obviously not |
| 29 | +sustainable. The GHG emissions of the IT sector must decrease. |
| 30 | + |
| 31 | +In that regard, good news is that more and more IT companies are committing to |
| 32 | +targeting carbon neutrality, including Scala companies such as [Netflix], |
| 33 | +[Zalando], and [Stripe]. |
| 34 | + |
| 35 | +Where do we start? If we break down the GHG emissions within the IT sector, |
| 36 | +we observe that most of them happen during the manufacturing process of |
| 37 | +hardware rather than during the usage of hardware [^1] [^3]. Therefore, the |
| 38 | +first action point is to **extend the lifetime of hardware**. |
| 39 | + |
| 40 | +Then, we can have a look at the emissions due to using hardware, namely |
| 41 | +the emissions due to running software. |
| 42 | + |
| 43 | +## Do programming languages affect energy consumption? |
| 44 | + |
| 45 | +The GHG emissions due to running software are caused by the energy |
| 46 | +consumption of computers. Obviously, the more computations are performed, the |
| 47 | +more energy is consumed. Said otherwise, the nature of the program is more |
| 48 | +important than the language in which it is written. |
| 49 | + |
| 50 | +That being said, if we implemented the exact same program in various |
| 51 | +languages, would we observe significant differences in energy consumption, |
| 52 | +solely due to intrinsic _language overhead_? This is the question [Rui |
| 53 | +Pereira _et. al._ try to answer][energy-efficiency-languages]. |
| 54 | + |
| 55 | +They compared the execution time, energy consumption, and memory consumption |
| 56 | +of running 10 different programs each implemented in 27 programming languages |
| 57 | +[^4]. Caution needs to be taken when interpreting benchmark results, but |
| 58 | +some general trends emerged. |
| 59 | + |
| 60 | +Notably, **the energy consumption does vary by factors up to 80 between |
| 61 | +programming languages**. On average, C and Rust programs are the most energy |
| 62 | +efficient. Java programs consume about 2 times more energy than C programs. |
| 63 | +JavaScript/TypeScript programs consume between 4 to 20 times more energy |
| 64 | +than C programs. Finally, Python programs are black sheeps, with an energy |
| 65 | +consumption 75 times higher than C programs. |
| 66 | + |
| 67 | +Where does Scala stand in this picture? Unfortunately, Scala was not |
| 68 | +included in this study. |
| 69 | + |
| 70 | +## Including Scala in the energy benchmarks |
| 71 | + |
| 72 | +The code used by the study is [open source][energy-language-repo]. It reuses |
| 73 | +the benchmarks of the [Computer Language Benchmarks Game], and measures not |
| 74 | +only the execution time and memory consumption, but also the energy |
| 75 | +consumption (via [perf tools]). |
| 76 | + |
| 77 | +In order to include Scala, we had the following plan: |
| 78 | + |
| 79 | +- take [old Scala implementations] of the benchmarks (which were written at |
| 80 | + a time where Scala was part of the Computer Language Benchmarks Game, and |
| 81 | + which are now sitting in an archived Git repository), |
| 82 | +- adapt the infrastructure created by the study to run Java benchmarks to |
| 83 | + also run Scala benchmarks. |
| 84 | + |
| 85 | +However, things were not that simple in practice. First, we discovered that the |
| 86 | +existing infrastructure was not properly warming-up the JVM before running the |
| 87 | +benchmarks, leading to an over-estimation of about 30% of their energy |
| 88 | +consumption. Second, and more importantly, we noticed that the different |
| 89 | +implementations of the same benchmarks (one per programming language) were |
| 90 | +sometimes using different algorithms, making things obviously less comparable. |
| 91 | + |
| 92 | +We managed to find a solution to the first issue by implementing a [JVM-based |
| 93 | +runner] for the benchmarks. Our runner implements the classic scheme of |
| 94 | +running several warming-up iterations before doing measurements. It measures |
| 95 | +the energy consumption by using [jRAPL]. |
| 96 | + |
| 97 | +Dealing with the second issue was a bit less satisfactory. We don’t have the |
| 98 | +capacity of checking and fixing the 50 benchmark implementations to use |
| 99 | +comparable algorithms. Consequently, we only focused on the Java and Scala |
| 100 | +benchmarks whose performance was too far behind their C counterpart. We |
| 101 | +modified them, when necessary, to use the same algorithm as the C |
| 102 | +implementation of the benchmark. |
| 103 | + |
| 104 | +After several days of work, we had 10 Scala implementations of the |
| 105 | +benchmarks as well as an infrastructure properly warming-up the JVM before |
| 106 | +measuring their execution time, memory consumption, and energy consumption. |
| 107 | + |
| 108 | +## What about day-to-day code? |
| 109 | + |
| 110 | +Heavily optimized code may be different from idiomatic code |
| 111 | +that we write day-to-day, and the results we would get from running the |
| 112 | +benchmarks may not be applicable to idiomatic Scala code. |
| 113 | + |
| 114 | +For this reason, we also created "more idiomatic" versions of the Scala |
| 115 | +benchmarks. The usage of the double quotes is justified because 1) what is |
| 116 | +qualified as idiomatic is fairly subjective, and 2) due to our limited |
| 117 | +resources, we did not completely overhaul the implementations, but we |
| 118 | +only changed some patterns that we believe are typically non-idiomatic. |
| 119 | + |
| 120 | +For instance, the "optimized" version of the `binary-trees` benchmark models |
| 121 | +an empty tree as tree whose branches are `null`: |
| 122 | + |
| 123 | +~~~ scala |
| 124 | +final case class Tree(left: Tree, right: Tree) { |
| 125 | + def checkSum: Int = |
| 126 | + left match { |
| 127 | + case null => 1 |
| 128 | + case tl => 1 + tl.checkSum + right.checkSum |
| 129 | + } |
| 130 | +} |
| 131 | + |
| 132 | +object Tree { |
| 133 | + final val EmptyTree = Tree(null, null) |
| 134 | +} |
| 135 | +~~~ |
| 136 | + |
| 137 | +Whereas the idiomatic one uses a class hierarchy: |
| 138 | + |
| 139 | +~~~ scala |
| 140 | +sealed trait Tree { |
| 141 | + def checkSum: Int |
| 142 | +} |
| 143 | + |
| 144 | +case class NonEmptyTree(left: Tree, right: Tree) extends Tree { |
| 145 | + def checkSum: Int = 1 + left.checkSum + right.checkSum |
| 146 | +} |
| 147 | + |
| 148 | +case object EmptyTree extends Tree { |
| 149 | + def checkSum: Int = 1 |
| 150 | +} |
| 151 | +~~~ |
| 152 | + |
| 153 | +Does this change impact performance? Keep reading to know the answer. |
| 154 | + |
| 155 | +Another example of difference between idiomatic and optimized versions is the |
| 156 | +usage of Scala collections instead of Java `Array`s, and the usage of `for` |
| 157 | +loop instead of `while` loops. |
| 158 | + |
| 159 | +### Protocol |
| 160 | + |
| 161 | +We compared the energy consumption of 10 benchmarks, each written in C, Scala, |
| 162 | +Java, JavaScript, and Python. To achieve this, we executed each benchmark 10 |
| 163 | +times (after 5 warm-up iterations), on an Intel i9-7900X @ 3.30 GHz with 20 |
| 164 | +CPUs and 128 GB of memory, with OpenJDK 17, Node 10.19.0, and Python 3.8.10. |
| 165 | + |
| 166 | +You can find the source code of the benchmarks in the following Git repository: |
| 167 | +[https://github.com/WojciechMazur/Energy-Languages](https://github.com/WojciechMazur/Energy-Languages/tree/feature/scala-develop). |
| 168 | + |
| 169 | +### Results |
| 170 | + |
| 171 | +To visualize better how the languages compare (regardless of the nature of |
| 172 | +the benchmarks), we normalized the measurements, using C as a baseline. |
| 173 | + |
| 174 | +The figure below shows the normalized average energy consumption for each |
| 175 | +benchmark, for the languages C, Java, and Scala (lower is better): |
| 176 | + |
| 177 | + |
| 178 | + |
| 179 | + |
| 180 | +We observe that the Scala benchmarks were sometimes consuming a similar |
| 181 | +amount of energy as the C benchmark (this is the case for `binary-trees`, |
| 182 | +`fannkuch-redux`, `fasta`, and `n-body`), and sometimes a significantly higher |
| 183 | +amount of energy (up to 12 times more energy for `regex-redux`). |
| 184 | + |
| 185 | +The figure below shows the same information, but it now includes JavaScript |
| 186 | +and Python: |
| 187 | + |
| 188 | + |
| 189 | + |
| 190 | + |
| 191 | +Compared to C, the Python benchmarks consumed between 4 to 339 times more |
| 192 | +energy, and the JavaScript benchmarks consumed between 2 to 12 times |
| 193 | +more energy. |
| 194 | + |
| 195 | +Last, the figure below compares C, Java, Scala, and "idiomatic" Scala |
| 196 | +benchmarks: |
| 197 | + |
| 198 | + |
| 199 | + |
| 200 | + |
| 201 | +For some benchmarks, the idiomatic version performs as well as the optimized |
| 202 | +one (for `binary-trees`, the idiomatic version performs even slightly better |
| 203 | +than the -- supposedly -- optimized one). However, for some other benchmarks, |
| 204 | +the idiomatic version consumes significantly more energy than the optimized |
| 205 | +one (up to 7 times more energy for `k-nucleotide`). |
| 206 | + |
| 207 | +## Discussion |
| 208 | + |
| 209 | +Within a language, we observe a high variability between benchmarks (e.g., |
| 210 | +Python consumed between 4 to 300 times more energy than C, depending on |
| 211 | +which benchmark we look at). This makes it hard to draw general conclusions |
| 212 | +like "language X consumes N times more energy than language Y". That being |
| 213 | +said, we believe that computing, for every language, their average energy |
| 214 | +consumption relative to C provides an order of magnitude of how the |
| 215 | +language may perform. The table below shows the average energy consumption |
| 216 | +relative to C, as well as the standard deviation: |
| 217 | + |
| 218 | +| Language | Average energy consumption (normalized) | Standard Deviation | |
| 219 | +|---|---|---| |
| 220 | +| C | 1.00 | 0.00 | |
| 221 | +| Java | 2.04 | 1.45 | |
| 222 | +| **Scala** | **3.71** | **3.67** | |
| 223 | +| **Scala (idiomatic)** | **6.99** | **4.87** | |
| 224 | +| JavaScript | 7.63 | 3.62 | |
| 225 | +| Python | 89.33 | 113.79 | |
| 226 | + |
| 227 | +A similar table was shown by Rui Pereira _et. al_ [^4]. While we did not get |
| 228 | +exactly the same numbers, the orders of magnitude remain the same. |
| 229 | + |
| 230 | +We now have an answer to our initial question, “where does Scala |
| 231 | +stand in the picture?” According to these benchmarks, Scala is well positioned |
| 232 | +within high-level programming languages. |
| 233 | + |
| 234 | +Also, we see that in Scala, two implementations of the same benchmark can |
| 235 | +easily show different performances, depending on your code style. For |
| 236 | +instance, the energy consumption of the `k-nucleotide` benchmark was between |
| 237 | +2 to 13 times higher than the C implementation. The differences between both |
| 238 | +versions are mainly the usage of immutable Scala collections and `for` loops in |
| 239 | +the idiomatic version, as opposed to `Array`s and `while` loops in the |
| 240 | +optimized version (this observation is consistent with the results of |
| 241 | +another study ran by Rui Pereira _et. al._ [^5]). |
| 242 | + |
| 243 | +We believe that the fact that Scala embraces several programming paradigms |
| 244 | +is a strength. It makes it easy to write high-level code that reads well, |
| 245 | +and it also makes it easy to write low-level code that performs well. |
| 246 | + |
| 247 | +## Next steps |
| 248 | + |
| 249 | +This work measured the performance of Scala on the JVM platform only. What |
| 250 | +about Scala.js and Scala Native? Including those platforms could be |
| 251 | +achieved in a follow-up study. |
| 252 | + |
| 253 | +The Scala Center started this work to get a first rough idea of what it |
| 254 | +entails to reduce the GHG emissions of running software written in Scala, |
| 255 | +but also, and more importantly, to see if there is any interest from |
| 256 | +companies that use Scala into consolidating the methodology and [tools] to |
| 257 | +reach their [sustainable development goals]. Please get in touch with |
| 258 | +[us ](mailto:[email protected]?subject=Sustainable%20Scala) if you want to be |
| 259 | +part of it! |
| 260 | + |
| 261 | +[^1]: The Shift Project. (2021, March). Impact environnemental du numérique. https://theshiftproject.org/article/impact-environnemental-du-numerique-5g-nouvelle-etude-du-shift/ |
| 262 | +[^2]: Freitag, C., Berners-Lee, M., Widdicks, K., Knowles, B., Blair, G., & Friday, A. (2021). The climate impact of ICT: A review of estimates, trends and regulations. arXiv preprint arXiv:2102.02622. |
| 263 | +[^3]: Gupta, U., Kim, Y. G., Lee, S., Tse, J., Lee, H. H. S., Wei, G. Y., ... & Wu, C. J. (2021, February). Chasing Carbon: The Elusive Environmental Footprint of Computing. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) (pp. 854-867). IEEE. |
| 264 | +[^4]: Rui Pereira, Marco Couto, Francisco Ribeiro, Rui Rua, Jácome Cunha, João Paulo Fernandes, and João Saraiva. 2017. Energy efficiency across programming languages: how do energy, time, and memory relate? In Proceedings of the 10th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2017). Association for Computing Machinery, New York, NY, USA, 256–267. DOI:https://doi.org/10.1145/3136014.3136031 |
| 265 | +[^5]: Pereira, R., Couto, M., Ribeiro, F., Rua, R., Cunha, J., Fernandes, J. P., & Saraiva, J. (2021). Ranking programming languages by energy efficiency. Science of Computer Programming, 205, 102609. |
| 266 | + |
| 267 | +[climate experts]: https://www.ipcc.ch/sr15/chapter/spm/ |
| 268 | +[energy-efficiency-languages]: https://sites.google.com/view/energy-efficiency-languages |
| 269 | +[energy-language-repo]: https://github.com/greensoftwarelab/Energy-Languages |
| 270 | +[Computer Language Benchmarks Game]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/ |
| 271 | +[old Scala implementations]: https://salsa.debian.org/benchmarksgame-team/archive-alioth-benchmarksgame/-/tree/master/contributed-source-code/benchmarksgame |
| 272 | +[perf tools]: https://en.wikipedia.org/wiki/Perf_(Linux) |
| 273 | +[Netflix]: https://about.netflix.com/en/news/net-zero-nature-our-climate-commitment |
| 274 | +[Zalando]: https://corporate.zalando.com/en/newsroom/news-stories/zalando-goes-carbon-neutral |
| 275 | +[Stripe]: https://stripe.com/blog/first-negative-emissions-purchases |
| 276 | +[JVM-based runner]: https://github.com/WojciechMazur/Energy-Languages/blob/cd9a9d6d8e40911af6f823b1ecc4f6173c51c36c/Scala/sRAPL/sRAPL.scala#L9-L26 |
| 277 | +[jRAPL]: https://github.com/WojciechMazur/jRAPL |
| 278 | +[tools]: https://github.com/WojciechMazur/Energy-Languages/blob/7cf02bffe6d3b9af39f23e1d0edc0cb24c1e450c/Scala/sRAPL/sRAPL.scala |
| 279 | +[sustainable development goals]: https://www.un.org/sustainabledevelopment/ |
0 commit comments