Skip to content

Commit 60ab591

Browse files
authored
Merge pull request #1310 from scalacenter/blog/sustainable-scala
Add blog post “sustainable scala”
2 parents db9e5cb + 5f1f267 commit 60ab591

7 files changed

+279
-0
lines changed
Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
---
2+
layout: blog-detail
3+
post-type: blog
4+
by: Julien Richard-Foy, Wojciech Mazur
5+
title: "Sustainable Scala"
6+
---
7+
8+
To what extent do programming languages affect the energy consumption of
9+
programs? How does Scala compare to the other programming languages? Does that
10+
matter for building an environmentally sustainable digital world?
11+
12+
We found out that the most important lever to reduce the energy
13+
consumption of the IT sector is to **extend the lifetime of hardware**.
14+
Nevertheless, the energy consumption due to **running software** is
15+
important, and it **varies significantly between programming languages**.
16+
Scala is well positioned for a high-level language, and **writing low-level
17+
imperative code** can make a difference.
18+
19+
## The world is warming, what about the digital world?
20+
21+
According to [climate experts], it is urgent to assess the environmental
22+
impact of our decisions (in all areas) to significantly reduce the
23+
greenhouse-gas (GHG) emissions of human activities.
24+
25+
In 2019, the sector of IT was responsible for 3.5% of the worldwide GHG
26+
emissions [^1] [^2], which is similar to the aviation sector. More
27+
importantly, this number is growing exponentially, and it is expected to
28+
double between now and 2025 [^1]. This trajectory is obviously not
29+
sustainable. The GHG emissions of the IT sector must decrease.
30+
31+
In that regard, good news is that more and more IT companies are committing to
32+
targeting carbon neutrality, including Scala companies such as [Netflix],
33+
[Zalando], and [Stripe].
34+
35+
Where do we start? If we break down the GHG emissions within the IT sector,
36+
we observe that most of them happen during the manufacturing process of
37+
hardware rather than during the usage of hardware [^1] [^3]. Therefore, the
38+
first action point is to **extend the lifetime of hardware**.
39+
40+
Then, we can have a look at the emissions due to using hardware, namely
41+
the emissions due to running software.
42+
43+
## Do programming languages affect energy consumption?
44+
45+
The GHG emissions due to running software are caused by the energy
46+
consumption of computers. Obviously, the more computations are performed, the
47+
more energy is consumed. Said otherwise, the nature of the program is more
48+
important than the language in which it is written.
49+
50+
That being said, if we implemented the exact same program in various
51+
languages, would we observe significant differences in energy consumption,
52+
solely due to intrinsic _language overhead_? This is the question [Rui
53+
Pereira _et. al._ try to answer][energy-efficiency-languages].
54+
55+
They compared the execution time, energy consumption, and memory consumption
56+
of running 10 different programs each implemented in 27 programming languages
57+
[^4]. Caution needs to be taken when interpreting benchmark results, but
58+
some general trends emerged.
59+
60+
Notably, **the energy consumption does vary by factors up to 80 between
61+
programming languages**. On average, C and Rust programs are the most energy
62+
efficient. Java programs consume about 2 times more energy than C programs.
63+
JavaScript/TypeScript programs consume between 4 to 20 times more energy
64+
than C programs. Finally, Python programs are black sheeps, with an energy
65+
consumption 75 times higher than C programs.
66+
67+
Where does Scala stand in this picture? Unfortunately, Scala was not
68+
included in this study.
69+
70+
## Including Scala in the energy benchmarks
71+
72+
The code used by the study is [open source][energy-language-repo]. It reuses
73+
the benchmarks of the [Computer Language Benchmarks Game], and measures not
74+
only the execution time and memory consumption, but also the energy
75+
consumption (via [perf tools]).
76+
77+
In order to include Scala, we had the following plan:
78+
79+
- take [old Scala implementations] of the benchmarks (which were written at
80+
a time where Scala was part of the Computer Language Benchmarks Game, and
81+
which are now sitting in an archived Git repository),
82+
- adapt the infrastructure created by the study to run Java benchmarks to
83+
also run Scala benchmarks.
84+
85+
However, things were not that simple in practice. First, we discovered that the
86+
existing infrastructure was not properly warming-up the JVM before running the
87+
benchmarks, leading to an over-estimation of about 30% of their energy
88+
consumption. Second, and more importantly, we noticed that the different
89+
implementations of the same benchmarks (one per programming language) were
90+
sometimes using different algorithms, making things obviously less comparable.
91+
92+
We managed to find a solution to the first issue by implementing a [JVM-based
93+
runner] for the benchmarks. Our runner implements the classic scheme of
94+
running several warming-up iterations before doing measurements. It measures
95+
the energy consumption by using [jRAPL].
96+
97+
Dealing with the second issue was a bit less satisfactory. We don’t have the
98+
capacity of checking and fixing the 50 benchmark implementations to use
99+
comparable algorithms. Consequently, we only focused on the Java and Scala
100+
benchmarks whose performance was too far behind their C counterpart. We
101+
modified them, when necessary, to use the same algorithm as the C
102+
implementation of the benchmark.
103+
104+
After several days of work, we had 10 Scala implementations of the
105+
benchmarks as well as an infrastructure properly warming-up the JVM before
106+
measuring their execution time, memory consumption, and energy consumption.
107+
108+
## What about day-to-day code?
109+
110+
Heavily optimized code may be different from idiomatic code
111+
that we write day-to-day, and the results we would get from running the
112+
benchmarks may not be applicable to idiomatic Scala code.
113+
114+
For this reason, we also created "more idiomatic" versions of the Scala
115+
benchmarks. The usage of the double quotes is justified because 1) what is
116+
qualified as idiomatic is fairly subjective, and 2) due to our limited
117+
resources, we did not completely overhaul the implementations, but we
118+
only changed some patterns that we believe are typically non-idiomatic.
119+
120+
For instance, the "optimized" version of the `binary-trees` benchmark models
121+
an empty tree as tree whose branches are `null`:
122+
123+
~~~ scala
124+
final case class Tree(left: Tree, right: Tree) {
125+
def checkSum: Int =
126+
left match {
127+
case null => 1
128+
case tl => 1 + tl.checkSum + right.checkSum
129+
}
130+
}
131+
132+
object Tree {
133+
final val EmptyTree = Tree(null, null)
134+
}
135+
~~~
136+
137+
Whereas the idiomatic one uses a class hierarchy:
138+
139+
~~~ scala
140+
sealed trait Tree {
141+
def checkSum: Int
142+
}
143+
144+
case class NonEmptyTree(left: Tree, right: Tree) extends Tree {
145+
def checkSum: Int = 1 + left.checkSum + right.checkSum
146+
}
147+
148+
case object EmptyTree extends Tree {
149+
def checkSum: Int = 1
150+
}
151+
~~~
152+
153+
Does this change impact performance? Keep reading to know the answer.
154+
155+
Another example of difference between idiomatic and optimized versions is the
156+
usage of Scala collections instead of Java `Array`s, and the usage of `for`
157+
loop instead of `while` loops.
158+
159+
### Protocol
160+
161+
We compared the energy consumption of 10 benchmarks, each written in C, Scala,
162+
Java, JavaScript, and Python. To achieve this, we executed each benchmark 10
163+
times (after 5 warm-up iterations), on an Intel i9-7900X @ 3.30 GHz with 20
164+
CPUs and 128 GB of memory, with OpenJDK 17, Node 10.19.0, and Python 3.8.10.
165+
166+
You can find the source code of the benchmarks in the following Git repository:
167+
[https://github.com/WojciechMazur/Energy-Languages](https://github.com/WojciechMazur/Energy-Languages/tree/feature/scala-develop).
168+
169+
### Results
170+
171+
To visualize better how the languages compare (regardless of the nature of
172+
the benchmarks), we normalized the measurements, using C as a baseline.
173+
174+
The figure below shows the normalized average energy consumption for each
175+
benchmark, for the languages C, Java, and Scala (lower is better):
176+
177+
![](/resources/img/blog/sustainable-scala/c-java-scala-1.png)
178+
![](/resources/img/blog/sustainable-scala/c-java-scala-2.png)
179+
180+
We observe that the Scala benchmarks were sometimes consuming a similar
181+
amount of energy as the C benchmark (this is the case for `binary-trees`,
182+
`fannkuch-redux`, `fasta`, and `n-body`), and sometimes a significantly higher
183+
amount of energy (up to 12 times more energy for `regex-redux`).
184+
185+
The figure below shows the same information, but it now includes JavaScript
186+
and Python:
187+
188+
![](/resources/img/blog/sustainable-scala/c-java-scala-js-python-1.png)
189+
![](/resources/img/blog/sustainable-scala/c-java-scala-js-python-2.png)
190+
191+
Compared to C, the Python benchmarks consumed between 4 to 339 times more
192+
energy, and the JavaScript benchmarks consumed between 2 to 12 times
193+
more energy.
194+
195+
Last, the figure below compares C, Java, Scala, and "idiomatic" Scala
196+
benchmarks:
197+
198+
![](/resources/img/blog/sustainable-scala/c-java-scala-idiomatic-1.png)
199+
![](/resources/img/blog/sustainable-scala/c-java-scala-idiomatic-2.png)
200+
201+
For some benchmarks, the idiomatic version performs as well as the optimized
202+
one (for `binary-trees`, the idiomatic version performs even slightly better
203+
than the -- supposedly -- optimized one). However, for some other benchmarks,
204+
the idiomatic version consumes significantly more energy than the optimized
205+
one (up to 7 times more energy for `k-nucleotide`).
206+
207+
## Discussion
208+
209+
Within a language, we observe a high variability between benchmarks (e.g.,
210+
Python consumed between 4 to 300 times more energy than C, depending on
211+
which benchmark we look at). This makes it hard to draw general conclusions
212+
like "language X consumes N times more energy than language Y". That being
213+
said, we believe that computing, for every language, their average energy
214+
consumption relative to C provides an order of magnitude of how the
215+
language may perform. The table below shows the average energy consumption
216+
relative to C, as well as the standard deviation:
217+
218+
| Language | Average energy consumption (normalized) | Standard Deviation |
219+
|---|---|---|
220+
| C | 1.00 | 0.00 |
221+
| Java | 2.04 | 1.45 |
222+
| **Scala** | **3.71** | **3.67** |
223+
| **Scala (idiomatic)** | **6.99** | **4.87** |
224+
| JavaScript | 7.63 | 3.62 |
225+
| Python | 89.33 | 113.79 |
226+
227+
A similar table was shown by Rui Pereira _et. al_ [^4]. While we did not get
228+
exactly the same numbers, the orders of magnitude remain the same.
229+
230+
We now have an answer to our initial question, “where does Scala
231+
stand in the picture?” According to these benchmarks, Scala is well positioned
232+
within high-level programming languages.
233+
234+
Also, we see that in Scala, two implementations of the same benchmark can
235+
easily show different performances, depending on your code style. For
236+
instance, the energy consumption of the `k-nucleotide` benchmark was between
237+
2 to 13 times higher than the C implementation. The differences between both
238+
versions are mainly the usage of immutable Scala collections and `for` loops in
239+
the idiomatic version, as opposed to `Array`s and `while` loops in the
240+
optimized version (this observation is consistent with the results of
241+
another study ran by Rui Pereira _et. al._ [^5]).
242+
243+
We believe that the fact that Scala embraces several programming paradigms
244+
is a strength. It makes it easy to write high-level code that reads well,
245+
and it also makes it easy to write low-level code that performs well.
246+
247+
## Next steps
248+
249+
This work measured the performance of Scala on the JVM platform only. What
250+
about Scala.js and Scala Native? Including those platforms could be
251+
achieved in a follow-up study.
252+
253+
The Scala Center started this work to get a first rough idea of what it
254+
entails to reduce the GHG emissions of running software written in Scala,
255+
but also, and more importantly, to see if there is any interest from
256+
companies that use Scala into consolidating the methodology and [tools] to
257+
reach their [sustainable development goals]. Please get in touch with
258+
[us](mailto:[email protected]?subject=Sustainable%20Scala) if you want to be
259+
part of it!
260+
261+
[^1]: The Shift Project. (2021, March). Impact environnemental du numérique. https://theshiftproject.org/article/impact-environnemental-du-numerique-5g-nouvelle-etude-du-shift/
262+
[^2]: Freitag, C., Berners-Lee, M., Widdicks, K., Knowles, B., Blair, G., & Friday, A. (2021). The climate impact of ICT: A review of estimates, trends and regulations. arXiv preprint arXiv:2102.02622.
263+
[^3]: Gupta, U., Kim, Y. G., Lee, S., Tse, J., Lee, H. H. S., Wei, G. Y., ... & Wu, C. J. (2021, February). Chasing Carbon: The Elusive Environmental Footprint of Computing. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) (pp. 854-867). IEEE.
264+
[^4]: Rui Pereira, Marco Couto, Francisco Ribeiro, Rui Rua, Jácome Cunha, João Paulo Fernandes, and João Saraiva. 2017. Energy efficiency across programming languages: how do energy, time, and memory relate? In Proceedings of the 10th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2017). Association for Computing Machinery, New York, NY, USA, 256–267. DOI:https://doi.org/10.1145/3136014.3136031
265+
[^5]: Pereira, R., Couto, M., Ribeiro, F., Rua, R., Cunha, J., Fernandes, J. P., & Saraiva, J. (2021). Ranking programming languages by energy efficiency. Science of Computer Programming, 205, 102609.
266+
267+
[climate experts]: https://www.ipcc.ch/sr15/chapter/spm/
268+
[energy-efficiency-languages]: https://sites.google.com/view/energy-efficiency-languages
269+
[energy-language-repo]: https://github.com/greensoftwarelab/Energy-Languages
270+
[Computer Language Benchmarks Game]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/
271+
[old Scala implementations]: https://salsa.debian.org/benchmarksgame-team/archive-alioth-benchmarksgame/-/tree/master/contributed-source-code/benchmarksgame
272+
[perf tools]: https://en.wikipedia.org/wiki/Perf_(Linux)
273+
[Netflix]: https://about.netflix.com/en/news/net-zero-nature-our-climate-commitment
274+
[Zalando]: https://corporate.zalando.com/en/newsroom/news-stories/zalando-goes-carbon-neutral
275+
[Stripe]: https://stripe.com/blog/first-negative-emissions-purchases
276+
[JVM-based runner]: https://github.com/WojciechMazur/Energy-Languages/blob/cd9a9d6d8e40911af6f823b1ecc4f6173c51c36c/Scala/sRAPL/sRAPL.scala#L9-L26
277+
[jRAPL]: https://github.com/WojciechMazur/jRAPL
278+
[tools]: https://github.com/WojciechMazur/Energy-Languages/blob/7cf02bffe6d3b9af39f23e1d0edc0cb24c1e450c/Scala/sRAPL/sRAPL.scala
279+
[sustainable development goals]: https://www.un.org/sustainabledevelopment/
Loading
Loading
Loading
Loading
Loading
Loading

0 commit comments

Comments
 (0)