Curbing the Carbon Footprint of Machine Learning

By Elcke Vels

The massive size of datasets and the number of calculations required to train Machine Learning algorithms result in a massive cloud-server workload with a significant carbon footprint. The European SustainML project meant to devise an innovative development framework that will help AI designers to reduce the power consumption of their applications, says French research institute Inria in a press release.

According to a study published in the journal Nature, a run-of-the-mill training model used for natural language processing in 2019 emitted 300,000 kg of CO2, the equivalent of 125 round-trip flights between New York and Beijing. Five years down the line, all sectors of society are now eagerly embracing deep neural networks, and as Artificial Intelligence grows to unprecedented proportions, so does the toll on the planet.

In light of this, the main objective of the European project SustainML is to create a framework that will make it easier for AI designers to consider the energy consumption of their Machine Learning applications as they develop them. Janin Koch, scientist of the Ex-Situ project-team common to the Inria Saclay Centre, delves more specifically into how Human-Computer Interaction (HCI) can be leveraged to assist AI designers in making more sustainable decisions across the entire ML life-cycle and to raise awareness of the cost-benefit trade-off behind each of these choices. The project started in October 2022 and includes Inria and other parties.

Quantifying the Carbon Impact Behind Machine Learning Models

The project encompasses different areas of research. One cornerstone aspect is the quantification of the environmental impact of algorithms and, more precisely, of the consequences of each decision taken throughout the ML life-cycle. For instance, choosing to train a ML model in a cloud facility relying on non-fossil renewable hydroelectricity as opposed to a data center powered by a coal-fired plant will evidently make quite a difference in how much carbon is emitted.

However, that is not the end of the story. “It’s a much broader problem than just picking a clean cloud,” according to Janin Koch it’s actually a matter of rethinking what we really need. “The trend in the AI community is to say: the more data, the more complex the model, the better the final results. Which, to some extent, is not entirely unfounded, especially when it comes to complex problems. However, many applications do not necessarily require this level of accuracy or this volume of data.” So, before we even begin an AI project, scientists should ask themselves, ‘What do I really need?’

Support Us!

Are there more sustainable alternatives that need less data or require less run time? Instead of collecting large amount of data, couldn’t I just reuse/repurpose existing datasets that are available out there? Should I create and train a model from scratch, or can I reuse one that is already available in code repositories? Is it really necessary to run my model for a long time? “In the end, it’s not only about improving an algorithm, but also about improving the application’s entire life cycle.”

Human-Centric Interactive Framework

Aside from raising awareness about sustainability trade-offs, the project intends to create an interactive tool to help developers make more sustainable decisions at every stage of the development process. And this is where Koch’s contribution comes in. “My research area is Human-Computer Interaction. I am interested in how humans and systems can collaborate to explore new ideas. HCI includes both how users express their goals to a system, and how systems make suggestions and explain them iteratively.”

“In the context of this project, this means: What do developers know before starting a project, and how could they describe the overall goal to a system? This can be quite vague at times. Hence, we consider how systems can assist in determining what is required to achieve a given goal and which approaches would be suitable for doing so.“

For such a tool to work, it must be able to explain to the user how a decision is reached, how a conclusion is made, how a constraint is enforced. “This process is actually quite challenging. If an algorithm claims that one particular decision is 80% better, what does it mean to the user? That’s not the way people understand things”. Instead, she suggest to contextualize explanations within the project’s goals and the process of the user, to make these explanations more meaningful.

The SustainML project is expected to have a significant impact on the so-called “democratisation of Green AI”, allowing not just tech giants but also SMEs, private enthusiasts, NGOs, and individual innovators to develop AI in a more sustainable way.

By Elcke Vels

Quantifying the Carbon Impact Behind Machine Learning Models

Support Us!

Human-Centric Interactive Framework

Related Posts