This is a multi-part message in MIME format. --nextPart3265584.aeNJFYEL58 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" On Montag, 4. Dezember 2023 12:09:43 CET Joseph P. De Veaugh-Geiss wrote: > I agree with the concerns Josh raises about the energy consumption of > training LLMs (see, e.g., [1]). A benefit of satisfying the above > characteristics is it is then possible for us to measure the energy > consumption for training/using the LLMs. This would enable KDE to be > transparent about what these tools consume in terms of energy and > present this information to users. To make this argument more complete, it's not only the training of such mo= dels but also their usage ("inference") later. For popular generic models the negative i= mpact can quickly become bigger than the impact of the training itself: https://www.technologyreview.com/2023/12/01/1084189/making-an-image-with-g= enerative-ai-uses-as-much-energy-as-charging-your-phone/[1] https://arxiv.org/abs/2311.16863[2] Though, these and similar arguments mostly tend to ignore the fact that th= e hyperscalers have commited to the net-zero imitative and are heavily investing into ren= ewables for their data centers and also trying to shift heavy workloads into more sust= ainable time windows. =2D- Alexander =2D------- [1] https://www.technologyreview.com/2023/12/01/1084189/making-an-image-wi= th-generative-ai-uses-as-much-energy-as-charging-your-phone/ [2] https://arxiv.org/abs/2311.16863 --nextPart3265584.aeNJFYEL58 Content-Transfer-Encoding: 7Bit Content-Type: text/html; charset="us-ascii"

On Montag, 4. Dezember 2023 12:09:43 CET Joseph P. De Veaugh-Geiss wrote:

> I agree with the concerns Josh raises about the energy consumption of

> training LLMs (see, e.g., [1]). A benefit of satisfying the above

> characteristics is it is then possible for us to measure the energy

> consumption for training/using the LLMs. This would enable KDE to be

> transparent about what these tools consume in terms of energy and

> present this information to users.

To make this argument more complete, it's not only the training of such models but also their usage ("inference") later. For popular generic models the negative impact can quickly become bigger than the impact of the training itself:

https://www.technologyreview.com/2023/12/01/1084189/making-an-image-with-generative-ai-uses-as-much-energy-as-charging-your-phone/

https://arxiv.org/abs/2311.16863

Though, these and similar arguments mostly tend to ignore the fact that the hyperscalers have commited to the net-zero imitative and are heavily investing into renewables for their data centers and also trying to shift heavy workloads into more sustainable time windows.

Alexander

--nextPart3265584.aeNJFYEL58--