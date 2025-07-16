Artificial Intelligence

Internet

See all Internet

IT

See all IT

Mobile Tech

See all Mobile Tech

Security

See all Security

Technology

See all Technology

Newsletters

See all Newsletters

Small Changes in AI Models Can Produce Big Energy Savings

green energy data center

Small changes in the large language models (LLMs) at the heart of AI applications can result in substantial energy savings, according to a report released by the United Nations Educational, Scientific and Cultural Organization (UNESCO) on Monday.

The 35-page report titled “Smarter, smaller, stronger: resource-efficient generative AI & the future of digital transformation” outlines three ways AI developers and users can reduce the power gluttony of the technology.

1. Use smaller models.

Smaller models are just as smart and accurate as large ones, according to the report. Small models tailored to specific tasks can cut energy use by up to 90%, the report maintained.

Currently, users rely on large, general-purpose models for all their needs, it explained. Research shows that using smaller models tailored to specific tasks — like translation or summarization — can cut energy use significantly without losing performance. It’s a smarter, more cost- and resource-efficient approach, it continued, matching the right model to the right job, rather than relying on one large, all-purpose system for everything.

What’s more, energy-efficient, small models are more accessible in low-resource environments with limited connectivity, offer faster response times, and are more cost-effective.

2. Use shorter prompts and responses.

Streamlining input queries and response lengths can reduce energy use by over 50%, the report noted. It added that shortening inputs and outputs also reduces the cost of running LLMs.

3. Use compression to shrink the size of the model.

Model compression techniques, such as quantization, can achieve energy savings of up to 44% by reducing computational complexity, the report explained. It also reduces the cost of running LLMs by shrinking their size and making them faster.

Why Smaller Models Use Less Energy

Smaller AI models consume less energy because they have less work to do. “Smaller AI models — what we call small language models — require fewer parameters, less memory, and significantly less GPU throughput,” explained Jim Olsen, CTO of ModelOp, a governance software company, in Chicago.

“That means lower power consumption during both training and inference,” he told TechNewsWorld. “You’re not running billions of operations per token. You’re optimizing for precision in a tighter domain, which leads to more sustainable compute costs.”

Larger models have exponentially more parameters than smaller models, so each time a model is asked a question, it has to perform mathematical calculations across all its parameters to generate an answer.

“More parameters mean more calculations, which require more processing power from the GPUs and, therefore, consume more energy,” said Wyatt Mayham, head of AI consulting at Northwest AI Consulting (NAIC), a global provider of AI consulting services.

“It’s the digital equivalent of a V8 engine burning more gas than a four-cylinder, even when just idling,” he told TechNewsWorld. “A smaller, more specialized model simply has less computational overhead for each task.”

Sagar Indurkhya, chief scientist at Virtualitics, an AI-powered analytics company, in Pasadena, Calif., contended that while smaller LLMs typically do not perform as well as larger or frontier models, it is possible to fine-tune small LLMs on specific relevant data, such as proprietary data that cannot be shared outside a company, so that the tuned model’s performance on very specific tasks is competitive with that of frontier models.

However, he also told TechNewsWorld, “If the goal is reducing power consumption for AI agents, use and adaptation of smaller LLMs is a path forward any company should carefully consider.”

Cutting Chatty Prompts Saves Energy

Although AI models are often referred to as chatbots, it doesn’t pay to be chatty with the AI. “The model understands your intent,” said Mel Morris, CEO of Corpora.ai, maker of an AI search engine, in Derby, England.

“It doesn’t need pleasantries,” he told TechNewsWorld. “It doesn’t really want them. It doesn’t do it any good, but it has to pass those additional words to its model, and that costs compute time.”

Ian Holmes, director and global lead for enterprise fraud solutions at SAS, a software company that specializes in analytics, artificial intelligence, and data management solutions, in Cary, N.C., agreed that prompt brevity can be an energy saver. “It can be potentially quite impactful in reducing the overall energy footprint of AI interactions,” he told TechNewsWorld. “The more unnecessarily complex a prompt is, the more computational power will be required for the LLM to interpret and respond.”

“It’s easy to treat an LLM like a knowledgeable friend, engaging in long, chatty exchanges, but this can unintentionally increase the model’s workload,” he said. “Keeping prompts concise and focused helps reduce the amount of data the model needs to process. That, in turn, can lower the compute power required to generate a response.”

Shorter prompts, however, are not always practical. “Many prompts contain unnecessary context or examples that could be trimmed,” acknowledged Charles Yeomans, CEO and co-founder of AutoBeam, a data compaction and transmission optimization company, in Moraga, Calif.

“However, some tasks inherently require detailed prompts for accuracy,” he told TechNewsWorld. “The key is eliminating redundancy, not sacrificing necessary information.”

There can be a trade-off when it comes to shorter prompts, added Axel Abulafia, chief business officer with CloudX, a software engineering and AI solutions company in Manalapan, N.J. “Smaller prompts are better on paper, but if the error rate of these prompts is double or triple versus a prompt that is only 50% larger, then the equation is clear,” he told TechNewsWorld. “I’d say that smarter prompts can save much more energy than only smaller ones.”

The challenge lies in maintaining quality, added NAIC’s Mayham. “A prompt that is too brief may lack the necessary context for the model to provide a useful or accurate response,” he said. “Likewise, forcing a response to be artificially short might strip it of important nuance.”

“It becomes a balancing act for developers,” he continued. “They need to design prompts that are concise yet contextually rich enough to get the job done. For many routine tasks, this is achievable, but for complex problem-solving, longer and more detailed interactions are often unavoidable.”

Risks and Rewards of Model Compression

UNESCO’s call for shrinking models can have drawbacks, too. “The primary risk is that you can compress a model too much and harm its performance,” Mayham noted. “Overly aggressive pruning or quantization can lead to a drop in accuracy, logical reasoning ability, or nuance, which might make the model unsuitable for its intended purpose. There’s a delicate balance between efficiency and capability.”

In addition, he continued, implementing compression techniques effectively requires deep technical expertise and significant experimentation. “It’s not a one-size-fits-all solution,” he said. “The right compression strategy depends on the specific model architecture and the target application. This can be a high barrier for teams without specialized AI/ML engineering talent.”

The key to reducing AI energy consumption is combining multiple optimizations — smaller models, compression, efficient prompting, better hardware utilization — to multiply savings, maintained AutoBeam’s Yeomans.

“Also consider caching common responses and using specialized models for specific tasks,” he said, “rather than general-purpose LLMs for everything.”

“Even if it is tempting to always throw LLMs at every problem, a good rule of thumb is that solutions should go from simple to complex,” added CloudX’s Abulafia. “There are many problems that can be solved using tried-and-true algorithms. You can use those as baselines and grow in complexity from there. First to smaller fine-tuned models, and only then to large models. Always working smart and realizing that bigger is not always better.”

John P. Mello Jr.

John P. Mello Jr. has been an ECT News Network reporter since 2003. His areas of focus include cybersecurity, IT issues, privacy, e-commerce, social media, artificial intelligence, big data and consumer electronics. He has written and edited for numerous publications, including the Boston Business Journal, the Boston Phoenix, Megapixel.Net and Government Security News. Email John.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Related Stories
IBM Quantum Starling rendering
IBM Plans Large-Scale Fault-Tolerant Quantum Computer by 2029
June 11, 2025
eta’s website showing Llama 4 Capabilities, including the model variants Behemoth, Scout, and Maverick.
Meta Llama 2025: The Open-Source AI Tsunami
May 5, 2025
AI Chatbot
AI Chatbots Can Be Easy Prey for ‘Zero-Knowledge’ Hackers
March 18, 2025
green energy data center
30 Lines of Code Could Cut Data Center Power Use by 30%
February 6, 2025
More by John P. Mello Jr.
view all
WhatsApp app icon on a smartphone screen, highlighting the mobile platform at the center of Meta's super app ambitions.
Meta Positioning WhatsApp To Be a Super App
July 9, 2025
High-performance data centers will be powered by repurposed EV batteries for AI infrastructure.
Redwood Energy Aims To Power AI Data Centers With Used EV Batteries
July 8, 2025
Cloudflare now blocks AI bots from scraping websites by default and introduces a Pay Per Crawl model to control and monetize content access.
Cloudflare Blocks AI Bots by Default, Launches Pay Per Crawl Model
July 2, 2025
AI-enabled identity theft causes rising losses despite fewer reports
Identity Crime Reports Down, Losses Up: ITRC
June 25, 2025
students working on computers in a school classroom
MIT Study Finds ChatGPT Can Harm Critical Thinking Over Time
June 24, 2025
Galaxy Watch and Samsung smartphone showing vascular load data, part of Samsung's planned Health Hub to help users and doctors track cardiovascular health.
Samsung Plans ‘Health Hub’ To Connect Doctors With Patient Data
June 18, 2025
emergency response center for first responders, law enforcement
Lagging 9-1-1 Upgrade Puts Public Safety at Risk
June 17, 2025
IBM Quantum Starling rendering
IBM Plans Large-Scale Fault-Tolerant Quantum Computer by 2029
June 11, 2025
Apple WWDC
WWDC: Apple Unifies Operating Systems, Makes iPad More PC
June 10, 2025
An IT user interacts with generative AI tools, raising enterprise concerns over unsanctioned LLM use.
IT Pros ‘Extremely Worried’ About Shadow AI: Report
June 4, 2025
More in Artificial Intelligence
Samsung’s TM Roh speaks onstage at Unpacked 2025, reinforcing the company’s vision for the future of foldables.
Samsung Galaxy Fold7 Ushers In a New Era for Foldable Phones
July 14, 2025
Advice on how excessive control is making AI models brittle, unpredictable, and unreliable, along with tips for identifying flawed AI behavior and protecting yourself from its growing risks.
Manufactured Madness: How To Protect Yourself From Insane AIs
July 14, 2025
Small business owner using a digital platform powered by AI to explore personalized insurance options.
How AI Personalizes Insurance for Today’s Small Business Owners
July 7, 2025
Overreliance on AI can take control of human thinking.
How To Keep AI From Making Us Stupid
June 30, 2025
IT professional monitors AI analytics and cybersecurity dashboards, with hidden risks of shadow AI and unapproved SaaS tool usage in focus.
Beyond ChatGPT: Shadow AI Risks Lurk in SaaS Tools
June 26, 2025
AI chips on circuit board with data streams representing next-gen hardware disruption
AI’s Inflection Point: Echoes of Hardware Disruption
June 23, 2025
Person walking through city with ambient AI interface projected from wearable device, symbolizing the future of screenless, context-aware computing.
I Know Precisely What Sam Altman and Jony Ive Are Up To
June 17, 2025
AMD Advancing AI 2025
AMD’s AI Surge Challenges Nvidia’s Dominance
June 16, 2025
international AI networks and technological diffusion
From Networks to Business Models, AI Is Rewiring Telecom
June 12, 2025
Illustration of AI vs. human creativity showing a tug-of-war between traditional art tools, music notes, and digital AI symbols, representing copyright conflict.
The Tangled Web: Copyright, AI, and the Content ID Conundrum
June 9, 2025

How interested are you in a foldable smartphone?
Loading ... Loading ...

Technewsworld Channels

Applications

Applications

Beyond ChatGPT: Shadow AI Risks Lurk in SaaS Tools

Audio/Video

Audio/Video

The Tangled Web: Copyright, AI, and the Content ID Conundrum

Chips

Chips

AI’s Inflection Point: Echoes of Hardware Disruption

Computing

Computing

Security Is Not Privacy, Part 2: The Guard Tower PC

Cybersecurity

Cybersecurity

Identity Crime Reports Down, Losses Up: ITRC

Data Management

Data Management

A Beginner’s Guide to Data Protection for Microsoft 365 Exchange

Developers

Developers

AMD’s AI Surge Challenges Nvidia’s Dominance

Emerging Tech

Emerging Tech

How To Keep AI From Making Us Stupid

Exclusives

Exclusives

Database Admins See Brighter Job Prospects Amid IT Challenges

Gaming

Gaming

AMD at Computex 2025: Making the Case for an AI Powerhouse

Hacking

Hacking

SMBs Face Costly, Complex Barriers to Cybersecurity

Hardware

Hardware

WWDC: Apple Unifies Operating Systems, Makes iPad More PC

Health

Health

Samsung Plans ‘Health Hub’ To Connect Doctors With Patient Data

Home Tech

Home Tech

Matter and Infineon Redefine Smart Home Security Standards

How To

How To

Web Raiders Unleash Global Brute Force Attacks From 2.8M IP Addresses

Internet of Things

Internet of Things

AMD’s Embedded Edge: Leadership, Differentiation, and AI Opportunity

IT Leadership

IT Leadership

Modernizing Identity Security Beyond MFA

Malware

Malware

AI Chatbots Can Be Easy Prey for ‘Zero-Knowledge’ Hackers

Mobile Apps

Mobile Apps

Collection of Private Data Makes Mobile Apps Fat Target for Hackers

Operating Systems

Operating Systems

Security Is Not Privacy, Part 1: The Mobile Target

Privacy

Privacy

I Know Precisely What Sam Altman and Jony Ive Are Up To

Reviews

Reviews

Powerful Mini-PCs Provide Efficient Replacement for Desktop Computers

Robotics

Robotics

Drones Set To Deliver Benefits for Labor-Intensive Industries: Forrester

Science

Science

MIT Study Finds ChatGPT Can Harm Critical Thinking Over Time

Search Tech

Search Tech

AI Is Rewriting the Rules of Brand Management

Servers

Servers

Why Texas Is the Ideal Home for Apple’s AI Data Center

Smartphones

Smartphones

Cell Phone Satisfaction Tumbles to 10-Year Low in Latest ACSI Survey

Social Networking

Social Networking

Americans Could Lose 7% of Their Lives to Social Media

Space

Space

Meta Llama 2025: The Open-Source AI Tsunami

Spotlight Features

Spotlight Features

How To Leverage Gen AI Without Losing the Corporate Shirt

Tablets

Tablets

Screen Time of Americans Above Global Average: Study

Tech Buzz

Tech Buzz

AI and the Algorithmic Muse: Entertainment’s Next Act

Tech Law

Tech Law

Democratic AI Revolution: Power to the People and Code to the Masses

Transportation

Transportation

Waymo Builds Arizona Factory To Grow Robotaxi Fleet

Virtual Reality

Virtual Reality

Apple Vision Pro Ecosystem Shows Sluggish Growth

Wearable Tech

Wearable Tech

Apple Adds Brain-to-Computer Protocol to Its Accessibility Repertoire

Women In Tech

Women In Tech

Crashing the Boys’ Club: Women Entering Cybersecurity Through Non-Traditional Paths

More from ECT News Network

E-Commerce Times

Harper Reinvents the Stack To Power High-Speed Digital Commerce
Harper Reinvents the Stack To Power High-Speed Digital Commerce
July 16, 2025
Subscription Economy Projected To Hit $1.2 Trillion by 2030
Subscription Economy Projected To Hit $1.2 Trillion by 2030
July 15, 2025
Transforming Retail With Autonomous Intelligence
Transforming Retail With Autonomous Intelligence
July 14, 2025

LinuxInsider

Is a Security Baseline Enough for Open-Source Software?
Is a Security Baseline Enough for Open-Source Software?
June 13, 2025
DOD Raises Software Security Expectations With SWFT Initiative
DOD Raises Software Security Expectations With SWFT Initiative
May 15, 2025
Edera and CIQ Advance Linux Security With Hardened Tools
Edera and CIQ Advance Linux Security With Hardened Tools
May 12, 2025

CRM Buyer

Modernizing Legacy ERP Without Rip-and-Replace
Modernizing Legacy ERP Without Rip-and-Replace
July 14, 2025
Crescendo AI Shifts CX From Deflection to Engagement
Crescendo AI Shifts CX From Deflection to Engagement
July 8, 2025
NiCE Interactions 2025: Orchestrating the Future of CX
NiCE Interactions 2025: Orchestrating the Future of CX
June 30, 2025