All Tags
AWS
ai
algorithm-design
architecture
browser
cloud
cloud-efficiency
cloud-principles
cost-reduction
data-centric
data-compression
data-processing
deployment
design
documentation
edge-computing
email-sharing
energy-efficiency
energy-footprint
enterprise-optimization
green-ai
hardware
libraries
llm
locality
machine-learning
maintainability
management
measured
microservices
migration
mobile
model-optimization
model-training
multi-objective
network-traffic
parameter-tuning
performance
queries
rebuilding
scaling
services
storage-optimization
strategies
tabs
template
testing
workloads
Tactic: Offload LLM Content Generation to Remote Server
Tactic sort:
Awesome Tactic
Type: Architectural Tactic
Category: green-ml-enabled-systems
Title
Offload LLM Content Generation to Remote Server
Description
Shift the generation of LLM content from client devices to remote servers with high-performance hardware. This reduces the energy use on client devices, improves execution time, and ensures smoother user experience when privacy and offline access are not strict requirements.
Participant
Software architects and developers
Related software artifact
AI-powered applications
Context
Applications integrating LLMs on client devices
Software feature
Content generation using LLMs
Tactic intent
To reduce energy consumption on client devices by performing LLM inference on remote servers
Target quality attribute
Energy efficiency
Other related quality attributes
Performance, User experience, Privacy
Measured impact
Fetching from a remote server consumed 3.5×–8.9× less energy than on-device generation for 100, 500, and 1,000 word outputs
