All Tags
AWS
ai
algorithm-design
architecture
browser
cloud
cloud-efficiency
cloud-principles
cost-reduction
data-centric
data-compression
data-processing
deployment
design
documentation
edge-computing
email-sharing
energy-efficiency
energy-footprint
enterprise-optimization
green-ai
hardware
libraries
llm
locality
machine-learning
maintainability
management
measured
microservices
migration
mobile
model-optimization
model-training
multi-objective
network-traffic
parameter-tuning
performance
queries
rebuilding
scaling
services
storage-optimization
strategies
tabs
template
testing
workloads
Tactic: RAG Context Filtering and Compression
Tactic sort:
Awesome Tactic
Type: Architectural Tactic
Category: green-ml-enabled-systems
Title
RAG Context Filtering and Compression
Description
Filtering and compressing context can greatly aid in reducing context length and resource use, thereby, being an energy efficient tactic. For example, Provence (Pruning and Reranking Of Retrieved Relevant Contexts) [11] dynamically determines the optimal pruning level, ensuring only necessary context is retrieved for more energy-efficient processing.
Participant
AI and RAG Practitioners.
Related software artifact
RAG-Based Systems.
Context
RAG. Unsustainable RAG. Green AI.
Software feature
COCOM. EasyRAG. FiD-Light. Filtering Context. PISCO. Provence. Recomp.
Tactic intent
Environmentally Sustainable RAG and through energy efficiency and reduction of computational waste.
Target quality attribute
Energy Efficiency.
Other related quality attributes
Reduce Context Length.
Measured impact
COCOM [40] compresses context embeddings, reducing inference time by 5.69× and lowering computational operations (GFLOPs) by 22×, leading to significant energy savings.
