- Published on
6x More Cache Capacity: Compressing Redis on a 1GB Server
- Authors
The Problem: Running Out of Cache Space
Our e-commerce platform runs on a 1GB Redis instance on DigitalOcean ($15/month). It worked fine initially, but as traffic grew and we cached more endpoints, we started hitting memory limits.
The symptoms were clear:
- Redis memory usage consistently above 90%
- LRU evictions happening too frequently
- SEO-critical pages getting evicted before they could be re-requested
- Cold cache misses increasing during peak traffic
We had two options: upgrade to a 2GB instance ($30/month) or optimize what we had.
I chose optimization. Here's how we achieved 6x more cache capacity without changing our infrastructure.
Understanding the Cache Structure
Our caching layer wraps Nuxt 3 API endpoints with a custom handler that stores responses in Redis. The structure looked like this:
type CacheEntry<T> = {
data: T; // The actual API response
createdAt: number; // Timestamp for cache invalidation
};
Every cached response gets wrapped in this envelope and stored as JSON in Redis. The problem? JSON is verbose, especially when your responses contain:
- Multi-language translations (11 languages: PL, EN, DE, UK, RU, HU, RO, FR, SL, ES, IT)
- Nested product data with repeated field names
- Large arrays of similar objects
A typical product listing response might look like this (simplified):
{
"data": {
"products": [
{
"id": 12345,
"code": "QUATRO-860",
"nameCore": {
"translations": {
"pl": "Zlewozmywak granitowy",
"en": "Granite sink",
"de": "Granitspüle",
"uk": "Гранітна мийка"
}
},
"price": { "gross": 1299.00, "net": 1056.10 },
"mainPhotoFullPath": "/products/quatro-860/main.jpg"
}
],
"count": 1847,
"pagesCount": 93
},
"createdAt": 1735405200000
}
Multiply this by 50 products per page, 93 pages, and 11 language variants... you get the picture.
Why Compression Works So Well for JSON
JSON has characteristics that make it highly compressible:
- Repeated keys: Every object in an array has the same keys (
"id","code","nameCore") - Repeated values: Translation objects have the same structure across all entries
- Text-heavy content: Product names, descriptions are natural language text
- Predictable patterns: JSON syntax itself (
{,},",:) is repetitive
Gzip compression exploits these patterns using the DEFLATE algorithm, which combines LZ77 (finding repeated sequences) and Huffman coding (using shorter codes for common patterns).
For our JSON data, we consistently see 70-85% compression ratios.
The Implementation
Here's the complete compression layer we added to our cache utility:
import { defineEventHandler as originalDefineEventHandler, type H3Event } from 'h3';
import { getCurrentUTC } from '~/utils/timezone.utils';
// eslint-disable-next-line import/default
import pako from 'pako';
const TTL = 259200; // 3 days in seconds
const FULL_RESET_FLAG_KEY = 'cache:flag:full';
const PARTIAL_RESET_FLAG_KEY = 'cache:flag:partial';
type CacheEntry<T> = {
data: T;
createdAt: number;
};
function compressData<T>(entry: CacheEntry<T>): string {
const json = JSON.stringify(entry);
const compressed = pako.gzip(json, { level: 6 });
return Buffer.from(compressed).toString('base64');
}
function decompressData<T>(compressed: string): CacheEntry<T> {
const buffer = Buffer.from(compressed, 'base64');
const decompressed = pako.ungzip(buffer, { to: 'string' });
return JSON.parse(decompressed) as CacheEntry<T>;
}
Why These Specific Choices?
Compression level 6: This is the sweet spot between compression ratio and CPU usage. Level 9 gives marginally better compression but takes significantly longer. Level 1 is fast but leaves size on the table.
Level 1: Fast, ~60% compression
Level 6: Balanced, ~78% compression <-- Our choice
Level 9: Slow, ~82% compression
Base64 encoding: Redis stores strings efficiently, and base64 ensures our binary gzip output is safely serializable. The ~33% overhead from base64 is more than offset by the 75%+ compression.
pako library: It's the standard JavaScript implementation of zlib, used by JSZip and many other libraries. Production-tested, fast, and already in our dependency tree.
Integrating with the Cache Handler
The cache handler needed minimal changes. Here's the core flow:
export const defineCustomCacheEventHandler = <T>(
handler: (event: H3Event) => T | Promise<T>,
) => {
return originalDefineEventHandler(async (event: H3Event) => {
// Skip cache in development
if (process.env.NODE_ENV === 'development') {
return await handler(event);
}
const url = getRequestURL(event);
const cacheKey = encodeURIComponent(url.hostname + url.pathname + url.search);
const storage = useStorage('cache');
// Fetch cache entry and invalidation flags in parallel
const [compressed, fullResetRaw, partialResetRaw] = await Promise.all([
storage.getItem<string>(cacheKey),
storage.getItem<number>(FULL_RESET_FLAG_KEY),
storage.getItem<number>(PARTIAL_RESET_FLAG_KEY),
]);
// Decompress with error handling
let cachedEntry: CacheEntry<T> | null = null;
if (compressed) {
try {
cachedEntry = decompressData<T>(compressed);
} catch (error) {
console.error(`[Cache] Decompression failed - ${cacheKey}:`, error);
// Treat as cache miss, will regenerate
}
}
// ... cache hit/miss logic ...
// On cache miss: generate, compress, store
const result = await handler(event);
try {
const entry: CacheEntry<T> = {
data: result,
createdAt: getCurrentUTC().getTime()
};
const compressedData = compressData(entry);
await storage.setItem(cacheKey, compressedData, { ttl: TTL });
} catch (error) {
console.error(`[Cache] Compression/storage failed - ${cacheKey}:`, error);
// Continue to return result even if caching failed
}
return result;
});
};
Error Handling is Critical
Notice the try-catch blocks around both compression and decompression. This is essential because:
- Corrupted cache entries: If Redis data gets corrupted, decompression will fail
- Migration period: Old uncompressed entries will fail to decompress (we treat this as a cache miss)
- Out of memory: Compression of very large objects could theoretically fail
- Graceful degradation: The API always returns data, even if caching fails
The key insight: a failed cache operation should never break the API. Users get their data; we just log the error and move on.
The Results
After deploying to production and letting the cache rebuild with compressed entries:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Avg entry size | ~200KB | ~40KB | 80% smaller |
| Entries in 1GB | ~5,000 | ~25,000 | 5x more |
| Memory usage | 92% | 45% | 47% reduction |
| LRU evictions/hour | ~150 | ~10 | 93% fewer |
Real-World Entry Sizes
Here's what we observed for different endpoint types:
| Endpoint | Uncompressed | Compressed | Ratio |
|---|---|---|---|
/api/public/products (50 items) | 180KB | 28KB | 84% |
/api/public/global (navigation) | 520KB | 78KB | 85% |
/api/public/search (complex) | 340KB | 45KB | 87% |
/api/public/categories/*/details/* | 8KB | 1.2KB | 85% |
/api/public/landings/* (heavy) | 2.1MB | 310KB | 85% |
The landing pages were our biggest win. These contain 15 product collections with 50+ products each, all with multi-language translations. Compressing a 2MB response down to 310KB is significant.
Performance Overhead
Compression isn't free. Here's what we measured:
| Operation | Time | Impact |
|---|---|---|
| Compression (200KB JSON) | 3-5ms | Added to cache writes |
| Decompression (40KB gzip) | 1-2ms | Added to cache reads |
| Base64 encode/decode | ~1ms | Negligible |
For a cache hit, we're adding ~2ms of decompression time. But consider:
- A database query for the same data takes 50-200ms
- Network latency to the database is 5-10ms
- The compressed response transfers faster from Redis to the app server
Net effect: Cache hits are still dramatically faster than cache misses, and the reduced memory pressure means fewer cache misses overall.
Type Safety Throughout
One concern with compression is losing TypeScript's type safety. Our approach preserves it completely:
// The generic type T flows through the entire chain
function compressData<T>(entry: CacheEntry<T>): string
function decompressData<T>(compressed: string): CacheEntry<T>
// Usage in the handler maintains type inference
const cachedEntry = decompressData<T>(compressed);
// cachedEntry.data is typed as T
const entry: CacheEntry<T> = { data: result, createdAt: ... };
const compressedData = compressData(entry);
// compressedData is string, entry.data is T
The compression layer is transparent to the rest of the application. Handlers return typed data, consumers receive typed data, and the compression/decompression happens invisibly in between.
Handling the Migration
When we deployed this change, we had thousands of existing cache entries in the old uncompressed format. We had two choices:
- Flush Redis and start fresh: Simple, but causes a thundering herd of cache misses
- Backward compatibility: Detect and handle both formats
We chose option 1 (flush), but with our error handling, option 2 would have worked automatically:
// Old format: { data: {...}, createdAt: 123 } (object)
// New format: "H4sIAAAAAAAA..." (base64 gzip string)
// When decompressData receives an object instead of string,
// Buffer.from() throws, catch block fires, treated as cache miss
The old entries naturally expire (3-day TTL) or get overwritten with compressed versions on the next request. The error logging helped us monitor the migration progress.
Redis Configuration Tips
To maximize the benefits of compression, ensure your Redis is configured correctly:
1. Eviction Policy
# Check current policy
CONFIG GET maxmemory-policy
# Set LRU eviction (recommended)
CONFIG SET maxmemory-policy allkeys-lru
With allkeys-lru, Redis evicts the least recently used keys when memory is full. This keeps your hot (frequently accessed) pages in cache while evicting rarely-used filter combinations.
2. Memory Limit
On DigitalOcean managed Redis, memory is pre-configured. But if you're self-hosting:
# Leave ~10% headroom for Redis overhead
CONFIG SET maxmemory 950mb # for a 1GB instance
3. Monitor Memory
# Check memory usage
INFO memory
# Key metrics to watch:
# - used_memory_human
# - used_memory_peak_human
# - evicted_keys
When NOT to Compress
Compression isn't always the right choice. Skip it when:
- Data is already compressed: Images, PDFs, pre-compressed assets
- Entries are tiny: Entries under 1KB might grow after base64 encoding
- CPU is the bottleneck: If you're already CPU-bound, compression adds load
- Ultra-low latency required: Sub-millisecond requirements might not tolerate 2ms overhead
For our use case (API responses averaging 50-500KB, memory-constrained, latency tolerance of 50ms+), compression is a clear win.
The Bigger Picture: Compression vs. Other Optimizations
Before implementing compression, we considered alternatives:
Alternative 1: Upgrade Redis (15/month)
- Pro: Zero code changes
- Con: Doesn't solve the underlying inefficiency, just delays it
- Con: Doubles infrastructure cost
Alternative 2: Cache fewer endpoints
- Pro: Reduces cache size
- Con: More database load, slower responses
- Con: SEO pages need caching
Alternative 3: Shorter TTLs
- Pro: Entries expire faster, less memory used
- Con: More cache misses, more database load
- Con: Stale-while-revalidate becomes less effective
Alternative 4: Per-language caching
- Pro: Cache only the language user needs (90% smaller per entry)
- Con: 11x more cache keys
- Con: Requires cache key refactoring
We chose compression because it's:
- Transparent to the application
- Maintains all existing behavior
- Provides 5-6x capacity improvement
- Minimal code changes
- No infrastructure changes
Code: The Complete Implementation
For reference, here's the complete cache utility with compression:
import { defineEventHandler as originalDefineEventHandler, type H3Event } from 'h3';
import type { InternalApi } from 'nitropack';
import { getServerSession } from '#auth';
import { getCurrentUTC } from '~/utils/timezone.utils';
// eslint-disable-next-line import/default
import pako from 'pako';
const TTL = 259200;
const FULL_RESET_FLAG_KEY = 'cache:flag:full';
const PARTIAL_RESET_FLAG_KEY = 'cache:flag:partial';
type CacheEntry<T> = {
data: T;
createdAt: number;
};
function compressData<T>(entry: CacheEntry<T>): string {
const json = JSON.stringify(entry);
const compressed = pako.gzip(json, { level: 6 });
return Buffer.from(compressed).toString('base64');
}
function decompressData<T>(compressed: string): CacheEntry<T> {
const buffer = Buffer.from(compressed, 'base64');
const decompressed = pako.ungzip(buffer, { to: 'string' });
return JSON.parse(decompressed) as CacheEntry<T>;
}
export const defineCustomCacheEventHandler = <T>(
handler: (event: H3Event) => T | Promise<T>,
) => {
return originalDefineEventHandler(async (event: H3Event) => {
if (process.env.NODE_ENV === 'development') {
return await handler(event);
}
const url = getRequestURL(event);
const cacheKey = encodeURIComponent(url.hostname + url.pathname + url.search);
const storage = useStorage('cache');
const [compressed, fullResetRaw, partialResetRaw] = await Promise.all([
storage.getItem<string>(cacheKey),
storage.getItem<number>(FULL_RESET_FLAG_KEY),
storage.getItem<number>(PARTIAL_RESET_FLAG_KEY),
]);
let cachedEntry: CacheEntry<T> | null = null;
if (compressed) {
try {
cachedEntry = decompressData<T>(compressed);
} catch (error) {
console.error(`[Cache] Decompression failed - ${cacheKey}:`, error);
}
}
const fullReset = fullResetRaw || 0;
const partialReset = partialResetRaw || 0;
const entryCreatedAt = cachedEntry?.createdAt || 0;
// Fresh cache hit
if (cachedEntry && entryCreatedAt > fullReset && entryCreatedAt > partialReset) {
return cachedEntry.data;
}
const session = await getServerSession(event);
// Stale-while-revalidate for non-authenticated users
if (!session?.user && cachedEntry && entryCreatedAt > fullReset) {
const waitUntil = event.context.waitUntil || event.context.cloudflare?.ctx?.waitUntil;
const revalidate = async () => {
try {
const result = await handler(event);
const entry: CacheEntry<T> = { data: result, createdAt: getCurrentUTC().getTime() };
const compressedData = compressData(entry);
await storage.setItem(cacheKey, compressedData, { ttl: TTL });
} catch (error) {
console.error(`[Cache] Revalidation failed - ${cacheKey}:`, error);
}
};
if (waitUntil) {
waitUntil(revalidate());
} else {
revalidate().catch(() => {});
}
return cachedEntry.data;
}
// Cache miss - generate fresh data
const result = await handler(event);
try {
const entry: CacheEntry<T> = { data: result, createdAt: getCurrentUTC().getTime() };
const compressedData = compressData(entry);
await storage.setItem(cacheKey, compressedData, { ttl: TTL });
} catch (error) {
console.error(`[Cache] Compression/storage failed - ${cacheKey}:`, error);
}
return result;
});
};
type PublicKeys<T> = {
[K in keyof T]: K extends `${string}public${string}` ? K : never;
}[keyof T];
export const cleanCachePartially = async (_partsOfKeyRaw: PublicKeys<InternalApi>[]) => {
const storage = useStorage('cache');
await storage.setItem(PARTIAL_RESET_FLAG_KEY, getCurrentUTC().getTime());
};
export const cleanCacheCompletely = async () => {
const storage = useStorage('cache');
await storage.setItem(FULL_RESET_FLAG_KEY, getCurrentUTC().getTime());
};
Key Takeaways
- JSON compresses extremely well (70-85% for typical API responses)
- Compression overhead is minimal (~2-5ms) compared to cache miss cost (~100-500ms)
- Error handling is essential - never let cache failures break your API
- Type safety can be preserved through generics
- Start with compression level 6 - it's the sweet spot for most use cases
- Configure Redis eviction policy to maximize cache efficiency
Final Thoughts
This optimization took about 2 hours to implement and test. The result? Our 30/month instance. More importantly, our SEO-critical pages stay cached, and users get faster responses.
Sometimes the best infrastructure optimization isn't adding more resources—it's using what you have more efficiently.
The compression approach is particularly valuable for:
- Multi-language applications (lots of repeated translation structures)
- E-commerce catalogs (similar product objects repeated)
- Any API returning arrays of similar objects
- Memory-constrained environments (serverless, small VPS)
If you're running into Redis memory limits and your cached data is JSON, give compression a try. The implementation is straightforward, the benefits are immediate, and the risks are minimal with proper error handling.