Kamil Owczarek
Published on

6x More Cache Capacity: Compressing Redis on a 1GB Server

Authors

The Problem: Running Out of Cache Space

Our e-commerce platform runs on a 1GB Redis instance on DigitalOcean ($15/month). It worked fine initially, but as traffic grew and we cached more endpoints, we started hitting memory limits.

The symptoms were clear:

  • Redis memory usage consistently above 90%
  • LRU evictions happening too frequently
  • SEO-critical pages getting evicted before they could be re-requested
  • Cold cache misses increasing during peak traffic

We had two options: upgrade to a 2GB instance ($30/month) or optimize what we had.

I chose optimization. Here's how we achieved 6x more cache capacity without changing our infrastructure.

Understanding the Cache Structure

Our caching layer wraps Nuxt 3 API endpoints with a custom handler that stores responses in Redis. The structure looked like this:

type CacheEntry<T> = {
    data: T;           // The actual API response
    createdAt: number; // Timestamp for cache invalidation
};

Every cached response gets wrapped in this envelope and stored as JSON in Redis. The problem? JSON is verbose, especially when your responses contain:

  • Multi-language translations (11 languages: PL, EN, DE, UK, RU, HU, RO, FR, SL, ES, IT)
  • Nested product data with repeated field names
  • Large arrays of similar objects

A typical product listing response might look like this (simplified):

{
  "data": {
    "products": [
      {
        "id": 12345,
        "code": "QUATRO-860",
        "nameCore": {
          "translations": {
            "pl": "Zlewozmywak granitowy",
            "en": "Granite sink",
            "de": "Granitspüle",
            "uk": "Гранітна мийка"
          }
        },
        "price": { "gross": 1299.00, "net": 1056.10 },
        "mainPhotoFullPath": "/products/quatro-860/main.jpg"
      }
    ],
    "count": 1847,
    "pagesCount": 93
  },
  "createdAt": 1735405200000
}

Multiply this by 50 products per page, 93 pages, and 11 language variants... you get the picture.

Why Compression Works So Well for JSON

JSON has characteristics that make it highly compressible:

  1. Repeated keys: Every object in an array has the same keys ("id", "code", "nameCore")
  2. Repeated values: Translation objects have the same structure across all entries
  3. Text-heavy content: Product names, descriptions are natural language text
  4. Predictable patterns: JSON syntax itself ({, }, ", :) is repetitive

Gzip compression exploits these patterns using the DEFLATE algorithm, which combines LZ77 (finding repeated sequences) and Huffman coding (using shorter codes for common patterns).

For our JSON data, we consistently see 70-85% compression ratios.

The Implementation

Here's the complete compression layer we added to our cache utility:

import { defineEventHandler as originalDefineEventHandler, type H3Event } from 'h3';
import { getCurrentUTC } from '~/utils/timezone.utils';
// eslint-disable-next-line import/default
import pako from 'pako';

const TTL = 259200; // 3 days in seconds
const FULL_RESET_FLAG_KEY = 'cache:flag:full';
const PARTIAL_RESET_FLAG_KEY = 'cache:flag:partial';

type CacheEntry<T> = {
    data: T;
    createdAt: number;
};

function compressData<T>(entry: CacheEntry<T>): string {
    const json = JSON.stringify(entry);
    const compressed = pako.gzip(json, { level: 6 });
    return Buffer.from(compressed).toString('base64');
}

function decompressData<T>(compressed: string): CacheEntry<T> {
    const buffer = Buffer.from(compressed, 'base64');
    const decompressed = pako.ungzip(buffer, { to: 'string' });
    return JSON.parse(decompressed) as CacheEntry<T>;
}

Why These Specific Choices?

Compression level 6: This is the sweet spot between compression ratio and CPU usage. Level 9 gives marginally better compression but takes significantly longer. Level 1 is fast but leaves size on the table.

Level 1: Fast, ~60% compression
Level 6: Balanced, ~78% compression  <-- Our choice
Level 9: Slow, ~82% compression

Base64 encoding: Redis stores strings efficiently, and base64 ensures our binary gzip output is safely serializable. The ~33% overhead from base64 is more than offset by the 75%+ compression.

pako library: It's the standard JavaScript implementation of zlib, used by JSZip and many other libraries. Production-tested, fast, and already in our dependency tree.

Integrating with the Cache Handler

The cache handler needed minimal changes. Here's the core flow:

export const defineCustomCacheEventHandler = <T>(
    handler: (event: H3Event) => T | Promise<T>,
) => {
    return originalDefineEventHandler(async (event: H3Event) => {
        // Skip cache in development
        if (process.env.NODE_ENV === 'development') {
            return await handler(event);
        }

        const url = getRequestURL(event);
        const cacheKey = encodeURIComponent(url.hostname + url.pathname + url.search);
        const storage = useStorage('cache');

        // Fetch cache entry and invalidation flags in parallel
        const [compressed, fullResetRaw, partialResetRaw] = await Promise.all([
            storage.getItem<string>(cacheKey),
            storage.getItem<number>(FULL_RESET_FLAG_KEY),
            storage.getItem<number>(PARTIAL_RESET_FLAG_KEY),
        ]);

        // Decompress with error handling
        let cachedEntry: CacheEntry<T> | null = null;
        if (compressed) {
            try {
                cachedEntry = decompressData<T>(compressed);
            } catch (error) {
                console.error(`[Cache] Decompression failed - ${cacheKey}:`, error);
                // Treat as cache miss, will regenerate
            }
        }

        // ... cache hit/miss logic ...

        // On cache miss: generate, compress, store
        const result = await handler(event);

        try {
            const entry: CacheEntry<T> = {
                data: result,
                createdAt: getCurrentUTC().getTime()
            };
            const compressedData = compressData(entry);
            await storage.setItem(cacheKey, compressedData, { ttl: TTL });
        } catch (error) {
            console.error(`[Cache] Compression/storage failed - ${cacheKey}:`, error);
            // Continue to return result even if caching failed
        }

        return result;
    });
};

Error Handling is Critical

Notice the try-catch blocks around both compression and decompression. This is essential because:

  1. Corrupted cache entries: If Redis data gets corrupted, decompression will fail
  2. Migration period: Old uncompressed entries will fail to decompress (we treat this as a cache miss)
  3. Out of memory: Compression of very large objects could theoretically fail
  4. Graceful degradation: The API always returns data, even if caching fails

The key insight: a failed cache operation should never break the API. Users get their data; we just log the error and move on.

The Results

After deploying to production and letting the cache rebuild with compressed entries:

MetricBeforeAfterImprovement
Avg entry size~200KB~40KB80% smaller
Entries in 1GB~5,000~25,0005x more
Memory usage92%45%47% reduction
LRU evictions/hour~150~1093% fewer

Real-World Entry Sizes

Here's what we observed for different endpoint types:

EndpointUncompressedCompressedRatio
/api/public/products (50 items)180KB28KB84%
/api/public/global (navigation)520KB78KB85%
/api/public/search (complex)340KB45KB87%
/api/public/categories/*/details/*8KB1.2KB85%
/api/public/landings/* (heavy)2.1MB310KB85%

The landing pages were our biggest win. These contain 15 product collections with 50+ products each, all with multi-language translations. Compressing a 2MB response down to 310KB is significant.

Performance Overhead

Compression isn't free. Here's what we measured:

OperationTimeImpact
Compression (200KB JSON)3-5msAdded to cache writes
Decompression (40KB gzip)1-2msAdded to cache reads
Base64 encode/decode~1msNegligible

For a cache hit, we're adding ~2ms of decompression time. But consider:

  • A database query for the same data takes 50-200ms
  • Network latency to the database is 5-10ms
  • The compressed response transfers faster from Redis to the app server

Net effect: Cache hits are still dramatically faster than cache misses, and the reduced memory pressure means fewer cache misses overall.

Type Safety Throughout

One concern with compression is losing TypeScript's type safety. Our approach preserves it completely:

// The generic type T flows through the entire chain
function compressData<T>(entry: CacheEntry<T>): string
function decompressData<T>(compressed: string): CacheEntry<T>

// Usage in the handler maintains type inference
const cachedEntry = decompressData<T>(compressed);
// cachedEntry.data is typed as T

const entry: CacheEntry<T> = { data: result, createdAt: ... };
const compressedData = compressData(entry);
// compressedData is string, entry.data is T

The compression layer is transparent to the rest of the application. Handlers return typed data, consumers receive typed data, and the compression/decompression happens invisibly in between.

Handling the Migration

When we deployed this change, we had thousands of existing cache entries in the old uncompressed format. We had two choices:

  1. Flush Redis and start fresh: Simple, but causes a thundering herd of cache misses
  2. Backward compatibility: Detect and handle both formats

We chose option 1 (flush), but with our error handling, option 2 would have worked automatically:

// Old format: { data: {...}, createdAt: 123 } (object)
// New format: "H4sIAAAAAAAA..." (base64 gzip string)

// When decompressData receives an object instead of string,
// Buffer.from() throws, catch block fires, treated as cache miss

The old entries naturally expire (3-day TTL) or get overwritten with compressed versions on the next request. The error logging helped us monitor the migration progress.

Redis Configuration Tips

To maximize the benefits of compression, ensure your Redis is configured correctly:

1. Eviction Policy

# Check current policy
CONFIG GET maxmemory-policy

# Set LRU eviction (recommended)
CONFIG SET maxmemory-policy allkeys-lru

With allkeys-lru, Redis evicts the least recently used keys when memory is full. This keeps your hot (frequently accessed) pages in cache while evicting rarely-used filter combinations.

2. Memory Limit

On DigitalOcean managed Redis, memory is pre-configured. But if you're self-hosting:

# Leave ~10% headroom for Redis overhead
CONFIG SET maxmemory 950mb  # for a 1GB instance

3. Monitor Memory

# Check memory usage
INFO memory

# Key metrics to watch:
# - used_memory_human
# - used_memory_peak_human
# - evicted_keys

When NOT to Compress

Compression isn't always the right choice. Skip it when:

  1. Data is already compressed: Images, PDFs, pre-compressed assets
  2. Entries are tiny: Entries under 1KB might grow after base64 encoding
  3. CPU is the bottleneck: If you're already CPU-bound, compression adds load
  4. Ultra-low latency required: Sub-millisecond requirements might not tolerate 2ms overhead

For our use case (API responses averaging 50-500KB, memory-constrained, latency tolerance of 50ms+), compression is a clear win.

The Bigger Picture: Compression vs. Other Optimizations

Before implementing compression, we considered alternatives:

Alternative 1: Upgrade Redis (30/monthinsteadof30/month instead of 15/month)

  • Pro: Zero code changes
  • Con: Doesn't solve the underlying inefficiency, just delays it
  • Con: Doubles infrastructure cost

Alternative 2: Cache fewer endpoints

  • Pro: Reduces cache size
  • Con: More database load, slower responses
  • Con: SEO pages need caching

Alternative 3: Shorter TTLs

  • Pro: Entries expire faster, less memory used
  • Con: More cache misses, more database load
  • Con: Stale-while-revalidate becomes less effective

Alternative 4: Per-language caching

  • Pro: Cache only the language user needs (90% smaller per entry)
  • Con: 11x more cache keys
  • Con: Requires cache key refactoring

We chose compression because it's:

  • Transparent to the application
  • Maintains all existing behavior
  • Provides 5-6x capacity improvement
  • Minimal code changes
  • No infrastructure changes

Code: The Complete Implementation

For reference, here's the complete cache utility with compression:

import { defineEventHandler as originalDefineEventHandler, type H3Event } from 'h3';
import type { InternalApi } from 'nitropack';
import { getServerSession } from '#auth';
import { getCurrentUTC } from '~/utils/timezone.utils';
// eslint-disable-next-line import/default
import pako from 'pako';

const TTL = 259200;
const FULL_RESET_FLAG_KEY = 'cache:flag:full';
const PARTIAL_RESET_FLAG_KEY = 'cache:flag:partial';

type CacheEntry<T> = {
    data: T;
    createdAt: number;
};

function compressData<T>(entry: CacheEntry<T>): string {
    const json = JSON.stringify(entry);
    const compressed = pako.gzip(json, { level: 6 });
    return Buffer.from(compressed).toString('base64');
}

function decompressData<T>(compressed: string): CacheEntry<T> {
    const buffer = Buffer.from(compressed, 'base64');
    const decompressed = pako.ungzip(buffer, { to: 'string' });
    return JSON.parse(decompressed) as CacheEntry<T>;
}

export const defineCustomCacheEventHandler = <T>(
    handler: (event: H3Event) => T | Promise<T>,
) => {
    return originalDefineEventHandler(async (event: H3Event) => {
        if (process.env.NODE_ENV === 'development') {
            return await handler(event);
        }

        const url = getRequestURL(event);
        const cacheKey = encodeURIComponent(url.hostname + url.pathname + url.search);
        const storage = useStorage('cache');

        const [compressed, fullResetRaw, partialResetRaw] = await Promise.all([
            storage.getItem<string>(cacheKey),
            storage.getItem<number>(FULL_RESET_FLAG_KEY),
            storage.getItem<number>(PARTIAL_RESET_FLAG_KEY),
        ]);

        let cachedEntry: CacheEntry<T> | null = null;
        if (compressed) {
            try {
                cachedEntry = decompressData<T>(compressed);
            } catch (error) {
                console.error(`[Cache] Decompression failed - ${cacheKey}:`, error);
            }
        }

        const fullReset = fullResetRaw || 0;
        const partialReset = partialResetRaw || 0;
        const entryCreatedAt = cachedEntry?.createdAt || 0;

        // Fresh cache hit
        if (cachedEntry && entryCreatedAt > fullReset && entryCreatedAt > partialReset) {
            return cachedEntry.data;
        }

        const session = await getServerSession(event);

        // Stale-while-revalidate for non-authenticated users
        if (!session?.user && cachedEntry && entryCreatedAt > fullReset) {
            const waitUntil = event.context.waitUntil || event.context.cloudflare?.ctx?.waitUntil;
            const revalidate = async () => {
                try {
                    const result = await handler(event);
                    const entry: CacheEntry<T> = { data: result, createdAt: getCurrentUTC().getTime() };
                    const compressedData = compressData(entry);
                    await storage.setItem(cacheKey, compressedData, { ttl: TTL });
                } catch (error) {
                    console.error(`[Cache] Revalidation failed - ${cacheKey}:`, error);
                }
            };

            if (waitUntil) {
                waitUntil(revalidate());
            } else {
                revalidate().catch(() => {});
            }

            return cachedEntry.data;
        }

        // Cache miss - generate fresh data
        const result = await handler(event);

        try {
            const entry: CacheEntry<T> = { data: result, createdAt: getCurrentUTC().getTime() };
            const compressedData = compressData(entry);
            await storage.setItem(cacheKey, compressedData, { ttl: TTL });
        } catch (error) {
            console.error(`[Cache] Compression/storage failed - ${cacheKey}:`, error);
        }

        return result;
    });
};

type PublicKeys<T> = {
    [K in keyof T]: K extends `${string}public${string}` ? K : never;
}[keyof T];

export const cleanCachePartially = async (_partsOfKeyRaw: PublicKeys<InternalApi>[]) => {
    const storage = useStorage('cache');
    await storage.setItem(PARTIAL_RESET_FLAG_KEY, getCurrentUTC().getTime());
};

export const cleanCacheCompletely = async () => {
    const storage = useStorage('cache');
    await storage.setItem(FULL_RESET_FLAG_KEY, getCurrentUTC().getTime());
};

Key Takeaways

  1. JSON compresses extremely well (70-85% for typical API responses)
  2. Compression overhead is minimal (~2-5ms) compared to cache miss cost (~100-500ms)
  3. Error handling is essential - never let cache failures break your API
  4. Type safety can be preserved through generics
  5. Start with compression level 6 - it's the sweet spot for most use cases
  6. Configure Redis eviction policy to maximize cache efficiency

Final Thoughts

This optimization took about 2 hours to implement and test. The result? Our 15/monthRedisnowhandlesthetrafficthatwouldhaverequireda15/month Redis now handles the traffic that would have required a 30/month instance. More importantly, our SEO-critical pages stay cached, and users get faster responses.

Sometimes the best infrastructure optimization isn't adding more resources—it's using what you have more efficiently.

The compression approach is particularly valuable for:

  • Multi-language applications (lots of repeated translation structures)
  • E-commerce catalogs (similar product objects repeated)
  • Any API returning arrays of similar objects
  • Memory-constrained environments (serverless, small VPS)

If you're running into Redis memory limits and your cached data is JSON, give compression a try. The implementation is straightforward, the benefits are immediate, and the risks are minimal with proper error handling.