Documents Compression



Overview

  • As a document database, RavenDB's schema-less nature presents many advantages,
    however, it requires us to manage the data structure on a per-document basis.
    In extreme cases, the majority of the data you store is the documents' structure.

  • The Zstd compression algorithm is used to learn your data model, identify common patterns,
    and create dictionaries that represent the redundant structural data across documents in a collection.

  • The algorithm is trained by each compression operation and continuously improves its compression ratio
    to maintain the most efficient compression model.
    In many datasets, this can reduce the storage space by more than 50%.

  • Compression and decompression are fully transparent to the user.
    Reading and querying compressed large datasets is usually as fast as reading and querying
    their uncompressed versions because the compressed data is loaded much faster.

  • Compression is Not applied to attachments, counters, and time series data,
    only to the content of documents and revisions.

  • Detailed information about the database's physical storage is visible in the Storage Report view.

Compression -vs- Compaction

  • The following table summarizes the differences between Compression and Compaction:
Compression
Action: Reduce storage space using the Zstd compression algorithm
Items that can be compressed: - Documents in collections that are configured for compression
- Revisions for all collections
Triggered by: The server
Triggered when: Compression feature is configured,
and when either of the following occurs for the configured collections:
   - Storing new documents
   - Modifying & saving existing documents
   - Compact operation is triggered, existing documents will be compressed
Compaction
Action: Remove empty gaps on disk that still occupy space after deletes
Items that can be compacted: Documents and/or indexes on the specified database
Triggered by: Client API code
Triggered when: Explicitly calling compact database operation

Set compression for all collections

// Compression is configured by setting the database record 

// Retrieve the database record
var dbrecord = store.Maintenance.Server.Send(new GetDatabaseRecordOperation(store.Database));

// Set compression on ALL collections
dbrecord.DocumentsCompression.CompressAllCollections = true;

// Update the database record
store.Maintenance.Server.Send(new UpdateDatabaseOperation(dbrecord, dbrecord.Etag));
// Compression is configured by setting the database record 

// Retrieve the database record
var dbrecord = await store.Maintenance.Server.SendAsync(new GetDatabaseRecordOperation(store.Database));

// Set compression on ALL collections
dbrecord.DocumentsCompression.CompressAllCollections = true;

// Update the database record
await store.Maintenance.Server.SendAsync(new UpdateDatabaseOperation(dbrecord, dbrecord.Etag));

Set compression for selected collections

// Retrieve the database record
var dbrecord = store.Maintenance.Server.Send(new GetDatabaseRecordOperation(store.Database));

// Turn on compression for specific collections
dbrecord.DocumentsCompression.Collections = new[] { "Orders", "Employees" };

// Turn off compression for all revisions, on all collections
dbrecord.DocumentsCompression.CompressRevisions = false;

// Update the database record
store.Maintenance.Server.Send(new UpdateDatabaseOperation(dbrecord, dbrecord.Etag));
// Retrieve the database record
var dbrecord = await store.Maintenance.Server.SendAsync(new GetDatabaseRecordOperation(store.Database));

// Turn on compression for specific collection
dbrecord.DocumentsCompression.Collections = new[] { "Orders", "Employees" };

// Turn off compression for all revisions, on all collections
dbrecord.DocumentsCompression.CompressRevisions = false;

// Update the database record
await store.Maintenance.Server.SendAsync(new UpdateDatabaseOperation(dbrecord, dbrecord.Etag));

Syntax

  • Documents compression is configured using the DocumentsCompressionConfiguration class in the database record.

public class DocumentsCompressionConfiguration 
{
    public string[] Collections { get; set; }
    public bool CompressRevisions { get; set; }
    public bool CompressAllCollections { get; set; }
}