The Challenges of Local-First Applications: Forward Compatibility, Backward Compatibility, and Schema Evolution
Introduction
In software development, local-first applications have emerged as a compelling solution that emphasizes total data ownership and offline-first capabilities. These applications run directly on the user’s device, providing enhanced performance and privacy. However, this approach also introduces significant challenges, particularly in versioning, forward compatibility, backward compatibility, and schema evolution. This article delves into these challenges and explores how developers can navigate them.
The Problem with Versioning
Versioning is a critical aspect of software development, ensuring that different software versions can coexist and function correctly. In microservices, developers often grapple with maintaining and deprecating old versions while introducing new ones. However, versioning in local-first applications and SDKs (Software Development Kits) presents even more complexity. Unlike microservices, where the service provider can control who uses the service, SDKs are distributed and integrated into various applications, making it difficult to enforce upgrades or track usage.
Forward Compatibility and Backward Compatibility
- Writer v1: Produces data using schema version 1.
- Writer v2: Produces data using schema version 2.
- Reader v1 (Backward): Reads data from older versions (schema version 1).
- Reader v2 (Forward): Reads data from newer versions (schema version 2).
- Reader (Full): Reads data from both older and newer versions, ensuring full compatibility with all schema versions.
Forward Compatibility
Forward compatibility refers to a system's ability to accept input intended for a future version of itself. This means that the current software version should be able to handle data or requests from newer versions. Achieving forward compatibility is challenging because it requires anticipating future changes and ensuring the current system can process them without errors.
Forward compatibility is important because:
- For the case of input parameters: you can upgrade clients without having to upgrade servers
- For return types: you can upgrade servers without having to upgrade clients
- For databases: you can run your schema migrations before deploying the new code to read it
For JSON here is an incomplete list of forward-compatible changes:
- Adding a new required field. Older readers will simply ignore it.
- Narrowing a numerical type (e.g. float to int). Older readers will assume ints, which are a subset of floats.
- Removing a value from an enum string. Older readers can handle the full breadth of enums.
- Adding a value to an enum string if and only if the reader has implemented a proper “else” case. (See note on enums)
Backward Compatibility
Backward compatibility ensures that newer software versions can still operate with data or requests from older versions. This is more commonly discussed and implemented, as it allows users to upgrade their software without losing functionality or data compatibility.
Backward compatibility is important because:
- For the case of input parameters: you can upgrade servers without having to upgrade clients
- For return types: you can upgrade clients without having to upgrade servers
- For databases: you don’t encounter any data loss (without backward compatibility you wouldn’t be able to read any data written by an older version)
For JSON here is an incomplete list of backward-compatible changes:
- Adding a field with a default value. Older writers will be unaware of that field so the default value will be used instead.
- Adding an optional field. Older writers will be unaware of that field so null will be used instead.
- Widening a numerical type (e.g. int to float). Older writers will always use ints, which are a subset of floats.
- Adding a value to an enum string. Older writers will just use one of the existing enum strings.
- Removing a field. Newer readers will ignore whatever was previously written in this field. (Note: this is not true of many binary serialization formats!)
Full Compatibility
Full compatibility, which encompasses forward and backward compatibility, is an ideal but often unattainable goal. The complexity arises from the need to seamlessly support old and new features. Changes in data structures, functionalities, and protocols must be carefully managed to avoid breaking the system.
For JSON here is an incomplete list of fully compatible changes (some are repeated from above):
- Adding a field with a default value
- Adding an optional field
- Adding a value to an enum string if and only if the reader has implemented a proper “else” case. (See note on enums)
Schema Evolution
Schema evolution is the process of modifying the schema (structure) of the data over time. This is crucial for maintaining compatibility as software evolves. Avro and Protocol Buffers (Protobuf) are two common formats that support schema evolution.
Strategies for Schema Evolution
1. Versioned Schemas: Maintain multiple versions of the schema and ensure that both the writer and reader can handle the appropriate version.
2. Flexible Data Formats: Use data formats that support optional fields and default values, allowing for gradual changes without breaking compatibility.
3. Gradual Protocol Updates: Implement protocols that can gradually evolve, similar to how languages evolve with new words but retain core comprehensibility.
Conclusion
Local-first applications offer significant benefits in terms of data ownership and performance. However, they also introduce unique challenges in versioning, compatibility, and schema evolution. Developers must adopt strategies to ensure forward and backward compatibility, leveraging tools and practices that support schema evolution. By doing so, they can create robust local-first applications that remain functional and relevant as they evolve.
Additional Resources
For those interested in diving deeper into these topics, consider exploring the following resources:
- Avro: A data serialization system that provides rich data structures and a compact, fast binary data format.
Protocol Buffers (Protobuf): A language-neutral, platform-neutral extensible mechanism for serializing structured data.
- JSON Schema: A vocabulary that allows you to annotate and validate JSON documents.
Also useful article
Stay tuned for more insights and updates on software development and local-first applications. Subscribe to our channel and leave your comments below!