Repositories in GoatDB
Repositories in GoatDB are the fundamental unit of data organization, similar to tables in SQL databases or document collections in NoSQL databases. Each repository manages a collection of related data items and provides synchronized, durable storage with efficient read and write operations.
Core Concepts
Storage Architecture
Each repository is backed by a single .jsonl
file that stores a log of commits. This design takes advantage of modern SSD characteristics:
- Sequential I/O: Optimized for SSD performance with sequential writes, enabling efficient batching of operations
- Hardware Parallelization: Leverages SSD internal parallelization through large sequential I/O operations, enabling multiple NAND chips and controllers to work in parallel
- Write Amplification: Minimized through append-only design
The JSON Lines format provides several benefits:
- Human Readable: Commits are stored in readable JSON format
- Append-Only: New commits are always appended to the end
- Atomic Writes: Each line is written atomically for consistency
Commit Graphs
Repositories are collections of distinct commit graphs - one per item. Each item (identified by its key) has its own independent commit history, allowing for parallel evolution of different items without interference. This design enables efficient concurrent operations and makes it possible to track the complete history of each item separately.
Commit Structure
Each commit in the repository log contains:
- ID: Unique identifier for the commit
- Key: The item key being modified
- Data: The actual data being written
- Parents: References to parent commits (for version history)
- Timestamp: When the commit was created
- Metadata: Additional information like organization ID and build version
- Session: The ID of the session that created the commit
- Signature: Cryptographic signature of the commit, generated using the session’s private key
The signature is particularly important as it provides cryptographic proof that:
- The commit was created by an authorized session
- The commit data hasn’t been tampered with
- The commit is part of a verifiable chain of changes
This security model ensures that all operations in GoatDB are cryptographically signed, creating a tamper-proof commit graph where each change can be traced back to its authorized source.
For performance-critical applications or trusted environments (like backend services), GoatDB offers a trusted mode that bypasses cryptographic verification. This mode can significantly improve performance by skipping commit signing and verification. However, it should only be used in controlled, trusted environments where security is handled at a different layer.
Basic Operations
Creating and Writing
Repositories are created implicitly when you first open them:
// Create a new repository by opening it
const userRepo = await db.open('/users/john');
Reading Data
GoatDB provides several ways to read data from repositories:
// Get a specific item by its full path
// Returns the current value of the item or undefined if it doesn't exist
const user = db.item('/users/john/foo');
// Query items using a schema filter
// This example gets all notes in john's repository that match the note schema
const allNotes = db.query({
source: '/users/john', // Repository path to search in
schema: kSchemaNote, // Schema to filter by
});
// Get all keys in a repository
// This is useful for listing all items or checking what exists
const allKeys = db.keys('/users/john');
For more details on reading and writing data, see Reading and Writing Data.
Durability
GoatDB provides strong durability through:
- Atomic Commits: Each commit is written atomically - if a crash occurs mid-write, the half-written commit is simply trimmed from the log
- Parallel Writes: Changes are written simultaneously to both local storage and replicated to other peers
- Automatic Recovery: After a crash, the system automatically recovers missing commits through the synchronization protocol, ensuring all peers converge to the same state. The P2P design enables both clients and servers to act as active replicas, providing redundancy and resilience
Traditional database durability often focuses on server-side guarantees - ensuring data survives server crashes. But in GoatDB, we recognize that client durability is fundamentally different. Modern SSDs in laptops and phones rarely fail, and when they do, it’s typically due to physical damage rather than data corruption. More importantly, user expectations differ between client and server operations - if your phone dies mid-click, you wouldn’t expect that click’s effect to be saved, but when a server acknowledges an API call, you rightfully expect that operation to be durable.
Advanced Usage
The following methods are low-level APIs typically used for advanced scenarios or debugging:
// Get a repository instance directly
const repo = db.repository('/users/john');
// Low-level repository methods
// Get the current value and commit for a specific key
// Returns [value, commit] tuple or undefined if key doesn't exist
const [value, commit] = repo.valueForKey('/users/john');
// Set a new value for a key with an optional parent commit
// Returns a Promise that resolves to the new commit or undefined if no change
await repo.setValueForKey('/users/john', newItem, parentCommit);
// Get all keys in the repository
// Returns an iterable of all keys
const keys = repo.keys();
// Get all full paths in the repository
// Returns an iterable of paths (repository path + key)
const paths = repo.paths();
// Get the commit graph for a specific key
// Returns an array of CommitGraph objects showing the commit history
const graph = repo.graphForKey('/users/john/foo');
// Get a Cytoscape-compatible JSON representation of the commit network for a key
// This can be used to visualize the commit history in Cytoscape (https://cytoscape.org/)
const network = repo.debugNetworkForKey('/users/john/foo');
These low-level APIs are primarily useful for:
- Debugging and troubleshooting
- Building specialized tools that need direct access to the commit graph
- Contributing to GoatDB’s core functionality
For normal application development, the higher-level APIs (db.item()
and db.create()
) provide a safer and more convenient interface. However, if you’re interested in contributing to GoatDB’s development, these low-level APIs give you direct access to the core functionality.