# DbEngine A lightweight, git-inspired database engine for Python that maintains complete history of all modifications. ## Overview DbEngine is a personal implementation of a versioned database engine that stores snapshots of data changes over time. Each modification creates a new immutable snapshot, allowing you to track the complete history of your data. ## Key Features - **Version Control**: Every change creates a new snapshot with a unique digest (SHA-256 hash) - **History Tracking**: Access any previous version of your data - **Multi-tenant Support**: Isolated data storage per tenant - **Thread-safe**: Built-in locking mechanism for concurrent access - **Git-inspired Architecture**: Objects are stored in a content-addressable format - **Efficient Storage**: Identical objects are stored only once ## Architecture The engine uses a file-based storage system with the following structure: ``` .mytools_db/ ├── {tenant_id}/ │ ├── head # Points to latest version of each entry │ └── objects/ │ └── {digest_prefix}/ │ └── {full_digest} # Actual object data └── refs/ # Shared references ``` ## Installation ```python from db_engine import DbEngine # Initialize with default root db = DbEngine() # Or specify custom root directory db = DbEngine(root="/path/to/database") ``` ## Basic Usage ### Initialize Database for a Tenant ```python tenant_id = "my_company" db.init(tenant_id) ``` ### Save Data ```python # Save a complete object user_id = "john_doe" entry = "users" data = {"name": "John", "age": 30} digest = db.save(tenant_id, user_id, entry, data) ``` ### Load Data ```python # Load latest version data = db.load(tenant_id, entry="users") # Load specific version by digest data = db.load(tenant_id, entry="users", digest="abc123...") ``` ### Work with Individual Records ```python # Add or update a single record db.put(tenant_id, user_id, entry="users", key="john", value={"name": "John", "age": 30}) # Add or update multiple records at once items = { "john": {"name": "John", "age": 30}, "jane": {"name": "Jane", "age": 25} } db.put_many(tenant_id, user_id, entry="users", items=items) # Get a specific record user = db.get(tenant_id, entry="users", key="john") # Get all records all_users = db.get(tenant_id, entry="users") ``` ### Check Existence ```python if db.exists(tenant_id, entry="users"): print("Entry exists") ``` ### Access History ```python # Get history of an entry (returns list of digests) history = db.history(tenant_id, entry="users", max_items=10) # Load a previous version old_data = db.load(tenant_id, entry="users", digest=history[1]) ``` ## Metadata Each snapshot automatically includes metadata: - `__parent__`: Digest of the previous version - `__user__`: User ID who made the change - `__date__`: Timestamp of the change (format: `YYYYMMDD HH:MM:SS`) ## API Reference ### Core Methods #### `init(tenant_id: str)` Initialize database structure for a tenant. #### `save(tenant_id: str, user_id: str, entry: str, obj: object) -> str` Save a complete snapshot. Returns the digest of the saved object. #### `load(tenant_id: str, entry: str, digest: str = None) -> object` Load a snapshot. If digest is None, loads the latest version. #### `put(tenant_id: str, user_id: str, entry: str, key: str, value: object) -> bool` Add or update a single record. Returns True if a new snapshot was created. #### `put_many(tenant_id: str, user_id: str, entry: str, items: list | dict) -> bool` Add or update multiple records. Returns True if a new snapshot was created. #### `get(tenant_id: str, entry: str, key: str = None, digest: str = None) -> object` Retrieve record(s). If key is None, returns all records as a list. #### `exists(tenant_id: str, entry: str) -> bool` Check if an entry exists. #### `history(tenant_id: str, entry: str, digest: str = None, max_items: int = 1000) -> list` Get the history chain of digests for an entry. #### `get_digest(tenant_id: str, entry: str) -> str` Get the current digest for an entry. ## Usage Patterns ### Pattern 1: Snapshot-based (using `save()`) Best for saving complete states of complex objects. ```python config = {"theme": "dark", "language": "en"} db.save(tenant_id, user_id, "config", config) ``` ### Pattern 2: Record-based (using `put()` / `put_many()`) Best for managing collections of items incrementally. ```python db.put(tenant_id, user_id, "settings", "theme", "dark") db.put(tenant_id, user_id, "settings", "language", "en") ``` **Note**: Don't mix these patterns for the same entry, as they use different data structures. ## Thread Safety DbEngine uses `RLock` internally, making it safe for multi-threaded applications. ## Exceptions - `DbException`: Raised for database-related errors (missing entries, invalid parameters, etc.) ## Performance Considerations - Objects are stored as JSON files - Identical objects (same SHA-256) are stored only once - History chains can become long; use `max_items` parameter to limit traversal - File system performance impacts overall speed ## License This is a personal implementation. Please check with the author for licensing terms.