DbEngine
A lightweight, git-inspired database engine for Python that maintains complete history of all modifications.
Overview
DbEngine is a personal implementation of a versioned database engine that stores snapshots of data changes over time. Each modification creates a new immutable snapshot, allowing you to track the complete history of your data.
Key Features
- Version Control: Every change creates a new snapshot with a unique digest (SHA-256 hash)
- History Tracking: Access any previous version of your data
- Multi-tenant Support: Isolated data storage per tenant
- Thread-safe: Built-in locking mechanism for concurrent access
- Git-inspired Architecture: Objects are stored in a content-addressable format
- Efficient Storage: Identical objects are stored only once
Architecture
The engine uses a file-based storage system with the following structure:
.mytools_db/
├── {tenant_id}/
│ ├── head # Points to latest version of each entry
│ └── objects/
│ └── {digest_prefix}/
│ └── {full_digest} # Actual object data
└── refs/ # Shared references
Installation
from db_engine import DbEngine
# Initialize with default root
db = DbEngine()
# Or specify custom root directory
db = DbEngine(root="/path/to/database")
Basic Usage
Initialize Database for a Tenant
tenant_id = "my_company"
db.init(tenant_id)
Save Data
# Save a complete object
user_id = "john_doe"
entry = "users"
data = {"name": "John", "age": 30}
digest = db.save(tenant_id, user_id, entry, data)
Load Data
# Load latest version
data = db.load(tenant_id, entry="users")
# Load specific version by digest
data = db.load(tenant_id, entry="users", digest="abc123...")
Work with Individual Records
# Add or update a single record
db.put(tenant_id, user_id, entry="users", key="john", value={"name": "John", "age": 30})
# Add or update multiple records at once
items = {
"john": {"name": "John", "age": 30},
"jane": {"name": "Jane", "age": 25}
}
db.put_many(tenant_id, user_id, entry="users", items=items)
# Get a specific record
user = db.get(tenant_id, entry="users", key="john")
# Get all records
all_users = db.get(tenant_id, entry="users")
Check Existence
if db.exists(tenant_id, entry="users"):
print("Entry exists")
Access History
# Get history of an entry (returns list of digests)
history = db.history(tenant_id, entry="users", max_items=10)
# Load a previous version
old_data = db.load(tenant_id, entry="users", digest=history[1])
Metadata
Each snapshot automatically includes metadata:
__parent__: Digest of the previous version__user__: User ID who made the change__date__: Timestamp of the change (format:YYYYMMDD HH:MM:SS)
API Reference
Core Methods
init(tenant_id: str)
Initialize database structure for a tenant.
save(tenant_id: str, user_id: str, entry: str, obj: object) -> str
Save a complete snapshot. Returns the digest of the saved object.
load(tenant_id: str, entry: str, digest: str = None) -> object
Load a snapshot. If digest is None, loads the latest version.
put(tenant_id: str, user_id: str, entry: str, key: str, value: object) -> bool
Add or update a single record. Returns True if a new snapshot was created.
put_many(tenant_id: str, user_id: str, entry: str, items: list | dict) -> bool
Add or update multiple records. Returns True if a new snapshot was created.
get(tenant_id: str, entry: str, key: str = None, digest: str = None) -> object
Retrieve record(s). If key is None, returns all records as a list.
exists(tenant_id: str, entry: str) -> bool
Check if an entry exists.
history(tenant_id: str, entry: str, digest: str = None, max_items: int = 1000) -> list
Get the history chain of digests for an entry.
get_digest(tenant_id: str, entry: str) -> str
Get the current digest for an entry.
Usage Patterns
Pattern 1: Snapshot-based (using save())
Best for saving complete states of complex objects.
config = {"theme": "dark", "language": "en"}
db.save(tenant_id, user_id, "config", config)
Pattern 2: Record-based (using put() / put_many())
Best for managing collections of items incrementally.
db.put(tenant_id, user_id, "settings", "theme", "dark")
db.put(tenant_id, user_id, "settings", "language", "en")
Note: Don't mix these patterns for the same entry, as they use different data structures.
Thread Safety
DbEngine uses RLock internally, making it safe for multi-threaded applications.
Exceptions
DbException: Raised for database-related errors (missing entries, invalid parameters, etc.)
Performance Considerations
- Objects are stored as JSON files
- Identical objects (same SHA-256) are stored only once
- History chains can become long; use
max_itemsparameter to limit traversal - File system performance impacts overall speed
License
This is a personal implementation. Please check with the author for licensing terms.