Files
MyDbEngine/README.md
2025-10-17 22:24:19 +02:00

187 lines
5.3 KiB
Markdown

# DbEngine
A lightweight, git-inspired database engine for Python that maintains complete history of all modifications.
## Overview
DbEngine is a personal implementation of a versioned database engine that stores snapshots of data changes over time. Each modification creates a new immutable snapshot, allowing you to track the complete history of your data.
## Key Features
- **Version Control**: Every change creates a new snapshot with a unique digest (SHA-256 hash)
- **History Tracking**: Access any previous version of your data
- **Multi-tenant Support**: Isolated data storage per tenant
- **Thread-safe**: Built-in locking mechanism for concurrent access
- **Git-inspired Architecture**: Objects are stored in a content-addressable format
- **Efficient Storage**: Identical objects are stored only once
## Architecture
The engine uses a file-based storage system with the following structure:
```
.mytools_db/
├── {tenant_id}/
│ ├── head # Points to latest version of each entry
│ └── objects/
│ └── {digest_prefix}/
│ └── {full_digest} # Actual object data
└── refs/ # Shared references
```
## Installation
```python
from db_engine import DbEngine
# Initialize with default root
db = DbEngine()
# Or specify custom root directory
db = DbEngine(root="/path/to/database")
```
## Basic Usage
### Initialize Database for a Tenant
```python
tenant_id = "my_company"
db.init(tenant_id)
```
### Save Data
```python
# Save a complete object
user_id = "john_doe"
entry = "users"
data = {"name": "John", "age": 30}
digest = db.save(tenant_id, user_id, entry, data)
```
### Load Data
```python
# Load latest version
data = db.load(tenant_id, entry="users")
# Load specific version by digest
data = db.load(tenant_id, entry="users", digest="abc123...")
```
### Work with Individual Records
```python
# Add or update a single record
db.put(tenant_id, user_id, entry="users", key="john", value={"name": "John", "age": 30})
# Add or update multiple records at once
items = {
"john": {"name": "John", "age": 30},
"jane": {"name": "Jane", "age": 25}
}
db.put_many(tenant_id, user_id, entry="users", items=items)
# Get a specific record
user = db.get(tenant_id, entry="users", key="john")
# Get all records
all_users = db.get(tenant_id, entry="users")
```
### Check Existence
```python
if db.exists(tenant_id, entry="users"):
print("Entry exists")
```
### Access History
```python
# Get history of an entry (returns list of digests)
history = db.history(tenant_id, entry="users", max_items=10)
# Load a previous version
old_data = db.load(tenant_id, entry="users", digest=history[1])
```
## Metadata
Each snapshot automatically includes metadata:
- `__parent__`: Digest of the previous version
- `__user__`: User ID who made the change
- `__date__`: Timestamp of the change (format: `YYYYMMDD HH:MM:SS`)
## API Reference
### Core Methods
#### `init(tenant_id: str)`
Initialize database structure for a tenant.
#### `save(tenant_id: str, user_id: str, entry: str, obj: object) -> str`
Save a complete snapshot. Returns the digest of the saved object.
#### `load(tenant_id: str, entry: str, digest: str = None) -> object`
Load a snapshot. If digest is None, loads the latest version.
#### `put(tenant_id: str, user_id: str, entry: str, key: str, value: object) -> bool`
Add or update a single record. Returns True if a new snapshot was created.
#### `put_many(tenant_id: str, user_id: str, entry: str, items: list | dict) -> bool`
Add or update multiple records. Returns True if a new snapshot was created.
#### `get(tenant_id: str, entry: str, key: str = None, digest: str = None) -> object`
Retrieve record(s). If key is None, returns all records as a list.
#### `exists(tenant_id: str, entry: str) -> bool`
Check if an entry exists.
#### `history(tenant_id: str, entry: str, digest: str = None, max_items: int = 1000) -> list`
Get the history chain of digests for an entry.
#### `get_digest(tenant_id: str, entry: str) -> str`
Get the current digest for an entry.
## Usage Patterns
### Pattern 1: Snapshot-based (using `save()`)
Best for saving complete states of complex objects.
```python
config = {"theme": "dark", "language": "en"}
db.save(tenant_id, user_id, "config", config)
```
### Pattern 2: Record-based (using `put()` / `put_many()`)
Best for managing collections of items incrementally.
```python
db.put(tenant_id, user_id, "settings", "theme", "dark")
db.put(tenant_id, user_id, "settings", "language", "en")
```
**Note**: Don't mix these patterns for the same entry, as they use different data structures.
## Thread Safety
DbEngine uses `RLock` internally, making it safe for multi-threaded applications.
## Exceptions
- `DbException`: Raised for database-related errors (missing entries, invalid parameters, etc.)
## Performance Considerations
- Objects are stored as JSON files
- Identical objects (same SHA-256) are stored only once
- History chains can become long; use `max_items` parameter to limit traversal
- File system performance impacts overall speed
## License
This is a personal implementation. Please check with the author for licensing terms.