Adding Profiler module
This commit is contained in:
271
docs/Profiler.md
Normal file
271
docs/Profiler.md
Normal file
@@ -0,0 +1,271 @@
|
||||
# Profiler — Design & Implementation Plan
|
||||
|
||||
## Context
|
||||
|
||||
Performance issues were identified during keyboard navigation in the DataGrid (173ms server-side
|
||||
per command call). The HTMX debug traces (via `htmx_debug.js`) confirmed the bottleneck is
|
||||
server-side. A persistent, in-application profiling system is needed for continuous analysis
|
||||
across sessions and future investigations.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### Data Collection Strategy
|
||||
|
||||
Two complementary levels:
|
||||
|
||||
- **Level A** (route handler): One trace per `/myfasthtml/commands` call. Captures total
|
||||
server-side duration including lookup, execution, and HTMX swap overhead.
|
||||
- **Level B** (granular spans): Decomposition of each trace into named phases. Activated
|
||||
by placing probes in the code.
|
||||
|
||||
Both levels are active simultaneously. Level A gives the global picture; Level B gives the
|
||||
breakdown.
|
||||
|
||||
### Probe Mechanisms
|
||||
|
||||
Four complementary mechanisms, chosen based on the context:
|
||||
|
||||
#### 1. Context manager — partial block instrumentation
|
||||
|
||||
```python
|
||||
with profiler.span("oob_swap"):
|
||||
# only this block is timed
|
||||
result = build_oob_elements(...)
|
||||
```
|
||||
|
||||
Metadata can be attached during execution:
|
||||
|
||||
```python
|
||||
with profiler.span("query") as span:
|
||||
rows = db.query(...)
|
||||
span.set("row_count", len(rows))
|
||||
```
|
||||
|
||||
#### 2. Decorator — full function instrumentation
|
||||
|
||||
```python
|
||||
@profiler.span("callback")
|
||||
def execute_callback(self, client_response):
|
||||
...
|
||||
```
|
||||
|
||||
Function arguments are captured automatically. Metadata can be attached via `current_span()`:
|
||||
|
||||
```python
|
||||
@profiler.span("process")
|
||||
def process(self, rows):
|
||||
result = do_work(rows)
|
||||
profiler.current_span().set("row_count", len(result))
|
||||
return result
|
||||
```
|
||||
|
||||
#### 3. Cumulative span — loop instrumentation
|
||||
|
||||
For loops with many iterations. Aggregates instead of creating one span per iteration.
|
||||
|
||||
```python
|
||||
for row in rows:
|
||||
with profiler.cumulative_span("process_row"):
|
||||
process(row)
|
||||
|
||||
# or as a decorator
|
||||
@profiler.cumulative_span("process_row")
|
||||
def process_row(self, row):
|
||||
...
|
||||
```
|
||||
|
||||
Exposes: `count`, `total`, `min`, `max`, `avg`. Single entry in the trace tree regardless of
|
||||
iteration count.
|
||||
|
||||
#### 4. `trace_all` — class-level static instrumentation
|
||||
|
||||
Wraps all methods of a class at definition time. No runtime overhead beyond the spans themselves.
|
||||
|
||||
```python
|
||||
@profiler.trace_all
|
||||
class DataGrid(MultipleInstance):
|
||||
def navigate_cell(self, ...): # auto-spanned
|
||||
...
|
||||
|
||||
# Exclude specific methods
|
||||
@profiler.trace_all(exclude=["__ft__", "render"])
|
||||
class DataGrid(MultipleInstance):
|
||||
...
|
||||
```
|
||||
|
||||
Implementation: uses `inspect` to iterate over methods and wraps each with `@profiler.span()`.
|
||||
No `sys.settrace()` involved — pure static wrapping.
|
||||
|
||||
#### 5. `trace_calls` — sub-call exploration
|
||||
|
||||
Traces all function calls made within a single function, recursively. Used for exploration
|
||||
when the bottleneck location is unknown.
|
||||
|
||||
```python
|
||||
@profiler.trace_calls
|
||||
def navigate_cell(self, ...):
|
||||
self._update_selection() # auto-traced as child span
|
||||
self._compute_visible() # auto-traced as child span
|
||||
db.query(...) # auto-traced as child span
|
||||
```
|
||||
|
||||
Implementation: uses `sys.setprofile()` scoped to the decorated function's execution only.
|
||||
Overhead is localized to that function's call stack. This is an exploration tool — use it
|
||||
to identify hotspots, then replace with explicit probes.
|
||||
|
||||
### Span Hierarchy
|
||||
|
||||
Hierarchy is determined by code nesting via a `ContextVar` stack (async-safe). No explicit
|
||||
parent references required.
|
||||
|
||||
```python
|
||||
with profiler.span("execute"): # root
|
||||
with profiler.span("callback"): # child of execute
|
||||
result = self.callback(...)
|
||||
with profiler.span("oob_swap"): # sibling of callback
|
||||
...
|
||||
```
|
||||
|
||||
When a command calls another command, the second command's spans automatically become children
|
||||
of the first command's active span.
|
||||
|
||||
`profiler.current_span()` provides access to the active span from anywhere in the call stack.
|
||||
|
||||
### Storage
|
||||
|
||||
- **Scope**: Global (all sessions). Profiling measures server behavior, not per-user state.
|
||||
- **Structure**: `deque` with a configurable maximum size.
|
||||
- **Default size**: 500 traces (constant `PROFILER_MAX_TRACES`).
|
||||
- **Eviction**: Oldest traces are dropped when the buffer is full (FIFO).
|
||||
- **Persistence**: In-memory only. Lost on server restart.
|
||||
|
||||
### Toggle and Clear
|
||||
|
||||
- `profiler.enabled` — boolean flag. When `False`, all probe mechanisms are no-ops (zero overhead).
|
||||
- `profiler.clear()` — empties the trace buffer.
|
||||
- Both are controllable from the UI control.
|
||||
|
||||
### Overhead Measurement
|
||||
|
||||
The `ProfilingManager` self-profiles its own `span.__enter__` and `span.__exit__` calls.
|
||||
Exposes:
|
||||
|
||||
- `overhead_per_span_ns` — average cost of one span boundary in nanoseconds
|
||||
- `total_overhead_ms` — estimated total overhead across all active spans
|
||||
|
||||
Visible in the UI to verify the profiler does not bias measurements significantly.
|
||||
|
||||
---
|
||||
|
||||
## Data Model
|
||||
|
||||
```
|
||||
ProfilingTrace
|
||||
command_name: str
|
||||
command_id: str
|
||||
kwargs: dict
|
||||
timestamp: datetime
|
||||
total_duration_ms: float
|
||||
root_span: ProfilingSpan
|
||||
|
||||
ProfilingSpan
|
||||
name: str
|
||||
start: float (perf_counter)
|
||||
duration_ms: float
|
||||
data: dict (attached via span.set())
|
||||
children: list[ProfilingSpan | CumulativeSpan]
|
||||
|
||||
CumulativeSpan
|
||||
name: str
|
||||
count: int
|
||||
total_ms: float
|
||||
min_ms: float
|
||||
max_ms: float
|
||||
avg_ms: float
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Existing Code Hooks
|
||||
|
||||
### `src/myfasthtml/core/utils.py` — route handler (Level A)
|
||||
|
||||
```python
|
||||
@utils_rt(Routes.Commands)
|
||||
async def post(session, c_id: str, client_response: dict = None):
|
||||
with profiler.span("command", args={"c_id": c_id}):
|
||||
command = CommandsManager.get_command(c_id)
|
||||
return await command.execute(client_response)
|
||||
```
|
||||
|
||||
### `src/myfasthtml/core/commands.py` — execution phases (Level B)
|
||||
|
||||
```python
|
||||
def execute(self, client_response=None):
|
||||
with profiler.span("before_commands"):
|
||||
...
|
||||
with profiler.span("callback"):
|
||||
result = self.callback(...)
|
||||
with profiler.span("after_commands"):
|
||||
...
|
||||
with profiler.span("oob_swap"):
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1 — Core
|
||||
|
||||
**File**: `src/myfasthtml/core/profiler.py`
|
||||
|
||||
1. `ProfilingSpan` dataclass
|
||||
2. `CumulativeSpan` dataclass
|
||||
3. `ProfilingTrace` dataclass
|
||||
4. `ProfilingManager` class with all probe mechanisms
|
||||
5. `profiler` singleton
|
||||
6. Hook into `utils.py` (Level A)
|
||||
7. Hook into `commands.py` (Level B)
|
||||
|
||||
**Tests**: `tests/core/test_profiler.py`
|
||||
|
||||
| Test | Description |
|
||||
|------|-------------|
|
||||
| `test_i_can_create_a_span` | Basic span creation and timing |
|
||||
| `test_i_can_nest_spans` | Child spans are correctly parented |
|
||||
| `test_i_can_use_span_as_decorator` | Decorator captures args automatically |
|
||||
| `test_i_can_use_cumulative_span` | Aggregates count/total/min/max/avg |
|
||||
| `test_i_can_attach_data_to_span` | `span.set()` and `current_span().set()` |
|
||||
| `test_i_can_clear_traces` | Buffer is emptied after `clear()` |
|
||||
| `test_i_can_enable_disable_profiler` | Probes are no-ops when disabled |
|
||||
| `test_i_can_measure_overhead` | Overhead metrics are exposed |
|
||||
| `test_i_can_use_trace_all_on_class` | All methods of a class are wrapped |
|
||||
| `test_i_can_use_trace_calls_on_function` | Sub-calls are traced via setprofile |
|
||||
|
||||
### Phase 2 — Controls
|
||||
|
||||
**`src/myfasthtml/controls/ProfilerList.py`** (SingleInstance)
|
||||
- Table of all traces: command name / total duration / timestamp
|
||||
- Right panel: trace detail (kwargs, span breakdown)
|
||||
- Buttons: enable/disable, clear
|
||||
- Click on a trace → opens ProfilerDetail
|
||||
|
||||
**`src/myfasthtml/controls/ProfilerDetail.py`** (MultipleInstance)
|
||||
- Hierarchical span tree for a single trace
|
||||
- Two display modes: list and pie chart
|
||||
- Click on a span → zooms into its children (if any)
|
||||
- Displays cumulative spans with count/min/max/avg
|
||||
- Shows overhead metrics
|
||||
|
||||
**`src/myfasthtml/controls/ProfilerPieChart.py`** (future)
|
||||
- Pie chart visualization of span distribution at a given zoom level
|
||||
|
||||
---
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
- Control files: `ProfilerXxx.py`
|
||||
- CSS classes: `mf-profiler-xxx`
|
||||
- Logger: `logging.getLogger("Profiler")`
|
||||
- Constant: `PROFILER_MAX_TRACES = 500` in `src/myfasthtml/core/constants.py`
|
||||
Reference in New Issue
Block a user