356 lines
11 KiB
Markdown
356 lines
11 KiB
Markdown
# Profiler — Design & Implementation Plan
|
||
|
||
## Context
|
||
|
||
Performance issues were identified during keyboard navigation in the DataGrid (173ms server-side
|
||
per command call). The HTMX debug traces (via `htmx_debug.js`) confirmed the bottleneck is
|
||
server-side. A persistent, in-application profiling system is needed for continuous analysis
|
||
across sessions and future investigations.
|
||
|
||
---
|
||
|
||
## Implementation Status
|
||
|
||
| Phase | Item | Status |
|
||
|-------|------|--------|
|
||
| **Phase 1 — Core** | `profiler.py` — data model + probe mechanisms | ✅ Done |
|
||
| **Phase 1 — Core** | `tests/core/test_profiler.py` — full test suite (7 classes) | ✅ Done |
|
||
| **Phase 1 — Core** | Hook `utils.py` — Level A `command_span` | ✅ Done |
|
||
| **Phase 1 — Core** | Hook `commands.py` — Level B phases | ⏳ Deferred |
|
||
| **Phase 2 — Controls** | `Profiler.py` — global layout (toolbar + list) | 🔄 In progress |
|
||
| **Phase 2 — Controls** | `Profiler.py` — detail panel (span tree + pie) | ⏳ Pending |
|
||
| **Phase 2 — Controls** | CSS `profiler.css` | 🔄 In progress |
|
||
| **Phase 2 — Controls** | `ProfilerPieChart.py` | ⏳ Future |
|
||
|
||
---
|
||
|
||
## Design Decisions
|
||
|
||
### Data Collection Strategy
|
||
|
||
Two complementary levels:
|
||
|
||
- **Level A** (route handler): One trace per `/myfasthtml/commands` call. Captures total
|
||
server-side duration including lookup, execution, and HTMX swap overhead.
|
||
- **Level B** (granular spans): Decomposition of each trace into named phases. Activated
|
||
by placing probes in the code.
|
||
|
||
Both levels are active simultaneously. Level A gives the global picture; Level B gives the
|
||
breakdown.
|
||
|
||
### Probe Mechanisms
|
||
|
||
Four complementary mechanisms, chosen based on the context:
|
||
|
||
#### 1. Context manager — partial block instrumentation
|
||
|
||
```python
|
||
with profiler.span("oob_swap"):
|
||
# only this block is timed
|
||
result = build_oob_elements(...)
|
||
```
|
||
|
||
Metadata can be attached during execution:
|
||
|
||
```python
|
||
with profiler.span("query") as span:
|
||
rows = db.query(...)
|
||
span.set("row_count", len(rows))
|
||
```
|
||
|
||
#### 2. Decorator — full function instrumentation
|
||
|
||
```python
|
||
@profiler.span("callback")
|
||
def execute_callback(self, client_response):
|
||
...
|
||
```
|
||
|
||
Function arguments are captured automatically. Metadata can be attached via `current_span()`:
|
||
|
||
```python
|
||
@profiler.span("process")
|
||
def process(self, rows):
|
||
result = do_work(rows)
|
||
profiler.current_span().set("row_count", len(result))
|
||
return result
|
||
```
|
||
|
||
#### 3. Cumulative span — loop instrumentation
|
||
|
||
For loops with many iterations. Aggregates instead of creating one span per iteration.
|
||
|
||
```python
|
||
for row in rows:
|
||
with profiler.cumulative_span("process_row"):
|
||
process(row)
|
||
|
||
# or as a decorator
|
||
@profiler.cumulative_span("process_row")
|
||
def process_row(self, row):
|
||
...
|
||
```
|
||
|
||
Exposes: `count`, `total`, `min`, `max`, `avg`. Single entry in the trace tree regardless of
|
||
iteration count.
|
||
|
||
#### 4. `trace_all` — class-level static instrumentation
|
||
|
||
Wraps all methods of a class at definition time. No runtime overhead beyond the spans themselves.
|
||
|
||
```python
|
||
@profiler.trace_all
|
||
class DataGrid(MultipleInstance):
|
||
def navigate_cell(self, ...): # auto-spanned
|
||
...
|
||
|
||
# Exclude specific methods
|
||
@profiler.trace_all(exclude=["__ft__", "render"])
|
||
class DataGrid(MultipleInstance):
|
||
...
|
||
```
|
||
|
||
Implementation: uses `inspect` to iterate over methods and wraps each with `@profiler.span()`.
|
||
No `sys.settrace()` involved — pure static wrapping.
|
||
|
||
#### 5. `trace_calls` — sub-call exploration
|
||
|
||
Traces all function calls made within a single function, recursively. Used for exploration
|
||
when the bottleneck location is unknown.
|
||
|
||
```python
|
||
@profiler.trace_calls
|
||
def navigate_cell(self, ...):
|
||
self._update_selection() # auto-traced as child span
|
||
self._compute_visible() # auto-traced as child span
|
||
db.query(...) # auto-traced as child span
|
||
```
|
||
|
||
Implementation: uses `sys.setprofile()` scoped to the decorated function's execution only.
|
||
Overhead is localized to that function's call stack. This is an exploration tool — use it
|
||
to identify hotspots, then replace with explicit probes.
|
||
|
||
### Span Hierarchy
|
||
|
||
Hierarchy is determined by code nesting via a `ContextVar` stack (async-safe). No explicit
|
||
parent references required.
|
||
|
||
```python
|
||
with profiler.span("execute"): # root
|
||
with profiler.span("callback"): # child of execute
|
||
result = self.callback(...)
|
||
with profiler.span("oob_swap"): # sibling of callback
|
||
...
|
||
```
|
||
|
||
When a command calls another command, the second command's spans automatically become children
|
||
of the first command's active span.
|
||
|
||
`profiler.current_span()` provides access to the active span from anywhere in the call stack.
|
||
|
||
### Storage
|
||
|
||
- **Scope**: Global (all sessions). Profiling measures server behavior, not per-user state.
|
||
- **Structure**: `deque` with a configurable maximum size.
|
||
- **Default size**: 500 traces (constant `PROFILER_MAX_TRACES`).
|
||
- **Eviction**: Oldest traces are dropped when the buffer is full (FIFO).
|
||
- **Persistence**: In-memory only. Lost on server restart.
|
||
|
||
### Toggle and Clear
|
||
|
||
- `profiler.enabled` — boolean flag. When `False`, all probe mechanisms are no-ops (zero overhead).
|
||
- `profiler.clear()` — empties the trace buffer.
|
||
- Both are controllable from the UI control.
|
||
|
||
### Overhead Measurement
|
||
|
||
The `ProfilingManager` self-profiles its own `span.__enter__` and `span.__exit__` calls.
|
||
Exposes:
|
||
|
||
- `overhead_per_span_us` — average cost of one span boundary in microseconds
|
||
- `total_overhead_ms` — estimated total overhead across all active spans
|
||
|
||
Visible in the UI to verify the profiler does not bias measurements significantly.
|
||
|
||
---
|
||
|
||
## Data Model
|
||
|
||
```
|
||
ProfilingTrace
|
||
command_name: str
|
||
command_id: str
|
||
kwargs: dict
|
||
timestamp: datetime
|
||
total_duration_ms: float
|
||
root_span: ProfilingSpan
|
||
|
||
ProfilingSpan
|
||
name: str
|
||
start: float (perf_counter)
|
||
duration_ms: float
|
||
data: dict (attached via span.set())
|
||
children: list[ProfilingSpan | CumulativeSpan]
|
||
|
||
CumulativeSpan
|
||
name: str
|
||
count: int
|
||
total_ms: float
|
||
min_ms: float
|
||
max_ms: float
|
||
avg_ms: float
|
||
```
|
||
|
||
---
|
||
|
||
## Code Hooks
|
||
|
||
### `src/myfasthtml/core/utils.py` — route handler (Level A) ✅
|
||
|
||
```python
|
||
command = CommandsManager.get_command(c_id)
|
||
if command:
|
||
with profiler.command_span(command.name, c_id, client_response or {}):
|
||
return command.execute(client_response)
|
||
```
|
||
|
||
### `src/myfasthtml/core/commands.py` — execution phases (Level B) ⏳ Deferred
|
||
|
||
Planned breakdown inside `Command.execute()`:
|
||
|
||
```python
|
||
def execute(self, client_response=None):
|
||
with profiler.span("before_commands"):
|
||
...
|
||
with profiler.span("callback"):
|
||
result = self.callback(...)
|
||
with profiler.span("after_commands"):
|
||
...
|
||
with profiler.span("oob_swap"):
|
||
...
|
||
```
|
||
|
||
Deferred: will be added once the UI control is functional to immediately observe the breakdown.
|
||
|
||
---
|
||
|
||
## UI Control Design
|
||
|
||
### Control name: `Profiler` (SingleInstance)
|
||
|
||
Single entry point. Replaces the earlier `ProfilerList` name.
|
||
|
||
**Files:**
|
||
- `src/myfasthtml/controls/Profiler.py`
|
||
- `src/myfasthtml/assets/core/profiler.css`
|
||
|
||
### Layout
|
||
|
||
Split view using `Panel`:
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────┐
|
||
│ [●] [🗑] Overhead/span: 1.2µs Traces: 8/500│ ← toolbar (icon-only)
|
||
├──────────────────────┬──────────────────────────────┤
|
||
│ Command Duration Time│ NavigateCell — 173.4ms [≡][◕]│
|
||
│ ──────────────────────│ ─────────────────────────────│
|
||
│ NavigateCell 173ms … │ [Metadata card] │
|
||
│ NavigateCell 168ms … │ [kwargs card] │
|
||
│ SelectRow 42ms … │ [Span breakdown / Pie chart] │
|
||
│ … │ │
|
||
└──────────────────────┴──────────────────────────────┘
|
||
```
|
||
|
||
### Toolbar
|
||
|
||
Icon-only buttons, no `Menu` control (Menu does not support toggle state).
|
||
Direct `mk.icon()` calls:
|
||
|
||
- **Enable/disable**: icon changes between "recording" and "stopped" states based on `profiler.enabled`
|
||
- **Clear**: delete icon, always red
|
||
- **Refresh**: manual refresh of the trace list (no auto-refresh yet — added in Step 2.1)
|
||
|
||
Overhead metrics displayed as plain text on the right side of the toolbar.
|
||
|
||
### Trace list (left panel)
|
||
|
||
Three columns: command name / duration (color-coded) / timestamp.
|
||
Click on a row → update right panel via HTMX.
|
||
|
||
**Duration color thresholds:**
|
||
- Green (`mf-profiler-fast`): < 20 ms
|
||
- Orange (`mf-profiler-medium`): 20–100 ms
|
||
- Red (`mf-profiler-slow`): > 100 ms
|
||
|
||
### Detail panel (right)
|
||
|
||
Two view modes, toggled by icons in the detail panel header:
|
||
|
||
1. **Tree view** (default): Properties-style cards (Metadata, kwargs) + span breakdown with
|
||
proportional bars and indentation. Cumulative spans show `×N · min/avg/max` badge.
|
||
2. **Pie view**: `ProfilerPieChart` control (future) — distribution of time across spans
|
||
at the current zoom level.
|
||
|
||
The `Properties` control is used as-is for Metadata and kwargs cards.
|
||
The span breakdown is custom rendering (not a `Properties` instance).
|
||
|
||
### Font conventions
|
||
|
||
- Labels, headings, command names: `--font-sans` (DaisyUI default)
|
||
- Values (durations, timestamps, kwargs values): `--font-mono`
|
||
- Consistent with `properties.css` (`mf-properties-value` uses `--default-mono-font-family`)
|
||
|
||
### Visual reference
|
||
|
||
Mockups available in `examples/`:
|
||
- `profiler_mockup.html` — first iteration (monospace font everywhere)
|
||
- `profiler_mockup_2.html` — **reference** (correct fonts, icon toolbar, tree/pie toggle)
|
||
|
||
---
|
||
|
||
## Implementation Plan
|
||
|
||
### Phase 1 — Core ✅ Complete
|
||
|
||
1. `ProfilingSpan`, `CumulativeSpan`, `ProfilingTrace` dataclasses
|
||
2. `ProfilingManager` with all probe mechanisms
|
||
3. `profiler` singleton
|
||
4. Hook into `utils.py` (Level A) ✅
|
||
5. Hook into `commands.py` (Level B) — deferred
|
||
|
||
**Tests**: `tests/core/test_profiler.py` — 7 classes, full coverage ✅
|
||
|
||
### Phase 2 — Controls
|
||
|
||
#### Step 2.1 — Global layout (current) 🔄
|
||
|
||
`src/myfasthtml/controls/Profiler.py`:
|
||
- `SingleInstance` inheriting
|
||
- Toolbar: `mk.icon()` for enable/disable and clear, overhead text
|
||
- `Panel` for split layout
|
||
- Left: trace list table (command / duration / timestamp), click → select_trace command
|
||
- Right: placeholder (empty until Step 2.2)
|
||
|
||
`src/myfasthtml/assets/core/profiler.css`:
|
||
- All `mf-profiler-*` classes
|
||
|
||
#### Step 2.2 — Detail panel ⏳
|
||
|
||
Right panel content:
|
||
- Metadata and kwargs via `Properties`
|
||
- Span tree: custom `_mk_span_tree()` with bars and cumulative badges
|
||
- View toggle (tree / pie) in detail header
|
||
|
||
#### Step 2.3 — Pie chart ⏳ Future
|
||
|
||
`src/myfasthtml/controls/ProfilerPieChart.py`
|
||
|
||
---
|
||
|
||
## Naming Conventions
|
||
|
||
- Control files: `ProfilerXxx.py`
|
||
- CSS classes: `mf-profiler-xxx`
|
||
- Logger: `logging.getLogger("Profiler")`
|
||
- Constant: `PROFILER_MAX_TRACES = 500` in `src/myfasthtml/core/constants.py`
|