11 KiB
Profiler — Design & Implementation Plan
Context
Performance issues were identified during keyboard navigation in the DataGrid (173ms server-side
per command call). The HTMX debug traces (via htmx_debug.js) confirmed the bottleneck is
server-side. A persistent, in-application profiling system is needed for continuous analysis
across sessions and future investigations.
Implementation Status
| Phase | Item | Status |
|---|---|---|
| Phase 1 — Core | profiler.py — data model + probe mechanisms |
✅ Done |
| Phase 1 — Core | tests/core/test_profiler.py — full test suite (7 classes) |
✅ Done |
| Phase 1 — Core | Hook utils.py — Level A command_span |
✅ Done |
| Phase 1 — Core | Hook commands.py — Level B phases |
⏳ Deferred |
| Phase 2 — Controls | Profiler.py — global layout (toolbar + list) |
🔄 In progress |
| Phase 2 — Controls | Profiler.py — detail panel (span tree + pie) |
⏳ Pending |
| Phase 2 — Controls | CSS profiler.css |
🔄 In progress |
| Phase 2 — Controls | ProfilerPieChart.py |
⏳ Future |
Design Decisions
Data Collection Strategy
Two complementary levels:
- Level A (route handler): One trace per
/myfasthtml/commandscall. Captures total server-side duration including lookup, execution, and HTMX swap overhead. - Level B (granular spans): Decomposition of each trace into named phases. Activated by placing probes in the code.
Both levels are active simultaneously. Level A gives the global picture; Level B gives the breakdown.
Probe Mechanisms
Four complementary mechanisms, chosen based on the context:
1. Context manager — partial block instrumentation
with profiler.span("oob_swap"):
# only this block is timed
result = build_oob_elements(...)
Metadata can be attached during execution:
with profiler.span("query") as span:
rows = db.query(...)
span.set("row_count", len(rows))
2. Decorator — full function instrumentation
@profiler.span("callback")
def execute_callback(self, client_response):
...
Function arguments are captured automatically. Metadata can be attached via current_span():
@profiler.span("process")
def process(self, rows):
result = do_work(rows)
profiler.current_span().set("row_count", len(result))
return result
3. Cumulative span — loop instrumentation
For loops with many iterations. Aggregates instead of creating one span per iteration.
for row in rows:
with profiler.cumulative_span("process_row"):
process(row)
# or as a decorator
@profiler.cumulative_span("process_row")
def process_row(self, row):
...
Exposes: count, total, min, max, avg. Single entry in the trace tree regardless of
iteration count.
4. trace_all — class-level static instrumentation
Wraps all methods of a class at definition time. No runtime overhead beyond the spans themselves.
@profiler.trace_all
class DataGrid(MultipleInstance):
def navigate_cell(self, ...): # auto-spanned
...
# Exclude specific methods
@profiler.trace_all(exclude=["__ft__", "render"])
class DataGrid(MultipleInstance):
...
Implementation: uses inspect to iterate over methods and wraps each with @profiler.span().
No sys.settrace() involved — pure static wrapping.
5. trace_calls — sub-call exploration
Traces all function calls made within a single function, recursively. Used for exploration when the bottleneck location is unknown.
@profiler.trace_calls
def navigate_cell(self, ...):
self._update_selection() # auto-traced as child span
self._compute_visible() # auto-traced as child span
db.query(...) # auto-traced as child span
Implementation: uses sys.setprofile() scoped to the decorated function's execution only.
Overhead is localized to that function's call stack. This is an exploration tool — use it
to identify hotspots, then replace with explicit probes.
Span Hierarchy
Hierarchy is determined by code nesting via a ContextVar stack (async-safe). No explicit
parent references required.
with profiler.span("execute"): # root
with profiler.span("callback"): # child of execute
result = self.callback(...)
with profiler.span("oob_swap"): # sibling of callback
...
When a command calls another command, the second command's spans automatically become children of the first command's active span.
profiler.current_span() provides access to the active span from anywhere in the call stack.
Storage
- Scope: Global (all sessions). Profiling measures server behavior, not per-user state.
- Structure:
dequewith a configurable maximum size. - Default size: 500 traces (constant
PROFILER_MAX_TRACES). - Eviction: Oldest traces are dropped when the buffer is full (FIFO).
- Persistence: In-memory only. Lost on server restart.
Toggle and Clear
profiler.enabled— boolean flag. WhenFalse, all probe mechanisms are no-ops (zero overhead).profiler.clear()— empties the trace buffer.- Both are controllable from the UI control.
Overhead Measurement
The ProfilingManager self-profiles its own span.__enter__ and span.__exit__ calls.
Exposes:
overhead_per_span_us— average cost of one span boundary in microsecondstotal_overhead_ms— estimated total overhead across all active spans
Visible in the UI to verify the profiler does not bias measurements significantly.
Data Model
ProfilingTrace
command_name: str
command_id: str
kwargs: dict
timestamp: datetime
total_duration_ms: float
root_span: ProfilingSpan
ProfilingSpan
name: str
start: float (perf_counter)
duration_ms: float
data: dict (attached via span.set())
children: list[ProfilingSpan | CumulativeSpan]
CumulativeSpan
name: str
count: int
total_ms: float
min_ms: float
max_ms: float
avg_ms: float
Code Hooks
src/myfasthtml/core/utils.py — route handler (Level A) ✅
command = CommandsManager.get_command(c_id)
if command:
with profiler.command_span(command.name, c_id, client_response or {}):
return command.execute(client_response)
src/myfasthtml/core/commands.py — execution phases (Level B) ⏳ Deferred
Planned breakdown inside Command.execute():
def execute(self, client_response=None):
with profiler.span("before_commands"):
...
with profiler.span("callback"):
result = self.callback(...)
with profiler.span("after_commands"):
...
with profiler.span("oob_swap"):
...
Deferred: will be added once the UI control is functional to immediately observe the breakdown.
UI Control Design
Control name: Profiler (SingleInstance)
Single entry point. Replaces the earlier ProfilerList name.
Files:
src/myfasthtml/controls/Profiler.pysrc/myfasthtml/assets/core/profiler.css
Layout
Split view using Panel:
┌─────────────────────────────────────────────────────┐
│ [●] [🗑] Overhead/span: 1.2µs Traces: 8/500│ ← toolbar (icon-only)
├──────────────────────┬──────────────────────────────┤
│ Command Duration Time│ NavigateCell — 173.4ms [≡][◕]│
│ ──────────────────────│ ─────────────────────────────│
│ NavigateCell 173ms … │ [Metadata card] │
│ NavigateCell 168ms … │ [kwargs card] │
│ SelectRow 42ms … │ [Span breakdown / Pie chart] │
│ … │ │
└──────────────────────┴──────────────────────────────┘
Toolbar
Icon-only buttons, no Menu control (Menu does not support toggle state).
Direct mk.icon() calls:
- Enable/disable: icon changes between "recording" and "stopped" states based on
profiler.enabled - Clear: delete icon, always red
- Refresh: manual refresh of the trace list (no auto-refresh yet — added in Step 2.1)
Overhead metrics displayed as plain text on the right side of the toolbar.
Trace list (left panel)
Three columns: command name / duration (color-coded) / timestamp. Click on a row → update right panel via HTMX.
Duration color thresholds:
- Green (
mf-profiler-fast): < 20 ms - Orange (
mf-profiler-medium): 20–100 ms - Red (
mf-profiler-slow): > 100 ms
Detail panel (right)
Two view modes, toggled by icons in the detail panel header:
- Tree view (default): Properties-style cards (Metadata, kwargs) + span breakdown with
proportional bars and indentation. Cumulative spans show
×N · min/avg/maxbadge. - Pie view:
ProfilerPieChartcontrol (future) — distribution of time across spans at the current zoom level.
The Properties control is used as-is for Metadata and kwargs cards.
The span breakdown is custom rendering (not a Properties instance).
Font conventions
- Labels, headings, command names:
--font-sans(DaisyUI default) - Values (durations, timestamps, kwargs values):
--font-mono - Consistent with
properties.css(mf-properties-valueuses--default-mono-font-family)
Visual reference
Mockups available in examples/:
profiler_mockup.html— first iteration (monospace font everywhere)profiler_mockup_2.html— reference (correct fonts, icon toolbar, tree/pie toggle)
Implementation Plan
Phase 1 — Core ✅ Complete
ProfilingSpan,CumulativeSpan,ProfilingTracedataclassesProfilingManagerwith all probe mechanismsprofilersingleton- Hook into
utils.py(Level A) ✅ - Hook into
commands.py(Level B) — deferred
Tests: tests/core/test_profiler.py — 7 classes, full coverage ✅
Phase 2 — Controls
Step 2.1 — Global layout (current) 🔄
src/myfasthtml/controls/Profiler.py:
SingleInstanceinheriting- Toolbar:
mk.icon()for enable/disable and clear, overhead text Panelfor split layout- Left: trace list table (command / duration / timestamp), click → select_trace command
- Right: placeholder (empty until Step 2.2)
src/myfasthtml/assets/core/profiler.css:
- All
mf-profiler-*classes
Step 2.2 — Detail panel ⏳
Right panel content:
- Metadata and kwargs via
Properties - Span tree: custom
_mk_span_tree()with bars and cumulative badges - View toggle (tree / pie) in detail header
Step 2.3 — Pie chart ⏳ Future
src/myfasthtml/controls/ProfilerPieChart.py
Naming Conventions
- Control files:
ProfilerXxx.py - CSS classes:
mf-profiler-xxx - Logger:
logging.getLogger("Profiler") - Constant:
PROFILER_MAX_TRACES = 500insrc/myfasthtml/core/constants.py