Files
MyFastHtml/docs/Profiler.md

11 KiB
Raw Blame History

Profiler — Design & Implementation Plan

Context

Performance issues were identified during keyboard navigation in the DataGrid (173ms server-side per command call). The HTMX debug traces (via htmx_debug.js) confirmed the bottleneck is server-side. A persistent, in-application profiling system is needed for continuous analysis across sessions and future investigations.


Implementation Status

Phase Item Status
Phase 1 — Core profiler.py — data model + probe mechanisms Done
Phase 1 — Core tests/core/test_profiler.py — full test suite (7 classes) Done
Phase 1 — Core Hook utils.py — Level A command_span Done
Phase 1 — Core Hook commands.py — Level B phases Deferred
Phase 2 — Controls Profiler.py — global layout (toolbar + list) 🔄 In progress
Phase 2 — Controls Profiler.py — detail panel (span tree + pie) Pending
Phase 2 — Controls CSS profiler.css 🔄 In progress
Phase 2 — Controls ProfilerPieChart.py Future

Design Decisions

Data Collection Strategy

Two complementary levels:

  • Level A (route handler): One trace per /myfasthtml/commands call. Captures total server-side duration including lookup, execution, and HTMX swap overhead.
  • Level B (granular spans): Decomposition of each trace into named phases. Activated by placing probes in the code.

Both levels are active simultaneously. Level A gives the global picture; Level B gives the breakdown.

Probe Mechanisms

Four complementary mechanisms, chosen based on the context:

1. Context manager — partial block instrumentation

with profiler.span("oob_swap"):
    # only this block is timed
    result = build_oob_elements(...)

Metadata can be attached during execution:

with profiler.span("query") as span:
    rows = db.query(...)
    span.set("row_count", len(rows))

2. Decorator — full function instrumentation

@profiler.span("callback")
def execute_callback(self, client_response):
    ...

Function arguments are captured automatically. Metadata can be attached via current_span():

@profiler.span("process")
def process(self, rows):
    result = do_work(rows)
    profiler.current_span().set("row_count", len(result))
    return result

3. Cumulative span — loop instrumentation

For loops with many iterations. Aggregates instead of creating one span per iteration.

for row in rows:
    with profiler.cumulative_span("process_row"):
        process(row)

# or as a decorator
@profiler.cumulative_span("process_row")
def process_row(self, row):
    ...

Exposes: count, total, min, max, avg. Single entry in the trace tree regardless of iteration count.

4. trace_all — class-level static instrumentation

Wraps all methods of a class at definition time. No runtime overhead beyond the spans themselves.

@profiler.trace_all
class DataGrid(MultipleInstance):
    def navigate_cell(self, ...):  # auto-spanned
        ...

# Exclude specific methods
@profiler.trace_all(exclude=["__ft__", "render"])
class DataGrid(MultipleInstance):
    ...

Implementation: uses inspect to iterate over methods and wraps each with @profiler.span(). No sys.settrace() involved — pure static wrapping.

5. trace_calls — sub-call exploration

Traces all function calls made within a single function, recursively. Used for exploration when the bottleneck location is unknown.

@profiler.trace_calls
def navigate_cell(self, ...):
    self._update_selection()   # auto-traced as child span
    self._compute_visible()    # auto-traced as child span
    db.query(...)              # auto-traced as child span

Implementation: uses sys.setprofile() scoped to the decorated function's execution only. Overhead is localized to that function's call stack. This is an exploration tool — use it to identify hotspots, then replace with explicit probes.

Span Hierarchy

Hierarchy is determined by code nesting via a ContextVar stack (async-safe). No explicit parent references required.

with profiler.span("execute"):           # root
    with profiler.span("callback"):      # child of execute
        result = self.callback(...)
    with profiler.span("oob_swap"):      # sibling of callback
        ...

When a command calls another command, the second command's spans automatically become children of the first command's active span.

profiler.current_span() provides access to the active span from anywhere in the call stack.

Storage

  • Scope: Global (all sessions). Profiling measures server behavior, not per-user state.
  • Structure: deque with a configurable maximum size.
  • Default size: 500 traces (constant PROFILER_MAX_TRACES).
  • Eviction: Oldest traces are dropped when the buffer is full (FIFO).
  • Persistence: In-memory only. Lost on server restart.

Toggle and Clear

  • profiler.enabled — boolean flag. When False, all probe mechanisms are no-ops (zero overhead).
  • profiler.clear() — empties the trace buffer.
  • Both are controllable from the UI control.

Overhead Measurement

The ProfilingManager self-profiles its own span.__enter__ and span.__exit__ calls. Exposes:

  • overhead_per_span_us — average cost of one span boundary in microseconds
  • total_overhead_ms — estimated total overhead across all active spans

Visible in the UI to verify the profiler does not bias measurements significantly.


Data Model

ProfilingTrace
  command_name: str
  command_id: str
  kwargs: dict
  timestamp: datetime
  total_duration_ms: float
  root_span: ProfilingSpan

ProfilingSpan
  name: str
  start: float          (perf_counter)
  duration_ms: float
  data: dict            (attached via span.set())
  children: list[ProfilingSpan | CumulativeSpan]

CumulativeSpan
  name: str
  count: int
  total_ms: float
  min_ms: float
  max_ms: float
  avg_ms: float

Code Hooks

src/myfasthtml/core/utils.py — route handler (Level A)

command = CommandsManager.get_command(c_id)
if command:
    with profiler.command_span(command.name, c_id, client_response or {}):
        return command.execute(client_response)

src/myfasthtml/core/commands.py — execution phases (Level B) Deferred

Planned breakdown inside Command.execute():

def execute(self, client_response=None):
    with profiler.span("before_commands"):
        ...
    with profiler.span("callback"):
        result = self.callback(...)
    with profiler.span("after_commands"):
        ...
    with profiler.span("oob_swap"):
        ...

Deferred: will be added once the UI control is functional to immediately observe the breakdown.


UI Control Design

Control name: Profiler (SingleInstance)

Single entry point. Replaces the earlier ProfilerList name.

Files:

  • src/myfasthtml/controls/Profiler.py
  • src/myfasthtml/assets/core/profiler.css

Layout

Split view using Panel:

┌─────────────────────────────────────────────────────┐
│ [●] [🗑]          Overhead/span: 1.2µs  Traces: 8/500│  ← toolbar (icon-only)
├──────────────────────┬──────────────────────────────┤
│ Command  Duration  Time│ NavigateCell — 173.4ms  [≡][◕]│
│ ──────────────────────│ ─────────────────────────────│
│ NavigateCell  173ms …  │ [Metadata card]              │
│ NavigateCell  168ms …  │ [kwargs card]                │
│ SelectRow      42ms …  │ [Span breakdown / Pie chart] │
│ …                      │                              │
└──────────────────────┴──────────────────────────────┘

Toolbar

Icon-only buttons, no Menu control (Menu does not support toggle state). Direct mk.icon() calls:

  • Enable/disable: icon changes between "recording" and "stopped" states based on profiler.enabled
  • Clear: delete icon, always red
  • Refresh: manual refresh of the trace list (no auto-refresh yet — added in Step 2.1)

Overhead metrics displayed as plain text on the right side of the toolbar.

Trace list (left panel)

Three columns: command name / duration (color-coded) / timestamp. Click on a row → update right panel via HTMX.

Duration color thresholds:

  • Green (mf-profiler-fast): < 20 ms
  • Orange (mf-profiler-medium): 20100 ms
  • Red (mf-profiler-slow): > 100 ms

Detail panel (right)

Two view modes, toggled by icons in the detail panel header:

  1. Tree view (default): Properties-style cards (Metadata, kwargs) + span breakdown with proportional bars and indentation. Cumulative spans show ×N · min/avg/max badge.
  2. Pie view: ProfilerPieChart control (future) — distribution of time across spans at the current zoom level.

The Properties control is used as-is for Metadata and kwargs cards. The span breakdown is custom rendering (not a Properties instance).

Font conventions

  • Labels, headings, command names: --font-sans (DaisyUI default)
  • Values (durations, timestamps, kwargs values): --font-mono
  • Consistent with properties.css (mf-properties-value uses --default-mono-font-family)

Visual reference

Mockups available in examples/:

  • profiler_mockup.html — first iteration (monospace font everywhere)
  • profiler_mockup_2.htmlreference (correct fonts, icon toolbar, tree/pie toggle)

Implementation Plan

Phase 1 — Core Complete

  1. ProfilingSpan, CumulativeSpan, ProfilingTrace dataclasses
  2. ProfilingManager with all probe mechanisms
  3. profiler singleton
  4. Hook into utils.py (Level A)
  5. Hook into commands.py (Level B) — deferred

Tests: tests/core/test_profiler.py — 7 classes, full coverage

Phase 2 — Controls

Step 2.1 — Global layout (current) 🔄

src/myfasthtml/controls/Profiler.py:

  • SingleInstance inheriting
  • Toolbar: mk.icon() for enable/disable and clear, overhead text
  • Panel for split layout
  • Left: trace list table (command / duration / timestamp), click → select_trace command
  • Right: placeholder (empty until Step 2.2)

src/myfasthtml/assets/core/profiler.css:

  • All mf-profiler-* classes

Step 2.2 — Detail panel

Right panel content:

  • Metadata and kwargs via Properties
  • Span tree: custom _mk_span_tree() with bars and cumulative badges
  • View toggle (tree / pie) in detail header

Step 2.3 — Pie chart Future

src/myfasthtml/controls/ProfilerPieChart.py


Naming Conventions

  • Control files: ProfilerXxx.py
  • CSS classes: mf-profiler-xxx
  • Logger: logging.getLogger("Profiler")
  • Constant: PROFILER_MAX_TRACES = 500 in src/myfasthtml/core/constants.py