Files
MyFastHtml/docs/Datagrid Formulas.md

14 KiB

DataGrid Formulas

Overview

The DataGrid formula system adds computed columns to the DataGrid. A formula column applies a single expression to every row, producing derived values from existing data — within the same table or across tables.

The system is designed for:

  • Column-level formulas: one formula per column, applied to all rows
  • Cross-table references: direct syntax to reference columns from other tables
  • Reactive recalculation: dirty flag propagation with page-aware computation
  • Cell-level overrides (planned): individual cells can override the column formula

Formula Language

Basic Syntax

A formula is an expression that references columns with {ColumnName} and produces a value for each row:

{Price} * {Quantity}

References use curly braces {} to distinguish column names from keywords and functions. Column names are matched by ID or title.

Operators

Arithmetic

Operator Description Example
+ Addition {Price} + {Tax}
- Subtraction {Total} - {Discount}
* Multiplication {Price} * {Quantity}
/ Division {Total} / {Count}
% Modulo {Value} % 2
^ Power {Base} ^ 2

Comparison

Operator Description Example
== Equal {Status} == "active"
!= Not equal {Status} != "deleted"
> Greater than {Price} > 100
< Less than {Stock} < 10
>= Greater or equal {Score} >= 80
<= Less or equal {Age} <= 18
contains String contains {Name} contains "Corp"
startswith String starts with {Code} startswith "ERR"
endswith String ends with {File} endswith ".csv"
in Value in list {Status} in ["active", "new"]
between Value in range {Age} between 18 and 65
isempty Value is empty {Notes} isempty
isnotempty Value is not empty {Email} isnotempty
isnan Value is NaN {Score} isnan

Logical

Operator Description Example
and Logical AND {Age} > 18 and {Status} == "active"
or Logical OR {Type} == "A" or {Type} == "B"
not Negation not {Status} == "deleted"

Parentheses control precedence: ({Type} == "A" or {Type} == "B") and {Active} == True

Conditions (suffix-if)

Conditions use a suffix-if syntax: the result expression comes first, then the condition. This keeps the focus on the output, not the branching logic.

Simple condition (no else — result is None when false)

{Price} * 0.8 if {Country} == "FR"

With else

{Price} * 0.8 if {Country} == "FR" else {Price}

Chained conditions

{Price} * 0.8 if {Country} == "FR" else {Price} * 0.9 if {Country} == "DE" else {Price}

With logical operators

{Price} * 0.8 if {Country} == "FR" and {Quantity} > 10 else {Price}

With grouping

{Price} * 0.8 if ({Country} == "FR" or {Country} == "DE") and {Quantity} > 10

Functions

Math

Function Description Example
round(expr, n) Round to n decimals round({Price} * 1.2, 2)
abs(expr) Absolute value abs({Balance})
min(expr, expr) Minimum of two values min({Price}, {MaxPrice})
max(expr, expr) Maximum of two values max({Score}, 0)
sum(expr, ...) Sum of values sum({Q1}, {Q2}, {Q3}, {Q4})
avg(expr, ...) Average of values avg({Q1}, {Q2}, {Q3}, {Q4})

Text

Function Description Example
upper(expr) Uppercase upper({Name})
lower(expr) Lowercase lower({Email})
len(expr) String length len({Description})
concat(expr, ...) Concatenate strings concat({First}, " ", {Last})
trim(expr) Remove whitespace trim({Input})
left(expr, n) First n characters left({Code}, 3)
right(expr, n) Last n characters right({Phone}, 4)

Date

Function Description Example
year(expr) Extract year year({CreatedAt})
month(expr) Extract month month({CreatedAt})
day(expr) Extract day day({CreatedAt})
today() Current date datediff({DueDate}, today())
datediff(expr, expr) Difference in days datediff({End}, {Start})

Aggregation (for cross-table contexts)

Function Description Example
sum(expr) Sum values sum({Orders.Amount WHERE Orders.ClientId = Id})
count(expr) Count values count({Orders.Id WHERE Orders.ClientId = Id})
avg(expr) Average avg({Reviews.Score WHERE Reviews.ProductId = Id})
min(expr) Minimum min({Bids.Price WHERE Bids.ItemId = Id})
max(expr) Maximum max({Bids.Price WHERE Bids.ItemId = Id})

Cross-Table References

Direct Reference

Reference a column from another table using {TableName.ColumnName}:

{Products.Price} * {Quantity}

Join Resolution (implicit)

When referencing another table without a WHERE clause, the join is resolved automatically:

  1. By id column: if both tables have a column named id, rows are matched on equal id values
  2. By row index: if no id column exists in both tables, rows are matched by their internal row index (stable across sort/filter)

Explicit Join (WHERE clause)

For explicit control over which row of the other table to use:

{Products.Price WHERE Products.Code = ProductCode} * {Quantity}

Inside the WHERE clause:

  • Products.Code refers to a column in the referenced table
  • ProductCode (no Table. prefix) refers to a column in the current table

Aggregation with Cross-Table

When a cross-table reference matches multiple rows, use an aggregation function:

sum({OrderLines.Amount WHERE OrderLines.OrderId = Id})

Without aggregation, a multi-row match returns the first matching value.

Calculation Engine

Dependency Graph (DAG)

The formula system maintains a Directed Acyclic Graph of dependencies between columns:

  • Nodes: each formula column is a node, identified by table_name.column_id
  • Edges: if column A's formula references column B, an edge B → A exists ("A depends on B")
  • Both directions are tracked:
    • Precedents: columns that a formula reads from
    • Dependents: columns that need recalculation when this column changes

Cross-table references create edges that span DataGrid instances, managed at the DataGridsManager level.

Dirty Flag Propagation

When a source column's data changes:

  1. The source column is marked dirty
  2. All direct dependents are marked dirty
  3. Propagation continues recursively through the DAG
  4. Each dirty column maintains a dirty row set: the specific row indices that need recalculation

This propagation is immediate (fast — only flag marking, no computation).

Recalculation Strategy (Hybrid)

Actual computation is deferred to rendering time:

  1. On value change → dirty flags propagate instantly through the DAG
  2. On page render (mk_body_content_page) → only dirty rows within the visible page (up to 1000 rows) are recalculated
  3. Off-screen pages remain dirty until scrolled into view
  4. Calculation follows topological order of the DAG to ensure precedents are computed before dependents

Cycle Detection

Before adding a formula, the engine checks for cycles in the DAG using Kahn's algorithm during topological sort. If a cycle is detected:

  • The formula is rejected
  • The editor displays an error identifying the circular dependency chain
  • The previous formula (if any) remains unchanged

Caching

Each formula column caches its computed values:

  • Results are stored in ns_fast_access[col_id] alongside raw data columns
  • The dirty row set tracks which cached values are stale
  • Non-dirty rows return their cached value without re-evaluation
  • Cache is invalidated per-row when source data changes

Evaluation

Row-by-Row Execution

Formulas are evaluated row-by-row within the page being rendered. For each row:

  1. Resolve column references {ColumnName} to the cell value at the current row index
  2. Resolve cross-table references {Table.Column} via the join mechanism
  3. Evaluate the expression with resolved values
  4. Store the result in the cache (ns_fast_access)

Parser

The formula language uses a custom grammar parsed with Lark (consistent with the formatting DSL). The parser:

  1. Tokenizes the formula string
  2. Builds an AST (Abstract Syntax Tree)
  3. Transforms the AST into an evaluable representation
  4. Extracts column references for dependency graph registration

Error Handling

Error Type Behavior
Syntax error Editor highlights the error, formula not saved
Unknown column Editor highlights, autocompletion suggests fixes
Type mismatch Cell displays error indicator, other cells unaffected
Division by zero Cell displays #DIV/0! or None
Circular dependency Formula rejected, editor shows cycle chain
Cross-table not found Editor highlights unknown table name
No join match Cell displays None

User Interface

Creating a Formula Column

Formula columns are created and edited through the DataGridColumnsManager:

  1. User opens the Columns Manager panel
  2. Adds a new column or edits an existing one
  3. Selects column type "Formula"
  4. A DslEditor (CodeMirror 5) opens for formula input
  5. The editor provides:
    • Syntax highlighting: keywords, column references, functions, operators
    • Autocompletion: column names (current table and other tables), function names, table names
    • Validation: real-time syntax checking and dependency cycle detection
    • Error markers: inline error indicators with descriptions

Formula Column Properties

A formula column extends DataGridColumnState with:

Property Type Description
formula str or None The formula expression (None for data columns)
col_type ColumnType Set to ColumnType.Formula
Other properties (title, visible, width, format) remain unchanged

Formula columns are read-only in the grid body — cell values are computed, not editable. Formatting rules from the formatting DSL apply to formula columns like any other column.

Integration Points

Component Role
DataGridColumnState Stores formula field and ColumnType.Formula type
DatagridStore ns_fast_access caches formula results as numpy arrays
DataGridColumnsManager UI for creating/editing formula columns
DataGridsManager Hosts the global dependency DAG across all tables
DslEditor CodeMirror 5 editor with highlighting and autocompletion
FormattingEngine Applies formatting rules AFTER formula evaluation
mk_body_content_page() Triggers formula computation for visible rows
mk_body_cell_content() Reads computed values from ns_fast_access

Syntax Summary

# Basic arithmetic
{Price} * {Quantity}

# Function call
round({Price} * 1.2, 2)

# Simple condition (None if false)
{Price} * 0.8 if {Country} == "FR"

# Condition with else
{Price} * 0.8 if {Country} == "FR" else {Price}

# Chained conditions
{Price} * 0.8 if {Country} == "FR" else {Price} * 0.9 if {Country} == "DE" else {Price}

# Logical operators
{Price} * 0.8 if {Country} == "FR" and {Quantity} > 10

# Grouping
{Price} * 0.8 if ({Country} == "FR" or {Country} == "DE") and {Quantity} > 10

# Cross-table (implicit join on id)
{Products.Price} * {Quantity}

# Cross-table (explicit join)
{Products.Price WHERE Products.Code = ProductCode} * {Quantity}

# Cross-table aggregation
sum({OrderLines.Amount WHERE OrderLines.OrderId = Id})

# Nested functions
round(avg({Q1}, {Q2}, {Q3}, {Q4}), 1)

# Text operations
concat(upper(left({FirstName}, 1)), ". ", {LastName})

Future: Cell-Level Overrides

The architecture supports adding cell-level formula overrides with ~20-30% additional work:

  • Storage: sparse dict cell_formulas: dict[(col_id, row_index), str] (same pattern as cell_formats)
  • DAG: new node type table.column[row] alongside existing table.column nodes
  • Evaluation: "does this cell have an override? If yes, use it. Otherwise, use the column formula."
  • Node ID scheme: designed to be extensible from the start (table.column for columns, table.column[row] for cells)