Files

Kodjo Sossouvi e8443f07f9 Introducing columns formulas

2026-02-13 21:38:00 +01:00

14 KiB

Raw Blame History

DataGrid Formulas

Overview

The DataGrid formula system adds computed columns to the DataGrid. A formula column applies a single expression to every row, producing derived values from existing data — within the same table or across tables.

The system is designed for:

Column-level formulas: one formula per column, applied to all rows
Cross-table references: direct syntax to reference columns from other tables
Reactive recalculation: dirty flag propagation with page-aware computation
Cell-level overrides (planned): individual cells can override the column formula

Formula Language

Basic Syntax

A formula is an expression that references columns with {ColumnName} and produces a value for each row:

{Price} * {Quantity}

References use curly braces {} to distinguish column names from keywords and functions. Column names are matched by ID or title.

Operators

Arithmetic

Operator	Description	Example
`+`	Addition	`{Price} + {Tax}`
`-`	Subtraction	`{Total} - {Discount}`
`*`	Multiplication	`{Price} * {Quantity}`
`/`	Division	`{Total} / {Count}`
`%`	Modulo	`{Value} % 2`
`^`	Power	`{Base} ^ 2`

Comparison

Operator	Description	Example
`==`	Equal	`{Status} == "active"`
`!=`	Not equal	`{Status} != "deleted"`
`>`	Greater than	`{Price} > 100`
`<`	Less than	`{Stock} < 10`
`>=`	Greater or equal	`{Score} >= 80`
`<=`	Less or equal	`{Age} <= 18`
`contains`	String contains	`{Name} contains "Corp"`
`startswith`	String starts with	`{Code} startswith "ERR"`
`endswith`	String ends with	`{File} endswith ".csv"`
`in`	Value in list	`{Status} in ["active", "new"]`
`between`	Value in range	`{Age} between 18 and 65`
`isempty`	Value is empty	`{Notes} isempty`
`isnotempty`	Value is not empty	`{Email} isnotempty`
`isnan`	Value is NaN	`{Score} isnan`

Logical

Operator	Description	Example
`and`	Logical AND	`{Age} > 18 and {Status} == "active"`
`or`	Logical OR	`{Type} == "A" or {Type} == "B"`
`not`	Negation	`not {Status} == "deleted"`

Parentheses control precedence: ({Type} == "A" or {Type} == "B") and {Active} == True

Conditions (suffix-if)

Conditions use a suffix-if syntax: the result expression comes first, then the condition. This keeps the focus on the output, not the branching logic.

Simple condition (no else — result is None when false)

{Price} * 0.8 if {Country} == "FR"

With else

{Price} * 0.8 if {Country} == "FR" else {Price}

Chained conditions

{Price} * 0.8 if {Country} == "FR" else {Price} * 0.9 if {Country} == "DE" else {Price}

With logical operators

{Price} * 0.8 if {Country} == "FR" and {Quantity} > 10 else {Price}

With grouping

{Price} * 0.8 if ({Country} == "FR" or {Country} == "DE") and {Quantity} > 10

Functions

Math

Function	Description	Example
`round(expr, n)`	Round to n decimals	`round({Price} * 1.2, 2)`
`abs(expr)`	Absolute value	`abs({Balance})`
`min(expr, expr)`	Minimum of two values	`min({Price}, {MaxPrice})`
`max(expr, expr)`	Maximum of two values	`max({Score}, 0)`
`sum(expr, ...)`	Sum of values	`sum({Q1}, {Q2}, {Q3}, {Q4})`
`avg(expr, ...)`	Average of values	`avg({Q1}, {Q2}, {Q3}, {Q4})`

Text

Function	Description	Example
`upper(expr)`	Uppercase	`upper({Name})`
`lower(expr)`	Lowercase	`lower({Email})`
`len(expr)`	String length	`len({Description})`
`concat(expr, ...)`	Concatenate strings	`concat({First}, " ", {Last})`
`trim(expr)`	Remove whitespace	`trim({Input})`
`left(expr, n)`	First n characters	`left({Code}, 3)`
`right(expr, n)`	Last n characters	`right({Phone}, 4)`

Date

Function	Description	Example
`year(expr)`	Extract year	`year({CreatedAt})`
`month(expr)`	Extract month	`month({CreatedAt})`
`day(expr)`	Extract day	`day({CreatedAt})`
`today()`	Current date	`datediff({DueDate}, today())`
`datediff(expr, expr)`	Difference in days	`datediff({End}, {Start})`

Aggregation (for cross-table contexts)

Function	Description	Example
`sum(expr)`	Sum values	`sum({Orders.Amount WHERE Orders.ClientId = Id})`
`count(expr)`	Count values	`count({Orders.Id WHERE Orders.ClientId = Id})`
`avg(expr)`	Average	`avg({Reviews.Score WHERE Reviews.ProductId = Id})`
`min(expr)`	Minimum	`min({Bids.Price WHERE Bids.ItemId = Id})`
`max(expr)`	Maximum	`max({Bids.Price WHERE Bids.ItemId = Id})`

Cross-Table References

Direct Reference

Reference a column from another table using {TableName.ColumnName}:

{Products.Price} * {Quantity}

Join Resolution (implicit)

When referencing another table without a WHERE clause, the join is resolved automatically:

By id column: if both tables have a column named id, rows are matched on equal id values
By row index: if no id column exists in both tables, rows are matched by their internal row index (stable across sort/filter)

Explicit Join (WHERE clause)

For explicit control over which row of the other table to use:

{Products.Price WHERE Products.Code = ProductCode} * {Quantity}

Inside the WHERE clause:

Products.Code refers to a column in the referenced table
ProductCode (no Table. prefix) refers to a column in the current table

Aggregation with Cross-Table

When a cross-table reference matches multiple rows, use an aggregation function:

sum({OrderLines.Amount WHERE OrderLines.OrderId = Id})

Without aggregation, a multi-row match returns the first matching value.

Calculation Engine

Dependency Graph (DAG)

The formula system maintains a Directed Acyclic Graph of dependencies between columns:

Nodes: each formula column is a node, identified by table_name.column_id
Edges: if column A's formula references column B, an edge B → A exists ("A depends on B")
Both directions are tracked:
- Precedents: columns that a formula reads from
- Dependents: columns that need recalculation when this column changes

Cross-table references create edges that span DataGrid instances, managed at the DataGridsManager level.

Dirty Flag Propagation

When a source column's data changes:

The source column is marked dirty
All direct dependents are marked dirty
Propagation continues recursively through the DAG
Each dirty column maintains a dirty row set: the specific row indices that need recalculation

This propagation is immediate (fast — only flag marking, no computation).

Recalculation Strategy (Hybrid)

Actual computation is deferred to rendering time:

On value change → dirty flags propagate instantly through the DAG
On page render (mk_body_content_page) → only dirty rows within the visible page (up to 1000 rows) are recalculated
Off-screen pages remain dirty until scrolled into view
Calculation follows topological order of the DAG to ensure precedents are computed before dependents

Cycle Detection

Before adding a formula, the engine checks for cycles in the DAG using Kahn's algorithm during topological sort. If a cycle is detected:

The formula is rejected
The editor displays an error identifying the circular dependency chain
The previous formula (if any) remains unchanged

Caching

Each formula column caches its computed values:

Results are stored in ns_fast_access[col_id] alongside raw data columns
The dirty row set tracks which cached values are stale
Non-dirty rows return their cached value without re-evaluation
Cache is invalidated per-row when source data changes

Evaluation

Row-by-Row Execution

Formulas are evaluated row-by-row within the page being rendered. For each row:

Resolve column references {ColumnName} to the cell value at the current row index
Resolve cross-table references {Table.Column} via the join mechanism
Evaluate the expression with resolved values
Store the result in the cache (ns_fast_access)

Parser

The formula language uses a custom grammar parsed with Lark (consistent with the formatting DSL). The parser:

Tokenizes the formula string
Builds an AST (Abstract Syntax Tree)
Transforms the AST into an evaluable representation
Extracts column references for dependency graph registration

Error Handling

Error Type	Behavior
Syntax error	Editor highlights the error, formula not saved
Unknown column	Editor highlights, autocompletion suggests fixes
Type mismatch	Cell displays error indicator, other cells unaffected
Division by zero	Cell displays `#DIV/0!` or None
Circular dependency	Formula rejected, editor shows cycle chain
Cross-table not found	Editor highlights unknown table name
No join match	Cell displays None

User Interface

Creating a Formula Column

Formula columns are created and edited through the DataGridColumnsManager:

User opens the Columns Manager panel
Adds a new column or edits an existing one
Selects column type "Formula"
A DslEditor (CodeMirror 5) opens for formula input
The editor provides:
- Syntax highlighting: keywords, column references, functions, operators
- Autocompletion: column names (current table and other tables), function names, table names
- Validation: real-time syntax checking and dependency cycle detection
- Error markers: inline error indicators with descriptions

Formula Column Properties

A formula column extends DataGridColumnState with:

Property	Type	Description
`formula`	`str` or None	The formula expression (None for data columns)
`col_type`	`ColumnType`	Set to `ColumnType.Formula`
Other properties (`title`, `visible`, `width`, `format`) remain unchanged

Formula columns are read-only in the grid body — cell values are computed, not editable. Formatting rules from the formatting DSL apply to formula columns like any other column.

Integration Points

Component	Role
`DataGridColumnState`	Stores `formula` field and `ColumnType.Formula` type
`DatagridStore`	`ns_fast_access` caches formula results as numpy arrays
`DataGridColumnsManager`	UI for creating/editing formula columns
`DataGridsManager`	Hosts the global dependency DAG across all tables
`DslEditor`	CodeMirror 5 editor with highlighting and autocompletion
`FormattingEngine`	Applies formatting rules AFTER formula evaluation
`mk_body_content_page()`	Triggers formula computation for visible rows
`mk_body_cell_content()`	Reads computed values from `ns_fast_access`

Syntax Summary

# Basic arithmetic
{Price} * {Quantity}

# Function call
round({Price} * 1.2, 2)

# Simple condition (None if false)
{Price} * 0.8 if {Country} == "FR"

# Condition with else
{Price} * 0.8 if {Country} == "FR" else {Price}

# Chained conditions
{Price} * 0.8 if {Country} == "FR" else {Price} * 0.9 if {Country} == "DE" else {Price}

# Logical operators
{Price} * 0.8 if {Country} == "FR" and {Quantity} > 10

# Grouping
{Price} * 0.8 if ({Country} == "FR" or {Country} == "DE") and {Quantity} > 10

# Cross-table (implicit join on id)
{Products.Price} * {Quantity}

# Cross-table (explicit join)
{Products.Price WHERE Products.Code = ProductCode} * {Quantity}

# Cross-table aggregation
sum({OrderLines.Amount WHERE OrderLines.OrderId = Id})

# Nested functions
round(avg({Q1}, {Q2}, {Q3}, {Q4}), 1)

# Text operations
concat(upper(left({FirstName}, 1)), ". ", {LastName})

Future: Cell-Level Overrides

The architecture supports adding cell-level formula overrides with ~20-30% additional work:

Storage: sparse dict cell_formulas: dict[(col_id, row_index), str] (same pattern as cell_formats)
DAG: new node type table.column[row] alongside existing table.column nodes
Evaluation: "does this cell have an override? If yes, use it. Otherwise, use the column formula."
Node ID scheme: designed to be extensible from the start (table.column for columns, table.column[row] for cells)

14 KiB Raw Blame History