14 KiB
DataGrid Formulas
Overview
The DataGrid formula system adds computed columns to the DataGrid. A formula column applies a single expression to every row, producing derived values from existing data — within the same table or across tables.
The system is designed for:
- Column-level formulas: one formula per column, applied to all rows
- Cross-table references: direct syntax to reference columns from other tables
- Reactive recalculation: dirty flag propagation with page-aware computation
- Cell-level overrides (planned): individual cells can override the column formula
Formula Language
Basic Syntax
A formula is an expression that references columns with {ColumnName} and produces a value for each row:
{Price} * {Quantity}
References use curly braces {} to distinguish column names from keywords and functions. Column names are matched by ID
or title.
Operators
Arithmetic
| Operator | Description | Example |
|---|---|---|
+ |
Addition | {Price} + {Tax} |
- |
Subtraction | {Total} - {Discount} |
* |
Multiplication | {Price} * {Quantity} |
/ |
Division | {Total} / {Count} |
% |
Modulo | {Value} % 2 |
^ |
Power | {Base} ^ 2 |
Comparison
| Operator | Description | Example |
|---|---|---|
== |
Equal | {Status} == "active" |
!= |
Not equal | {Status} != "deleted" |
> |
Greater than | {Price} > 100 |
< |
Less than | {Stock} < 10 |
>= |
Greater or equal | {Score} >= 80 |
<= |
Less or equal | {Age} <= 18 |
contains |
String contains | {Name} contains "Corp" |
startswith |
String starts with | {Code} startswith "ERR" |
endswith |
String ends with | {File} endswith ".csv" |
in |
Value in list | {Status} in ["active", "new"] |
between |
Value in range | {Age} between 18 and 65 |
isempty |
Value is empty | {Notes} isempty |
isnotempty |
Value is not empty | {Email} isnotempty |
isnan |
Value is NaN | {Score} isnan |
Logical
| Operator | Description | Example |
|---|---|---|
and |
Logical AND | {Age} > 18 and {Status} == "active" |
or |
Logical OR | {Type} == "A" or {Type} == "B" |
not |
Negation | not {Status} == "deleted" |
Parentheses control precedence: ({Type} == "A" or {Type} == "B") and {Active} == True
Conditions (suffix-if)
Conditions use a suffix-if syntax: the result expression comes first, then the condition. This keeps the focus on the output, not the branching logic.
Simple condition (no else — result is None when false)
{Price} * 0.8 if {Country} == "FR"
With else
{Price} * 0.8 if {Country} == "FR" else {Price}
Chained conditions
{Price} * 0.8 if {Country} == "FR" else {Price} * 0.9 if {Country} == "DE" else {Price}
With logical operators
{Price} * 0.8 if {Country} == "FR" and {Quantity} > 10 else {Price}
With grouping
{Price} * 0.8 if ({Country} == "FR" or {Country} == "DE") and {Quantity} > 10
Functions
Math
| Function | Description | Example |
|---|---|---|
round(expr, n) |
Round to n decimals | round({Price} * 1.2, 2) |
abs(expr) |
Absolute value | abs({Balance}) |
min(expr, expr) |
Minimum of two values | min({Price}, {MaxPrice}) |
max(expr, expr) |
Maximum of two values | max({Score}, 0) |
sum(expr, ...) |
Sum of values | sum({Q1}, {Q2}, {Q3}, {Q4}) |
avg(expr, ...) |
Average of values | avg({Q1}, {Q2}, {Q3}, {Q4}) |
Text
| Function | Description | Example |
|---|---|---|
upper(expr) |
Uppercase | upper({Name}) |
lower(expr) |
Lowercase | lower({Email}) |
len(expr) |
String length | len({Description}) |
concat(expr, ...) |
Concatenate strings | concat({First}, " ", {Last}) |
trim(expr) |
Remove whitespace | trim({Input}) |
left(expr, n) |
First n characters | left({Code}, 3) |
right(expr, n) |
Last n characters | right({Phone}, 4) |
Date
| Function | Description | Example |
|---|---|---|
year(expr) |
Extract year | year({CreatedAt}) |
month(expr) |
Extract month | month({CreatedAt}) |
day(expr) |
Extract day | day({CreatedAt}) |
today() |
Current date | datediff({DueDate}, today()) |
datediff(expr, expr) |
Difference in days | datediff({End}, {Start}) |
Aggregation (for cross-table contexts)
| Function | Description | Example |
|---|---|---|
sum(expr) |
Sum values | sum({Orders.Amount WHERE Orders.ClientId = Id}) |
count(expr) |
Count values | count({Orders.Id WHERE Orders.ClientId = Id}) |
avg(expr) |
Average | avg({Reviews.Score WHERE Reviews.ProductId = Id}) |
min(expr) |
Minimum | min({Bids.Price WHERE Bids.ItemId = Id}) |
max(expr) |
Maximum | max({Bids.Price WHERE Bids.ItemId = Id}) |
Cross-Table References
Direct Reference
Reference a column from another table using {TableName.ColumnName}:
{Products.Price} * {Quantity}
Join Resolution (implicit)
When referencing another table without a WHERE clause, the join is resolved automatically:
- By
idcolumn: if both tables have a column namedid, rows are matched on equalidvalues - By row index: if no
idcolumn exists in both tables, rows are matched by their internal row index (stable across sort/filter)
Explicit Join (WHERE clause)
For explicit control over which row of the other table to use:
{Products.Price WHERE Products.Code = ProductCode} * {Quantity}
Inside the WHERE clause:
Products.Coderefers to a column in the referenced tableProductCode(noTable.prefix) refers to a column in the current table
Aggregation with Cross-Table
When a cross-table reference matches multiple rows, use an aggregation function:
sum({OrderLines.Amount WHERE OrderLines.OrderId = Id})
Without aggregation, a multi-row match returns the first matching value.
Calculation Engine
Dependency Graph (DAG)
The formula system maintains a Directed Acyclic Graph of dependencies between columns:
- Nodes: each formula column is a node, identified by
table_name.column_id - Edges: if column A's formula references column B, an edge B → A exists ("A depends on B")
- Both directions are tracked:
- Precedents: columns that a formula reads from
- Dependents: columns that need recalculation when this column changes
Cross-table references create edges that span DataGrid instances, managed at the DataGridsManager level.
Dirty Flag Propagation
When a source column's data changes:
- The source column is marked dirty
- All direct dependents are marked dirty
- Propagation continues recursively through the DAG
- Each dirty column maintains a dirty row set: the specific row indices that need recalculation
This propagation is immediate (fast — only flag marking, no computation).
Recalculation Strategy (Hybrid)
Actual computation is deferred to rendering time:
- On value change → dirty flags propagate instantly through the DAG
- On page render (
mk_body_content_page) → only dirty rows within the visible page (up to 1000 rows) are recalculated - Off-screen pages remain dirty until scrolled into view
- Calculation follows topological order of the DAG to ensure precedents are computed before dependents
Cycle Detection
Before adding a formula, the engine checks for cycles in the DAG using Kahn's algorithm during topological sort. If a cycle is detected:
- The formula is rejected
- The editor displays an error identifying the circular dependency chain
- The previous formula (if any) remains unchanged
Caching
Each formula column caches its computed values:
- Results are stored in
ns_fast_access[col_id]alongside raw data columns - The dirty row set tracks which cached values are stale
- Non-dirty rows return their cached value without re-evaluation
- Cache is invalidated per-row when source data changes
Evaluation
Row-by-Row Execution
Formulas are evaluated row-by-row within the page being rendered. For each row:
- Resolve column references
{ColumnName}to the cell value at the current row index - Resolve cross-table references
{Table.Column}via the join mechanism - Evaluate the expression with resolved values
- Store the result in the cache (
ns_fast_access)
Parser
The formula language uses a custom grammar parsed with Lark (consistent with the formatting DSL). The parser:
- Tokenizes the formula string
- Builds an AST (Abstract Syntax Tree)
- Transforms the AST into an evaluable representation
- Extracts column references for dependency graph registration
Error Handling
| Error Type | Behavior |
|---|---|
| Syntax error | Editor highlights the error, formula not saved |
| Unknown column | Editor highlights, autocompletion suggests fixes |
| Type mismatch | Cell displays error indicator, other cells unaffected |
| Division by zero | Cell displays #DIV/0! or None |
| Circular dependency | Formula rejected, editor shows cycle chain |
| Cross-table not found | Editor highlights unknown table name |
| No join match | Cell displays None |
User Interface
Creating a Formula Column
Formula columns are created and edited through the DataGridColumnsManager:
- User opens the Columns Manager panel
- Adds a new column or edits an existing one
- Selects column type "Formula"
- A DslEditor (CodeMirror 5) opens for formula input
- The editor provides:
- Syntax highlighting: keywords, column references, functions, operators
- Autocompletion: column names (current table and other tables), function names, table names
- Validation: real-time syntax checking and dependency cycle detection
- Error markers: inline error indicators with descriptions
Formula Column Properties
A formula column extends DataGridColumnState with:
| Property | Type | Description |
|---|---|---|
formula |
str or None |
The formula expression (None for data columns) |
col_type |
ColumnType |
Set to ColumnType.Formula |
Other properties (title, visible, width, format) remain unchanged |
Formula columns are read-only in the grid body — cell values are computed, not editable. Formatting rules from the formatting DSL apply to formula columns like any other column.
Integration Points
| Component | Role |
|---|---|
DataGridColumnState |
Stores formula field and ColumnType.Formula type |
DatagridStore |
ns_fast_access caches formula results as numpy arrays |
DataGridColumnsManager |
UI for creating/editing formula columns |
DataGridsManager |
Hosts the global dependency DAG across all tables |
DslEditor |
CodeMirror 5 editor with highlighting and autocompletion |
FormattingEngine |
Applies formatting rules AFTER formula evaluation |
mk_body_content_page() |
Triggers formula computation for visible rows |
mk_body_cell_content() |
Reads computed values from ns_fast_access |
Syntax Summary
# Basic arithmetic
{Price} * {Quantity}
# Function call
round({Price} * 1.2, 2)
# Simple condition (None if false)
{Price} * 0.8 if {Country} == "FR"
# Condition with else
{Price} * 0.8 if {Country} == "FR" else {Price}
# Chained conditions
{Price} * 0.8 if {Country} == "FR" else {Price} * 0.9 if {Country} == "DE" else {Price}
# Logical operators
{Price} * 0.8 if {Country} == "FR" and {Quantity} > 10
# Grouping
{Price} * 0.8 if ({Country} == "FR" or {Country} == "DE") and {Quantity} > 10
# Cross-table (implicit join on id)
{Products.Price} * {Quantity}
# Cross-table (explicit join)
{Products.Price WHERE Products.Code = ProductCode} * {Quantity}
# Cross-table aggregation
sum({OrderLines.Amount WHERE OrderLines.OrderId = Id})
# Nested functions
round(avg({Q1}, {Q2}, {Q3}, {Q4}), 1)
# Text operations
concat(upper(left({FirstName}, 1)), ". ", {LastName})
Future: Cell-Level Overrides
The architecture supports adding cell-level formula overrides with ~20-30% additional work:
- Storage: sparse dict
cell_formulas: dict[(col_id, row_index), str](same pattern ascell_formats) - DAG: new node type
table.column[row]alongside existingtable.columnnodes - Evaluation: "does this cell have an override? If yes, use it. Otherwise, use the column formula."
- Node ID scheme: designed to be extensible from the start (
table.columnfor columns,table.column[row]for cells)