EraQL language reference

EraQL is a functional query language designed for data transformation and analysis. It combines SQL-like operations with functional programming concepts and a rich set of built-in functions.

Core Syntax

Data Sources

Table

Connect to existing database tables (must be followed by a Select):

Table(
  {
    name: 'users',
    schema: 'analytics'
  },
  {
    created_at: timestamp,
    user_id: string,
    username: string,
    email: string
  }
)
  |> Select({
    id: "user_id",
    name: "username",
    email: "email",
    created: "created_at"
  })

From

Alternative syntax for defining data sources:

From({
  name: 'orders',
  schema: 'sales',
  attributes: {
    customer_email: string,
    order_id: int,
    order_date: timestamp,
    status: string
  }
})

Pipeline Operator

Use the pipe operator |> to chain operations:

data |> Where(condition) |> Select(fields) |> Aggregate(grouping, calculations)

Data Types

Basic Types

string: Text data
int: Integer numbers
float: Floating-point numbers
boolean: True/false values
timestamp: Date and time values

Complex Types

ARRAY(type): Arrays of elements
Records: { field1: type1, field2: type2 }

Type Operations

null_of

Create null values of specific types:

null_of('int') // null integer
null_of('string') // null string
type_of(null_of('string')) // returns 'string'

type_of

Get the type of any value:

type_of(1) // 'RANGE(1, 1)!'
type_of('hello') // "ENUM('hello')!"
type_of(true) // 'boolean!'

tag

Tag values with specific types:

tag(1, 'float') // Tags integer as float type
type_of(tag(1, 'float')) // 'RANGE(1, 1)! @float'

Query Operations

Selection and Projection

Select

Transform and project columns:

// Basic selection
data |> Select({
  id: "order_id",
  name: "customer_name",
  created: "order_date"
})

// With transformations
data |> Select(|t| {
  ...t,  // Spread existing fields
  id: t"order_id" |> cast('string'),
  full_name: concat(t"first_name", ' ', t"last_name")
})

// With distinct option
data |> Select({ id: "user_id", name: "username" }, { distinct: true })

Derive

Add computed columns without removing existing ones:

data
  |> Derive({
    id: "user_id",
    metadata: {
      title: "post_title",
      content: "post_content",
      id: "post_id"
    }
  })

Where

Filter rows based on conditions:

data |> Where("item_count" == 1)
data |> Where("item_count" > 1 && ("is_premium" || "has_discount"))
data |> Where(one_of(lower("category"), ['electronics', 'books', 'clothing']))

Aggregation

Aggregate

Group data and perform calculations:

data
  |> Aggregate(
    ["customer_id"], // Group by columns
    {
      customer_id: "customer_id",
      unique_orders: count_distinct("order_id"),
      total_spent: sum("amount")
    }
  )

Summary

Perform calculations across all rows:

data
  |> Summary({
    total_count: count(1),
    items: `[{{string_agg(cast("item_data", string), ', ')}}]`
  })

Joins

LeftJoin

Perform left outer joins:

orders |> LeftJoin(customers, |ord, cust| {
  on: ord"customer_id" == cust"id",
  select: {
    ...ord,
    customer_name: cust"name",
    total_value: ord"quantity" |> mul(cust"unit_price")
  }
})

Set Operations

UnionAll

Combine multiple datasets:

UnionAll(active_users, inactive_users)

Built-in Functions

Mathematical Functions

Basic Arithmetic

add(1, 2) // 3
sub(5, 2) // 3
mul(2, 3) // 6
div(6, 2) // 3
to_the_power_of(2, 3) // 8

Mathematical Operations

abs(-5.5) // 5.5
floor(1.9) // 1
ceil(1.1) // 2
round(1.56, 1) // 1.6

Logarithmic Functions

ln(10) // 2.302585092994046
log_2(8) // 3
log_10(100) // 2

Advanced Math

cosine_distance([1, 0], [0, 1]) // 1
random() // Random number 0-1
nan() // NaN value
is_nan(div(0, 0)) // true

String Functions

Case Conversion

lower('HELLO') // 'hello'
upper('hello') // 'HELLO'

String Manipulation

concat('Hello', ' ', 'World') // 'Hello World'
length('hello') // 5
replace('hello world', 'world', 'there') // 'hello there'
substring('hello world', 1, 5) // 'hello'

String Analysis

starts_with('hello world', 'hello') // true
ends_with('hello world', 'world') // true
like('hello', 'h%') // true (SQL LIKE pattern)
is_numeric_string('123') // true

String Splitting

split_part('hello-world', '-', 1) // 'hello'
split_part('a,b,c', ',', 2) // 'b'

String Prefix/Suffix

remove_prefix('prefix-value', 'prefix-') // 'value'
remove_suffix('value-suffix', '-suffix') // 'value'

String Formatting

Template Strings

;`hello {{1+1}}` // 'hello 2'
format`hello {{1+1}}` // 'hello 2'
format_nullable`hello {{null_of(int)}}` // null

Comparison Functions

Equality

eq(1, 1) // true
neq(1, 2) // true
5 == 5 // true
5 != 3 // true

Ordering

gt(5, 3) // true
gte(5, 5) // true
lt(3, 5) // true
lte(5, 5) // true
5 > 3 // true
5 >= 5 // true
3 < 5 // true
5 <= 5 // true

Logical Functions

Boolean Operations

and(true, 2 > 1) // true
or(false, true) // true
not(true) // false

Null Handling

Null Checks

is_null(null_of(int)) // true
is_not_null(0) // true

Null Operations

coalesce(null_of(int), 3) // 3 (first non-null)
null_if(1, 1) // null (null if equal)

Array and Set Functions

Set Operations

one_of(1, [1, 2, 3]) // true
one_of('a', ['b', 'c']) // false

Array Creation

empty_array_of(int) // []

Date and Time Functions

Date Arithmetic

date_add(@2023-04-01, 1, 'days')      // Add 1 day
date_diff(@2023-04-01, @2023-04-15, 'days')  // 14 days difference

Date Manipulation

date_trunc(@2023-04-15, 'month')      // Truncate to month start
date_part(@2023-04-15, 'month')       // Extract month (4)
now()                                 // Current timestamp

Aggregation Functions

Basic Aggregations

count(1) // Count rows
count_distinct("field") // Count unique values
sum("amount") // Sum values
max("value") // Maximum value
min("value") // Minimum value
string_agg("field", ',') // Concatenate with delimiter

Window Functions

Ranking and Numbering

row_number_over({
  partition_by: ["group_id", "category"],
  order_by: ["created_date"]
})

first_value_over("message_text", {
  partition_by: ["conversation_id"],
  order_by: ["timestamp"]
})

count_over(true, {
  partition_by: ["user_id"]
})

Utility Functions

Data Generation

gen_random_uuid() // Generate UUID
impure(1 + 1) // Disable optimization

Type Conversion

cast('123', 'int') // Convert to integer
cast(123, 'string') // Convert to string

Record Operations

get_from_record({ a: 1, b: 2 }, 'a') // 1

Category Functions

category_concat('a', 'b') // 'a~>b'
category_at('a~>b~>c', 0) // 'a'

Conditional Logic

Conditional Expressions

if(condition, {then: value1, else: value2})

// Example in aggregation
sum(if("is_active", {then: "points", else: 0}))

Advanced Features

Lambda Expressions

Use lambda syntax for complex transformations:

data |> Select(|t| {
  ...t,
  computed_field: some_function(t"field1", t"field2")
})

Field References

Reference fields using quoted syntax:

t"field_name"        // Reference field in lambda
"field_name"         // Reference field in operations

Spread Operator

Use ... to include all existing fields:

Select(|t| {
  ...t,              // Include all existing fields
  new_field: "value" // Add new field
})

Type Casting

Convert between types explicitly:

"user_id" |> cast('string')
"amount" |> cast('float')

Best Practices

Query Organization

Use meaningful variable names with let bindings
Break complex queries into smaller, reusable parts
Comment complex logic for maintainability

Performance Optimization

Place Where clauses early in the pipeline
Use appropriate aggregation functions
Consider using distinct: true when needed

Type Safety

Explicitly cast types when necessary
Use null_of() for type-safe null values
Validate types with type_of() during development

Data Integrity

Handle null values explicitly with coalesce() and null checks
Use is_numeric_string() before string-to-number conversions
Validate data ranges and constraints in Where clauses

This reference covers the core EraQL language features. The language continues to evolve with additional functions and capabilities being added regularly.