Data Types
Laminar supports a rich set of SQL data types for stream processing. Examples use the sample data.
Primitive Types
Booleans
BOOLEANValues: TRUE, FALSE, NULL
SELECT
order_id,
status,
status = 'completed' as is_completed,
amount > 100 as is_high_value
FROM orders| order_id | status | is_completed | is_high_value |
|---|---|---|---|
| ord_1 | completed | true | true |
| ord_2 | completed | true | false |
| ord_3 | pending | false | true |
| ord_4 | cancelled | false | false |
| ord_5 | completed | true | true |
Integers
| Type | Range | Use Case |
|---|---|---|
INT / INTEGER | -2B to 2B | Counts, IDs, small numbers |
BIGINT | -9Q to 9Q | Large counts, timestamps as ms |
INT UNSIGNED | 0 to 4B | Always-positive integers |
BIGINT UNSIGNED | 0 to 18Q | Very large positive numbers |
SELECT
metric_id,
CAST(value as INTEGER) as value_int,
CAST(value as INTEGER) * 2 as doubled,
CAST(value as INTEGER) / 2 as halved
FROM metrics
WHERE name = 'requests'| metric_id | value_int | doubled | halved |
|---|---|---|---|
| m8 | 1250 | 2500 | 625 |
Note: Integer division truncates decimals. Cast to DOUBLE for decimal results.
Floating Point
| Type | Precision | Use Case |
|---|---|---|
FLOAT / REAL | ~7 digits | General decimals |
DOUBLE | ~15 digits | Scientific, financial |
DECIMAL(p, s) | Exact | Money, precise calculations |
SELECT
order_id,
amount,
(amount * 0.08)::DECIMAL(10, 2) as tax,
(amount * 1.08)::DECIMAL(10, 2) as total
FROM orders
WHERE status = 'completed'| order_id | amount | tax | total |
|---|---|---|---|
| ord_1 | 150.00 | 12.00 | 162.00 |
| ord_2 | 75.50 | 6.04 | 81.54 |
| ord_5 | 300.00 | 24.00 | 324.00 |
| ord_6 | 125.00 | 10.00 | 135.00 |
| ord_7 | 89.99 | 7.20 | 97.19 |
Text
VARCHAR -- Variable-length string (most common)
TEXT -- Alias for VARCHAR
STRING -- Alias for VARCHAR
CHAR -- Fixed-length (rarely needed)All text types are UTF-8 encoded with no length limit.
SELECT
log_id,
UPPER(level) as level_upper,
LOWER(service) as service_lower,
LENGTH(message) as message_length
FROM logs| log_id | level_upper | service_lower | message_length |
|---|---|---|---|
| l1 | INFO | auth | 47 |
| l2 | ERROR | payment | 54 |
| l3 | WARN | api | 54 |
| l4 | INFO | orders | 42 |
| l5 | ERROR | inventory | 40 |
| l6 | INFO | cache | 29 |
Timestamps
TIMESTAMPRepresents a point in time with nanosecond precision.
SELECT
order_id,
timestamp,
DATE_TRUNC('hour', timestamp) as order_hour,
EXTRACT(MINUTE FROM timestamp) as minute
FROM orders
WHERE status = 'completed'| order_id | timestamp | order_hour | minute |
|---|---|---|---|
| ord_1 | 2024-01-15 10:00:15 | 2024-01-15 10:00:00 | 0 |
| ord_2 | 2024-01-15 10:00:35 | 2024-01-15 10:00:00 | 0 |
| ord_5 | 2024-01-15 10:02:20 | 2024-01-15 10:00:00 | 2 |
| ord_6 | 2024-01-15 10:02:55 | 2024-01-15 10:00:00 | 2 |
| ord_7 | 2024-01-15 10:03:30 | 2024-01-15 10:00:00 | 3 |
Binary
BYTEARaw byte sequences for binary data.
SELECT
log_id,
message,
MD5(message) as message_hash
FROM logs| log_id | message | message_hash |
|---|---|---|
| l1 | User login successful user_id=u1 ip=192.168.1.1 | a1b2c3... |
| l2 | Payment failed order_id=ord_3 error=insufficient_funds | d4e5f6... |
Composite Types
Arrays
Group multiple values of the same type.
Declaration:
VARCHAR[] -- Array of strings
INT[] -- Array of integers
DOUBLE[] -- Array of floatsCreating arrays:
SELECT
customer_id,
ARRAY_AGG(order_id) as order_ids,
ARRAY_AGG(amount) as amounts
FROM orders
WHERE customer_id = 'c1'
GROUP BY customer_id| customer_id | order_ids | amounts |
|---|---|---|
| c1 | [ord_1, ord_3, ord_7] | [150.00, 200.00, 89.99] |
Array Operations
Access elements (1-indexed):
SELECT
customer_id,
ARRAY_AGG(order_id) as orders,
(ARRAY_AGG(order_id))[1] as first_order
FROM orders
GROUP BY customer_id| customer_id | orders | first_order |
|---|---|---|
| c1 | [ord_1, ord_3, ord_7] | ord_1 |
| c2 | [ord_2, ord_5] | ord_2 |
Get array length:
SELECT
customer_id,
ARRAY_AGG(order_id) as orders,
ARRAY_LENGTH(ARRAY_AGG(order_id)) as order_count
FROM orders
GROUP BY customer_id| customer_id | orders | order_count |
|---|---|---|
| c1 | [ord_1, ord_3, ord_7] | 3 |
| c2 | [ord_2, ord_5] | 2 |
Structs
Combine related fields into a single column.
Declaration:
STRUCT<
street VARCHAR,
city VARCHAR,
zip VARCHAR
>Access struct fields:
SELECT
order_id,
metadata.source,
metadata.channel
FROM orders_with_metadata| order_id | source | channel |
|---|---|---|
| ord_1 | web | organic |
| ord_2 | mobile | paid |
Type Casting
Convert between types with CAST or :: syntax.
SELECT
order_id,
amount,
CAST(amount as INTEGER) as amount_int,
amount::VARCHAR as amount_str,
timestamp::DATE as order_date
FROM orders| order_id | amount | amount_int | amount_str | order_date |
|---|---|---|---|---|
| ord_1 | 150.00 | 150 | 150.00 | 2024-01-15 |
| ord_2 | 75.50 | 75 | 75.50 | 2024-01-15 |
| ord_3 | 200.00 | 200 | 200.00 | 2024-01-15 |
NULL Handling
All types are nullable. Handle NULLs explicitly.
SELECT
order_id,
customer_id,
COALESCE(region, 'unknown') as region,
amount IS NOT NULL as has_amount
FROM orders| order_id | customer_id | region | has_amount |
|---|---|---|---|
| ord_1 | c1 | us-east | TRUE |
| ord_2 | c2 | us-west | TRUE |
| ord_3 | c1 | us-east | TRUE |
Type Inference
Laminar infers types from your connectors:
- JSON format: Types inferred from schema definition
- Avro/Protobuf: Types derived from schema
- Parquet/Iceberg: Types from table metadata
When types don't match, Laminar attempts automatic coercion. If coercion fails, you'll get a validation error before the pipeline starts.
Best Practices
- Use
BIGINTfor timestamps when storing as milliseconds - Use
DECIMALfor money to avoid floating-point errors - Prefer
VARCHARoverCHARfor text - Use arrays for variable-length lists of the same type
- Use structs for fixed, known fields