Delta Lake
Delta Lake connector for writing streaming data to Delta Lake tables. Supports S3, GCS, Azure, and local filesystem.
Quick Example
apiVersion: laminar.io/v1
kind: Table
spec:
name: events_delta
connector: delta
config:
type: sink
path: s3://my-bucket/delta-tables/events/
storage_options:
s3.region: us-east-1
s3.access-key-id: AKIAIOSFODNN7EXAMPLE
s3.secret-access-key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
rolling_policy:
file_size_bytes: 134217728
interval_seconds: 300
partitioning:
fields:
- name: event_date
transform: identity
shuffle_by_partition:
enabled: true
schema:
format:
parquet: {}
fields:
- field_name: event_id
field_type:
type:
primitive: Utf8
nullable: false
- field_name: event_date
field_type:
type:
primitive: Date32
nullable: falseConfiguration
Required
| Property | Type | Description |
|---|---|---|
type | string | Must be sink |
path | string | URI of the Delta Lake table (s3://, gs://, az://, or local) |
Optional
| Property | Type | Description |
|---|---|---|
storage_options | object | Cloud storage credentials (see Storage Options tab) |
rolling_policy | object | When to create new Parquet files |
partitioning | object | Data partitioning configuration |
JSON Schema Reference
Connection Table Schema
{
"type": "object",
"properties": {
"type": {"const": "sink"},
"path": {"type": "string"},
"storage_options": {
"type": "object",
"additionalProperties": {"type": "string"}
},
"rolling_policy": {
"type": "object",
"properties": {
"file_size_bytes": {"type": "integer"},
"interval_seconds": {"type": "integer"},
"inactivity_seconds": {"type": "integer"}
}
},
"partitioning": {
"type": "object",
"properties": {
"time_pattern": {"type": "string"},
"fields": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"transform": {"enum": ["identity", "hour", "year", "month"]}
},
"required": ["name"]
}
},
"shuffle_by_partition": {
"type": "object",
"properties": {
"enabled": {"type": "boolean"}
}
}
}
}
},
"required": ["type", "path"]
}