Work in Progress: This page is under development. Use the feedback button on the bottom right to help us improve it.

Pipeline Templates

Pipeline Templates provide quick-start configurations for common streaming use cases. Templates include pre-configured source schemas, SQL queries, and pipeline settings that you can customize before deployment.

Templates List

The Templates page displays all available pipeline templates organized by category. You can filter templates by category (Analytics, Developer, E-commerce, etc.) or search for specific templates.

Template Categories

  • Analytics - A/B testing, user analytics, retention tracking
  • Developer - API monitoring, CI/CD pipelines, error tracking, log aggregation, container monitoring, database performance
  • E-commerce - Cart abandonment, order processing, recommendations
  • Financial - Fraud detection, stock trading, transaction monitoring
  • Gaming - Player tracking, leaderboards, matchmaking, in-game purchases, server performance
  • Healthcare - Patient monitoring, appointments, medication tracking, hospital admissions
  • IoT & Sensors - Device monitoring, sensor data processing
  • Logistics - Fleet management, shipment tracking, warehouse operations, route optimization
  • Manufacturing - Production monitoring, equipment maintenance, supply chain tracking
  • Marketing - Campaign analytics, customer segmentation, email campaigns, social media analytics
  • Media - Content streaming, ad impressions, social engagement, content moderation
  • Retail - POS transactions, inventory management, loyalty programs, customer feedback
  • Security - Access control, intrusion detection, vulnerability scanning, firewall logs
  • Telecom - CDR processing, network monitoring, outage detection, subscriber activity

Template Details

Click on any template to view its details including the schema fields, SQL query, and customizable parameters.

Template Information

Each template includes:

  • Schema Fields - The data fields that will be generated, with their types and generator configurations (uuid, faker, enum, range, datetime)
  • SQL Query - The transformation query that processes the streaming data
  • Events Per Second - Configurable data generation rate (50-5000 events per second)
  • Tags - Descriptive labels for categorization and discovery

Creating a Pipeline from a Template

Click Use Template to start the pipeline creation wizard with the template's configuration pre-filled.

Configuration Steps

The wizard guides you through:

  1. Source Profile - The connector type for input data (e.g., mock for testing)
  2. Source Table - The table that provides input data with auto-generated name
  3. Sink Profile - The connector type for output data (e.g., preview for testing)
  4. Sink Table - The table that receives processed data
  5. Pipeline Configuration - Name, SQL query, parallelism, and checkpoint interval

You can edit any of these settings before creating the pipeline.

Pipeline Created

After clicking Create Pipeline, you're redirected to the Pipelines list showing your newly created pipeline.

The pipeline starts automatically and you can monitor its status, messages received/sent, data transferred, and backpressure in the list.

Pipeline Details

Click on a pipeline name to view its details, including real-time metrics, pipeline graph, SQL query, and configuration.

Overview Tab

The Overview tab displays:

  • Pipeline Metrics - Messages received, messages sent, data transferred, and max backpressure
  • Messages Over Time - Chart showing received and sent message counts over time
  • Backpressure Over Time - Chart showing system backpressure percentage
  • Throughput Over Time - Chart showing bytes in and bytes out
  • Messages by Operator - Breakdown of messages by pipeline operator
  • Pipeline Graph - Visual representation of the pipeline topology with nodes and edges
  • SQL Query - The transformation query with copy functionality
  • Configuration - Checkpoint interval and other settings

Pipeline Tabs

  • Overview - Real-time metrics, charts, pipeline graph, and configuration
  • Jobs - History of pipeline runs with status and duration
  • Output - Real-time stream output preview
  • Checkpoints - Checkpoint history with trace details
  • Logs - Pipeline execution logs with filtering

Monitoring Jobs

The Jobs tab shows all pipeline runs with their status, start time, duration, and task count.

Job Information

ColumnDescription
Run IDSequential run identifier (e.g., #1, #2)
StateCurrent job state (Running, Completed, Failed)
Start TimeWhen the job started
DurationHow long the job has been running
TasksNumber of parallel tasks

Monitoring Checkpoints

The Checkpoints tab shows the checkpoint history for the pipeline, allowing you to track state persistence.

Checkpoint Information

ColumnDescription
EpochCheckpoint sequence number
StatusCheckpoint status (Completed, In Progress, Failed)
BackendStorage backend (parquet)
Start TimeWhen the checkpoint started
DurationHow long the checkpoint took

Checkpoint Trace Details

Click on any checkpoint to view its detailed trace timeline, showing the checkpoint phases for each operator.

The trace timeline shows:

  • Alignment - Time spent aligning checkpoint barriers
  • Sync - Synchronous state snapshot phase
  • Async - Asynchronous state upload phase
  • Metadata - Metadata writing phase
  • Committing - Final commit phase

Viewing Logs

The Logs tab provides access to pipeline execution logs with filtering capabilities.

Log Filtering

  • Job Filter - Filter logs by specific job run or view all jobs
  • Log Level - Filter by severity (error, warn, info, debug, trace)
  • Time Range - Select time window (Last 1 hour, Last 24 hours, etc.)
  • Live Tail - Enable real-time log streaming

Next Steps