Pipeline Templates
Pipeline Templates provide quick-start configurations for common streaming use cases. Templates include pre-configured source schemas, SQL queries, and pipeline settings that you can customize before deployment.
Templates List
The Templates page displays all available pipeline templates organized by category. You can filter templates by category (Analytics, Developer, E-commerce, etc.) or search for specific templates.
Template Categories
- Analytics - A/B testing, user analytics, retention tracking
- Developer - API monitoring, CI/CD pipelines, error tracking, log aggregation, container monitoring, database performance
- E-commerce - Cart abandonment, order processing, recommendations
- Financial - Fraud detection, stock trading, transaction monitoring
- Gaming - Player tracking, leaderboards, matchmaking, in-game purchases, server performance
- Healthcare - Patient monitoring, appointments, medication tracking, hospital admissions
- IoT & Sensors - Device monitoring, sensor data processing
- Logistics - Fleet management, shipment tracking, warehouse operations, route optimization
- Manufacturing - Production monitoring, equipment maintenance, supply chain tracking
- Marketing - Campaign analytics, customer segmentation, email campaigns, social media analytics
- Media - Content streaming, ad impressions, social engagement, content moderation
- Retail - POS transactions, inventory management, loyalty programs, customer feedback
- Security - Access control, intrusion detection, vulnerability scanning, firewall logs
- Telecom - CDR processing, network monitoring, outage detection, subscriber activity
Template Details
Click on any template to view its details including the schema fields, SQL query, and customizable parameters.
Template Information
Each template includes:
- Schema Fields - The data fields that will be generated, with their types and generator configurations (uuid, faker, enum, range, datetime)
- SQL Query - The transformation query that processes the streaming data
- Events Per Second - Configurable data generation rate (50-5000 events per second)
- Tags - Descriptive labels for categorization and discovery
Creating a Pipeline from a Template
Click Use Template to start the pipeline creation wizard with the template's configuration pre-filled.
Configuration Steps
The wizard guides you through:
- Source Profile - The connector type for input data (e.g., mock for testing)
- Source Table - The table that provides input data with auto-generated name
- Sink Profile - The connector type for output data (e.g., preview for testing)
- Sink Table - The table that receives processed data
- Pipeline Configuration - Name, SQL query, parallelism, and checkpoint interval
You can edit any of these settings before creating the pipeline.
Pipeline Created
After clicking Create Pipeline, you're redirected to the Pipelines list showing your newly created pipeline.
The pipeline starts automatically and you can monitor its status, messages received/sent, data transferred, and backpressure in the list.
Pipeline Details
Click on a pipeline name to view its details, including real-time metrics, pipeline graph, SQL query, and configuration.
Overview Tab
The Overview tab displays:
- Pipeline Metrics - Messages received, messages sent, data transferred, and max backpressure
- Messages Over Time - Chart showing received and sent message counts over time
- Backpressure Over Time - Chart showing system backpressure percentage
- Throughput Over Time - Chart showing bytes in and bytes out
- Messages by Operator - Breakdown of messages by pipeline operator
- Pipeline Graph - Visual representation of the pipeline topology with nodes and edges
- SQL Query - The transformation query with copy functionality
- Configuration - Checkpoint interval and other settings
Pipeline Tabs
- Overview - Real-time metrics, charts, pipeline graph, and configuration
- Jobs - History of pipeline runs with status and duration
- Output - Real-time stream output preview
- Checkpoints - Checkpoint history with trace details
- Logs - Pipeline execution logs with filtering
Monitoring Jobs
The Jobs tab shows all pipeline runs with their status, start time, duration, and task count.
Job Information
| Column | Description |
|---|---|
| Run ID | Sequential run identifier (e.g., #1, #2) |
| State | Current job state (Running, Completed, Failed) |
| Start Time | When the job started |
| Duration | How long the job has been running |
| Tasks | Number of parallel tasks |
Monitoring Checkpoints
The Checkpoints tab shows the checkpoint history for the pipeline, allowing you to track state persistence.
Checkpoint Information
| Column | Description |
|---|---|
| Epoch | Checkpoint sequence number |
| Status | Checkpoint status (Completed, In Progress, Failed) |
| Backend | Storage backend (parquet) |
| Start Time | When the checkpoint started |
| Duration | How long the checkpoint took |
Checkpoint Trace Details
Click on any checkpoint to view its detailed trace timeline, showing the checkpoint phases for each operator.
The trace timeline shows:
- Alignment - Time spent aligning checkpoint barriers
- Sync - Synchronous state snapshot phase
- Async - Asynchronous state upload phase
- Metadata - Metadata writing phase
- Committing - Final commit phase
Viewing Logs
The Logs tab provides access to pipeline execution logs with filtering capabilities.
Log Filtering
- Job Filter - Filter logs by specific job run or view all jobs
- Log Level - Filter by severity (error, warn, info, debug, trace)
- Time Range - Select time window (Last 1 hour, Last 24 hours, etc.)
- Live Tail - Enable real-time log streaming
Next Steps
- Explore SQL functions for data transformation
- Set up production connectors for real data sources