Skip to content

API Reference

Complete API documentation for clpipe - SQL column lineage analysis and pipeline orchestration.

Quick Start

from clpipe import Pipeline

# Create pipeline from SQL queries
pipeline = Pipeline.from_sql_list([
    "CREATE TABLE staging AS SELECT id, name FROM raw",
    "CREATE TABLE output AS SELECT * FROM staging"
])

# Trace lineage
sources = pipeline.trace_column_backward("output", "id")
impacts = pipeline.trace_column_forward("raw", "name")

# Export
data = pipeline.to_json()

Core Classes

Pipeline

The main entry point for all lineage operations.

  • Pipeline - Create pipelines, trace lineage, manage metadata, execute queries

Lineage

Classes for understanding data flow.

Export

Export lineage data to various formats.

Comparison

Track changes between pipeline versions.

Common Imports

from clpipe import (
    # Main entry point
    Pipeline,

    # Single-query lineage
    SQLColumnTracer,

    # Export formats
    JSONExporter,
    CSVExporter,
    GraphVizExporter,
)

SQL Dialects

clpipe supports multiple SQL dialects via sqlglot:

# BigQuery (default)
pipeline = Pipeline.from_sql_list(queries, dialect="bigquery")

# Snowflake
pipeline = Pipeline.from_sql_list(queries, dialect="snowflake")

# PostgreSQL
pipeline = Pipeline.from_sql_list(queries, dialect="postgres")

# Other: mysql, redshift, spark, duckdb, etc.