Lineage
Data lineage helps you understand and trace the flow of data through Sundial's pipeline, from source tables to metrics and dimensions. It provides visibility into data transformations and dependencies, making it easier to debug issues and understand metric calculations.
Overview
Data lineage in Sundial automatically tracks:
- Source tables and their relationships
- Transformations and operations applied to data
- Dependencies between metrics and source data
- Derived metric relationships
Use Cases
Debugging Metric Issues
When investigating metric discrepancies, lineage helps you:
- Trace metrics back to source tables
- Identify transformation steps that may cause issues
- Verify data freshness at each stage
- Validate metric calculation logic
Impact Analysis
Before making changes, use lineage to:
- Understand downstream dependencies
- Assess impact of schema changes
- Identify affected metrics and reports
- Plan testing and validation
Data Governance
Lineage supports governance by:
- Documenting data flows and transformations
- Tracking data usage and dependencies
- Supporting audit and compliance needs
- Maintaining data quality standards
Viewing Lineage
The lineage UI provides multiple views:
- Table View: Shows relationships between tables and their refresh status
- Schema View: Shows table schemas and column definitions
- Metric View: Traces metric dependencies and refresh status
- Focused View: Highlights specific tables/metrics and their direct dependencies, helping isolate and analyze targeted data flows
Best Practices
To get the most value from lineage:
- Review lineage before making schema changes
- Use lineage to validate metric definitions
- Document transformations and business logic
- Monitor lineage for unexpected changes
- Leverage lineage for impact assessment
Technical Details
Sundial automatically generates lineage by:
- Analyzing source table schemas and relationships
- Tracking SQL transformations and operations
- Mapping metric definitions to source data
- Building dependency graphs for derived metrics
- Maintaining historical lineage information
The lineage system updates in real-time as changes occur in your data pipeline.