Sources
A Source is a table-based representation of raw data from an external system that serves as an input to Sundial. It defines the schema and properties of data as it first enters your ETL workflow, before any transformations are applied. A Source is always tied to a connection from where it reads the data.
Source tables typically represent:
- Tables or table-like objects like views, etc in your data warehouse such as Snowflake, BigQuery
- Tables in a well-structured data lake format like Iceberg or Delta Lake
- Partitioned CSV/JSON/Parquet files in an object storage system such as S3 or GCS
Each Source table has:
- The connection-specific configuration that identifies the table. This includes details like the name of the source table name in the connection's database or the prefix path on an object storage bucket.
- The expected schema (column names and data types) of the table that will be usable within Sundial. You can also select a subset of columns from the source table, such that only those are visible and usable.
- Materialization configuration, which (if set) defines how the table will be replicated into the Sundial workspace.
- Table tests, to ensure that the data in a source is valid. These tests run whenever the source is refreshed.