Supported sources

Column-level lineage is supported in all Synq supported data-warehouses and relational databases. In addition to that it also works down to Looker and Tableau.

Visibility into issues

There are two possible when column-level lineage might not be accurate and we clearly indicate that.

Parsing of SQL failed

In case our parser is unable to understand the SQL query, we indicate that both on the Code & Changes of an asset and on the lineage graph.

Extraction of dependencies failed

In case when SQL contains edge case constructs, we might not correctly resolve how data in the columns flows. To prevent such issues, we do post-validation of our SQL understanding and expose those validation results both in Code & Changes tab of an asset and on the lineage graph.

Limitations

We do our best to improve support for all different syntaxes and constructs available in different SQL dialects. If any feature needed is not supported please report that to us.

BigQuery STRUCT literal lineage

We track lineage of individual STRUCT fields and RECORD column itself, so lineage will work when using them in CTEs and record simplified lineage as part of single SQL. When going from one asset to another we expose only information about columns and not the RECORD fields.

Consider this example:

WITH c AS (
  SELECT STRUCT (c1 AS a, c2 AS b) AS field FROM `prj`.`ds`.`tbl`),
    d AS (SELECT STRUCT (field.a as d) double_nested
          FROM c)
SELECT  double_nested.d
FROM d

In this case we will correctly track simplified lineage from output column d to column c1 in prj.ds.tbl.

When we have two assets, one defined as

CREATE VIEW ds.rec_tbl(r) AS
  SELECT STRUCT (c1 AS a, c2 AS b) AS r FROM `prj`.`ds`.`tbl`)

Second defined by query

SELECT r.c1 AS r FROM `prj`.`ds`.`rec_tbl`

We show that column r depends on both c1 and c2. We might consider changing that behaviour later.