DWH Agent
Integrating to databases via on-premise installed SYNQ agent
This guide explains how to configure SYNQ Data Warehouse agent installed on-premise.
When possible use standard integration methods as they are much simpler and provide full functionality.
Reach out to SYNQ for installation details.
Data we collect
For the automated data anomaly testing, we collect the following:
- Number of rows in every table in the monitored dataset(s)
- Timestamp of the last change of data in all tables in the monitored dataset(s)
To provide out-of-the-box monitors for volume of data and freshness SYNQ doesn’t require access to your actual data. For custom monitors, SYNQ requires access to query your raw data
Setup DWH Agent integration
-
In SYNQ, navigate to Data Sources and click “Add integration”
-
Select “On-premise DWH Agent” from the list of available integrations
-
Enter a title for your integration (e.g., “Production DWH Agent”)
-
Click “Create”. You will receive:
client_id
client_secret
Save these credentials - you’ll need them to configure the agent in the next section.
Keep your client credentials secure. They allow the agent to authenticate with SYNQ.
Agent Configuration
Agent is configured via agent.yaml
file or through environment variables.
Example configuration:
For details about available options check Config file schema
Config file schema
Config
Config represents the main configuration for the DWH agent
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
agent | object | Config.Agent | Agent configuration | |
synq | object | Config.SYNQ | SYNQ platform configuration | |
connections | object | Config.Connection | Map of connection configurations |
Config.Agent
Agent contains metadata about this agent instance
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
name | string | string | Name of the agent instance | |
tags | array | string | Tags to categorize and organize the agent | |
log_level | string | LOG_LEVEL_UNSPECIFIED LOG_LEVEL_TRACE LOG_LEVEL_DEBUG LOG_LEVEL_INFO LOG_LEVEL_WARN LOG_LEVEL_ERROR | ||
log_json | boolean | boolean | ||
log_report_caller | boolean | boolean |
Config.Connection
Connection represents a database connection configuration
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
name | string | string | Name of the connection | |
disabled | boolean | boolean | ||
parallelism | integer | integer | How many queries to DWH can be executed in parallel, defaults to 2 | |
bigquery | object | BigQueryConf | ||
clickhouse | object | ClickhouseConf | ||
databricks | object | DatabricksConf | ||
mysql | object | MySQLConf | ||
postgres | object | PostgresConf | ||
redshift | object | RedshiftConf | ||
snowflake | object | SnowflakeConf |
Config.SYNQ
SYNQ contains authentication and connection details for the SYNQ platform
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
client_id | string | string | Client ID for OAuth authentication | |
client_secret | string | string | Client secret for OAuth authentication | |
endpoint | string | string | SYNQ API agent endpoint (host:port) | |
ingest_endpoint | string | string | SYNQ API ingest endpoint (host:port) | |
oauth_url | string | string | OAuth authentication URL |
BigQueryConf
BigQuery specific configuration
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
project_id | string | string | GCP project ID | |
service_account_key | string | string | Service account key JSON | |
service_account_key_file | string | string | Location of service account key file | |
region | string | string | Region for BigQuery resources |
ClickhouseConf
No description provided for this model.
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
host | string | string | Host address | |
port | integer | integer | Port number | |
database | string | string | Database name | |
username | string | string | Username for authentication | |
password | string | string | Password for authentication | |
allow_insecure | boolean | boolean | Whether to use disable SSL for connection |
DatabricksConf
No description provided for this model.
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
workspace_url | string | string | ||
auth_token | string | string | ||
auth_client | string | string | ||
auth_secret | string | string | ||
warehouse | string | string | ||
refresh_table_metrics | boolean | boolean | ||
refresh_table_metrics_use_scan | boolean | boolean | ||
fetch_table_tags | boolean | boolean | ||
use_show_create_table | boolean | boolean |
MySQLConf
MySQL specific configuration
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
host | string | string | Host address | |
port | integer | integer | Port number | |
database | string | string | Database name | |
username | string | string | Username for authentication | |
password | string | string | Password for authentication | |
allow_insecure | boolean | boolean | Whether to allow insecure connections | |
params | object | object | Additional connection parameters |
PostgresConf
Postgres specific configuration
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
host | string | string | Host address | |
port | integer | integer | Port number | |
database | string | string | Database name | |
username | string | string | Username for authentication | |
password | string | string | Password for authentication | |
allow_insecure | boolean | boolean | Whether to allow insecure connections |
RedshiftConf
Redshift specific configuration
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
host | string | string | Host address | |
port | integer | integer | Port number | |
database | string | string | Database name | |
username | string | string | Username for authentication | |
password | string | string | Password for authentication | |
freshness_from_query_logs | boolean | boolean | Estimate table freshness based on query logs |
SnowflakeConf
Snowflake specific configuration
Type: object
Property | Type | Required | Possible values | Description |
---|---|---|---|---|
account | string | string | Snowflake account identifier | |
warehouse | string | string | Virtual warehouse to use | |
role | string | string | Role to assume | |
username | string | string | Username for authentication | |
password | string | string | Password for authentication | |
private_key | string | string | Content of Private key used for Snowflake authentication | |
databases | array | string | Database to connect to | |
use_get_ddl | boolean | boolean | Use GET_DDL to determine queries used for table/view creation |