Skip to main content

Overview

The Coalesce Quality Google Cloud Storage integration automatically discovers all GCS buckets in your Google Cloud project and creates custom entities for centralized asset management and governance. This integration can also serve as a reference implementation for building custom integrations using the Coalesce Quality Public API. Source code: github.com/getsynq/synq-google-cloud-storage

What It Does

The integration:
  • Discovers all GCS buckets in your Google Cloud project
  • Creates custom entities for each bucket with rich metadata
  • Syncs comprehensive bucket information including:
    • Storage class and location
    • Versioning status
    • Lifecycle rules with detailed conditions
    • Uniform bucket-level access settings
    • User-defined labels
  • Filters buckets based on include/exclude patterns
  • Auto-cleanup removes entities when buckets are deleted
Each bucket entity includes a detailed description with storage class, location, creation date, versioning status, and complete lifecycle rule configurations.

Use Cases

  • Data Catalog Management: Track all cloud storage resources in one place
  • Resource Discovery: Automatically maintain an up-to-date inventory of GCS buckets
  • Compliance Tracking: Monitor storage configurations and lifecycle policies
  • Cross-Platform Lineage: Connect storage buckets with data pipelines and tables

Installation

Download Pre-built Binaries

Download the latest release for your platform from the releases page.
Intel:
curl -LO https://github.com/getsynq/synq-google-cloud-storage/releases/latest/download/synq-google-cloud-storage_darwin_amd64.tar.gz
tar -xzf synq-google-cloud-storage_darwin_amd64.tar.gz
sudo mv synq-google-cloud-storage /usr/local/bin/
Apple Silicon:
curl -LO https://github.com/getsynq/synq-google-cloud-storage/releases/latest/download/synq-google-cloud-storage_darwin_arm64.tar.gz
tar -xzf synq-google-cloud-storage_darwin_arm64.tar.gz
sudo mv synq-google-cloud-storage /usr/local/bin/
AMD64:
curl -LO https://github.com/getsynq/synq-google-cloud-storage/releases/latest/download/synq-google-cloud-storage_linux_amd64.tar.gz
tar -xzf synq-google-cloud-storage_linux_amd64.tar.gz
sudo mv synq-google-cloud-storage /usr/local/bin/
ARM64:
curl -LO https://github.com/getsynq/synq-google-cloud-storage/releases/latest/download/synq-google-cloud-storage_linux_arm64.tar.gz
tar -xzf synq-google-cloud-storage_linux_arm64.tar.gz
sudo mv synq-google-cloud-storage /usr/local/bin/
Download the .zip file from the releases page and extract it.
Requires Go 1.24 or later:
git clone https://github.com/getsynq/synq-google-cloud-storage.git
cd synq-google-cloud-storage
go build

Configuration

Required: API Credentials

Create a .env file with your Coalesce Quality API credentials:
SYNQ_CLIENT_ID=your_client_id_here
SYNQ_CLIENT_SECRET=your_client_secret_here

# GCP Project ID (optional if running on GCP or using gcloud)
GCP_PROJECT_ID=your-gcp-project-id
The GCP project ID can be auto-detected from:
  • Environment variables (GCP_PROJECT_ID, GOOGLE_CLOUD_PROJECT, GCLOUD_PROJECT)
  • gcloud CLI configuration
  • GCP metadata server (when running on GCP)

Optional: Advanced Configuration

For advanced customization, create a config.yaml file:
# Coalesce Quality API Configuration (defaults shown for EU region)
synq:
  endpoint: "developer.synq.io:443"  # EU region (default)
  # For US region, use: "api.us.synq.io:443"
  oauth_url: "https://developer.synq.io/oauth2/token"  # EU region (default)
  # For US region, use: "https://api.us.synq.io/oauth2/token"

# GCP Configuration
gcp:
  user_agent: "synq-gcs-client-v1.0.0"
  # project_id: "your-project-id"  # Alternative to GCP_PROJECT_ID env var
  # entity_group_id: "gcs::custom-group-id"  # Defaults to gcs::<project_id>

# Custom Entity Type IDs
types:
  bucket_type_id: 40
  # bucket_icon: "path/to/custom-bucket-icon.svg"

# Resource Filters
filter:
  buckets:
    include: []  # Empty means include all
    exclude: []  # Regex patterns to exclude
    # Examples:
    # include: ["prod-.*"]  # Only include buckets starting with prod-
    # exclude: ["test-.*"]  # Skip buckets starting with test-
Configuration precedence: defaults → config.yaml → environment variables → CLI flags

Network Requirements

If your GCP project has firewall rules that restrict outbound connections, whitelist the Coalesce Quality egress IP addresses:
See IP Addresses for the latest information.

Running the Integration

Basic Usage

# Minimal setup: just create .env with credentials
cp .env.example .env
# Edit .env and add your credentials

# Run with defaults
./synq-google-cloud-storage

# Run with debug logging
LOG_LEVEL=DEBUG ./synq-google-cloud-storage

# Run with JSON logging for production
LOG_FORMAT=json ./synq-google-cloud-storage

Dry-Run Mode

Preview what entities would be created without making API calls:
# Dry-run mode (no API calls)
./synq-google-cloud-storage --dry-run

# Dry-run with debug logging
LOG_LEVEL=DEBUG ./synq-google-cloud-storage --dry-run
In dry-run mode, the integration scans GCS buckets and shows what would be created, but does not call the API or require credentials.

Command-Line Options

Common flags:
  • --dry-run - Preview changes without calling the API
  • --gcp.project-id - Specify GCP project ID
  • --filter.buckets.include - Bucket name patterns to include
  • --filter.buckets.exclude - Bucket name patterns to exclude
  • --types.bucket-type-id - Custom entity type ID (default: 40)
  • --synq.endpoint - API endpoint (EU or US region)
For US region:
./synq-google-cloud-storage \
  --synq.endpoint=api.us.synq.io:443 \
  --synq.oauth-url=https://api.us.synq.io/oauth2/token
Run ./synq-google-cloud-storage --help for all available options.

How It Works

  1. Authenticates with the Coalesce Quality API using OAuth2 client credentials
  2. Creates/updates custom entity type for GCS buckets
  3. Discovers all GCS buckets in your project
  4. Creates entities with comprehensive metadata
  5. Uses entity groups to enable automatic cleanup of deleted buckets

Entity Groups

The integration uses entity groups to track all entities created in each run. When the group is updated, Coalesce Quality automatically removes entities that were in the previous group but not in the current one, enabling automatic cleanup of deleted buckets.

Custom Identifiers

All entities use custom identifiers with gcs:: prefix for namespace isolation. Buckets use identifiers in the format: gcs::<bucket_name>.

Scheduling with Cron

To keep your bucket inventory up-to-date, schedule the integration to run periodically:
# Run every hour
0 * * * * cd /path/to/integration && ./synq-google-cloud-storage

# Run every 6 hours
0 */6 * * * cd /path/to/integration && ./synq-google-cloud-storage

Using as a Public API Example

This integration demonstrates best practices for using the Coalesce Quality Public API:
  • OAuth2 authentication with client credentials
  • Creating and managing custom entity types
  • Using entity groups for automatic cleanup
  • Implementing filtering and dry-run modes
  • Handling regional deployments
See the source code and API documentation for more details.

Troubleshooting

Verify your credentials are correct in the .env file. Ensure you’re using the correct endpoint for your region (EU or US).
Set the GCP_PROJECT_ID environment variable explicitly or configure gcloud CLI with your project:
gcloud config set project YOUR_PROJECT
Ensure your GCP credentials have the storage.buckets.list permission and can read bucket metadata.
If running behind a firewall, verify that outbound connections to the API endpoint are allowed. Check your network configuration and firewall rules.

Support

For issues or questions: