Overview
The Coalesce Quality Google Cloud Storage integration automatically discovers all GCS buckets in your Google Cloud project and creates custom entities for centralized asset management and governance. This integration can also serve as a reference implementation for building custom integrations using the Coalesce Quality Public API. Source code: github.com/getsynq/synq-google-cloud-storageWhat It Does
The integration:- Discovers all GCS buckets in your Google Cloud project
- Creates custom entities for each bucket with rich metadata
- Syncs comprehensive bucket information including:
- Storage class and location
- Versioning status
- Lifecycle rules with detailed conditions
- Uniform bucket-level access settings
- User-defined labels
- Filters buckets based on include/exclude patterns
- Auto-cleanup removes entities when buckets are deleted
Use Cases
- Data Catalog Management: Track all cloud storage resources in one place
- Resource Discovery: Automatically maintain an up-to-date inventory of GCS buckets
- Compliance Tracking: Monitor storage configurations and lifecycle policies
- Cross-Platform Lineage: Connect storage buckets with data pipelines and tables
Installation
Download Pre-built Binaries
Download the latest release for your platform from the releases page.macOS
macOS
Intel:Apple Silicon:
Linux
Linux
AMD64:ARM64:
Windows
Windows
Download the
.zip file from the releases page and extract it.Build from Source
Build from Source
Requires Go 1.24 or later:
Configuration
Required: API Credentials
Create a.env file with your Coalesce Quality API credentials:
The GCP project ID can be auto-detected from:
- Environment variables (
GCP_PROJECT_ID,GOOGLE_CLOUD_PROJECT,GCLOUD_PROJECT) gcloudCLI configuration- GCP metadata server (when running on GCP)
Optional: Advanced Configuration
For advanced customization, create aconfig.yaml file:
Configuration precedence: defaults → config.yaml → environment variables → CLI flags
Network Requirements
If your GCP project has firewall rules that restrict outbound connections, whitelist the Coalesce Quality egress IP addresses:- EU Region (Default)
- US Region
- App: https://app.synq.io
- API: https://developer.synq.io
- Egress IP:
34.105.135.39
Running the Integration
Basic Usage
Dry-Run Mode
Preview what entities would be created without making API calls:In dry-run mode, the integration scans GCS buckets and shows what would be created, but does not call the API or require credentials.
Command-Line Options
Common flags:--dry-run- Preview changes without calling the API--gcp.project-id- Specify GCP project ID--filter.buckets.include- Bucket name patterns to include--filter.buckets.exclude- Bucket name patterns to exclude--types.bucket-type-id- Custom entity type ID (default: 40)--synq.endpoint- API endpoint (EU or US region)
./synq-google-cloud-storage --help for all available options.
How It Works
- Authenticates with the Coalesce Quality API using OAuth2 client credentials
- Creates/updates custom entity type for GCS buckets
- Discovers all GCS buckets in your project
- Creates entities with comprehensive metadata
- Uses entity groups to enable automatic cleanup of deleted buckets
Entity Groups
The integration uses entity groups to track all entities created in each run. When the group is updated, Coalesce Quality automatically removes entities that were in the previous group but not in the current one, enabling automatic cleanup of deleted buckets.Custom Identifiers
All entities use custom identifiers withgcs:: prefix for namespace isolation. Buckets use identifiers in the format: gcs::<bucket_name>.
Scheduling with Cron
To keep your bucket inventory up-to-date, schedule the integration to run periodically:Using as a Public API Example
This integration demonstrates best practices for using the Coalesce Quality Public API:- OAuth2 authentication with client credentials
- Creating and managing custom entity types
- Using entity groups for automatic cleanup
- Implementing filtering and dry-run modes
- Handling regional deployments
Troubleshooting
Authentication errors
Authentication errors
Verify your credentials are correct in the
.env file. Ensure you’re using the correct endpoint for your region (EU or US).GCP project not detected
GCP project not detected
Set the
GCP_PROJECT_ID environment variable explicitly or configure gcloud CLI with your project:Permission denied errors
Permission denied errors
Ensure your GCP credentials have the
storage.buckets.list permission and can read bucket metadata.Network connectivity issues
Network connectivity issues
If running behind a firewall, verify that outbound connections to the API endpoint are allowed. Check your network configuration and firewall rules.
Support
For issues or questions:- GitHub Issues: github.com/getsynq/synq-google-cloud-storage/issues
- Coalesce Quality Support: support