BigLake tables for Apache Iceberg (hereafter BigLake Iceberg tables) are Iceberg tables that you create from open source engines and store in Cloud Storage. Like all tables that use BigLake metastore, they can be read by open source engines and BigQuery. However, the open source engine that created the table is the only engine that can write to it.
Architecture
A data lakehouse built with BigLake consists of the following components:
- Storage: Cloud Storage and BigQuery storage serve as the storage layer, with Apache Iceberg as the recommended open table format for Cloud Storage.
- Metastore: BigLake metastore provides a single source of truth for managing metadata across multiple engines.
- Query engine: BigQuery, Apache Spark, Apache Flink, Trino, and other open-source engines are compatible with BigLake.
- Governance: Dataplex Universal Catalog provides centralized security and governance policies.
- Data writing and analytics tools: Engines and tools integrated with BigLake provide multiple paths for data ingestion and analysis.
Key capabilities
As a component of BigLake, managing these tables with BigLake metastore provides several advantages for data management and analysis:
- Engine interoperability: BigLake metastore enables multiple engines, including Apache Spark, Apache Flink, and BigQuery, to share tables and metadata without copying files.
- Storage access delegation: BigLake metastore supports storage access delegation (credential vending), which improves security by removing the need for direct Cloud Storage bucket access.
- Unified governance: It integrates with Dataplex Universal Catalog for unified governance, lineage, and data quality.
- High-performance analytics: Provides high-performance analytics, streaming, and AI when used with BigQuery.
Pricing
For pricing details, see BigLake pricing.
What's next
- Manage BigLake Iceberg tables in BigQuery
- Learn about the Iceberg REST catalog.