This module allows managing a single BigQuery dataset, including access configuration, tables and views.
- Simple dataset with access configuration
- IAM roles
- Authorized Views, Datasets, and Routines
- Dataset options
- Tables, views and routines
- Tag bindings
- TODO
- Variables
- Outputs
Access configuration defaults to using the separate google_bigquery_dataset_access
resource, so as to leave the default dataset access rules untouched.
You can choose to manage the google_bigquery_dataset
access rules instead via the dataset_access
variable, but be sure to always have at least one OWNER
access and to avoid duplicating accesses, or terraform apply
will fail.
The access variables are split into access
and access_identities
variables, so that dynamic values can be passed in for identities (eg a service account email generated by a different module or resource).
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
access = {
reader-group = { role = "READER", type = "group" }
owner = { role = "OWNER", type = "user" }
project_owners = { role = "OWNER", type = "special_group" }
view_1 = { role = "READER", type = "view" }
}
access_identities = {
reader-group = "[email protected]"
owner = "[email protected]"
project_owners = "projectOwners"
view_1 = "my-project|my_dataset|my-table"
}
}
# tftest modules=1 resources=5 inventory=simple.yaml
Access configuration can also be specified via IAM instead of basic roles via the iam
variable. When using IAM, basic roles cannot be used via the access
family variables.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
iam = {
"roles/bigquery.dataOwner" = ["user:[email protected]"]
}
}
# tftest modules=1 resources=2 inventory=iam.yaml
You can specify authorized views, datasets, and routines via the authorized_views
, authorized_datasets
and authorized_routines
variables, respectively.
// Create private BigQuery dataset that will not be publicly accessible, except via the authorized BigQuery resources
module "bigquery-dataset-private" {
source = "./fabric/modules/bigquery-dataset"
project_id = "private_project"
id = "private_dataset"
authorized_views = [
{
project_id = "auth_view_project"
dataset_id = "auth_view_dataset"
table_id = "auth_view"
}
]
authorized_datasets = [
{
project_id = "auth_dataset_project"
dataset_id = "auth_dataset"
}
]
authorized_routines = [
{
project_id = "auth_routine_project"
dataset_id = "auth_routine_dataset"
routine_id = "auth_routine"
}
]
}
// Create authorized view in a public dataset
module "bigquery-authorized-views-dataset-public" {
source = "./fabric/modules/bigquery-dataset"
project_id = "auth_view_project"
id = "auth_view_dataset"
views = {
auth_view = {
friendly_name = "Public"
labels = {}
query = "SELECT * FROM `private_project.private_dataset.private_table`"
use_legacy_sql = false
deletion_protection = true
}
}
}
// Create public authorized dataset
module "bigquery-authorized-dataset-public" {
source = "./fabric/modules/bigquery-dataset"
project_id = "auth_dataset_project"
id = "auth_dataset"
}
// Create public authorized routine
module "bigquery-authorized-authorized-routine-dataset-public" {
source = "./fabric/modules/bigquery-dataset"
project_id = "auth_routine_project"
id = "auth_routine_dataset"
}
resource "google_bigquery_routine" "public-routine" {
project = "private_project"
dataset_id = module.bigquery-authorized-authorized-routine-dataset-public.dataset_id
routine_id = "auth_routine"
routine_type = "TABLE_VALUED_FUNCTION"
language = "SQL"
definition_body = <<-EOS
SELECT 1 + value AS value
EOS
arguments {
name = "value"
argument_kind = "FIXED_TYPE"
data_type = jsonencode({ "typeKind" = "INT64" })
}
return_table_type = jsonencode({ "columns" = [
{ "name" = "value", "type" = { "typeKind" = "INT64" } },
] })
}
# tftest modules=4 resources=9 inventory=authorized_resources.yaml
Authorized views can be specified both using the standard access
options and the authorized_views
blocks. The example configuration below uses both blocks, and will create a dataset with three authorized views view_id_1
, view_id_2
, and view_id_3
.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
authorized_views = [
{
project_id = "view_project"
dataset_id = "view_dataset"
table_id = "view_id_1"
},
{
project_id = "view_project"
dataset_id = "view_dataset"
table_id = "view_id_2"
}
]
access = {
view_2 = { role = "READER", type = "view" }
view_3 = { role = "READER", type = "view" }
}
access_identities = {
view_2 = "view_project|view_dataset|view_id_2"
view_3 = "view_project|view_dataset|view_id_3"
}
}
# tftest modules=1 resources=4 inventory=authorized_resources_views.yaml
Dataset options are set via the options
variable. all options must be specified, but a null
value can be set to options that need to use defaults.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
options = {
default_table_expiration_ms = 3600000
default_partition_expiration_ms = null
delete_contents_on_destroy = false
max_time_travel_hours = 168
}
}
# tftest modules=1 resources=1 inventory=options.yaml
Tables are created via the tables
variable. Support for external tables will be added in a future release.
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tables = {
countries = {
friendly_name = "Countries"
schema = local.countries_schema
deletion_protection = true
}
}
}
# tftest modules=1 resources=2 inventory=tables.yaml
If partitioning is needed, populate the partitioning
variable using either the time
or range
attribute.
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tables = {
table_a = {
deletion_protection = true
friendly_name = "Table a"
schema = local.countries_schema
partitioning = {
time = { type = "DAY", expiration_ms = null }
}
}
}
}
# tftest modules=1 resources=2 inventory=partitioning.yaml
To create views use the views
variable. If you're querying a table created by the same module terraform apply
will initially fail and eventually succeed once the underlying table has been created. You can probably also use the module's output in the view's query to create a dependency on the table.
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tables = {
countries = {
friendly_name = "Countries"
schema = local.countries_schema
deletion_protection = true
}
}
views = {
population = {
friendly_name = "Population"
query = "SELECT SUM(population) FROM my_dataset.countries"
use_legacy_sql = false
deletion_protection = true
}
}
}
# tftest modules=1 resources=3 inventory=views.yaml
To create routines use the routines
variable.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
routines = {
custom_masking_routine = {
routine_type = "SCALAR_FUNCTION"
language = "SQL"
data_governance_type = "DATA_MASKING"
definition_body = "SAFE.REGEXP_REPLACE(ssn, '[0-9]', 'X')"
return_type = "{\"typeKind\" : \"STRING\"}"
arguments = {
ssn = {
data_type = "{\"typeKind\" : \"STRING\"}"
}
}
}
}
}
# tftest modules=1 resources=2 inventory=routines.yaml
Refer to the Creating and managing tags documentation for details on usage.
module "org" {
source = "./fabric/modules/organization"
organization_id = var.organization_id
tags = {
environment = {
description = "Environment specification."
values = {
dev = {}
prod = {}
sandbox = {}
}
}
}
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tag_bindings = {
env-sandbox = module.org.tag_values["environment/sandbox"].id
}
}
# tftest modules=2 resources=6
- check for dynamic values in tables and views
- add support for external tables
name | description | type | required | default |
---|---|---|---|---|
id | Dataset id. | string |
✓ | |
project_id | Id of the project where datasets will be created. | string |
✓ | |
access | Map of access rules with role and identity type. Keys are arbitrary and must match those in the access_identities variable, types are domain , group , special_group , user , view . |
map(object({…})) |
{} |
|
access_identities | Map of access identities used for basic access roles. View identities have the format 'project_id|dataset_id|table_id'. | map(string) |
{} |
|
authorized_datasets | An array of datasets to be authorized on the dataset. | list(object({…})) |
[] |
|
authorized_routines | An array of routines to be authorized on the dataset. | list(object({…})) |
[] |
|
authorized_views | An array of views to be authorized on the dataset. | list(object({…})) |
[] |
|
dataset_access | Set access in the dataset resource instead of using separate resources. | bool |
false |
|
description | Optional description. | string |
"Terraform managed." |
|
encryption_key | Self link of the KMS key that will be used to protect destination table. | string |
null |
|
friendly_name | Dataset friendly name. | string |
null |
|
iam | IAM bindings in {ROLE => [MEMBERS]} format. Mutually exclusive with the access_* variables used for basic roles. | map(list(string)) |
{} |
|
labels | Dataset labels. | map(string) |
{} |
|
location | Dataset location. | string |
"EU" |
|
materialized_views | Materialized views definitions. | map(object({…})) |
{} |
|
options | Dataset options. | object({…}) |
{} |
|
routines | Routine definitions. | map(object({…})) |
{} |
|
tables | Table definitions. Options and partitioning default to null. Partitioning can only use range or time , set the unused one to null. |
map(object({…})) |
{} |
|
tag_bindings | Tag bindings for this dataset, in key => tag value id format. | map(string) |
{} |
|
views | View definitions. | map(object({…})) |
{} |
name | description | sensitive |
---|---|---|
dataset | Dataset resource. | |
dataset_id | Dataset id. | |
id | Fully qualified dataset id. | |
materialized_view_ids | Map of fully qualified materialized view ids keyed by view ids. | |
materialized_views | Materialized view resources. | |
routine_ids | Map of fully qualified routine ids keyed by routine ids. | |
routines | Routine resources. | |
self_link | Dataset self link. | |
table_ids | Map of fully qualified table ids keyed by table ids. | |
tables | Table resources. | |
view_ids | Map of fully qualified view ids keyed by view ids. | |
views | View resources. |