How to copy BigQuery tables between locations with PDI

Suppose you have accidentally created a dataset in US region instead of EU. Your dataset has few tables with large amount of data. How do you copy or move data from one region to another.

You could use command line tool bq cp:

bq cp source_project_id:dataset.table_name destination_project_id:dataset.table_name

Unfortunately the copy command does not support cross region copies. You can only copy within the same region. There are number of ways of achieving this, you can use:

  1. BigQuery Transfer Service
  2. Cloud Composer
  3. Cloud Dataflow
  4. Custom ETL tool like PDI

Using PDI

Assuming your PDI is configured to read data from BiqQuery and store data in Cloud Storage.

It took just over 20mins to retrieve 1.6 GB of data and save it as avro compressed (snappy) file in Google bucket. The output file size was 365MB. From Google bucket to BigQuery it took 80 secs.