Copy data from local MySQL database to Google Cloud MySQL

The main job performs two tasks:

  • insert latest records into the target database
  • update the target database with delta records

PDI crashes if it tries to load more than few millions records at a single run. The approach is taken to overcome this limitation by processing data in smaller data chunks.

How it works

  1. get tables and corresponding fields from INFORMATION_SCHEMA database and copy rows to result
  2. get data from result and set variables
  3. loop
    • prepare paging
    • evaluate load type
    • if delta then update tables
    • else insert data into the appropriate table
    • continue until all records are processed
Main jPDI ob
Get metadata from INFORMATION_SCHEMA
PDI job for running load in a smaller batches