.. _tap-snowflake: Tap Snowflake ------------- Connecting to Snowflake ''''''''''''''''''''''' .. warning:: This section of the documentation is work in progress. Configuring what to replicate ''''''''''''''''''''''''''''' PipelineWise configures every tap with a common structured YAML file format. A sample YAML for Snowflake replication can be generated into a project directory by following the steps in the :ref:`generating_pipelines` section. Example YAML for tap-snowflake: .. code-block:: yaml --- # ------------------------------------------------------------------------------ # General Properties # ------------------------------------------------------------------------------ id: "snowflake_sample" # Unique identifier of the tap name: "Sample Snowflake Database Tap" # Name of the tap type: "tap-snowflake" # !! THIS SHOULD NOT CHANGE !! owner: "somebody@foo.com" # Data owner to contact #send_alert: False # Optional: Disable all configured alerts on this tap #slack_alert_channel: "#tap-channel" # Optional: Sending a copy of specific tap alerts to this slack channel # ------------------------------------------------------------------------------ # Source (Tap) - Snowflake connection details # ------------------------------------------------------------------------------ db_conn: account: "" # Snowflake host dbname: "" # Snowflake database name user: "" # Snowflake user password: "" # Plain string or vault encrypted warehouse: "" # Snowflake warehouse # ------------------------------------------------------------------------------ # Destination (Target) - Target properties # Connection details should be in the relevant target YAML file # ------------------------------------------------------------------------------ target: "snowflake" # ID of the target connector where the data will be loaded batch_size_rows: 20000 # Batch size for the stream to optimise load performance stream_buffer_size: 0 # In-memory buffer size (MB) between taps and targets for asynchronous data pipes #batch_wait_limit_seconds: 3600 # Optional: Maximum time to wait for `batch_size_rows`. Available only for snowflake target. # Options only for Snowflake target #archive_load_files: False # Optional: when enabled, the files loaded to Snowflake will also be stored in `archive_load_files_s3_bucket` #archive_load_files_s3_prefix: "archive" # Optional: When `archive_load_files` is enabled, the archived files will be placed in the archive S3 bucket under this prefix. #archive_load_files_s3_bucket: "" # Optional: When `archive_load_files` is enabled, the archived files will be placed in this bucket. (Default: the value of `s3_bucket` in target snowflake YAML) # ------------------------------------------------------------------------------ # Source to target Schema mapping # ------------------------------------------------------------------------------ schemas: - source_schema: "SCHEMA_1" # Source schema (aka. database) in Snowflake with tables target_schema: "REPL_SCHEMA_1" # Target schema in the destination Data Warehouse # List of tables to replicate from Snowflake to a destination # # Please check the Replication Strategies section in the documentation to understand the differences. tables: - table_name: "TABLE_ONE" replication_method: "INCREMENTAL" # One of INCREMENTAL or FULL_TABLE replication_key: "last_update" # Important: Incremental load always needs replication key # OPTIONAL: Load time transformations #transformations: # - column: "last_name" # Column to transform # type: "SET-NULL" # Transformation type # You can add as many tables as you need... - table_name: "TABLE_TWO" replication_method: "FULL_TABLE" # You can add as many schemas as you need... # Uncomment this if you want replicate tables from multiple schemas #- source_schema: "SCHEMA_2" # target_schema: "REPL_SCHEMA_2"