_images/pipelinewise-with-text.png

Documentation

PipelineWise is a Data Pipeline Framework using the Singer.io specification to ingest and replicate data from various sources to various destinations.

_images/pipelinewise-diagram-circle-bold.png

Features

  • Built with ELT in mind: PipelineWise fits into the ELT landscape but does not do traditional ETL. PipelineWise ingests data first into DWH in the original format and the “transformation” shifts to the end of the data pipeline. Load time transformations are still supported but complex mapping and joins have to be done once the data is replicated into the Data Warehouse.

  • Lightweight: No daemons or database setup are required

  • Replication Methods: CDC (Log Based), Key-Based Incremental and Full Table snapshots

  • Managed Schema Changes: When source data changes, PipelineWise detects the change and alters the schema in your DWH automatically

  • Load time transformations: Ideal place to obfuscate, mask or filter sensitive data that should never be replicated in the Data Warehouse

  • YAML based configuration: Data pipelines are defined as YAML files, ensuring that the entire configuration is kept under version control

  • Integration with external tools: With built-in event handlers you can trigger external scripts automatically when a certain event occures

  • Extensible: PipelineWise is using Singer.io compatible taps and target connectors. New connectors can be added to PipelineWise with relatively small effort

Beyond the Horizon

PipelineWise is built on top of several Singer.io components. Singer.io components are responsible for certain tasks like extracting data from a specific data source or loading data into a specific destination, however to replicate data end to end you’ll need an extra layer on top of these components to run the jobs, create configurations, select streams to replicate, do logging and more.

This is where PipelineWise comes in place. PipelineWise is a collection of pre-selected singer taps and targets to add the required functionalities to create, run and maintain data pipelines in a production Data Warehouse environment without the extra hassle.

Taps (Data Source Connectors)

PipelineWise can replicate data from the following data sources:

Target (Destination Connectors)

PipelineWise can replicate data into the following destinations:

Indices and tables