What is data transformation

Conditional Split Transformation

The conditional split transformation routes data rows to different streams based on matching conditions

The conditional split transformation is similar to a CASE decision structure in a programming language.

The transformation evaluates expressions, and based on the results, directs the data row to the specified stream

Exists Transformation

The exists transformation is a row filtering transformation that checks whether your data exists in another source or stream.

The output stream includes all rows in the left stream that either exists or don’t exist in the right stream.

Union Transformation

Union will combine multiple data streams into one, with the SQL Union of those streams as the new output from the Union transformation

All of the schema from each input stream will be combined inside of your data flow, without needing to have a join key

Join Transformation

Use the join transformation to combine data from two sources or streams in a mapping data flow.

The output stream will include all columns from both sources matched based on a join condition.

Aggregate Transformation

The Aggregate transformation defines aggrins in your data streams.

Using the Expression Builder, you can define different types of aggregations such as SUM, MIN, MAX, and COUNT grouped by existing or computed columns.

Surrogate Key Transformation

Use the surrogate key transformation to add an incrementing key value to each row of data. This is useful when designing dimension tables in a star schema analytical data model.

In a star schema, each member in your dimension tables requires a unique key that is a non-business key.

Select Transformation

Use the select transformation to rename, drop, or reorder columns.

This transformation doesn’t alter row data, but chooses which columns are propagated downstream.

Lookup Transformation

A lookup transformation is similar to a left outer join

All rows from the primary stream will exist in the output stream with additional columns from the lookup stream

Derived Column Transformation

Use the derived column transformation to generate new columns in your data flow or to modify existing fields

Pivot Transformation

Use the pivot transformation to create multiple columns from the unique row value of a single column.

Pivot is an aggregation transformation where you select group by columns and generate pivot columns using aggregate functions.

Unpivot Transformation

Use Unpivot in a mapping data flow as a way to turn an unnormalized dataset into a more normalized version by expanding values from multiple columns in a single record into multiple records with the same values in a single column.

Rank Transformation

Use the rank transformation to generate an ordered ranking based upon sort conditions specified by the user.

Window Transformation

The Window transformation is where you will define window-based aggregations of columns in your data stream.

In the Expression Builder, you can define different types of aggregations that are based on data or time windows (SQL, OVER clause) such as LEAD, LAG, NTILE, CUMEDIST, RANK, etc.

A new field will be generated in your output that includes these aggregations.

Flatten Transformation

Use the flatten transformation to take array values inside hierarchical structures such as JSON and unroll them into individual rows.

This procss is known as denormalization.

Filter Transformation

The Filter transforms allows row filtering based upon a condition.
The output stream includes all rows that matching the filter condition.

the filter transformation is similar to a WHERE clause in SQL.