Monday, 22 February 2016

Pushdown Optimization in Informatica Power Center and Big data edition



Pushdown Optimization in Informatica Power Center and Big data edition


Pushdown optimization is a feature in Informatica which enables the Informatica mapping to be converted into a SQL code and executed on the database. It is useful when you are dealing with huge volumes of data from the source and instead of reading all the data to the Informatica server it might be better to do some of the transformation logic such as filtering, aggregation, joins on the database itself. Also, useful in cases when you have powerful data warehouse appliance such as Netezza or Teradata.



It might be possible to push the complete mapping to the database such as in Informatica Big data (Hive mode) where the whole mapping is converted to SQL (select, insert, update) statements or to partially convert some of the code to SQL like in traditional powercenter and execute on the source/target database . Integration Service  can analyze the mapping and writes one or more SQL statements based on the mapping transformation logic. Pushdown Optimization Viewer or the session logs can be  used  to preview the SQL statements and mapping logic that the Integration Service can push to the source or target database.


Types of Pushdown Optimization in Traditional Powercenter


1.   Source Side Pushdown Optimization

2.   Target Side Pushdown Optimization

3.   Full Pushdown Optimization



1    1. Source-Side Pushdown Optimization


a.   Integration Service analyzes the mapping from the source to the target.

b.   Generates a SELECT statement based on the transformation logic for each transformation it can push to the database from the source or until it reaches a downstream transformation it cannot push to the database.

2    2. Target-Side Pushdown Optimization


a.   Integration Service analyzes the mapping from the target to the source 

b.   Generates a INSERT/UPDATE /DELETE statement based on the transformation logic for each transformation it can push to the database from the target or until it reaches a upstream transformation it cannot push to the database.

3    3. Full Pushdown Optimization


a.   Integration Service analyzes the mapping starting with the source and analyzes each transformation in the pipeline until it analyzes the target.

b.   Generates SQL statements that are executed against the source and target database based on the transformation logic it can push to the database.

c.   Source and Target must be on the same database

Pushdown optimization in Informatica Big Data Edition: 


In Informatica Big data edition, the whole logic is converted to HSQL statements and executed on Hive. The source and target need not be the same database. If the source and target are different database such as SQLserver or oracle, then it creates temporary Hive source and target tables on Hive and reads at the source and target data to those temporary hive tables and then performs rest of the transformation present in the mapping as HSQL statements.This feature is turn on by default in Hive mode of Informatica Big Data Edition.

Disadvantage of pushdown optimization:

1) It does not work for all database
2) You need a  pushdown optimization license
3)  It does not always generate optimal query
 

No comments:

Post a Comment