Pushdown Optimization in Informatica Power Center and Big data edition
Pushdown optimization is a feature in Informatica which
enables the Informatica mapping to be converted into a SQL code and executed on
the database. It is useful when you are dealing with huge volumes of data from
the source and instead of reading all the data to the Informatica server it
might be better to do some of the transformation logic such as filtering,
aggregation, joins on the database itself. Also, useful in cases when you have powerful data warehouse appliance such as Netezza or Teradata.
It might be possible to push the
complete mapping to the database such as in Informatica Big data (Hive mode)
where the whole mapping is converted to SQL (select, insert, update) statements
or to partially convert some of the code to SQL like in traditional powercenter
and execute on the source/target database . Integration Service can analyze the mapping and writes one or more
SQL statements based on the mapping transformation logic. Pushdown Optimization
Viewer or the session logs can be used to
preview the SQL statements and mapping logic that the Integration Service can
push to the source or target database.
Types of Pushdown Optimization in Traditional Powercenter
1. Source Side Pushdown Optimization
2. Target Side Pushdown Optimization
3. Full Pushdown Optimization
1 1. Source-Side Pushdown Optimization
a. Integration Service analyzes the mapping from
the source to the target.
b. Generates a SELECT statement based on the
transformation logic for each transformation it can push to the database from
the source or until it reaches a downstream transformation it cannot push to
the database.
2 2. Target-Side Pushdown Optimization
a.
Integration Service analyzes the mapping from
the target to the source
b.
Generates a INSERT/UPDATE /DELETE statement
based on the transformation logic for each transformation it can push to the
database from the target or until it reaches a upstream transformation it
cannot push to the database.
3 3. Full Pushdown Optimization
a. Integration Service analyzes the mapping
starting with the source and analyzes each transformation in the pipeline until
it analyzes the target.
b. Generates SQL statements that are executed
against the source and target database based on the transformation logic it can
push to the database.
c. Source and Target must be on the same database
Pushdown optimization in Informatica Big Data Edition:
In Informatica Big data edition, the whole logic is
converted to HSQL statements and executed on Hive. The source and target need
not be the same database. If the source and target are different database such
as SQLserver or oracle, then it creates temporary Hive source and target tables
on Hive and reads at the source and target data to those temporary hive tables
and then performs rest of the transformation present in the mapping as HSQL
statements.This feature is turn on by default in Hive mode of Informatica Big Data Edition.
Disadvantage of pushdown optimization:
1) It does not work for all database
2) You need a pushdown optimization license
3) It does not always generate optimal query
Disadvantage of pushdown optimization:
1) It does not work for all database
2) You need a pushdown optimization license
3) It does not always generate optimal query
No comments:
Post a Comment