Thursday, 5 February 2015

Informatica version 9.5.1 important features and enhancements for big data and social media

The enhancements listed in this blog are from the Informatica version 9.5.1 hotfix3 new features guide. Some important features for hadoop and social media integration is listed in this blog. For more details check the original new features guide:


Enhancements for Informatica version 9.5.1 hotfix3:

PowerExchange for Hadoop:  Supports IBM InfoSphere BigInsights version 2.1 to access Hadoop sources and targets.
PowerExchange for MongoDB for PowerCenter: Supports extracting data from and load to a MongoDB database.

Enhancements for Informatica version 9.5.1 hotfix2:

PowerExchange for Hadoop supports MapR 2.1.2 and EMC Greenplum PivotalHD 2.0.1 to
access Hadoop sources and targets.

PowerCenter Big Data Edition:You can run mappings in a Hive environment with MapR 2.1.2 and Hortonworks 1.1 distribution

PowerExchange for Facebook: uses the Facebook API to control the number of rows that you request when a mapping runs.

PowerExchange for LinkedIn: uses the LinkedIn API to control the number of rows that you request when a mapping runs.


PowerExchange for Twitter:uses the Twitter API to control the number of rows that you request when a mapping runs.



The Data Masking transformation can mask integer and bigint data for phone number masking.

Enhancements for Informatica version 9.5.1 hotfix1:

Pushdown Optimization Enhancements: You can push transformation logic for an Aggregator transformation and a Sorter transformation to a relational source database

Data Transformation with JSON Input:A Data Processor transformation can contain JSON input with an .xsd schema file that defines JSON input file hierarchies.


Recover Workflows: When you monitor workflows, you can recover aborted or canceled workflow instances that are enabled for recovery.

PowerExchange for Hadoop: supports Cloudera 4.1.2 and HortonWorks 1.1 to access Hadoop sources and targets.

Enhancements for Informatica version 9.5.1:

PowerExchange for HDFS:  access data in a Hadoop Distributed file System (HDFS) cluster.You can read and write fixed-width , delimited file formats and  compressed files. You can read text files and binary file formats such as a sequence file from HDFS with a complex file data object. You can specify the compression format of the files. You can use the binary stream output of the complex file data object as input to a Data Processor transformation, which can parse the file.

PowerExchange for Hive: access data in a Hive data warehouse.You can read data from Hive in native or Hive run-time environments. You can write to Hive only if the run-time environment is Hive.You can create a Hive connection to run Informatica mappings in the Hadoop cluster. You can specify the Hive validation and run-time environment for Informatica mappings.

PowerExchange for Facebook: You can access Facebook through an HTTP proxy server.You can specify a list of access tokens that the Data Integration Service can use at run time to authenticate access to Facebook.


PowerExchange for LinkedIn: You can access LinkedIn through an HTTP proxy server.You can specify a list of access tokens that the Data Integration Service can use at run time to authenticate access to LinkedIn.

PowerExchange for Teradata Parallel Transporter API: You can use PowerExchange for Teradata Parallel Transporter API to load large volumes of data into Teradata tables by using Load or Stream system operators.

PowerExchange for Twitter: You can access Twitter through an HTTP proxy server.
You can specify a list of access tokens that the Data Integration Service can use at run time to authenticate access to Twitter.