Friday, 6 November 2015

How to identify last record in Informatica cloud or powercenter

How to identify last record in Informatica cloud or powercenter?

I had a scenario where I had to add comma (,) at the end of each line that I'm generating and loading to the target but however for the last line I should not add the comma. Now that is not as easy as it sounds. Informatica powercenter does not have any indicator on the last line that is read from the source. How do we accomplish this? Well if you have access to unix scripts then most likely you are write a script to remove that comma from the target file after it is generated. I work in windows environment these days and don't have the option of writing complex scripts. This is how I solved the problem using aggregator with LAST function and a joiner to join the last record indicator with rest of the records.

I created two source definition reading from the same file. The first source reads all the records but after the second source I have added a aggregator to pass the last line using the LAST function.

The last function in the aggregator sent the last record. Basically I have used a field called LAST_RECORD that has the value LAST(ID1 || ID2) in the aggregator. ID1 and ID2 are the two primary key fields of every record that is read from the source and the last function sends the last value of ID1 concatenated to ID2. There is one record out of this aggregator which is the last record read from the source.

For all the records read from the first source, I'm creating a field called FIELD_CONCAT that holds the value of ID1 || ID2 which is the concatenation of two value.

Now all the records coming from the two sources are joined on the FIELD_CONCAT field from the first source and LAST_RECORD field from the second source. This join is a full outer join, All the records coming out from this full outer join will have the LAST_RECORD field value as NULL except for the one record which is equal to the LAST(ID1|ID2) value. I hope you got the point. Now you can check which of those records have the LAST_RECORD value as not null, and remove the comma from the last record which was my original requirement to remove the comma from the last record.

Now one problem was the joiner was sending the records in the same order as it was reading from the first source. Hence I have to add sequence number field to each record and then sort using that field so that we can write the output in the same order as the input. Sample mapping for identifying the last line read is shown below. It is an Informatica cloud mapping.