DWBI-TECH BLOGS (Pradeep Kannadiga): November 2015

Thursday 12 November 2015

Indirect file processing in Informatica Cloud - Workaround

Indirect file processing ( Filelist) in Informatica Cloud (ICS)

I have a scenario where I need to read list of files (filelist) in Informatica cloud. My research pointed out to couple of articles on Informatica community. Apparently there is a template that we can work with or we need to use filelist connector. I could not find both and so had to come up with a different startegy.

The strategy was:
1) Use powercenter to create the mapping and a workflow that reads from the file list and writes to a oracle target
2) Export it from repository manager as XML file
3) Import it from the informatica cloud as a powercenter task.Map the connection to ICS connections and run it from Informatica cloud.

In Informatica power center we read filelist by setting the property in the session source file to indirect as shown in the screenshot below: Using the same strategy I created my mapping and wrote the records to oracle target. Once I tested the workflow I exported it to Informatica cloud and it worked.

When you import in Informatica cloud you can map the connection of the oracle target to the connection used in the cloud. Make sure your mapping has only the supported cloud transformations.

Also the source filelist and the source files have to be in location that is accessible to Informatica cloud (ICS).

Tuesday 10 November 2015

User management- Creating groups , users , owners in Amazon redshift

Below are some sample commands to create user groups, adding users, managing permission on table in Amazon redshift.

1) To create a user use the below command in amazon redshift:

create user my_user password 'anaconda123';

In the above example, user with name my_user is created with password anaconda123.

You can later change the password using the command:

alter user my_user password 'mynewpassword123'

2) To create a user_group use the below command in amazon redshift:

create group my_group with user my_user;

In the above command, a new group with name my_group is created and the user my_user is added to that group.

3) To create a database schema use the below command in Amazon redshift:

create schema myschema authorization my_user;

In the above example, schema myschema is created with my_user being provided all the permission on the schema.

4) To assign user group to a schema use the below command in Amazon redshift:

grant all on schema myschema to group my_group;

5) To add or drop users to a user group use the below command in Amazon redshift:

ALTER GROUP group_name ADD USER username;

ALTER GROUP group_name DROP USER username;

ALTER GROUP group_name RENAME TO new_name;

6) To check permissions on the tables in amazon redshift:

select * from pg_tables /* provides all table owner info */
select * from pg_views /* provided table owner info */
select * from pg_shadow /* provides user info */
select * from pg_group /* provided user group info */

7) To add permissions on the tables in amazon redshift:

Use sample below to add permissions on a table in redshift

Create table MYTABLE (
COLUMN1 CHAR(1) NOT NULL,
COLUMN2 CHARACTER VARYING(100) NOT NULL)
distkey (COLUMN1);

GRANT ALL ON TABLE MYTABLE TO mstr_user;

ALTER TABLE MYTABLE owner TO my_user;

Monday 9 November 2015

SQL transformation in Informatica for executing SQL scripts and dynamic SQLs

If you have some SQLs in a file and need to execute it on a database using Informatica or if you want to use some transformation to create dynamic SQLs i,e SQLs that take parameter and give you the output based on the input parameter, then SQL transformation is a way to go.

The SQL transfomation operates in Script and Query mode. In query mode you can write a query in the SQL transformatica and specify a parameter in the query which is the input port and depending on the value of the input port the result could change. For example: if NAME is a input port in the SQL transformation, your query in SQL transformation could be " SELECT DEPARTMENT FROM DEPARTMENT_TABLE WHERE FIRST_NAME=?NAME?". This will return department name for the every NAME that goes through the SQL transformation.

If you want to execute bunch of SQLs that are in a file then create the SQL transformation in Script mode. Read the SQL from a flat file and pass the field corresponding to the SQL to the Script name field of the SQL transformation. The SQL transformation executes every SQL that it reads from the flat file and executes it.

For example: If the input file has :

CREATE TABLE MYTABLE as SELECT * from MYSTAGING_TABLE;

InSERT into MYTABLE values ('xxx', 1 ,2 3).

The SQL transformation will execute both the statements one after another and return the script result and any ERROR outputs. Sample mapping is shown in the screenshot below:

Friday 6 November 2015

How to identify last record in Informatica cloud or powercenter

How to identify last record in Informatica cloud or powercenter?

I had a scenario where I had to add comma (,) at the end of each line that I'm generating and loading to the target but however for the last line I should not add the comma. Now that is not as easy as it sounds. Informatica powercenter does not have any indicator on the last line that is read from the source. How do we accomplish this? Well if you have access to unix scripts then most likely you are write a script to remove that comma from the target file after it is generated. I work in windows environment these days and don't have the option of writing complex scripts. This is how I solved the problem using aggregator with LAST function and a joiner to join the last record indicator with rest of the records.

I created two source definition reading from the same file. The first source reads all the records but after the second source I have added a aggregator to pass the last line using the LAST function.

The last function in the aggregator sent the last record. Basically I have used a field called LAST_RECORD that has the value LAST(ID1 || ID2) in the aggregator. ID1 and ID2 are the two primary key fields of every record that is read from the source and the last function sends the last value of ID1 concatenated to ID2. There is one record out of this aggregator which is the last record read from the source.

For all the records read from the first source, I'm creating a field called FIELD_CONCAT that holds the value of ID1 || ID2 which is the concatenation of two value.

Now all the records coming from the two sources are joined on the FIELD_CONCAT field from the first source and LAST_RECORD field from the second source. This join is a full outer join, All the records coming out from this full outer join will have the LAST_RECORD field value as NULL except for the one record which is equal to the LAST(ID1|ID2) value. I hope you got the point. Now you can check which of those records have the LAST_RECORD value as not null, and remove the comma from the last record which was my original requirement to remove the comma from the last record.

Now one problem was the joiner was sending the records in the same order as it was reading from the first source. Hence I have to add sequence number field to each record and then sort using that field so that we can write the output in the same order as the input. Sample mapping for identifying the last line read is shown below. It is an Informatica cloud mapping.

Monday 2 November 2015

Estimating database and table size in Netezza

Estimating database and table size and skew in Netezza

nz_db_size utility present in /nz/support/contrib/3.1/bin/ can be used to estimate the total number of bytes on disk that a given database and the tables belonging to the database is occupying:

Usage is as follows:

nz_db_size [ -s ] [ database [ table ] ]

E.g.:

[nz@nps12345 bin]$ ./nz_db_size MYTESTDB MYTABLE

Object | Name | Bytes | KB | MB | GB | TB
-----------+----------------------------------+---------------------+-----------------+-------------+-----------+--------
Appliance | nps1234 | 134,934,002 | 136,083 | 132,894 | 192.8 | .1
Database | MYTESDB | 123,245,56 | 12,345,67 | 1000 | 1230 | .0
Table | MYTABLE | 1,206,720 | 1,178 | 1 | .0 | .0

To get summary stats for database use the below command:

[nz@mpc-nps]$ nz_db_size -summary -s

Object   |               Name               |        Bytes         |        KB        |      MB      |     GB     |   TB
-----------+----------------------------------+----------------------+------------------+--------------+------------+--------
Appliance | mpc-nps                   |    1,34343,2434,24 |   1,232,32,23232,34 |    2,167,756 |    4,2267 |    2.1
Database | MYTESTDB                          |      123,2323,3232,33 |      123,345,343|      456,334|      686.0 |     .7

nz_stats is another command that can be used be get the full statistics:

For example: nzstats show -type table

Using SQLCMD to execute SQL scripts from a file on SQL server machine in Informatica

Using SQLCMD to execute SQL scripts from a file on SQL server machine

I was wondering how to execute a SQL script using Informatica cloud on SQL server. The obvious approach that came to mind was using the SQL transformation in script mode. That approach works but cannot be implemented in Informatica cloud since SQL transformation is not supported in Informatica. The next approach that came to mind was to use some tool that can execute SQL script on SQL server. Happened to find out about this tool called the SQLCMD that can do the task for you.
You can put all the commands you want to execute in a .sql file or .txt file and then execute the script using the SQLCMD utility. In the below example, all the SQL commands are in C;\test_sql.txt

You will most likely find the tool in the below location on the machine that has microsoft managment studio installed.

cd C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Bin

If you want to execute the SQLCMD command in window authentication mode use the below command:

SQLCMD -S myservermachine -i C:\test_sql.txt -o C:\out.txt

If you want to execute the SQLCMD command in window authentication mode as a different user then use the below command:

RUNAS /user:Mydomain\myusername "SQLCMD -S myservermachine -i C:\test_sql.txt -o C:\out.txt"

If you want to execute the SQLCMD command in sql server authentication mode use the below command:

SQLCMD -U myusername -P mypass -S myservermachine -i C:\test_sql.txt -o C:\out.txt

Using NZSQL to execute SQL scripts from a file on Netezza machine

Similary nzsql tool can be used SQL scripts from a file on Netezza machine. The command is:

nzsql -f <sql_file_name>