Informatica Best Practice: User Defined Join syntax in the Source Qualifier transformation

In any Business Intelligence environment, changing technology of data sources is a big challenge.

This is valid particularly in the case of mature, long-running BI platforms, where the overall ETL processing is likely to exceed three or four hundred single jobs.

A change of the data source database technology – for example, from SQL Server to Oracle – and related data migration means often a painstaking exercise of manually updating every single ETL step, unit and regression testing, QA and moving to Production.

In order to minimise the effort required it is recommended to avoid database-specific SQL wherever possible, and to make use of any automation your ETL tool offers in order to make the code portable across platforms.

An example of this automation is the Join syntax used in Informatica PowerCenter.

One of the perks of using Informatica mappings to run your BI ETL jobs is the capability to write the extraction SQL queries in database independent code, which the Informatica Integration Service then transforms in database-specific SQL, readable by the proper technology.

The meta-code – which is mainly a different way to enter joins in the SQL – can be either entered in the SQL Query override or in the User Defined Join property of the Source Qualifier transformation.

informatica

Fig.1 – SQL Query override

The best practice is to enter the syntax in the User Defined Join, since if the code is entered in the SQL Query override, any further changes to the transformation applied at a later stage will be ignored by the Integration Service.

informatica 2

Fig.2 – Informatica SQL in the User Defined Join property

The syntax is quite simple and makes use of curly brackets to encapsulate the main join logic.

For example, the following query:

SELECT

TABLE_1.FIELD1,

TABLE_2.FIELD2

FROM

TABLE_1 INNER JOIN TABLE_2

ON TABLE_1.KEY = TABLE_2.KEY

becomes, in Informatica SQL:

SELECT

TABLE_1.FIELD1,

TABLE_2.FIELD2

FROM

TABLE_1,

TABLE_2

WHERE

{

TABLE_1 INNER JOIN TABLE_2

ON TABLE_1.KEY = TABLE_2.KEY

}

You can paste either the whole code above in the SQL Query override or only the brackets-delimited part in the User Defined Join. As stated earlier, the latter is preferred.

The rule does not change for outer joins:

SELECT

TABLE_1.FIELD1,

TABLE_2.FIELD2

FROM

TABLE_1,

TABLE_2

WHERE

{

TABLE_1 LEFT OUTER JOIN TABLE_2

ON TABLE_1.KEY = TABLE_2.KEY

}

Filter conditions are also applicable in the SQL and can be entered in both the WHERE clause (if you have a SQL Query override anyway), in the User Defined Join and in the Source Filter (preferred option):

In the first case, you can write:

SELECT

TABLE_1.FIELD1,

TABLE_2.FIELD2

FROM

TABLE_1,

TABLE_2

WHERE

{

TABLE_1 LEFT OUTER JOIN TABLE_2

ON TABLE_1.KEY = TABLE_2.KEY

}

AND TABLE_1.FILTER = ‘VALUE’

In the second, enter in the User Defined Join property:

{

TABLE_1 LEFT OUTER JOIN TABLE_2

ON TABLE_1.KEY = TABLE_2.KEY

WHERE TABLE_1.FILTER = ‘VALUE’

}

Aliases are supported, so the above code can be rewritten as:

{

TABLE_1 T1 LEFT OUTER JOIN TABLE_2 T2

ON T1.KEY = T2.KEY

WHERE T1.FILTER = ‘VALUE’

}

Combining joins is straightforward and follow the same rules:

{

TABLE_1 INNER JOIN TABLE_2 ON TABLE_1.KEY = TABLE_2.KEY

LEFT OUTER JOIN TABLE_3 ON TABLE_3.KEY = TABLE_2.COLUMN

}

When combining joins, the best practice is to enter them in the following order:

1 – normal joins

2 – left outer joins

3 – right outer joins (note that some database technology may limit to one right outer join per query)

In conclusion, the code is simple to apply and makes your Informatica mappings portable and flexible. Combining this with proper parametrisation techniques (e.g. using variables and parameters to store database-specific code, like date conversions and specific functions), which is also supported by the Informatica SQL syntax, will allow you to be able to port your mappings from a database technology to another quite seamlessly. Try it!

Gianluca N
gianluca.nacci@clearpeaks.com