MS Excel spreadsheets as a data source in Informatica PowerCenter

Informatica Power Center can use different sources of data. The most common sources are tables, views or synonyms, but for storing small sets of data, companies usually use flat files, MS Excel files or MS Access. Informatica treat Excel and Access file sources as a relational database, not as a flat file.  Which steps do we have to follow to load this data from these files to our Data Warehouse? How are the connections created? Where do these files have to be located? How should we configure Informatica?  In this article I will explain the way to load data from Excel files using Informatica Power Center running on Windows.

 

 Steps to follow:

 

1. In Microsoft Excel file:

 

a)    Select the required rows to be read into PowerCenter.

 

 isabel

 

b)   Click on Insert> Name> Define option.

 

isabel 1

 

c)    Type the name of the range and select the range on “Refers to:” space.

 

isabel 2

 

d)    Save the spreadsheet, Normally this kind of sources are saved in a specific folder       (\Informatica\PowerCenter8.1.1\server\infa_shared\SrcFiles).

 

2. Create ODBC DSN: it has to be created in the location where PowerCenter client and server are installed

 

a)    Go to Start > Settings > Administrative Tools > Datasources (ODBC) (in Windows 64 bits you must use ODBC Data Source Administration tool located in  \Windows\SysWOW64\odbcad32).

b)    Click on System DSN

 

 isabel 3

 

c)    Click on Add and choose Microsoft Excel Driver (*.xls) and click on Finish button.

 

isabel 4

 

d)    Write a Data Source Name, select the Excel or Access version and define the localization of the file through ‘Select Workbook…’ button and click OK.

 

isabel 5

 

3. Create Relational connection in Informatica Workflow Manager: To import the excel file you need to configure connectivity with the PowerCenter Client:

 

a)    Open Informatica Power Center Workflow Manager. Go to Connection and select ‘Relational…’ and click on ‘New’.

 

isabel 6

 

b)    Choose ODBC from the list options.

isabel 7

 

c)    Fill in the blank spaces as follow:

i.    Connection name
ii.    Username = pmnulluser
iii.    Password = pmnullpasswd
iv.    Connect string = Excel Name defined in step 3.a

 

isabel 8

 

4. Include relational source in the Source Qualifier transformation of the mapping:

 

a)    Open Informatica PowerCenter Designer. Go to Repository and select ‘Import Objects…’.

 

isabel 9

 

b)    Open Informatica Power Center Designer. Go to Source Analyzer and select ‘Import from Database..,’.

 

isabel 10

 

c)    Select the ODBC data source name created in step 3.c and click on Re-connect (don’t write any Username and Password). Select the source file and click on OK.

 

isabel 11

 

d)    Create mapping.

e)    Create workflow. In Source connection define the new relational connection created.

 

isabel 12

 

Please bear in mind the following considerations: – Do not write sheet names or column names with spaces. It could cause problTo import a source definition, a read permission on the database object is required in the client machine. – If you don’t assign datatypes to columns of data in Excel, the Designer imports each column as VARCHAR.
If you require a specific datatype (numeric, date…), you need to assign the datatype in Excel before  importing the spreadsheet (you can also modify it in Informatica source definition). – The ranges you defined in the Excel file (step 1.a) appear as separate table names when you import them in Informatica PowerCenter. – You need to have Excel installed on Informatica Server.

Isabel B
isabel.barbosa@clearpeaks.com