MS Excel spreadsheets as a data source in Informatica PowerCenter

.

Informatica Power Center can use different sources of data. The most common sources are tables, views or synonyms, but for storing small sets of data, companies usually use flat files, MS Excel files or MS Access. Informatica treat Excel and Access file sources as a relational database, not as a flat file.

 Which steps do we have to follow to load this data from these files to our Data Warehouse? How are the connections created? Where do these files have to be located? How should we configure Informatica?

 In this article I will explain the way to load data from Excel files using Informatica Power Center running on Windows.

 

 Steps to follow:

1. In Microsoft Excel file:

a)    Select the required rows to be read into PowerCenter.

 isabel

b)   Click on Insert> Name> Define option.

isabel 1

c)    Type the name of the range and select the range on “Refers to:” space.

isabel 2

d)    Save the spreadsheet, Normally this kind of sources are saved in a specific folder

      (\Informatica\PowerCenter8.1.1\server\infa_shared\SrcFiles).

 

2. Create ODBC DSN: it has to be created in the location where PowerCenter client and server are installed

a)    Go to Start > Settings > Administrative Tools > Datasources (ODBC)

(in Windows 64 bits you must use ODBC Data Source Administration tool located in  \Windows\SysWOW64\odbcad32).

b)    Click on System DSN

 isabel 3

c)    Click on Add and choose Microsoft Excel Driver (*.xls) and click on Finish button.

isabel 4

d)    Write a Data Source Name, select the Excel or Access version and define the localization of the file through ‘Select Workbook…’ button and click OK.

isabel 5

 

3. Create Relational connection in Informatica Workflow Manager: To import the excel file you need to configure connectivity with the PowerCenter Client:

a)    Open Informatica Power Center Workflow Manager. Go to Connection and select ‘Relational…’ and click on ‘New’.

isabel 6

b)    Choose ODBC from the list options.

isabel 7

c)    Fill in the blank spaces as follow:

                                          i.    Connection name

                                         ii.    Username = pmnulluser

                                        iii.    Password = pmnullpasswd

                                       iv.    Connect string = Excel Name defined in step 3.a

isabel 8

4. Include relational source in the Source Qualifier transformation of the mapping:

a)    Open Informatica PowerCenter Designer. Go to Repository and select ‘Import Objects…’.

isabel 9

b)    Open Informatica Power Center Designer. Go to Source Analyzer and select ‘Import from Database..,’.

isabel 10

c)    Select the ODBC data source name created in step 3.c and click on Re-connect (don’t write any Username and Password). Select the source file and click on OK.

isabel 11

d)    Create mapping.

e)    Create workflow. In Source connection define the new relational connection created.

isabel 12

 

Please bear in mind the following considerations:

- Do not write sheet names or column names with spaces. It could cause problTo import a source definition, a read permission on the database object is required in the client machine.

- If you don’t assign datatypes to columns of data in Excel, the Designer imports each column as VARCHAR. If you require a specific datatype (numeric, date…), you need to assign the datatype in Excel before  importing the spreadsheet (you can also modify it in Informatica source definition).

- The ranges you defined in the Excel file (step 1.a) appear as separate table names when you import them in Informatica PowerCenter.

- You need to have Excel installed on Informatica Server.

29 Responses to “MS Excel spreadsheets as a data source in Informatica PowerCenter”

  1. Balaji says:

    ODBC Data source is created using a sample spreadsheet from local drive. While setting connection object, connection string is select as the ODBC data source created while import the database to create source. But where is the option to set the actual source file path (not the local drive copy)?

  2. Isabel says:

    Thanks Balaji for your comment!
    In order to connect the Excel file with Informatica you need to have the Excel file in the same server where Informatica is installed or in an accessible folder for that server (shared folder). Also you need to create the ODBC in that server and set up that ODBC connection in Informatica; if any of those steps are missing, the connection between Informatica and Excel won’t be created.
    Please, let me know if you need any further clarification.

    Best regards.
    Isabel

  3. sahar says:

    is it required to create ODBC connection on the server?
    As when i create the connection in my local machine and trying to use an excel file as a source it is giving error :
    [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified
    Database driver error…
    Function Name : Connect
    Database driver error…
    Function Name : Connect
    Database Error: Failed to connect to database using user [pmnulluser] and connection string [Targets].].
    Targets is the name of DSN file and connection string both.
    Please help me to resolve this.
    Thanks.

  4. Isabel says:

    Thanks Sahar for your comment.

    In the event that the client and server are in different machines, you also have to create the ODBC in the server SO.

    As per the example above, step 3 configures the connection in the PowerCenter Workflow Manager, therefore all the connections configured refer to ones accessible in the server – these connections will be used when the workflows are executed in the server.
    On the other hand, step 4 configures the connection in the PowerCenter Designer, therefore all the connections configured refer to the ones accessible in the client – these connections are used when importing source table schemas through the client.

    Please let me know whether this resolves your issue!

  5. Patrick says:

    is there any way to use excel as source if the Informatica server is running on HP-UX or Linux?

  6. Isabel says:

    Hi Patrick,

    Have you tried to install excel ODBC driver in the Linux machine? In this case, which error do you obtain when you try to run a workflow?

    There is an alternative way, changing the excel file to text file (.csv). There is a script (Perl script) allows you to convert excel files in to csv files which are easy to read format across all the platforms (you can obtain it free from link https://community.informatica.com/solutions/1727).

    I hope it helps,
    Isabel

  7. Patrick says:

    ok, thanks, we go for the .csv, we use that already

  8. Naresh says:

    Hi Isabel,

    I have multiple worksheets in excel and i want to import those sheets into oracle. The problem is i don’t have Informatica Server installed on my machine, so do know how can i create an ODBC data source for the server machine from client machine ?
    Converting the .xls file to .csv won’t work for multiple worksheets.. so is there any other alternative to import multiple worksheets into oracle ?

    Thanks,
    Naresh

  9. Ravikiran says:

    Thanks for your explanation

  10. Isabel says:

    Hi Naresh,

    You could try to create a range for each worksheet (defining those using different names) and import them in the Designer as different relational sources. Let me know if it works.

    Isabel

  11. Naresh says:

    Isabel.. I had already tried with that but still it is not working!

  12. Naresh says:

    As mentioned in the second point:
    2. Create ODBC DSN: it has to be created in the location where PowerCenter client and server are installed.

    I don’t have PowerCenter server installed on my machine.. So how do i create ODBC Datasource in the server ?

    Naresh

  13. Isabel says:

    Hi Naresh,
    As far as I know, you need to create the ODBC where Informatica PowerCenter server is installed, if not, Informatica can’t recognise the Excel file.
    Good luck with the research and if you find a way to achieve it, please, share it with us!

    Best Regards,

    Isabel

  14. SM says:

    Hi Isabel,

    I have done the above steps and its working fine .

    The Question is , Do We need to define the name of the tab everytime for file with same structure manually??

    Thanks,
    SM

  15. Isabel says:

    Hi!
    Thanks for your comment.
    To import data stored in a excel you need to define each time the range where the data is. Informatica creates source definitions based on these ranges.
    In case you have more than one file with the same structure, you could use the same template file, but you’d need to change the name of the range, in order Informatica to recognise which file is.
    Let me know if you have any other doubt.

    Regards,
    Isabel

  16. Mandar says:

    Thanks a lot Isabel !! really appreciate your help by posting it here….

    One more thing.. I went through whole help file, but no where they have mentioned about this.. from where did you came to know ?

  17. Hari says:

    Hi,
    If my server is residing in Unix and getting one excel file with multiple worksheets and each sheet as different columns then how can this scenario works.

    Regards,
    Hari

  18. Isabel says:

    Hi,
    Thanks for your comments.

    @Mandar: it’s true; there is not many information about how to do it in guides. I have done some research and testing in order to obtain the above steps.

    @Hari: You cannot use directly the excel file as a source; this is because you have Informatica Server installed on UNIX machine and it doesn’t recognise Excel format.
    My advice is convert the excel file to text file (.csv). You should create one text file for each excel page/format.
    Another option should be using an ODBC-ODBC Bridge (OOB) that allows an ODBC application to access an ODBC driver on a remote machine, but it should be much complicated, and I’m not sure if it works properly.

    Let me know if you have any other question.
    Regards,
    Isabel

  19. Edson says:

    Just found this topic now… great work!

  20. When some one searches for his essential thing,
    thus he/she wants to be available that in detail, thus that thing is maintained over here.

  21. kumar says:

    Hi,
    Nice article u had made, my concern is I want to export data to excel sheet from informatica since i am aware of importing from excel but can u provide screen shots for export s, normally we use cognos tool after informatica is done for getting data into excel .instead of that I want to use excel as my target instead of cognos tool.
    thanks in advance.

  22. Priya says:

    While creating ODBC DSN it is necessary to create DSN on System DSN?As am creating on user DSN While i am running the wlfw i am getting the error as [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified..I am unable 2 create DSN on System DSn as i dont ahve prevliages on it..

    Appreciate your help on this.

  23. Gaurav chaturvedi says:

    Hi Isabel,

    Thanks a lot for so much explaination. I still have two doubts, could anyone provide suggestions-
    1- How can I add DSN details for Excel in odbc.ini file on unix server. I can access odbc.ini file, but not sure exactly what details I need to add here to process excel file.
    2- Which driver I should have on unix server to be used in DSN for it.
    3- If no of records changes in excel file each time a new file comes with same structure, do I need to create (or recreate DSN) each time?

    Regards,
    Garuav Chaturvedi

  24. Hi there! I just want to give you a big thumbs up for your great info you have right here on this post.
    I am coming back to your website for more soon.

  25. Dhruuv says:

    Hi there!!! I have to deal with multiple Excel files everyday. How do I define the range if the number of records keep changing in the files.

    Thanks
    Dhruuv

  26. Mahesh says:

    Guys,Thanks.I was able to load multiple tabs from one excel file to informatica and able to use as a source.For Each sheet,we have to “define a name” and use it in ODBC But the tricky part is We will be having 200+ excel file and each will have 10 tabs/sheets.I cannot keep on “defining name” for each tab in each excel file.Is there anyway to automate the naming and load into informatica?

  27. Ankur says:

    Hi Isabel,

    I am having the same probles as Sahar with same error.
    Looks like I am doing something wrong when creating the ODBC conenction at server. I founf one thing that id server is on 64 bit then; (in Windows 64 bits you must use ODBC Data Source Administration tool located in \Windows\SysWOW64\odbcad32)

    But I cound not \Windows\SysWOW64\odbcad32.

    Please help me with this. I am trying to figure it out from one week

    Thank you
    Ankur

  28. Nidhi says:

    Hi All,

    I need to generate the output in excel file which contains multiple worksheet. Can anyone suggest me how can I achieve it. Tools used are – Unix, Informatica and teradata.

    Regards,
    Nidhi

  29. Rita says:

    Even My excel will have data which keeps on changing. Can i implement this logic?

Leave a Comment

privacy policy - Copyright © 2000-2010 ClearPeaks

topnav