EXTRACTING DATA FROM TALEO

Over the past few years we have seen companies focusing more and more on Human Resources / Human Capital activities. This is no surprise, considering that nowadays a large number of businesses depend more on people skills and creativity than on machinery or capital, so hiring the right people has become a critical process. As a consequence, more and more emphasis is put on having the right software to support HR/HC activities, and this in turn leads to the necessity of building a BI solution on top of those systems for a correct evaluation of processes and resources. One of the most commonly used HR systems is Taleo, an Oracle product that resides in the cloud, so there is no direct access to its underlying data. Nevertheless, most BI systems are still on-premise, so if we want to use Taleo data, we need to extract it from the cloud first.

 

1. Taleo data extraction methods

 

As mentioned before, there is no way of direct access to Taleo data; nevertheless, there are several ways to extract it, and once extracted we will be able to use it in the BI solution:

 

Querying Taleo API
Using Cloud Connector
Using Taleo Connect Client

 

API is very robust, but the most complex of the methods, since it requires a separate application to be written. Usually, depending on the configuration of the BI system, either Oracle Cloud Connector, Taleo Connect Client or a combination of both is used.

 

Extracting data from Taleo

Figure 1: Cloud connector in ODI objects tree

 

Oracle Cloud Connector is a component of OBI Apps, and essentially it’s Java code that replicates Taleo entities / tables. It’s also easy to use: just by creating any Load Plan in BIACM using Taleo as the source system, a series of calls to Cloud Connector are generated that effectively replicate Taleo tables to local schema. Although it works well, it has 2 significant disadvantages:

 

It’s only available as a component of BI Apps
It doesn’t extract Taleo UDFs

So even if we have BI Apps installed and we use Cloud Connector, there will be some columns (UDFs) that will not get extracted. This is why the use of Taleo Connect Client is often a must.

 

2. Taleo Connect Client

 

Taleo Connect Client is a tool that is used to export or import data from / to Taleo. In this article we’re going to focus on extraction. It can extract any field, including UDFs, so it can be used in combination with BI Apps Cloud Connector or, if that’s not available, then as a unique extraction tool. There are versions for both Windows and Linux operating systems. Let’s look at the Windows version first.

Part 1 – Installation & running: 

Taleo Connect Client can be downloaded from the Oracle e-delivery website; just type Taleo Connect Client into the searcher and you will see it on the list. Choose the necessary version, select Application Installer and Application Data Model (required!), remembering that it must match the version of the Taleo application you will be working with; then just download and install. Important – the Data Model must be installed before the application is installed.

 

Extracting data from Taleo

Figure 2: Downloading TCC for Windows

 

After TCC is installed, we run it, providing the necessary credentials in the initial screen:

 

Extracting data from Taleo

Figure 3: Taleo Connect Client welcome screen

 

And then, after clicking on ‘ping’, we connect to Taleo.  The window that we see is initially empty, but we can create or execute new extracts from it. But before going on to this step, let’s find out how to see the UDFs: in the right panel, go to the ‘Product integration pack’ tab, also selecting the correct product and model. Then, in the Entities tab, we can see a list of entities / tables, and in fields / relations, we can see columns and relations with other entities / tables (through foreign keys). After the first run, you will probably have some UDFs that are not on the list of fields / relations available. Why is this? Because what we initially see in the field list are only Taleo out-of-the-box fields, installed with the Data Model installer. But don’t worry, this can easily be fixed: use the ‘Synchronize custom fields’ icon (highlighted on the screenshot). After clicking on it you will be taken to a log-on screen where you’ll have to provide log-on credentials again, and after clicking on the ‘Synchronize’ button, the UDFs will be retrieved.

 

Extracting data from Taleo

Figure 4: Synchronizing out-of-the-box model with User Defined Fields

 

Extracting data from Taleo

Figure 5: Synchronized list of fields, including some UDFs (marked with ‘person’ icon)

 

Part 2 – Preparing the extract: 

Once we have all the required fields available, preparing the extract is pretty straightforward. Go to File->New->New Export Wizard, then choose the desired Entity, and click on Finish. Now, in the General window, set Export Mode to ‘CSV-Entity’, and in the Projections tab, select the columns that you want to extract by dragging and dropping them from the Entity->Structure window on the right. You can also add filters or sort the result set. Finally, save the export file. The other component necessary to actually extract the data is so-called configuration. To create it, we select File->New->New Configuration Wizard, then we point the export file that we’ve created in the previous step and, in the subsequent step, our endpoint (the Taleo instance that we will extract the data from). Then, on the following screen, there are more extract parameters, like request format and encoding, extract file name format and encoding and much more. In most cases, using the default values of parameters will let us extract the data successfully, so unless it’s clearly required, there is no need to change anything. So now the configuration file can be saved and the extraction process can start, just by clicking on the ‘Execute the configuration’ button (on the toolbar just below the main menu). If the extraction is successful, then all the indicators in the Monitoring window on the right will turn green, as in the screenshot below.

 

Extracting data from Taleo

Figure 6: TCC successfull extraction

 

By using a bat file created during the installation, you can schedule TCC jobs to be executed on a timely basis, using Windows Scheduler, but it’s much more common to have your OBI / BI Apps (or almost any other DBMS that your organization uses as a data warehouse) installed on a Linux / Unix server. This is why we’re going to have a look at how to install and set up TCC in a Linux environment.

Part 3 – TCC in a Linux / Unix environment: 

 

 

TCC setup in a Linux / Unix environment is a bit more complex. To simplify it, we will use some of the components that were already created and used when we worked with Windows TCC, and although the frontend of the application is totally different (to be precise, there is no frontend at all in the Linux version as it’s strictly command-line), the way the data is extracted from Taleo is exactly the same (using extracts designed as XML files and Taleo APIs). So, after downloading the application installer and data model from edelivery.oracle.com , we install both components. Installation is actually just extracting the files, first from zip to tgz, and then from tgz to uncompressed content. But this time, unlike in Windows, we recommend installing (extracting) the application first, and then extracting the data model files to an application subfolder named ‘featurepacks’ (this must be created, it doesn’t exist by default). It’s also necessary to create a subfolder ‘system’ in the application directory. Once this is done, you can move some components of your Windows TCC instance to the Linux one (of course, if you have no Windows machine available, you can create any of these components manually):

 

Copy file default.configuration_brd.xml from windows TCC/system to the Linux TCC/system
Copy extract XML and configuration XML files, from wherever you had them created, to the main Linux TCC directory

 

There are also some changes that need to be made in the TaleoConnectClient.sh file

 

Set JAVA_HOME variable there, at the top of the file (just below #!/bin/bash line), setting it to the path of your Java SDK installation (for some reason, in our installation, system variable JAVA_HOME wasn’t captured correctly by the script)
In the line below #Execute the client,  after the TCC_PARAMETERS variable, add: ✓ parameters of proxy server if it is to be used:

 

-Dhttp.proxyHost=ipNumber –Dhttp.proxyPort=portNumber

 

✓ path of Data Model:

 

-Dcom.taleo.integration.client.productpacks.dir=/u01/oracle/tcc-15A.2.0.20/featurepacks

 

So, in the end, the TaleoConnectClient.sh file in our environment has the following content (IP addresses where ‘masked’):

 

#!/bin/sh
JAVA_HOME=/u01/middleware/Oracle_BI1/jdk
# Make sure that the JAVA_HOME variable is defined if [ ! "${JAVA_HOME}" ] then
echo +-----------------------------------------+
echo "+ The JAVA_HOME variable is not defined. +"
echo +-----------------------------------------+
exit 1
fi

# Make sure the IC_HOME variable is defined if [ ! "${IC_HOME}" ] then
IC_HOME=.
fi

# Check if the IC_HOME points to a valid taleo Connect Client folder if [ -e "${IC_HOME}/lib/taleo-integrationclient.jar" ] then
# Define the class path for the client execution
IC_CLASSPATH="${IC_HOME}/lib/taleo-integrationclient.jar":"${IC_HOME}/log"

# Execute the client
${JAVA_HOME}/bin/java ${JAVA_OPTS} -Xmx256m ${TCC_PARAMETERS}
-Dhttp.proxyHost=10.10.10.10 -Dhttp.proxyPort=8080 -Dco m.taleo.integration.client.productpacks.dir=/u01/tcc_linux/tcc-15A.2.0.20/featurepacks
-Dcom.taleo.integration.client.i
nstall.dir="${IC_HOME}" -Djava.endorsed.dirs="${IC_HOME}/lib/endorsed"
-Djavax.xml.parsers.SAXParserFactory=org.apache.xe
rces.jaxp.SAXParserFactoryImpl
-Djavax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImpl
-Dorg.apache.
commons.logging.Log=org.apache.commons.logging.impl.Log4JLogger
-Djavax.xml.xpath.XPathFactory:http://java.sun.com/jaxp/x
path/dom=net.sf.saxon.xpath.XPathFactoryImpl -classpath ${IC_CLASSPATH} com.taleo.integration.client.Client ${@} else
echo +-----------------------------------------------------------------------------------------------
echo "+ The IC_HOME variable is defined as (${IC_HOME}) but does not contain the Taleo Connect Client"
echo "+ The library ${IC_HOME}/lib/taleo-integrationclient.jar
cannot be found. "
echo +-----------------------------------------------------------------------------------------------
exit 2
fi

 

Once this is ready, we can also apply the necessary changes to the extract and configuration files, although there is no need to change anything in the extract definition (file blog_article_sq.xml). Let’s have a quick look at content of this file:

 

<?xml version="1.0" encoding="UTF-8"?>
<quer:query productCode="RC1501" model="http://www.taleo.com/ws/tee800/2009/01" projectedClass="JobInformation" locale="en" mode="CSV-ENTITY" largegraph="true" preventDuplicates="false" xmlns:quer="http://www.taleo.com/ws/integration/query"><quer:subQueries/><quer:projections><quer:projection><quer:field path="BillRateMedian"/></quer:projection><quer:projection><quer:field path="JobGrade"/></quer:projection><quer:projection><quer:field path="NumberToHire"/></quer:projection><quer:projection><quer:field path="JobInformationGroup,Description"/></quer:projection></quer:projections><quer:projectionFilterings/><quer:filterings/><quer:sortings/><quer:sortingFilterings/><quer:groupings/><quer:joinings/></quer:query>

 

Just by seeing the file we can figure out how to add more columns manually: we just need to add more quer tags, like

 

<quer:projection><quer:field path="DesiredFieldPath"/></quer:projection>

 

With regard to the configuration file, we need to make some small changes: in tags cli:SpecificFile and cli:Folder absolute Windows paths are used. Once we move the files to Linux, we need to replace them with Linux filesystem paths, absolute or relative. Once the files are ready, the only remaining task is to run the extract, which is done by running:

 

./TaleoConnectClient.sh blog_article_cfg.xml

 

See the execution log:

 

[KKanicki@BIApps tcc-15A.2.0.20]$ ./TaleoConnectClient.sh blog_article_cfg.xml
2017-03-16 20:18:26,876 [INFO] Client - Using the following log file: /biapps/tcc_linux/tcc-15A.2.0.20/log/taleoconnectclient.log
2017-03-16 20:18:26,876 [INFO] Client - Using the following log file: /biapps/tcc_linux/tcc-15A.2.0.20/log/taleoconnectclient.log
2017-03-16 20:18:27,854 [INFO] Client - Taleo Connect Client invoked with configuration=blog_article_cfg.xml, request message=null, response message=null
2017-03-16 20:18:27,854 [INFO] Client - Taleo Connect Client invoked with configuration=blog_article_cfg.xml, request message=null, response message=null
2017-03-16 20:18:31,010 [INFO] WorkflowManager - Starting workflow execution
2017-03-16 20:18:31,010 [INFO] WorkflowManager - Starting workflow execution
2017-03-16 20:18:31,076 [INFO] WorkflowManager - Starting workflow step: Prepare Export
2017-03-16 20:18:31,076 [INFO] WorkflowManager - Starting workflow step: Prepare Export
2017-03-16 20:18:31,168 [INFO] WorkflowManager - Completed workflow step: Prepare Export
2017-03-16 20:18:31,168 [INFO] WorkflowManager - Completed workflow step: Prepare Export
2017-03-16 20:18:31,238 [INFO] WorkflowManager - Starting workflow step: Wrap SOAP
2017-03-16 20:18:31,238 [INFO] WorkflowManager - Starting workflow step: Wrap SOAP
2017-03-16 20:18:31,249 [INFO] WorkflowManager - Completed workflow step: Wrap SOAP
2017-03-16 20:18:31,249 [INFO] WorkflowManager - Completed workflow step: Wrap SOAP
2017-03-16 20:18:31,307 [INFO] WorkflowManager - Starting workflow step: Send
2017-03-16 20:18:31,307 [INFO] WorkflowManager - Starting workflow step: Send
2017-03-16 20:18:33,486 [INFO] WorkflowManager - Completed workflow step: Send
2017-03-16 20:18:33,486 [INFO] WorkflowManager - Completed workflow step: Send
2017-03-16 20:18:33,546 [INFO] WorkflowManager - Starting workflow step: Poll
2017-03-16 20:18:33,546 [INFO] WorkflowManager - Starting workflow step: Poll
2017-03-16 20:18:34,861 [INFO] Poller - Poll results: Request Message ID=Export-JobInformation-20170316T201829;Response Message Number=123952695;State=Completed;Record Count=1;Record Index=1;
2017-03-16 20:18:34,861 [INFO] Poller - Poll results: Request Message ID=Export-JobInformation-20170316T201829;Response Message Number=123952695;State=Completed;Record Count=1;Record Index=1;
2017-03-16 20:18:34,862 [INFO] WorkflowManager - Completed workflow step: Poll
2017-03-16 20:18:34,862 [INFO] WorkflowManager - Completed workflow step: Poll
2017-03-16 20:18:34,920 [INFO] WorkflowManager - Starting workflow step: Retrieve
2017-03-16 20:18:34,920 [INFO] WorkflowManager - Starting workflow step: Retrieve
2017-03-16 20:18:36,153 [INFO] WorkflowManager - Completed workflow step: Retrieve
2017-03-16 20:18:36,153 [INFO] WorkflowManager - Completed workflow step: Retrieve
2017-03-16 20:18:36,206 [INFO] WorkflowManager - Starting workflow step: Strip SOAP
2017-03-16 20:18:36,206 [INFO] WorkflowManager - Starting workflow step: Strip SOAP
2017-03-16 20:18:36,273 [INFO] WorkflowManager - Completed workflow step: Strip SOAP
2017-03-16 20:18:36,273 [INFO] WorkflowManager - Completed workflow step: Strip SOAP
2017-03-16 20:18:36,331 [INFO] WorkflowManager - Completed workflow execution
2017-03-16 20:18:36,331 [INFO] WorkflowManager - Completed workflow execution
2017-03-16 20:18:36,393 [INFO] Client - The workflow execution succeeded.

 

And that’s it! Assuming our files were correctly prepared, the extract will be ready in the folder declared in cli:Folder tag of the configuration file. As for scheduling, different approaches are available, the most basic being to use the Linux crontab as the scheduler, but you can also use any ETL tool that is used in your project easily. See the screenshot below for an ODI example:

 

Extracting data from Taleo

Figure 7: TCC extracts placed into 1 ODI package

 

The file extract_candidate.sh contains simple call of TCC extraction:

 

[KKanicki@BIApps tcc-15A.2.0.20]$ cat extract_candidate.sh
#!/bin/bash
cd /u01/tcc_linux/tcc-15A.2.0.20/
./TaleoConnectClient.sh extracts_definitions/candidate_cfg.xml

 

If your extracts fail or you have any other issues with configuring Taleo Connect Client, feel free to ask us in the comments section below! In the last couple of years we have delivered several highly successful BI Projects in the Human Resources / Human Capital space! Don´t hesitate to contact us if you would like to receive specific information about these solutions!

Karol K
karol.kanicki@clearpeaks.com