Oracle BI Cloud Service – Part III: Modeling Data


Data Modeler tool

In the last post of this BICS series, we reviewed how data can be loaded into the cloud. Once you have all your data loaded and ready, it is time to build the business model, or in other words, create the repository. For that purpose, a browser-based tool called Data Modeler is accessible from the home page of the BI Cloud Service.

OBI Cloud Service

Figure 1: Accessing the Data Modeler tool
from Blog Sixty Analytics


Data Modeler is the equivalent to the Administration Tool of the on-premise OBI, although much simplified and with a reduced set of its capabilities, considering that the main premise of the cloud version of OBI is precisely that is should be simple and with a fast deployment cycle. Still, the Data Modeler tool allows you to build a repository from your source tables with all the basic features of OBIEE Admin Tool, and all from your web browser.

The application is divided into panes, with the left pane listing database objects, data model objects, variables, and users & roles, and the right pane listing in detail the fact tables, dimension tables, and joins in your data model.

OBI Cloud Service

Figure 2: Data Modeler
From Blog Sixty Analytics


Since we are in a cloud environment, there has to be a way to ensure that multiple users can modify the model concurrently. Data Modeler uses a Lock & Edit system for this, so that users must lock the data model before making any changes to the model and database views (note that the lock only affects the Data Modeler tool and does not prevent other users from changing database views using other tools such as SQL Developer). Only one user at a time can have the lock, with the possibility to override locks and discard changes made by other users if you have administrative privileges.

After every change, the data model has to be validated (equivalent to a consistency check in OBIEE Admin Tool), and any validation error will appear at the bottom of the right pane. All the changes made to the model need to be published in order to to save them permanently and make them available to other users. There is the possibility to retain the lock after publishing the changes, discard the changes and release the lock, and discard the changes reverting to the previous published state.

Data should be refreshed frequently to ensure that it reflects the most up-to-date information. You can refresh the source objects, the data model, and synchronize with the database to automatically detect new columns, obsolete objects, and other discrepancies. It is advisable to clear the cache after loading new data as well, to ensure that the most recent data is displayed in analyses.

Data Modeler also offers users the possibility to create source views, i.e., saved data queries in the database. Source views are useful when using a single table as a source for more than one table, when creating a dimension table based on multiple source tables, and when performing pre-aggregation calculations. In general source views facilitate the handling of model objects and should be used as a base when subsequent changes in these objects might be needed.

You can partition a single source table into fact tables and dimension tables, or create fact and dimension tables from single source tables.


OBI Service Cloud

Figure 3: Adding fact and dimension tables to the model
From the article:  Using Oracle BI Cloud Service

Joins are automatically created between fact and dimension tables if the join reference exists in the source tables, and of course you can create manually other joins in Data Modeler. Hierarchies and levels can be added to dimension tables joined to fact tables, and aggregation levels can be set for measures. As in OBIEE Admin Tool, you can make use of variables (static / dynamic repository variables and session variables supported). You can also secure the model, setting permissions to tables, columns, and even records using filters.

A recently added feature of Data Modeler is the possibility to upload a full repository to the cloud service, so if you had already modeled your business data with OBIEE you don’t need to start from scratch in the cloud. This also allows a single model to be used both on-premise and in the cloud. Something to bear in mind however is that repositories uploaded from on-premises OBIEE cannot be edited through the BICS Data Modeler, so changes should be done in OBIEE Admin Tool and then uploaded again (after the consistency check).

So once you have your data and repository in the cloud, you are ready to start reporting against it.

Check out our next post of the BICS series to learn about analyses, reports, and dashboards in the BI Cloud Service.


Oracle BI Cloud Service – Part II: Loading Data


In the first post of the Oracle BI Cloud Service (BICS) series we reviewed some of its key features. In the following posts, we will explain how the service actually works, starting with the data provisioning.

All the data used in the BI Cloud Service has to be stored in the Database Schema Service or Database as a Service, so uploading your data to the cloud is the first step towards a cloud OBI. There are several ways to do it, depending on your specific requirements:

Data Loader.

The default method of uploading your data to the cloud is by using the built-in Data Loader accessible from the BICS homepage.


Figure 1: Accessing the Data Loader tool (from Oracle BI Cloud Service - A First look )


Data Loader is an easy-to-use browser-based application that allows you to upload data from files (txt, csv, xsl, xlsx) with a maximum of 500,000 rows and formatted as number, date, or character (for the moment only UTF-8 encoding is supported). There is also the possibility to perform simple transformations such as Uppercase, Lowercase, Trim Spaces, and Format as a Number.

You can use Data Loader to perform full manual refresh, incremental data loadings, upsert (update/insert), add data to the model, view the load history and correct errors.

SQL Developer

Those who have worked with other Oracle Database products will already know SQL Developer. Now the application features a new cloud connection that you can configure to load data from files (with no row limit) and relational sources to your cloud service. It runs on your local machine, so you need to download it from the OTN downloads webpage and install it locally. Once configured, SQL Developer connects to the cloud service through a set of RESTful web service calls and can be used to load data using SFTP (SQL*Loader utility). You can also create “carts” with multiple objects to be uploaded, and schedule incremental loads.



Figure 2: Loading carts (from Loading Relational Tables Using SQL Developer)

BI Cloud Service (BICS) Data Sync

Data Sync is another application for loading your data to the cloud service and available for download on the OTN downloads webpage. It is a wizard-driven tool supporting the load of data from CSV files and relational sources into the Database Schema Service. It also supports the load and merge of data from different sources: DB2, SQL Server, MySQL, Teradata, and TimesTen, apart from Oracle relational sources.

Use BICS Data Sync to perform incremental data loads, rolling deletes, and loads according to appended load strategies. You can also schedule and monitor data loads (called “jobs”).

Be aware however that Data Sync is not officially released and Oracle does not support it and does not guarantee that it will be supported in future releases (future upgrades are not guaranteed either), so care should be taken when using it.


Figure 3: BICS Data Sync (from Loading On-Premises Relational Sources and Files to the Cloud Using Oracle BI Cloud Service Data Sync)

BICS & Database Schema Service REST APIs

You can also use the BI Cloud Service and Database Schema Service REST APIs to define a customized API and programmatically load data into the Database Schema Service. With the BICS REST API, you can update statistics, drop or create indices on tables, insert, update, upsert, and delete records. You can also combine the API with any other application and script in any programming language, offering endless possibilities such as invoking applications and scripts with an on-premises scheduler of integrating the load process with ETL tools.

Similarly, with the Database Schema REST API you can call out SQL queries to read data and return results, call PL/SQL scripts to read, write, modify or delete data, and define your own customized API to be invoked from your on-premises environment. Moreover, using the APEX_WEB_SERVICE package within a PL/SQL block, you can invoke any REST/SOAP API supported by cloud applications, and retrieve data from external systems.

PL/SQL Scripts

Finally, you can use PL/SQL scripts to load data from external, generic web services. Scripts can be created using SQL Workshop, a robust browser-based tool part of APEX that allows you to create and execute SQL queries and database procedures against objects in the Database Schema Service.


Check out part III of the BICS series in which we will cover how you can model your data in the cloud according to your business requirements.


Oracle BI Cloud Service – Part I: Introduction & Key Features


Last year Oracle released the Business Intelligence Cloud Service (BICS) as part of its Platform as a Service (PaaS) offering, and it has been releasing upgrades regularly since then, some with important new features. In this BICS post series we will be giving an overview of the BI cloud service and walking you through some of its key features and functions:

  • Part I: Introduction & Key Features
  • Part II: Loading Data
  • Part III: Modeling Data
  • Part IV: Creating analyses, reports and dashboards
  • Part V: Managing the Service & Conclusion

Cloud technologies are not something from the future anymore but a reality, with some operating systems like Google Chrome OS relying almost completely on cloud applications. Trends indicate that the shift towards cloud environments will accelerate even more in the next years, so companies are rushing to release cloud versions of all their software solutions. Oracle has been releasing many of its software platforms into the cloud over the past years, and last year was the turn of OBI with the Business Intelligence Cloud Service.

Captura 1

Figure 1: Business Intelligence Cloud Service Web Page



Below are some of its key features.

The service is 100% cloud-based, with almost no local installs needed (some data loading tools might be required locally), and it comes bundled with the Oracle Database Schema Service, as BICS can only report against data stored in the cloud. With the Database Schema Service you get 1 schema on Oracle Database 11g with 50GB of storage, full PL/SQL support, multiple tools for loading the data, Oracle Exadata hardware running in the background, RESTful web services, and Oracle Application Express (APEX), but no SQL*Net access. Since the last release, there is also the possibility to connect the BI service to a Database as a Service (DBaaS), which includes a dedicated virtual machine running Oracle Database 11g or 12c with full SQL*Net access and all the storage capacity we may need.

One of the most interesting features of the service is that it is 100% mobile ready: all the content you create in the cloud will be immediately available in any iOS or Android device downloading the Oracle BI Mobile App from the App Store or Google Play and with no extra programming required (touch gestures such as zoom or swipe are already built in).

Capture 2

Figure2: 100% mobile ready. 

(From the Oracle BI Cloud Service & Data Discovery Webinar from Oracle Deutschland B.V. & Co. KG,

available at

 The service comes with two instances: PROD and pre-PROD for developing and testing, and provides automatic full system backup & restore, role based grain security, and simple self-administration. All the patching and system maintenance is handled by Oracle so you don’t have to worry.

So what are the benefits of the BI Cloud Service compared to the traditional on-premise OBI? The first benefit we can think of is the huge cost reduction: you only have to pay a monthly subscription of $250 per month and per named user (10 users minimum) for the BI Cloud Service plus $1000 per month for the 50GB of cloud storage in the Database Schema Service, so a minimum of $3500 per month, which is much cheaper than an OBIEE license. Not having to purchase annual licenses and being able to pay just for the storage capacity you require are big bonuses too. Moreover, working in the cloud means working with a standardised and consolidated infrastructure (there is no need for servers, storage, or any other network components). Therefore, you get to dramatically reduce your capital and operation expenditures compared to the expenditures you would incur with on-premises OBI.

As you need fewer database elements and you get standardised operating systems, servers, database versions and configurations, the administrative burden is also dramatically reduced. Also given that the system maintenance is fully managed by Oracle, you get guaranteed system availability. This means that overall the environment will be more reliable and manageable, resulting in lower risks for the user.

Finally, another key benefit of using the BI Cloud Service is agility. The rapid deployment of environments for development and production and the shorter upgrade cycle make the system highly agile, and thanks to the Oracle Real Application Clusters (RAC) you get adaptive cloud databases for workload volumes.

Be sure to check out the rest of our BICS series posts for more detailed information on the basic features of the service.


Oracle Financial Analytics: General ledger to sub-ledgers navigation


When implementing Oracle Financial Analytics, it has been noticed that it is a common requirement to enable navigation from general ledger to sub-ledgers reports to obtain information on specific transactions. One of the main reasons for this is to reconcile general ledger to individual sub-ledgers.

The general ledger is the master set of accounts that summarise the sub-ledger transactions. It is used to add information into the financial statements and journal entries.  A sub-ledger is a detailed record of transactions for an individual account that contains details of transactions for an account.

Sub-ledgers serve as support for amounts posted to the general ledger. It is important to reconcile the general ledger balances to the sub-ledger balances on a regular basis to spot possible discrepancies.

In this blog article, I will describe how this process can be approached in OBI Apps. I will describe the OBI Apps data model and the configuration required. Also, I will describe the creation of sample navigation from general ledger to sub-ledger by the creation of an intermediate analysis.

The animation below shows the expected navigation from GL Account balance to Payables details.

Financial Analytics

Note: This information applies when the data source is Oracle EBS.

OBI Apps allows this navigation by default using the ‘GL Journal ID’ column as a link between subject areas. It is available in general ledger transaction subject area and in the following sub-ledgers (In some cases, ‘GL Journal ID’ has to be exposed to the presentation layer):

  • Accounts Payable
  • Accounts Receivable
  • Revenue
  • Purchase Orders
  • Fixed Assets (Available in OBIA 11g)

Understanding the OBIA model

‘GL Journal ID’ is a unique identifier of an EBS journal entry line and it is available in the linkage table W_GL_LINKAGE_INFORMATION_G.

Sub-ledger transaction tables are linked to W_GL_LINKAGE_INFORMATION_G at source distribution level.

Finantial Analytics

For each sub-ledger transaction fact, there is a logical “transaction detail” dimension comprising the transaction fact and the W_GL_LINKAGE_INFORMATION_G table, ‘GL Journal ID’ is available in this dimension. The figure below shows the physical source “Fact_W_AP_XACT_F_GLJournalID” of the logical dimension “Dim – AP Transaction Details”.

Finantial Analytics

With this configuration, it is possible to have ‘GL Journal ID’ available in the sub-ledger transaction subject areas. And since ‘GL Journal ID’ is also available in “Financial - GL Detail Transactions” subject area, it is possible to configure the navigation from general ledger to sub-ledger transactions by using the OBIEE Navigation functionality.

Creating GL to sub-ledger navigation

‘GL Journal ID’ is available in the two subject areas that we want to link, but the level of detail between the source and the target reports are not the same, which is why it is necessary to create an intermediate analysis to link the two reports. This process is described in this section.

  1. Create the GL report and enable navigation.

Using the subject area “Financial - GL Detail Transactions”, create a new analysis that shows the Balance by Accounting Document Source of a selected account for a selected period. Accounting Document Source is used to identify the sub-ledger source.

Finantial Analytics

  1. Create an Intermediate analysis that contains the ‘GL Journal ID’.

Using the subject area “Financial - GL Detail Transactions”, create an analysis that contains the ‘GL Journal ID’ column. Account and period filters need to be created as well. This report will be used to filter the sub-ledger Transactions.

Finantial Analytics

  1. Create a detailed sub-ledger report

Using one of the sub-ledger subject areas, create a new analysis and filter the results by existing report. Use the analysis created in step 2 as the “Saved Analysis”.


image 6 Finantial Analytics

  1. Enable navigation

In the source report (created in step 1), using Action Links, enable navigation to the different sub-ledger reports. The example below shows the navigation to AP Transaction report when Account Document source is Payables.

Finantial Analytics

In conclusion, the OBI Apps data model is prepared to enable the link between General ledger and sub-ledger reports, but it has to be configured at analysis level. This feature can be used for the reconciliation process between General ledger and sub-ledgers.

If you want to go further, create a report that shows summaries sourced by general ledger, a second column that shows sub-ledger totals and a third column that shows the difference between these two amounts, if the value of this column is not ‘0’ navigate to details to discover the discrepancies.


Big Data Ecosystem – Spark and Tableau


In this article we give you the big picture of how Big Data fits in your actual BI architecture and how to connect Tableau to Spark to enrich your current BI reports and dashboards with data that you were not able to analyse before. Give your reports and dashboards a 360º view, and understand what, when, why, who, where and how.

After reading this article you will understand what Big Data can offer you and you will be able to load your own data into HDFS and perform your analysis on it using Tableau powered by Apache Spark.

The Big Data ecosystem

When considering a Big Data solution, it is important to keep in mind the architecture of a traditional BI system and how Big Data comes into play.

Until now, basically we have been working with structured data coming mainly from RDBMS loaded into a DWH, ready to be analysed and shown to the end user. Before considering how this structure may change when taking Big Data into the field, one could wonder how exactly the use of Big Data technology benefits my current solution. Using this technology allows the system to process higher volumes of data much faster, which can be more diverse, giving the chance to efficiently and safely extract information from data that a traditional solution can’t (high fault tolerance).

In addition, using Big Data permits the hardware structure to grow horizontally, which is more economical and flexible.

So, how does Big Data enter this ecosystem? Well, the main architecture concepts are quite the same, but there are big changes. The main differences are a whole new set of data sources, specifically non-structured and a completely new environment to store and process data.

BId Data -Spark and Tableau

In the picture above, at the top we have our traditional BI architecture. Below we can see how the new Big Data architecture still preserves the same concepts, Data Acquisition, Data Storage, etc. We are showing a few Big Data tools from the ones available in Apache Hadoop project.

What is important to point out is that Reporting & visualization must be combined. We must combine data from traditional and Big Data storage to provide a 360º view, which is where the true value resides.

To combine it there are different options. We could administer our aggregation calculations from HDFS, Cassandra data etc to feed the Data warehouse with information we were unable to compute before. Or we could use a reporting & visualization tool capable of combining traditional Data warehouse and Big Data storage or engines, like Tableau does.

A Big Data implementation: Apache Spark + Tableau

When approaching a Big Data implementation, there are quite a lot of different options and possibilities available, from new data sources and connectors to the final visualization layer, passing through the cluster and its components for storing and processing data.

A good approach to a Big Data solution is the combination of Apache Spark for processing in Hadoop clusters consuming data from storage systems such as HDFS, Cassandra, Hbase or S3, and Tableau such as the visualization software that will make the information available to the end users.

Spark has demonstrated a  great improvement in terms of performance compared to the original Hadoop’s MapReduce model. It also stands out as a one-component solution for Big Data processing, with support for ETL, interactive queries, advanced analytics and streaming.

The result is a unified engine for Big Data that stands out in low-latency applications and iterative computations, where fast performance is required, like iterative processing, interactive querying, large-scale batch computations, streaming or graph computations.

Tableau is growing really quickly, and has already proven to be one of the most powerful data discovery and visualisation tools. It has connectors to nearly any data source such as Excel, corporate Data Warehouse or SparkSQL. But where Tableau really stands out is when transforming data into compelling and interactive dashboards and visualizations through its intuitive user interface.

The combination of Apache Spark with Tableau stands out as a complete end-to-end Big Data solution, relying on Spark’s capabilities for processing the data and Tableau’s expertise for visualisation. Integrating Tableau with Apache Spark gives the chance to visually analyse Big Data in an easy and business-friendly way, no Spark SQL code is needed here.

Connecting Tableau with Apache Spark

Here, at ClearPeaks, we are convinced that connecting Apache Spark to Tableau is one of the best approaches for processing and visualising Big Data. So, how does this solution work? We are already working with this technology, and are proud to show a demonstration of Tableau connected to Apache Spark.


  • Tableau Desktop, any version that supports SparkSQL connector.
  • Apache Spark installed either on your machine or on an accessible cluster.


Tableau uses a specific SparkSQL connector, which communicates with Spark Thrift Server to finally use Apache Spark engine.

Big Data Spark & Tabelau

Software components

Tableau Desktop

Apache Spark Driver for ODBC with SQL Connector

Apache Spark (includes Spark Thrift Server)

Set up the environment

Installing Tableau Desktop and Apache Spark is out of the scope of this article. We assume that you have already installed Tableau Desktop and Apache Spark.

Apache Spark needs to be built with Hive support, i.e.: adding –Phive and –Phive-thriftserver profiles to your build options. More details here.

Install Apache Spark Driver for ODBC with SQL Connector

Install Apache Spark connector from Simba webpage. They are offering a free trial period which can be used to follow this article.

It has an installation wizard which makes installation a straightforward process.

Configure and start Apache Spark Thrift Server

Configuration files

Spark Thrift Server uses Hive Metastore by default unless another database is specified. We need to copy hive-site.xml config file from Hive to Spark conf folder.

cp /etc/hive/hive-site.xml /usr/lib/spark/conf/

park needs access to Hive libraries in order to connect to Hive Metastore. If those libraries are not already in Spark CLASSPATH variable, they need to be added.

Add the following line to /usr/lib/spark/bin/


Start Apache Spark Thrift Server

We can start Spark Thrift Server with the following command:

./sbin/ --master <master-uri>

<master-uri> might be yarn-cluster if you are running yarn, or spark://host:7077 if you are running spark in standalone mode.

Additionally, you can specify the host and port using the following properties:

./sbin/ \

  --hiveconf hive.server2.thrift.port=<listening-port> \

  --hiveconf<listening-host> \

  --master <master-uri>

To check if Spark Thrift Server has started successfully you can look at Thrift Server log. <thriftserver-log-file> is shown after starting Spark Thrift Server in console output.

tail -f <thriftserver-log-file>

Spark Thrift Server is ready to serve requests as soon as the log file shows the following lines:

INFO AbstractService: Service:ThriftBinaryCLIService is started.

INFO AbstractService: Service:HiveServer2 is started.

INFO HiveThriftServer2: HiveThriftServer2 started

INFO ThriftCLIService: ThriftBinaryCLIService listening on

Connect Tableau using SparkSQL connector

Start Tableau and select option to connect to Spark SQL.

Select the appropriate Type depending on your Spark version and the appropriate Authentication depending on your security.

Big Data Spark Tableau

The next steps are selecting schema, tables and desired relations, the same as when using any other Tableau connector.

Now you are able to run your own analysis on Big Data powered by Spark!

Spark tableau  Big Data

The dashboard above has been created in Tableau 9.0 after following the instructions provided. Apache Spark is used by Tableau to transparently retrieve and perform calculations over our data stored in HDFS.

Show us a capture of your Spark powered dashboards and reports. Share with us your impressions about Apache Spark and Tableau tandem in the comment section at the bottom.

Happy analytics!


Eduard Gil & Pol Oliva

Bonus: Add data to Hive Metastore to consume it in Tableau

If you are not familiar with the process of loading data to Hive Metastore you will find this section very useful.

This section describes how to load your csv from your file system to Hive Metastore. After this process you will be able to use it from Tableau using the process described in this article.

For this example we are going to use the following file that contains the well-known employee example:


123234877,Michael,Rogers, IT
546523478,John,Doe,Human Resources
654873219,Zacary,Efron,Human Resources
745685214,Eric,Goldsmith,Human Resources

As we can see it follows the schema: Employee Id, Name, Last Name, and Department.

We are going to use beeline to connect to Thrift JDBC Server. Beeline is shipped with Spark and Hive.

Start beeline from the command line


Connect to Thrift JDBC Server

beeline> !connect jdbc:hive2://localhost:10000

Create the table and specify the schema of it

beeline> CREATE TABLE employees (employee_id INT, name STRING, last_name STRING, department STRING)  ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

Now you are ready to load your my_employees.csv file into the previously created table

beeline> LOAD DATA LOCAL INPATH '/home/user/my_employees.csv' INTO TABLE employees;

We can even perform operations over employees table using beeline

beeline> SELECT COUNT(*) FROM employees;


privacy policy - Copyright © 2000-2010 ClearPeaks