Data Quality Series – Part 3: Data Standardization with EDQ

.

Data Quality Series – Data Standardization with EDQ

 

In the first part of the Data Quality Series we introduced the main concepts of data quality. In the second article we explained how Oracle Enterprise Data Quality (EDQ) can help us profile our data.

The focus in this post is to review some of EDQ’s basic cleansing and standardization capabilities.

 

Introduction

EDQ provides a number of built-in transformations for data cleansing, and some of the most commonly used transformations are:

Character Replace: replaces characters according to a map
Denoise: removes noise characters from text attributes
Lower Case: converts string values to lower case
Upper Case: converts string values to upper case
Proper Case: converts string values to proper case
Normalize No Data: normalizes attributes with no data to nulls
Normalize Whitespace: removes leading and trailing whitespace, and normalizes inter-word whitespace to a single space
RegEx Replace: replaces a string value matching a regular expression with a given value, or part of the matching expression
Replace: replaces values using a map

EDQ also comes with a series of built-in processors for data type conversion that make the normalization of types in our data really easy. There are also transformations that can be used to enhance our data, for example by adding dates, concatenating or parsing fields, or adding initials from text values.

Maps can be very useful for data standardization; by keeping all the possible values representing an entity together with the corresponding standard value for that entity in map tables, we can standardize the data using EDQ in just a couple of simple steps. In the following example, we will see how we can improve the quality of two common data fields containing people information: Country and Gender.

 

1. Data Standardization with EDQ

Let’s suppose that we have a dataset with a Country field. The table should contain data mostly for the United States; however, a quick Frequency profile shows the following results:

Data Quality Series – Data Standardization with EDQ

Figure 1: Frequency profile of the Country field. We can see that some countries are represented by more than one value in the database.

We can see that we have very bad data quality for this field: the values ‘USA’, ‘US’, ‘U.S.A’, ‘United States’ and ‘U.S’ all belong to the single entity that we would like to call ‘United States of America’. Moreover, there are two different values for Canada (‘Canada’ and ‘CAN’), and also some null values. With EDQ, we can easily fix issues like this using mapping tables.

The first transformation that we need is the Normalize Whitespace processor, which removes leading and trailing whitespace, and normalizes inter-word spaces to a single space. Simply create a new process, add the transformation, connect the Reader output to the normalizer input, and select the fields that need whitespace normalization:

Data Quality Series – Data Standardization with EDQ

Figure 2: Normalize Whitespace processor in EDQ. Before processing text fields for standardization, it is recommended to remove leading and trailing spaces and to normalize inter-word spaces to a single space.

We will generate the values for the null ones from another field containing the city, using an external mapping called City to Country Mappings. This table contains a mapping of cities to their corresponding countries:

Data Quality Series – Data Standardization with EDQ

Figure 3: City to Country Mappings

To use it in our project, we just need to add it to the Reference Data section:

Data Quality Series – Data Standardization with EDQ

Figure 4: Adding Reference Data

We will use the mapping to generate the name of the country in a new column called “Derived Country” for each city in our dataset. To do this, we add the processor Enhance from Map, connected to the output of the Normalize Whitespace processor, using City as the field to match, and the added mapping:

Data Quality Series – Data Standardization with EDQ

Figure 5: Process for normalizing whitespace and populating the Country field in EDQ

Data Quality Series – Data Standardization with EDQ

Figure 6: Enhance from Map processor in EDQ. We will populate the “Country” field using the “City to Country” mapping, matching by the field “City”.

After running this process, we can see in the Enhance from Map processor results that we were able to enhance 3338 fields:

Data Quality Series – Data Standardization with EDQ

Figure 7: Results of the country enhancement using the City to Country Mapping in EDQ

However, 2100 records remain with an unenhanced country, so we still have work to do on this field to make it fit for use. The next step is the normalization of the name, using another mapping called Country Variants. This mapping contains the most common variants for ‘United States of America’, plus variants for Canada:

Data Quality Series – Data Standardization with EDQ

Figure 8: Country Variants mapping. The Standardized Country column contains the standard that we want to use for each real world entity, in this case countries.

To normalize these different variants, we add a Replace processor to the process, calling it Standardize Country:

Data Quality Series – Data Standardization with EDQ

Figure 9: Country Standardization process in EDQ.

In the properties of the transformation, we need to define Country as the input field (this is the field that we are standardizing), and we have to specify the replacements in the Options tab. We will use the mentioned Country Variants mapping, containing the replacements.

Data Quality Series – Data Standardization with EDQ

Figure 10: Replace processor in EDQ. The processor will replace the original value of the Country field with the standard from the Country Variants mapping.

Finally, we simply add a Merge Attributes processor to merge the country derived from the city and the standardized country into a new column, ‘Best Country’, and a Frequency Profile processor to check the results:

Data Quality Series – Data Standardization with EDQ

Figure 11: Complete Country Standardization process in EDQ. The Merge Attributes processor, called “Create Best Country Attribute”, merges the values of the country derived from the city and the country obtained after standardization using the Country Variants mapping.

The transformation will merge the derived country with the standardized country into a new column called “Best Country”:

Data Quality Series – Data Standardization with EDQ

Figure 12: Merging attributes in EDQ.

With a Frequency profiler we can analyse the results of the new field:

Data Quality Series – Data Standardization with EDQ

Figure 13: Frequency profiler to verify the results of the standardization of the Country field in EDQ.

We can see how the quality of the field has improved thanks to the transformations applied:

Data Quality Series – Data Standardization with EDQ

Figure 14: Country Frequency profiler before standardization in EDQ, showing null values and many values for the same countries.

 

Data Quality Series – Data Standardization with EDQ

Figure 15: Country Frequency profiler after standardization in EDQ.

We were able to eliminate all the variants of United States of America, and we also identified other countries that were not appearing correctly before the standardization process.

Now let’s suppose we have a dataset with people including a field for Gender, which should contain either ‘M’ or ‘F’ (Male or Female) for each record. However, a quick profiling of the field shows the following results:

Data Quality Series – Data Standardization with EDQ

Figure 16: Gender frequency profiling in EDQ.

We can see that we have a lot of null values, and some records with ‘U’ in the gender field. We’re going to improve the quality of the field with EDQ, by trying to generate the gender using the title and the first name fields, and merge this generated gender with the Gender field available in the data to get the best result.

We’ll begin by adding an Unstructured Name Parser. We will use it to parse the name field, so that will be the input of the transformation, specified in the Input sub-processor of the Parser processor:

Data Quality Series – Data Standardization with EDQ

Figure 17: Parsing text fields with EDQ.

Data Quality Series – Data Standardization with EDQ

Figure 18: Input sub-processor of the Unstructed Name Parser processor in EDQ. The input is specified here.

The name will be mapped to the only attribute of the processor called “FullName”:

Data Quality Series – Data Standardization with EDQ

Figure 19: Map sub-processor of the Unstructed Name Parser processor in EDQ. The FullName attribute is mapped to the input field Name.

After running the process, we can see all the steps taken and all the rules created in the results of the parser in order to parse the name field into three new fields: P1_Prefix with the title, P1_First with the first name, and P1_Last with the last name, for each record:

Data Quality Series – Data Standardization with EDQ

Figure 20: Results of the Unstructured Name Parser, with the fuzzy rules created to parse the string containing the name into three separate fields for the title, first name, and last name.

Now we can use these fields to derive the gender of each person in the dataset. To do this, we add the built-in transformation Add Gender to the process, and join it to the pass output of the parser transformation:

Data Quality Series – Data Standardization with EDQ

Figure 21: Enhancing a Gender field in EDQ. The process contains a parser transformation and EDQ's built-in transformation for adding the gender.

To derive the gender, we need to specify the fields that contain the title and the first name in the processor configuration, as well as the stated gender, which the processor will use in combination with the derived gender to create the output “Best Gender”.

Data Quality Series – Data Standardization with EDQ

Figure 22: Enhancing a Gender field in EDQ. The Add Gender processor requires three attributes: title, first name, and stated gender.

If we run the process now, we can see how we have improved the quality of this field in the results of the Add Gender processor:

Data Quality Series – Data Standardization with EDQ

Figure 23: Results of the Gender standardization in EDQ.

Now there are only 6 records with missing gender, and we have reduced the percentage of missing gender records from 19.5% to 0.1%.

 

Conclusion

EDQ provides some solid features that make data cleansing and standardization tasks much easier: we can apply built-in transformations, use external maps, and apply business rules to standardize, correct wrong values, normalize fields, and enhance our data.

You can try these and other usage examples of EDQ for standardization downloading the latest EDQ virtual machine and hands-on lab from the following links:

Latest EDQ virtual machine from Oracle
Getting Started Hands-On Lab


Click here
if you would like to know more about the Data Quality Services we offer!

Data Quality Series – Part 2: Data Profiling with EDQ

.

Data Quality Series

 

 

This is the second article in our blog series about Data Quality. In the first part of the series we introduced the main concepts of data quality.

The focus of this article is to introduce Data Profiling with EDQ.

 

Introduction

Enterprise Data Quality (EDQ) is Oracle’s total solution for data governance and data quality management. It is part of the Oracle Fusion Middleware product range and the current version, 12.2.1.2, can be downloaded here.

There is a virtual machine for testing EDQ with a sample dataset, available for download too at the following link.

You can also find extensive documentation on the Oracle webpage - go to the following links for documentation and more resources.

Oracle EDQ can be used to profile, audit, standardize, normalize, and de-duplicate our data. The list below summarizes the key features of EDQ:

Integrated data profiling, auditing, cleansing and matching
Browser-based access to the different client applications used to operate and configure the product
Ability to handle all types of data
Connection to files (text, XML, Office) and any JDBC and ODBC compliant data sources and targets
Multi-user project support
SOA support to create processes that may be exposed to external applications as a service
A single repository to hold data, gathered statistics, and project tracking information
Fully extensible architecture allowing the insertion of any required custom processing

EDQ is fully extensible, offering extensions for Address Verification and Watchlist Screening, for example, and can be integrated as batch processing or as real-time processing with other integration tools.

In this article we will review the basic profiling capabilities of EDQ and explain how we can profile our data using EDQ in just a few steps.

 

1. Data Profiling

The first step in a data quality project is profiling the data. Data profiling is the analysis of data to clarify its structure, content, relationships and derivation rules. Profiling helps not only to understand anomalies and to assess data quality, but also to discover, register, and assess enterprise metadata, and so the purpose of data profiling is both to validate metadata when it is available and to discover metadata when it is not.

EDQ comes with several built-in transformations to profile our data quickly and to start discovering patterns and quality issues. Some of the most important transformations for profiling are:

Quickstats profiler: analyses high-level completeness, duplication, and value frequency across many attributes, and highlights possible issues. For example, we could use this transformation to identify duplicate values in columns that should contain unique values, or columns with more unique values than expected.

Figure 1: Data Quality Series – Data Profiling with EDQ

Figure 1: Example of the results of the Quickstats profiler in EDQ. We can start discovering issues in the data using this transformation, for example, fields with null values, or with more unique values than expected. 

Data types profiler: analyses attribute values for their data types, and assesses data type consistency. In the example below, we can see that some columns have inconsistent data types, and we can clearly identify wrong values (the Street column has 1 numeric value and the rest text, the Cell column has 6 Date/time values, the Active columns have 15 numeric values, etc.).

Figure 2: Data Quality Series – Data Profiling with EDQ

Figure 2: Example of the results of the Data types profiler in EDQ. This transformation is very useful to find data type consistency issues.

Max/Min profiler: finds minimum and maximum values – longest, shortest, lowest, and highest.

Figure 3: Data Quality Series – Data Profiling with EDQ

Figure 3: Example of the results of the Max/Min profiler in EDQ. One of the basic profiles that we can do on our data is an analysis of the maximum and minimum values of each field to find wrong values.

Frequency profiler: analyses value frequency across many attributes. In the image below, we can see how this transformation can be used to rapidly identify consistency problems in the values of the data.

Figure 4: Data Quality Series – Data Profiling with EDQ

Figure 4: Example of the results of the Frequency profiler in EDQ. We can already see data quality.

Patterns profiler: analyses character patterns and pattern frequency across many attributes.

Figure 5: Data Quality Series – Data Profiling with EDQ

Figure 5: Example of the results of the Patterns profiler in EDQ, very useful to find data quality issues in fields such as zip codes, telephone numbers, or email addresses.

Record completeness profiler: analyses records for their completeness across many attributes.

Figure 6: Data Quality Series – Data Profiling with EDQ

Figure 6: Example of the results of the Record completeness profiler in EDQ.

Other profiling transformations include, but are not limited to, number profiler, character profiler, date profiler, and RegEx patterns profiler.

 

2. Data Profiling with Oracle EDQ

EDQ developments are called projects, which contain processes that run processors (transformations). There is also the possibility to include processors in jobs, to make them accessible to external applications. All these developments are created using the Director application, accessible from the EDQ Launchpad on the EDQ server:

Figure 7: Data Quality Series – Data Profiling with EDQ

Figure 7: The EDQ Launchpad. All the client applications are accesible from this page.

The Director is a Java application, and is the main application for the operation of EDQ.

It is divided into five different panes:

Figure 8: Data Quality Series – Data Profiling with EDQ

Figure 8: The Director client application, the main application for developing in EDQ.

The projects and processors, together with the data stores, staged data, reference datasets, and other files, will be shown in the Project Browser pane (1). The main section of the application is the Project Canvas (2), where the different processes and jobs are built. The Tool Palette (3) will show the different tools available for use in different data quality processes and jobs. The Results Browser (4) will show the results of the processor selected on the canvas. Finally, the Tasks window in the bottom left corner (5) shows the progress of the execution of the processes as they are running.

To start profiling our data, the first thing we need to do is create a new project in the Director:

Figure 9: Data Quality Series – Data Profiling with EDQ

Figure 9: Creating a project in EDQ.

Then we need to add a data store to connect to:

Figure 10: Data Quality Series – Data Profiling with EDQ

Figure 10: Creating a new data store in EDQ.

Once we have a valid data store we can connect to, we have to create a snapshot of the desired tables in the data store. All the processing done in EDQ will be done on this snapshot, so the original data source will remain unchanged.

Figure 11: Data Quality Series – Data Profiling with EDQ

Figure 11: Creating a data snapshot in EDQ.

Finally, we need to create the process that will contain all the profiling transformations or processors:

Figure 12: Data Quality Series – Data Profiling with EDQ

Figure 12: Creating a process in EDQ.

After selecting our created snapshot, we can select the processors that we need on the next screen of the New Process window, checking the box ‘Add Profiling’, or we can leave it un-checked and manually drag and drop the processors one by one to the process canvas from the tool palette.

Figure 13: Data Quality Series – Data Profiling with EDQ

Figure 13: A simple data profiling process, containing the Reader processor and six profiling processors.

If we add the processors to the process manually, we will have to connect the output of the reader transformation to the input of the profiling transformations.

To run the process, we simply need to save it then click on the run icon:

Figure 14: Data Quality Series – Data Profiling with EDQ

Figure 14: Executing a process in EDQ.

The execution progress will be shown in the Tasks window, and once all the processors have finished the results of each individual transformation will be shown in the Results Browser. All these results can be exported to Excel using the buttons above the results in the Results Browser.

 

3. Creating Issues

A lot of common problems that could jeopardize the quality of BI or Analytics solutions can be identified with data profiling. EDQ includes features to create a collaborative environment between the different users involved in the data governance processes. For example, we can create issues and assign them to DBAs, application developers, ETL developers, BI developers, and business owners as they are found. To do this, simply right-click on any value that might indicate a data quality issue in the Results Browser, and select Create Issue.

Figure 15: Data Quality Series – Data Profiling with EDQ

Figure 15: Creating issues in EDQ.

The issues can be reviewed via the Issue Management application, accessible from the EDQ Launchpad or from the Director:

Figure 16: Data Quality Series – Data Profiling with EDQ

Figure 16: Link to the Issue Manager application accessible from the Director in EDQ. The Issue Manager application can also be accessed from the EDQ Launchpad.

 

4. Creating Reference Datasets

The data profiling step is useful for creating datasets that can be used later on as reference for data auditing. To do this, simply right-click on the valid values for a given field and select Create Reference Data.

Figure 17: Data Quality Series – Data Profiling with EDQ

Figure 17: Creating reference data in EDQ.

Figure 18: Data Quality Series – Data Profiling with EDQ

Figure 18: Reviewing reference data in EDQ.

We can add extra rows if there are missing values in our data.

After saving, this dataset will be the single source of truth for the field, and it can be used in other processes for auditing the data.

We can also create reference data from patterns instead of from individual values. EDQ also comes with many different types of reference data out-of-the-box, which can speed up the creation of data auditing processes.

 

5. Data Auditing

There are a number of processors available to audit data, called data checks. The most important ones are:

Data Type Check: checks that a string attribute contains data of the expected data type
Duplicate Check: checks for records that are duplicated across selected attributes
Email Check: checks email addresses are in a valid syntactic format
Invalid Character Check: checks a string attribute for invalid characters
Pattern Check: checks values for an attribute against reference data of valid and invalid character formats
No Data Check: checks whether or not values in an attribute contain any meaningful data
RegEx Check: checks values for a string attribute against reference data of valid and invalid regular expressions
Business Rules Check: checks data against a set of business rules

These processors allow the output streams to be divided for valid and invalid records to be handled separately or together as they are audited:

Figure 19: Data Quality Series – Data Profiling with EDQ

Figure 19: Different streams of data from a data check transformation in EDQ. The records with populated email can be treated separately from the records without email using the different outputs.

Data can be audited using reference data created by users. For example, the List Check and Pattern Check processors allow specifying reference datasets to determine valid and invalid records:

Figure 20: Data Quality Series – Data Profiling with EDQ

Figure 20: Using reference data for auditing in EDQ.

This way the processors will check if the selected field of each record in the database matches the reference data. We can drill down in the results of these processors to see the matching and un-matching records, and export the records to Excel to share them with DBAs, business users, or other users.

 

6. Data Profiling with ODI

Oracle Data Integrator comes with some data profiling capabilities, which although not as complete as EDQ, are enough for small projects with basic profiling needs. The table below shows the comparison between the data profiling features included in ODI and the more advanced features of Oracle EDQ:

Figure 21: Data Quality Series – Data Profiling with EDQ

Figure 21: Data Profiling Feature.


Conclusion

EDQ offers some great features to profile and audit data in a quick and simple way, and in a collaborative environment to raise issues with other users involved in the data quality assurance process. Tasks that we would normally do manually on the databases can be done with EDQ much faster, speeding up the process of finding quality issues in our data. A good profiling and auditing of our data is the first requisite for a successful quality assurance process, which in turn is a requisite for the success of any Analytics / BI project.

Stay tuned and read the next part of the Oracle EDQ series, in which we explain data standardization and cleansing using EDQ.

Click here if you would like to know more about the Data Quality Services we offer!

Oracle BI Cloud Service, Part V: Managing the Service

.

How Oracle BI Cloud Service is managed

Parts II, III and IV of our BICS series were dedicated to the basic functionalities of the service. In this post, the last one in the series, we will explain how the BI Cloud Service is managed.

As in the on-premise version of OBI, security of the BI cloud service is managed with users, roles and application roles. All the cloud services you buy will be assigned a unique identity domain that will allow you to manage what users and roles can access to which service associated with that identity domain. Application roles allow you to set permissions to users and roles within each cloud service.

You can manage users and roles from the My Services page of Oracle Cloud with the Identity Domain Administrator credentials.

 

image

Figure 1: Oracle Cloud My Services webpage

Application roles are managed from the BI Cloud Service Console with BI Service Administrator credentials.

bics 2

 Figure 2: Managing users and roles

From the Service Console you can easily add users, roles and application roles as members to one or multiple application roles, manage pre-defined application roles, and create your own application roles.

 

bics 3

Figure 3: Oracle BI Cloud Service - Service Console

A useful feature included in BI Cloud Service is Snapshot: you can capture with one click the state of the service at a point in time including data model, catalog and application roles (but not database elements which should be backed-up separately). You can then save the snapshot in the cloud (maximum 10 snapshots) or download it to a file system, and upload a saved snapshot to restore the service to a previous state. Each snapshot import is total in that it overwrites everything that was in the instance beforehand. Apart from backups, snapshots are useful for moving data from the pre-prod to the prod environment.

Image 3

Figure 4: Taking system snapshots

Other common administration tasks of OBIEE are also available in the BI Cloud Service, such as monitoring users currently signed in, analyze or test SQL queries, and monitor usage and other metrics.

 

Conclusion

The Oracle BI Cloud Service is a good option for small and medium companies looking for an easy-to-use BI framework with a fast deployment cycle and minimum capital and operating expenditures. The agility and availability of the system allows companies to start reporting against their business data in a couple of hours. We are still missing some features to consider it as an option for large projects, but it is still useful for any company requiring their BI contents to be available in the cloud and in any mobile device.

So what are your thoughts on the Oracle BI Cloud Service? Would you consider deploying your BI fully in the cloud using the service? Contact us at info@clearpeaks.com or using our web form if you would like more information about the Oracle BI Cloud Service, or leave your comments below!

Oracle BI Cloud Service, Part IV: Creating analyses, reports and dashboards

.

In Parts II and III of the BICS series, we covered how to upload data to the cloud and build the repository. In this post, we will be covering the basics of analyses, reports, and dashboards in the BI Cloud Service.

The OBI Cloud Service has almost the same report capabilities of OBIEE. You can create analyses combining different views and dashboards, limit the results using filters, prompt for values and create dynamic reports using named and inline actions just as you would do in the on-premise version of OBI.

image 1

Figure 1: Creating reports, visualizations and dashboards
from the Oracle BI Cloud Service & Data Discovery Webinar
from Oracle Deutschland B.V. & Co. KG

All the results of analyses can be exported to the usual formats such as Excel or PDF, and you can also make use of variables: session, repository, presentation, request, and the recently added global variables are supported. There is the possibility to report against a mash-up of model data and external sources in the PROD environment.

A new presentation feature recently added to the BI Cloud Service (and that will be available in the on-premise version 12c of OBIEE) is Visual Analyzer, a web-based tool accessible from the BICS homepage that allows you to explore analytics data visually and on an individual basis in just a few clicks.

image 2

Figure 2: Accessing the Visual Analyzer tool
from blog Sixty-Analytics

Visual Analyzer simplifies the process of finding the right information and creating rich and powerful visualizations with that data. You can search the entire repository for dimension and measure attributes and drag and drop them to build graphs and charts in a Tableau-type single pane, allowing you to find correlations, discover patterns, and see trends in your content.

image 3

Figure 3: Creating reports in Visual Analyzer
from Oracle

image 4

Figure 4: Creating compound visualizations in Visual Analyzer
from Oracle

On the downside, there are still some presentation services lacking in the cloud version of OBI. Scorecards, a useful view for KPIs in OBIEE, are still not supported in the cloud version. Agents, alerts and BI Publisher are also not presently supported, although we will probably see these services supported in future releases within the next 12 months.

Be sure to check out the last part of our BICS post series explaining how the service is managed and our overall conclusion.

Oracle BI Cloud Service Part III: Modeling Data

.

Data Modeler tool

In the last post of this BICS series, we reviewed how data can be loaded into the cloud. Once you have all your data loaded and ready, it is time to build the business model, or in other words, create the repository. For that purpose, a browser-based tool called Data Modeler is accessible from the home page of the BI Cloud Service.

OBI Cloud Service

Figure 1: Accessing the Data Modeler tool
from Blog Sixty Analytics

 

Data Modeler is the equivalent to the Administration Tool of the on-premise OBI, although much simplified and with a reduced set of its capabilities, considering that the main premise of the cloud version of OBI is precisely that is should be simple and with a fast deployment cycle. Still, the Data Modeler tool allows you to build a repository from your source tables with all the basic features of OBIEE Admin Tool, and all from your web browser.

The application is divided into panes, with the left pane listing database objects, data model objects, variables, and users & roles, and the right pane listing in detail the fact tables, dimension tables, and joins in your data model.

OBI Cloud Service

Figure 2: Data Modeler
From Blog Sixty Analytics

 

Since we are in a cloud environment, there has to be a way to ensure that multiple users can modify the model concurrently. Data Modeler uses a Lock & Edit system for this, so that users must lock the data model before making any changes to the model and database views (note that the lock only affects the Data Modeler tool and does not prevent other users from changing database views using other tools such as SQL Developer). Only one user at a time can have the lock, with the possibility to override locks and discard changes made by other users if you have administrative privileges.

After every change, the data model has to be validated (equivalent to a consistency check in OBIEE Admin Tool), and any validation error will appear at the bottom of the right pane. All the changes made to the model need to be published in order to to save them permanently and make them available to other users. There is the possibility to retain the lock after publishing the changes, discard the changes and release the lock, and discard the changes reverting to the previous published state.

Data should be refreshed frequently to ensure that it reflects the most up-to-date information. You can refresh the source objects, the data model, and synchronize with the database to automatically detect new columns, obsolete objects, and other discrepancies. It is advisable to clear the cache after loading new data as well, to ensure that the most recent data is displayed in analyses.

Data Modeler also offers users the possibility to create source views, i.e., saved data queries in the database. Source views are useful when using a single table as a source for more than one table, when creating a dimension table based on multiple source tables, and when performing pre-aggregation calculations. In general source views facilitate the handling of model objects and should be used as a base when subsequent changes in these objects might be needed.

You can partition a single source table into fact tables and dimension tables, or create fact and dimension tables from single source tables.

 

OBI Service Cloud

Figure 3: Adding fact and dimension tables to the model
From the article:  Using Oracle BI Cloud Service

Joins are automatically created between fact and dimension tables if the join reference exists in the source tables, and of course you can create manually other joins in Data Modeler. Hierarchies and levels can be added to dimension tables joined to fact tables, and aggregation levels can be set for measures. As in OBIEE Admin Tool, you can make use of variables (static / dynamic repository variables and session variables supported). You can also secure the model, setting permissions to tables, columns, and even records using filters.

A recently added feature of Data Modeler is the possibility to upload a full repository to the cloud service, so if you had already modeled your business data with OBIEE you don’t need to start from scratch in the cloud. This also allows a single model to be used both on-premise and in the cloud. Something to bear in mind however is that repositories uploaded from on-premises OBIEE cannot be edited through the BICS Data Modeler, so changes should be done in OBIEE Admin Tool and then uploaded again (after the consistency check).

So once you have your data and repository in the cloud, you are ready to start reporting against it.

Check out our next post of the BICS series to learn about analyses, reports, and dashboards in the BI Cloud Service.

 

privacy policy - Copyright © 2000-2010 ClearPeaks

topnav