Data Discovery and Analysis: making the best of Oracle Endeca Information Discovery

.

With the acquisition of Endeca in 2011, Oracle enhanced their already powerful Business Analytics products portfolio with Information Discovery capabilities, potentially allowing customers to analyse structured and unstructured data within the same framework.

Version 3.1 of the tool, released in November 2013, introduced new features such as Data Mash-up from multiple sources, easier and deeper unstructured analysis tools available directly to the business users, a tighter integration with the Oracle BI platform, Enterprise Class Data Discovery and the Web Acquisition Toolkit, a crawler to harvest and analyse website content.

Where, therefore, is Endeca positioned now within the context of Business Intelligence, and how should it be used to make the best of its capabilities? Can it be used as an alternative to OBIEE or other traditional, established Business Intelligence tools? How does its web crawling tool fare against the existing competition? To answer these questions and more, we have put Endeca on the test bench and saw it in action.

In today’s business landscape, analysis of unstructured and large volume data (Big Data) is morphing from a nice-to-have task for cutting-edge, BI-savvy companies to an important driver for business processes in enterprises everywhere. Customer data stored and displayed in social media like Twitter, Facebook and LinkedIn are virtual gold for any marketing department, while sensors capture millions of snapshots on a daily basis that can be used to monitor manufacturing processes and improve their performance. It is not difficult, therefore, to see why Oracle considers Endeca a strategic investment and a key component of its Business Analysis product stack.

In the following paragraphs of this article you will find our honest, no-frills opinion on Endeca Information Discovery 3.1, its new features and our suggestions for making the best of it within your enterprise BI stack.

Integration with Oracle Business Intelligence

One of the most recurring complaints from early Endeca adopters was the lack of integration with the Oracle Common EIM (Enterprise Information Model). As often happens with recent acquisitions, the first Oracle-branded versions of Endeca – starting with Version 2.3 in April 2012 - were mostly a revamp of the existing Latitude tool. Endeca felt like, and actually was, a stand-alone data discovery tool with its own data processing engine and front-end studio.

This has radically improved with Version 3.1. Oracle BI is now a native data source for Endeca and users can now create their Discovery Applications sourcing data from OBI with a two-step easy process. Moreover, the Integration Knowledge Module for Oracle Data Integrator now enables the latter ETL tool to load data directly into the Endeca Server.

endeca

There are still margins for improvement, of course. Administration tasks are still performed separately from the rest of the Oracle EIM architecture. Endeca Server does not interface with WebLogic and Enterprise Manager, core of the Oracle Middleware. We would also like to see CloverETL better integrated and possibly merged with ODI, to avoid splitting the overall data workflow and transformation logic in two separate tools. We see a lot of potential in using Endeca Server as a data source to the OBIEE repository, capability that is currently limited to BI Publisher.

We like, however, the concept of e-BS Extensions for Endeca. Based on pre-defined views in Oracle e-Business Suite, the Extensions consist of a set of Studio applications with pre-built content for a broad range of horizontal functions, from Supply Chain Management (Discrete and Process Manufacturing, Cost Management, Warehouse Management, Inventory,…) to Human Capital Management, Asset Management and more. The good level of integration within e-BS makes them a light-weight, easy-to-implement alternative to Oracle BI Analytic Applications module. Like for its bigger brother, however, the customization effort of the pre-built dashboards content required to be successfully used remains a question mark.

Self Service Analysis (Data Mash-up, Provisioning, Applications)

These are the topics that most excited our team when testing the new Endeca capabilities. The range of sources and databases available for data mash-up has been broadened, covering both databases as well as semi-structured data in JSON format and the Applications look and feel has been improved with new visualization options, but in our opinion the most compelling feature of Endeca is the new Provisioning and Applications creation process.

The workflow to create a new Discovery Application is now based on a wizard so user-friendly that we believe the classic buzz-phrase “Business users creating own applications! No more IT overhead!” is not a chimaera anymore but a serious possibility. Yes, establishing the Provisioning Service, connecting to the data source (JDBC for example) and configuring it might require some hand-holding, but once it is done, the wizard simplifies and streamlines the proper Application creation tasks, allowing the business user to perform its data discovery in autonomy.

Also, it is a fact that Endeca Applications look good. Definitely good, actually better than OBIEE dashboards and we can see why business users are usually impressed more by the former than the latter during product demos.

selfservice_endeca - Copy

Web Acquisition Toolkit

“A tool within a tool within a tool” is how our testing team has defined the new web crawling tool embedded in Endeca.

The toolkit looks and feels separate from the Endeca Server (it actually is) and features its own Design Studio where crawling rules and workflows can be defined, organized and scheduled, adding a third layer of data processing complexity: from Design Studio to CloverETL to ODI. In fact, Web Acquisition Toolkit does not use Endeca Server as a target, so a third party ETL tool is necessary to move data accordingly.

However, even if right now there are cheaper and more powerful options on the market, the tool does its job and – if Oracle continues investing in product integration, which we think is very likely – has the potential to become a very interesting feature of future Endeca versions.

wat_interface_endeca - Copy

Best fit for Endeca?

Wrapping up, we can safely say that Endeca is evolving into a compelling component of the Business Intelligence stack for enterprises looking to enable their users to perform rapid-fire data discovery (up to a certain extent, of course – data management, especially in complex enterprise environments, will still be required).

The stand-alone nature of Endeca architecture is a weakness but also a strength, allowing Endeca to be purchased and installed independently from the rest of the Oracle BI stack. However, we can see how e-BS Extensions make Endeca extremely appealing to Oracle ERP existing users.

Could Endeca, therefore, be considered as an alternative to OBIEE (and Oracle BI Applications) as the enterprise Business Intelligence tool? We do not think so. Although its Applications visualization capabilities are very powerful, the best fit for Endeca is to complement OBIEE. While the solid back-end (repository metadata layers, reports and dashboards catalog) of the latter provides corporate reporting in a structured and organised way, Endeca’s real power lies in enabling the business user to individually analyze data patterns on the fly: mix and match different data sources and quickly create new applications to find out the answers they need.

To enable all of the above, Self-service provisioning is where the strength of Endeca shows up. Web sources, unstructured information as flat files can be mashed together, and setting up and configuring another provisioning service to mix it up with the rest is a very easy task.

We at ClearPeaks will keep on the outlook for future enhancements and features of Oracle Endeca Information Discovery. If in the meantime you want to know more about Endeca and how it could add value to your enterprise, contact us.

 

Informatica Best Practice: User Defined Join syntax in the Source Qualifier transformation

.

In any Business Intelligence environment, changing technology of data sources is a big challenge.

This is valid particularly in the case of mature, long-running BI platforms, where the overall ETL processing is likely to exceed three or four hundred single jobs.

A change of the data source database technology – for example, from SQL Server to Oracle – and related data migration means often a painstaking exercise of manually updating every single ETL step, unit and regression testing, QA and moving to Production.

In order to minimise the effort required it is recommended to avoid database-specific SQL wherever possible, and to make use of any automation your ETL tool offers in order to make the code portable across platforms.

Continue reading this post >

Informatica OBI Applications ETL: the Slowly Changing Dimensions logic

.

One of the greatest advantages of buying an OBI Application – Project, Supply Chain or any other of the many Analytics flavours – is the set of predefined ETL mappings, sessions and workflows that come with it.

Although there is a good chance that the OLTP data source is highly customised, the online Oracle documentation is full of information that can make the ETL developer’s life easier. That said, there are some important ETL tasks whose logic isn’t very easy to find. They are like black boxes: you customise them a little bit – some fields behind the X_CUSTOM placeholder here, a small datatype change in the target table there – and they operate their magic.

So let’s reveal the truth behind the veil of a very important out of the box ETL logic in OBI Apps: the Slowly Changing Dimension management mappings.

Continue reading this post >

privacy policy - Copyright © 2000-2010 ClearPeaks

topnav