Java Technology for Business Intelligence


Figure 1
Typical infrastructure required for developing and deploying Business
Intelligence applications
See
larger image

Business intelligence makes the enterprise “smart.” Although not new
in and of itself, business intelligence can be seen as the process of
transforming data into information and ultimately into knowledge that
is valuable to the corporation. Applications such as data warehousing,
data mining, enterprise information portals (EIPs), and knowledge management
systems (which can all comprise a business intelligence solution) can
provide insight into customer retention, purchasing patterns, and even
future behavior. They can also consolidate the presentation of and access
to data stored throughout the company. These applications can not only
tell you what has happened but why and what may happen given certain business
conditions – allowing for exploration of “what if” scenarios.

Business intelligence touches on every aspect of IT, such as enterprise
resource planning, supply chain, and Customer Relationship Management.
By improving their ability tocollect, interpret, and act on their information
assets, companies realize more-efficient operations and decision-making,
fueling top-line growth and rewarding shareholders.

It is universally acknowledged that information is a valuable asset and
competitive advantage, especially in e-business. However, extracting potentially
valuable information from the massive volumes of data collected by operational
systems is the biggest challenge most companies face in developing business
intelligence systems. The priorities for business intelligence system
designers and applications developers are interoperability, scalability,
and adaptability, but traditional IT practices have focused on corporate
or departmental solutions with their own internal standards for data interchange,
system access, and security. In order to bridge this gap and allow for
the creation of trading exchanges and robust, semantically rich business
intelligence and data warehousing applications that capitalize on lucrative
new business models such as business-to-business e-commerce, participants
need to utilize standards-based development and computing models. These
models act as a blueprint for new applications and systems development
as well as a common set of standards and interfaces through which applications
can interact and explore and exchange information.

The Data Interoperability Challenge

Demands on corporate data warehouses have been steadily accelerating
for the past several years, as businesses generate and collect more and
more information over the Web. As a result of this information growth,
people at all levels inside the enterprise – as well as suppliers, customers,
and others in the value chain – are clamoring for subsets of the vast
stores of information – such as billing, shipping, and inventory information
– that can benefit them. Collecting and storing vast amounts of data is
one thing; utilizing and deploying that data throughout the organization
is another.

The technical challenges inherent in integrating disparate data formats,
platforms, and applications are significant. However, emerging standards
such as the application programming interfaces (APIs) that comprise the
Java platform, as well as XML technologies, can facilitate the interchange
of data and the development of next-generation data warehousing and business
intelligence applications. Java technology has been used extensively for
client-side access and in the presentation layer, and it is emerging as
a significant force for developing scalable, mission-critical server-side
programs. The Java 2 Platform, Enterprise Edition (J2EE) provides the
object, transaction, and security support for building robust, adaptable
enterprise-class systems.

Incompatible Metadata

One of the key problems limiting data interoperability that business
intelligence developers must solve is incompatible metadata formats. Metadata
can be defined as information about data or simply “data about data.”
In practice, metadata is what most tools, databases, applications, and
other information processes use to define, relate, and manipulate data
objects within their own environments. It defines the structure and meaning
of data objects managed by an application, so that the application knows
how to process requests or jobs involving those data objects. The problem
is that most applications define metadata differently, using different
programming structures, syntaxes, and semantics as well as storing metadata
in different data management systems with different file formats.

An example of metadata is the schema or model that database programmers
create that defines the tables, fields in a table, and table relationships
in a database. The database management system uses this metadata to determine
which tables and rows to access in response to an end user transaction
or query. Developers can use this schema to create views for users. Also,
users can browse the schema to better understand the structure and function
of the database tables before launching a query.

Data warehousing and business intelligence developers in particular are
familiar with the problems incompatible metadata formats cause. A typical
data warehousing application requires the integration of many different
types of tools for the extraction and transformation of data, often from
different operating systems and software applications. Data must then
be transported in stages to the data warehouse, where it is merged with
data collected from other sources – each with its own set of metadata.
Query, reporting, and analysis tools likewise need to maintain common
metadata to ensure that the data views maintained by the tools are synchronized
with the associated database schemas. Without a common model for creating
metadata, developers must hard-wire discrete interfaces between applications
to allow for the exchange and synchronization of data. The high cost of
developing such a system, in terms of development and maintenance, can
be prohibitive. Companies have been limited in developing solutions that
require the exchange of data from multiple, heterogeneous applications.

To address the metadata issue, a group of companies – including Hyperion,
IBM, Inline Software, Oracle, SAS Institute, Sun, and Unisys – have joined
forces to develop the Java Metadata Interface (JMI) API, which permits
the access and manipulation of metadata in Java with standard metadata
services. JMI is based on the Meta Object Facility (MOF) specification
from the Object Management Group (OMG). The MOF provides a model and a
set of interface definition language (IDL) interfaces for the creation,
storage, access, and interchange of metadata and metamodels (higher-level
abstractions of metadata). Metamodel and metadata interchange is done
via XML and uses the XML Metadata Interchange (XMI) specification, also
from the OMG. JMI defines a Java mapping of the MOF IDL interfaces as
well as the contracts necessary to connect to a metadata repository (see
“J2EE Connector Architecture,” later in this white paper). The goal is
to overcome the limitations caused by proprietary systems’ use of different
and incompatible semantics, structures, and syntax for metadata. The lack
of metadata interoperability prevents the sharing of data between applications
and has limited the development of robust BI systems.

JMI is part of a larger strategy of utilizing Java technology to create
an end-to-end data warehousing and business intelligence solutions framework.
Through the Java Community Process, industry experts are extending the
functionality of J2EE in new areas relevant to data warehousing and business
intelligence independent software vendors and users. Another specification
in the works is the Java OLAP (JOLAP) API, which will provide Java-based
access to OLAP servers and multidimensional databases.

Business Intelligence and J2EE

Metadata management is just one aspect of creating a successful business
intelligence solution. As applications become more Web-centric and integrated
with operational systems such as ERP and CRM, it is important that new
applications be developed and deployed with a scalable, robust, and secure
development and deployment framework. J2EE was specifically designed to
meet the rigorous needs of enterprise computing as well as those for data
interchange and interoperability.

Although the benefits of J2EE for building enterprise applications are
many, of specific interest to data warehousing and business intelligence
developers are scalability, multitier support, platform independence,
and security:

Scalability

Data warehousing and business intelligence applications typically involve
ad hoc combinations and transformations of large amounts of data, making
scalability of the underlying system critical. As the number of users
increases, J2EE can reliably manage millions of transactions during major
Web surges.

Multitier Support

Multitier architectures are composed of tiers of application logic separated
from the data tier and the client user interface. Multitier architectures
bring high levels of scalability and reliability to Web applications,
in that unpredictable demand levels and changes in application code will
not require rewriting of the entire application.

Platform Independence

Java Virtual Machines are available on a wide range of computing platforms,
from handheld PDAs to servers to mainframes. As users’ expectations for
information to facilitate decision making increase, there is a greater
need to distribute information on a variety of devices and platforms.

Security

Java has been designed from the ground up with security as a central
feature. The security architecture of J2EE defines simple, flexible relationships
between protected resources, the roles that have access to those resources,
components, and users.

A key technology of J2EE is Enterprise JavaBeans (EJB), an architecture
for the development of component-based distributed business applications.
Applications written with the EJB architecture are scalable, transactional,
secure, and multiuser-aware. These applications may be written once and
then deployed on any server platform that supports J2EE. The EJB architecture
makes writing components easy for developers, who do not need to understand
or deal with complex, system-level details such as thread management,
resource pooling, and transaction and security management. These issues
are all taken care of by the EJB server, allowing developers to focus
on writing business logic. Applications are then composed by combination
of EJB components, sometimes supplied by different vendors, into modules
that can be deployed, managed, and executed in any compliant J2EE implementation.
This allows for role-based development, in which component assemblers,
platform providers, and application assemblers can focus on their area
of responsibility, further simplifying application development.

J2EE Connector Architecture

Although accessing data stored in relational databases is a relatively
trivial matter with the JDBC API, most applications will require access
to larger amounts of data stored in back-office applications as well as
legacy computing environments. J2EE defines the Connector Architecture,
which allows access to data within the Java environment by defining a
set of contracts that need to be fulfilled between the back-end system
and the J2EE platform to support security, transactions, and resource
management. The connector acts as an interface between the J2EE platform
and the targeted data source, which allows for transparent connectivity
between these two systems. This simplifies the integration of operational
systems, data warehouses, and mainframe-based systems, because only one
connector needs to be provided for any single back-end data source. Connectors
can be built that access metadata repositories, either locally or remotely
on other platforms. Utilizing the connector architecture to access a metadata
repository and using JMI to manipulate the metamodels and metadata stored
in that repository enhance interoperability between applications, tools,
services, and disparate data sources.

Java Technology for Business Intelligence

As we have seen, the J2EE platform provides key benefits for building
data warehousing and business intelligence applications, tools, and services
by providing a solid architectural framework that simplifies complex development
and shortens product time to market. By leveraging the J2EE platform,
organizations can take advantage of the scalability, multitier architecture
support, and security of Java, which has become the de facto industry
standard for building transactional Web-based applications. The support
of industry-leading companies in developing extensions to J2EE for the
data warehousing and business intelligence marketplace makes J2EE a compelling
platform for deployment of such applications.