Read Oracle Essentials Oracle Database 11g Online
Authors: Rick Greenwald
Data Pump Fast Import/Export
Added in Oracle Database 10
g
and enabled via external table support, Data Pump is a newer import/export format. Parallel direct path loading and unloading are supported.
Each of these database features is typically used for high-performance data transfers and not (by themselves) for difficult transformations. Oracle Warehouse Builder (OWB) is the Oracle database’s ETL tool used for building maps from extraction sources, through predefined or custom transformations to target tables. OWB then can be used to automatically generate the scripts needed to perform the ETL. More than just an ETL tool, OWB also can be used as a data warehouse design tool and provides a metadata repository. Designs may also be imported from a variety of design tools such as Oracle Designer, CA’s ERwin, Sybase PowerDesigner, and Business Objects Designer.
In most warehouse building, metadata is first imported that describes source tables, including Oracle (via database links) and other RDBMS systems (through ODBC or gateways) and flat files. Target tables are designed or imported, and source metadata is mapped to target metadata, including transformations. OWB’s basic set of transformations include a name and address cleansing operator for use with Oracle
Other Software for the Data Warehouse
|
239
partners’ libraries and applications that perform “householding,” matching, and merging of data. Advanced features such as support for slowly changing dimensions and pluggable mappings are available in the OWB Enterprise Option. The OWB
Data Quality Option includes support for data profiling and data rules.
OWB can validate the source-to-target mappings (see
Figure 10-4).
Once validated, you can then generate any of the following:
• DDL if target tables are to be created
• SQL*Loader control files for the loading of flat files
• PL/SQL scripts for ETL from relational sources
Figure 10-4. Typical Oracle Warehouse Builder mapping validation
Scripts are deployed to and run at the target data warehouse, typically scheduled using the Enterprise Manager job scheduler. In this way, OWB is more of an “ELT” tool since the transformations leverage the target database engine. For more complex scheduling of ETL jobs where certain prerequisites must be met, OWB leverages Oracle Workflow components.
OWB provides access to a number of other non-Oracle sources. Connectors for the E-Business Suite and PeopleSoft provide access to technical and business metadata and enable inclusion of objects from those ERP applications into mappings and process flows. The SAP Connector is similar but also includes an ABAP code generator used to build access to any SAP table on any database, including cluster tables, through an RFC connection.
For high-speed loading of flat files, Oracle SQL*Loader’s
direct pathload
option provides rapid loading by bypassing the buffer cache and rollback mechanism and
240
|
Chapter 10: Oracle Data Warehousing and Business Intelligence
writing directly to the datafile. You can run SQL*Loader sessions in parallel to further speed the table-loading process (as many warehouses need to be loaded in a limited “window” of time). Many popular extraction tools, including OWB, generate SQL*Loader scripts.
Oracle9
i
first added key ETL functionality in the core database engine, including support for external tables, table functions, merge (i.e., insert or update depending on whether a data item exists), multitable inserts, change data capture, and resumable statements. Today, OWB can be used to leverage this functionality. Additionally, OWB can create trickle feeds through the use of Streams and Advanced Queues.
For ETL into both Oracle databases and non-Oracle targets, Oracle offers a product named Oracle Data Integrator (ODI). This product was acquired in 2007 and was formerly known as Sunopsis. ODI features Knowledge Modules that define integration capabilities including extraction with change data capture, loading and unloading utilities, SQL-based loading and unloading, and transformation logic SQL.
The Knowledge Modules are modifiable. The product architecture includes a development environment that makes use of the Knowledge Modules as templates in declarative design processes and an orchestration agent.
In addition to providing heterogeneous ETL, ODI can be used to deploy and integrate data and transformation services in a Service-Oriented Architecture (SOA) infrastructure. ODI is a key component in Oracle MDM solutions and in some of Oracle’s emerging business intelligence applications.
Reporting and Ad Hoc Query Tools
Marketing, financial, and other business analysts are rarely interested in the storage and schema that hold their information. Their interest level rises when the discussion turns to the tools they’ll be using. Business intelligence tools are often evaluated and purchased within individual business areas, sometimes without close IT coordination. For implementations leveraging Oracle databases, youhave a choice between suites of Oracle business intelligence tools or popular independent vendors’ products, such as Business Objects, Cognos, and MicroStrategy.
Oracle’s business intelligence tools are bundled in three suites today: Oracle Business Intelligence Enterprise Edition, Standard Edition One, and Standard Edition. In addition, Oracle has obtained through acquisition the Hyperion Intelligence Server and Client products now bundled in Oracle Business Intelligence Enterprise Edition Plus. Oracle’s most strategic offerings are the Enterprise Edition Plus and Standard Edition One, although development and support continues for all of the other products.
Oracle Business Intelligence Enterprise Edition (OBI EE) contains the former Siebel Analytics tools and Oracle BI Publisher (previously XML Publisher). It includes optimizations for Oracle and non-Oracle databases. Included in the suite are:
Other Software for the Data Warehouse
|
241
Interactive Dashboards
Provide interactive browser-based collection of content from other OBI EE
components such as Answers. This content can include guided analytics to help less-sophisticated business users explore the right additional information available.
Answers
Thin client (DHTML) interactive tool for generating ad hoc queries and analysis.
Answers can be used directly against relational databases and MOLAP data stores.
Generated reports can be posted to the dashboard or serve as input to BI Publisher.
Reporting and Publishing (BI Publisher)
Template-based publishing solution that incorporates XML data extracts and produces reports in various output formats including PDF, RTF, HTML, Excel, XML, and eText. Report editors include popular desktop tools such as Adobe Acrobat and Microsoft Word.
Delivers
Infrastructure built by defining “iBot” alerts that trigger based on user-specified conditions. Delivers can set up publish-and-subscribe mechanisms to email, dashboard alerts, SMS text messaging, and other such notifications. It can also be linked to business process flows generated using Oracle’s Business Process Execution Language (BPEL).
Disconnected Analytics
Enables a business analyst to leverage the suite of tools disconnected from the network by accessing local data on a laptop. Resynchronization occurs when the analyst reconnects to his network.
Office Plug-in
Supports access to the BI Server from popular Microsoft tools such as Excel.
BI Server
Middle-tier for the previously described components that provides a business model and extraction layer, caching services, calculation and integration engine, and optimized data access into supported sources. Supported databases include the Oracle database and Oracle OLAP Option (analytic workspaces), Microsoft SQL Server and Analysis Services, IBM DB2, Teradata, and other ODBC
sources. Other sources can include Oracle Business Intelligence (BI) Applications, PeopleSoft EPM, E-Business Suite, Siebel, Fusion Business Intelligence Applications, and SAP.
BI Server Administrator
Used in managing the presentation layer, business model and mapping, and physical layer defined in the BI Server. Business analyst and user access and groupings are configured through this tool.
In the OBI EE Plus packaging, Hyperion components were added; these include the Hyperion Foundation Services, Interactive Reporting, SQR production reporting, Financial Reporting, Smartview for Office, and Web Analysis.
242
|
Chapter 10: Oracle Data Warehousing and Business Intelligence
Figure 10-5 illustrates a typical query formulation using Answers in the OBI EE Suite
.
Figure 10-5. Typical query using Answers to produce ranked results
Figure 10-6 illustrates the query output as viewed in Answers.
Figure 10-6. Query results viewed in Answers
Other Software for the Data Warehouse
|
243
The Oracle Business Intelligence Standard Edition One packaging includes a subset of the above and is intended for small and medium-sized implementations (e.g., deployed on a maximum of two CPUs or four cores and supporting from 5 to 50
users). Components include Oracle Dashboards, Answers, BI Publisher, BI Server, and BI Server Administrator. In addition, OBI SE One includes the Oracle Database Standard Edition One and Oracle Warehouse Builder.
Oracle’s previous-generation business intelligence tools, targeting Oracle databases only, are bundled in the Oracle Business Intelligence Standard Edition (OBI EE SE) and in the Oracle Application Server. The tools include the following:
Discoverer Plus
Easy-to-use Java applet-based frontend for picking and choosing data items used in building queries by business users. Discoverer is designed to access Oracle relational databases and the Oracle OLAP Option (analytic workspaces). Users can generate their own reports and deploy them to the Web as HTML files. Discoverer has a query governor that can predict the amount of time a query will take based on comparisons in records of previous queries kept in the database server.
Discoverer Viewer
Thin client used most often to view Discoverer reports. It provides a subset of the functionality of Discoverer Plus.
Discoverer Portlet Provider
Used for embedding Discoverer reports into enterprise portal solutions such as Oracle Portal.
Discoverer Administration Edition
Used for managing the Discoverer End User Layer (EUL), maintaining business areas and mapping relevant database tables and views, and controlling tasks available to business analysts and users.
Reports
Wizard-based frontend for building reports that can then be deployed to the Web for access as Adobe Acrobat, plain text, or HTML files. With this tool, you can cache reports on a middle-tier server for better performance. The tool also provides some limited drill-down search capabilities, in which a user can ask for more detail about a particular portion of a report.