Augmenting the Data Warehouse with an Enterprise Data Hub

In the world of Data Warehousing, speed, cost and value are often of paramount importance.  Consider the following:

  • TDWI estimates it takes upwards of 8 weeks to add a column to a table
  • The average cost of an integration project runs between $250K and $1M, according to Gartner
  • Only 3 of 10 customers surveyed by Ventana Research trust the data in their data warehouse

Despite these constraints, the enterprise data warehouse (EDW) remains one of the most important tools at the center of any organization.  Unfortunate as the dynamics of data rapidly change, the EDW is under constant pressure causing budgets to swell and confidence to wane.

Often times this is avoidable.  With the influx of new types and ever increasing volumes of data, conventional wisdom suggests that the EDW must be retrofitted in response.  When the value of that data is not yet known, however, the effort and cost are likely something that could have been avoided.  Instead organizations try to load this data anyway, integration processes start to run long (or over!), scarce EDW resources are used up to try shrink that time, reports and queries start to run longer than expected, and data is archived to try make room for rapidly growing data sets.  In the end, the EDW becomes overburdened, can cost more and causes day-to-day business intelligence (BI) to suffer.

As we announced last week, an enterprise data hub (EDH) based on Apache Hadoop is emerging as the leading solution for big data.  In addition to storage and processing benefits, an EDH is ideal companion to an EDW that helps alleviate these challenges.  When deployed along side common EDW infrastructure, an EDH helps streamline data integration processing, by offloading data transformations and improving performance, thus reducing costs.  It also enables true self-service, exploratory BI – leveraging the tools you already have – reducing time to insight.  Ultimately, the combination of an EDH and an EDW allow you to store all your data, in full fidelity and bring compute to the data without the need to move data around the enterprise.

To help you get started, check out the Cloudera and Informatica reference architecture for data warehouse optimization.  We’ll be talking about this topic and more at our upcoming Cloudera Sessions tour. Check out our events page to learn more.



The post Augmenting the Data Warehouse with an Enterprise Data Hub appeared first on Cloudera VISION.

Leave a Comment

Your email address will not be published. Required fields are marked *