The Rise of the Apache Hadoop Data Warehouse: Gartner Magic Quadrant 2016
The enterprise data warehouse has long been labeled the lifeblood of organizational analytics. The playground outside of transactional systems where analysts could correlate information that would fuel the entire business. For many years, this was mainly periodic reporting and dashboards that were rigid and took months to develop. This model proved adequate for many years until the rise of a web driven culture put increasingly more demands on data collection and distributed analysis. Many organizations have struggled to scale to meet the demands of more data for business decision-making. At roughly the same rate, we have seen the need for enterprise capabilities that ensure that data is kept safe and that data can be governed across the business. While traditional solutions offered robust leadership in providing enterprise capabilities; scale and the need to leverage new types of data are still complex issues for legacy enterprise data warehouse solutions.
There are several institutions that track the key trends in data warehouse modernization, but perhaps not many as tenured and instantly familiar as Gartner. Since 1979, Gartner has observed and reported on some of the most exciting shifts in technology, from personal computers to open source to cloud. They have become trusted advisors to many organizations wishing to learn how to strategically weather these fundamental shifts.
Gartner has been following Cloudera for many years and has evaluated us previously in categories like Operational Database. In 2014 Cloudera made a debut on a new vector of comparison, The Magic Quadrant for Data Warehouse and Data Management Solutions as the first Apache Hadoop vendor to be included in the comparison. In its inception it may never have been clear that a technology focused on simply processing vast amounts of data would begin to challenge the incumbent dominated world of the data warehouse. But the big data ecosystem began to address many of the complex issues with large scale data collection and management with a tabula rasa mentality and began to figure out how to incorporate new data types that simply could not be utilized for analysis in traditional data warehouse environments. While many times this is a complementary exercise in combination with running an existing EDW some organizations began to adopt an enterprise data hub approach, a single destination to process, analyze and serve all data. This shift in architecture has downstream advantages as it has shaped the world of advanced analytics and agile, iterative model building that is implicitly enabled by Hadoop.
In 2014 Cloudera first appeared on the Data Warehouse and Database Systems Magic Quadrant, a lone vendor representing the rise of big data solutions. The leaders should not be of much surprise (Oracle, Teradata, IBM), they are household names that come with a portfolio of solutions to address many areas of the business and also close partners of Cloudera. Cloudera was showcasing its promise as an enterprise analytics solution with many customers already successfully using Cloudera to extend their data warehouse.
In 2015, Cloudera’s rapid innovation in enterprise tooling allowed for a move into the challengers quadrant. This means that in just a year Cloudera moved from a solution to address niche use cases to a solution that was actively challenging strongly embedded solutions and leaders in the space. This was really just a preview of what was to come. We also saw other Hadoop vendors enter the picture while some Hadoop vendors remained unevaluated.
Perhaps the most exciting move yet was showcased in the recent 2016 Magic Quadrant for Data Warehouse and Data Management Solutions. An exciting year for open source in general, as we see an explosion of solutions that are driven by open-source innovation. We now see Hadoop vendors as key players in the visionary roadmap and it’s clear that the demands of big data are steering the future of the EDW solutions. It is perhaps best captured in Gartner’s opening statement.
“Disruption is accelerating in this market, with more demand for broad solutions that address multiple types of data and offer distributed processing and repository options.”
Cloudera retained it leadership over alternative Hadoop distributions and has had a year of solid developments that aim to address the needs of a next generation analytic database. This included product innovations and open-source projects like RecordService, Apache Kudu (incubating), and Cloudera Navigator Optimizer, which were all net new solutions that address large-scale analytic database functionality.
Hadoop as an Analytic Database
One of the easily obvious advantages of new technologies like Apache Hadoop is the ability to handle complex data like unstructured data. This has allowed companies to begin to analyze images, web logs, and a variety of application data formats that were tricky or impossible with established solutions. In addition, the flexibility to bring compute resources directly to the data meant that incorporating increasingly more data was a reality without running into increased processing pressures. While Hadoop’s original promise was to store unlimited data the analytics capabilities of Hadoop have been growing vastly since its inception.
Cloudera as an Analytic Database
The unique components of the Cloudera Enterprise platform aim to solve many of the challenges specific to running Hadoop at enterprise scale and with the security assurances that users require. Many of our customers use Cloudera as an agile and high performing analytic database. Cloudera Enterprise gives you the fastest analytic SQL with Impala and Kudu allows for fast analytics on real-time data, as it changes. Third party extensibility to popular BI and visualization tools opens up analytic access to a wider group of participants. Opening up access often means you need more visibility into who is using the data. Cloudera Navigator provides comprehensive data governance, lineage, and audit capabilities which a key component of housing critical data. To protect the data tools like Apache Sentry and Navigator Encrypt ensure that access policies are met and that sensitive data is encrypted at all stages. Recently, Cloudera announced new tools that made enterprise functionality even more comprehensive; RecordService for unified policy and access control and Navigator Optimizer which provides intelligent guidance for database modernization. Now more than ever do users have the tools they need to be successful with implementing an enterprise Hadoop-based analytics strategy. Cloudera aims to take the vast and complex ecosystem of Hadoop and make it extremely fast, easy to manage, and secure from end to end.
This rise of the Hadoop Data Warehouse is not to be mistaken with the downfall of legacy data warehouse solutions. In fact, it’s probably summarized best in data warehousing and business intelligence thought leader and evangelist for dimensional modeling, Ralph Kimball’s recent series of videos.
Dr. Ralph Kimball, reiterated why Hadoop and the applications it empowers are having such an impact on traditional data warehousing environments: “By modernizing the “back room” of data collection and preparation, it can rapidly open the door to more data, more users, and more diverse analytic perspectives than ever before possible.” – Global News Wire
Many of our customers choose to extend and optimize their current data warehouse with Cloudera Enterprise. This allows them to capture, analyze, and serve all data while meeting the discrete needs of the schema and data format. It also enables massive scaling to meet the demands of including more data in decision-making. Cox Automotive needed to manage 20 brands incorporating 200 million rows of data daily. They used a Cloudera EDH to extend their data warehouse and reduced the total cost of ownership of their data environment by about 50 percent per terabyte. Our customers today and in the future can choose to leverage our vast ecosystem of integrated partners that provide critical application integration and optimized architecture. This ensures that companies can implement Cloudera with their current analytics efforts with minimal disruption or start from day one with an analytic database that is enterprise hardened and an industry visionary.
Read the full Gartner report here.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
The post The Rise of the Apache Hadoop Data Warehouse: Gartner Magic Quadrant 2016 appeared first on Cloudera VISION.