Production-Ready Hadoop with Cloudera 5.3
Today, Cloudera 5.3 is generally available. This release marks a milestone in a journey that was started back with the release of Cloudera 5. With Cloudera 5, we set out to make Hadoop accessible to the enterprise. We wanted to make a platform that could store unlimited data of all kinds, is accessible to many different users, and with the enterprise capabilities necessary for production-use (such as security, governance, administration, and deployment agility).
Since the release of Cloudera 5 and our enterprise data hub vision, we have seen much of this become a reality. We have continued to improve our existing frameworks with updates to Cloudera Search, the release of Impala 2.0, integrating Apache Spark into the platform, and driving better batch processing in the ecosystem with Spark as the processing engine.
Cloudera 5.3 includes a number of improvements to Spark, with Spark 1.2 available, HDFS caching integration, and alpha builds of Hive-on-Spark (a key part of improving batch processing). By having these frameworks all integrated into a single platform, users have unprecedented access to data, using whatever skills or tools that they are most comfortable with.
With broader data access also comes added security concerns. That is why security and governance have been a key focus of ours both for our Cloudera Enterprise customers and for the Hadoop community as a whole. We have worked closely with Intel on Project Rhino to improve Hadoop security. Apache Sentry was donated to this initiative for providing unified authorization across all access tools. With Cloudera 5.3, we are one step closer to that goal by adding HDFS integration to Sentry – meaning permissions can be set once and Sentry enforces them across Impala, Hive, Search, and HDFS.
At-rest encryption for HDFS is also a major part of Project Rhino, and one that is delivered with Cloudera 5.3. This end-to-end encryption is highly performant and provides a critical separation of duties, so HDFS administrators do not have full access to unencrypted data or sensitive key material. For Cloudera Enterprise customers, this encryption is also directly integrated with Navigator Key Trustee for enterprise-grade key management.
As part of Cloudera Enterprise 5, we also have added security tools with auditing and lineage through Cloudera Navigator and encryption and key management with Navigator Encrypt and Key Trustee (added from the acquisition of Gazzang). Combined with security automation through Cloudera Manager and Sentry as an integrated part of the platform, Cloudera is the only platform to pass PCI security compliance. As we continue to harden the security and governance of our platform, Cloudera Navigator’s policy engine is now generally available with Cloudera Enterprise 5.3 for extended policy management and simple integration with partner data preparation, quality, and profiling tools.
Another trend we have seen is bringing Hadoop to the cloud. Cloudera is focused on making this both easier and portable, bringing analytics to wherever the data may be. From a product perspective, we recently launched Cloudera Director as the first self-service tool for deploying and managing Hadoop in the cloud. With 5.3, we are also furthering our partnership with Microsoft with the ability to deploy directly from the Microsoft Azure Marketplace and integrate with SQL Server, Power BI, and Azure Machine Learning.
Across many different industries and hundreds of customers, we have worked to add critical, production-ready features with the Cloudera 5 series, as well as certify partner solutions to ensure stability and seamless integrations. With rolling upgrades available and a new upgrade wizard in Cloudera Manager 5.3, it is now easier than ever to upgrade to all the benefits of Cloudera 5.3, without downtime.
For more information about the security features added with Cloudera 5.3, please register for our upcoming webinar with Intel, “Project Rhino: Enhancing Data Protection for Hadoop.”