Hadoop
topology supervision features of streaming frameworks (or lack thereof)
Blog post edited by Lester Martin This blog post introduces the three streaming frameworks that are bundled in the Hortonworks Data Platform (HDP) – Apache Storm, Spark Streaming, and Kafka Streams – and focuses on the supervision features offered to…
Read more
are partially-written hdfs files accessible? (not exactly, but much more yes than I previously thought)
Blog post added by Lester Martin Today is one of those days when I thought I knew something, stood firm with my assumption, and then found out that I wasn’t as right as I thought. Yes, a more humble person…
Read more
The New Cloudera
A new year is always an opportunity for change. This year, we’re making a big one. On January 3, we closed the merger of Cloudera and Hortonworks — the two leading companies in the big data space — creating a…
Read more
Cloudera + Hortonworks, from the Edge to AI
We’ve just announced that Cloudera and Hortonworks have agreed to merge to form a single company. I want to explain the thinking behind the deal and the combination. Rob Bearden from Hortonworks has written up a post sharing his thoughts,…
Read more
Introducing Cloudera Enterprise 6.0
Digital technologies are changing business models, reshaping how companies go-to-market, win new customers and drive new revenue-producing opportunities. Consider the following practices that, until recently, were relegated to the R&D department: Data-driven decision making – the collection and analysis of…
Read more
Simplifying the Fortification of your Data Lake with Apache Knox
This blog is a first in a series of security-related blogs that we plan to publish in the near future. It’s a myth that usability and security are mutually exclusive. In this blog, we’ll try to dispel it in the…
Read more
First Class GPUs support in Apache Hadoop 3.1, YARN & HDP 3.0
This blog is also co-authored by Zian Chen and Sunil Govindan from Hortonworks. Introduction – Apache Hadoop 3.1, YARN, & HDP 3.0 GPUs are increasingly becoming a key tool for many big data applications. Deep-learning / machine learning, data analytics,…
Read more
Building the Modern Platform with Cloudera Enterprise 6.x and Altus
Why Enterprises Need to Unify ML, Analytics, and Cloud Times are changing, and the traditional models of analytics and data management don’t serve the needs of the modern enterprise, so the way to address these topics is changing too. While…
Read more
Trying out Containerized Applications on Apache Hadoop YARN 3.1
This is the 5th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4). In this blog, we will explore running Docker containers on YARN for faster time to market and faster time to insights for…
Read more
From EDW Optimization to Business Transformation
If I asked a question about the benefits in optimizing Enterprise Data Warehouse (EDW) with Apache Hadoop, from my own experience, 9 out of 10 responses had to do with either data archiving or the reservation of high-performance EDW processing…
Read more