Kafka
topology supervision features of streaming frameworks (or lack thereof)
Blog post edited by Lester Martin This blog post introduces the three streaming frameworks that are bundled in the Hortonworks Data Platform (HDP) – Apache Storm, Spark Streaming, and Kafka Streams – and focuses on the supervision features offered to…
Read more
Building Secure and Governed Microservices with Kafka Streams
With Hortonworks DataFlow (HDF) 3.3 now supporting Kafka Streams, we are truly excited about the possibilities of the applications that you can benefit from when combined with the rest of our platform. In this post, we will demonstrate how Kafka…
Read more
A new era of SQL-development, fueled by a modern data warehouse
SQL development is not a new concept. However, as the data warehousing world shifts into a fast-paced, digital, and agile era, the demands to quickly generate reports and help guide data-driven decisions are constantly increasing. This puts new pressures on…
Read more
Building the Modern Platform with Cloudera Enterprise 6.x and Altus
Why Enterprises Need to Unify ML, Analytics, and Cloud Times are changing, and the traditional models of analytics and data management don’t serve the needs of the modern enterprise, so the way to address these topics is changing too. While…
Read more
New Capabilities for Apache Spark Users
In September 2015, Cloudera launched the One Platform Initiative to make Apache Spark the default engine for Cloudera’s modern data platform. At the time, we had about 150 customers using Spark, many of them for simple ETL and data…
Read more
What’s New for Apache Spark & Apache Zeppelin in HDP 2.6?
The value of any data is proportional to the insights derived from it. With the Data Lake Architecture, all of the enterprise data is made available in one place. The key to driving insights from the Data Lake is Apache…
Read more
Cross-component Lineage for Apache Hadoop
Apache Hadoop® exists within a broader ecosystem of enterprise analytical packages. This includes ETL tools, ERP and CRM systems, enterprise data warehouses, data marts and others. Modern workloads flow from these various traditional analytical sources into Hadoop and then often back…
Read more
NoClassDefFoundError for Log4jLoggerFactory on hdp 2.5.3 when running the KafkaSpout in your topology? (how’s that for a title?)
Blog post added by Lester Martin Ephemeral Issue NOTE: This is a corner-case blog post and really only useful for those who find this entry from a very specific Google search!! Additionally, I’m filing a support case and expect the…
Read more
opening up a port on centos 7 firewall (using firewalld)
Blog post edited by Lester Martin There I was on an AWS hosted node trying to access port 2181 and 9092 on another AWS node where I just followed the instructions at http://kafka.apache.org/documentation/#quickstart to get a stand-alone instance of Kafka…
Read more
Cross-component Lineage for Apache Hadoop
Apache Hadoop® exists within a broader ecosystem of enterprise analytical packages. This includes ETL tools, ERP and CRM systems, enterprise data warehouses, data marts and others. Modern workloads flow from these various traditional analytical sources into Hadoop and then often back…
Read more