Latest Posts

Big Data & Brews: Part II on Data Security with Informatica

Informatica’s Anil Chakravarthy and I continue our conversation around data security, this time discussing how risk management is a perfect example of a data-driven exercise. He elaborates that in the past it was either driven by human expertise or by process and increasingly, it’s becoming process-driven.

We also talk about the role Informatics plays and how cloud and data aggregation is their sweet spot.

Don’t miss it! Tune in below for part two of our Big Data & Brews with Informatica.


Stefan: But let’s talk a little bit about that “using data to secure” topic. Where do you see the opportunity in the market?

Anil: You mentioned Splunk earlier. You see a lot of companies now which have really changed the ways essentially security happens. Or, I can even broaden the topic further to your earlier conversation about risk management. When you think of managing your risk, that is essentially a data-driven exercise right now. In the past, it was either human-expertise-driven or process-driven. I think increasingly we’ve seen that it is becoming data-driven. A great example is, think of just what is happening even at the network security level. In the past, it used to be that you had specific devices like routers and firewalls, etc. from which you collected logs and you prerecorded what you were looking for and you basically said, “This is what a security attack looks like.” And then, you look for patterns that match that prerecorded knowledge that you had.

Now that world is changing very quickly even at the network level. You basically now collect logs not only from all the network devices, applications, active directory interface, user access. You pretty much collect all of that information and then you use big data techniques to find the pattern rather than say, “Hey, I already know the pattern of attack and I’m just going to go look for that pattern.” I say, “I don’t know the pattern of attack.” The assumption right now is, I have all this work and attackers only needs one way to get in. Therefore, I don’t know what way they’re using to get in. So, let me get the data and see what the data tells me in terms of what made me abnormal and then use that to find if it’s really a security vulnerability, right? That, to me, is how data is being used to change the world and that’s …read more

Big Data & Brews: Informatica Talks Security

I’m extremely excited to return from our hiatus with a new interview with Informatica’s Acting CEO, Anil Chakravarthy. He has over 15 years of experience in security and given the importance of big data governance, I thought he was the perfect candidate to share what he sees coming down the pipeline.

Tune in below to see the first installment.



Anil Chakravarthy, Acting CEO, Informatica

Stefan: Welcome to Big Data and Brews. It’s been a long time. I’m very excited to start off a new season of Big Data and Brews with Anil Chakravarthy from Informatica. Thanks for joining.

Anil: My pleasure.

Stefan: Usually we ask to please introduce yourself and the brew you brought, but it’s so early in the morning, we decided we’d go for coffee and refreshing water. Tell me a little bit about your background. You have a very interesting background, very security-focused. How did that shape how you got to Informatica and what you’re doing there?

Anil: Yes, as you said, I’ve had a deep background in security for the last 15 years. I was at Symantec, where I ran the enterprise security business. I was at Symantec for nearly 10 years. Before that at VeriSign, where I was responsible for product management of the VeriSign security services. Coming to Informatica, to me, was really a great way to bring that security expertise to the data layer.

As you know, a lot of the security world is still very much at the network layer. It’s creeping up into the application layer, but if you really look at where security can be most affective, it’s really at the data layer. There you know what you are trying to protect, what is sensitive, what is valuable. We at Informatica are taking a new approach, based on my background, but based also on what we see from the industry. We are taking a new data-centric approach to security.

Stefan: I think there are two topics I want to talk to you about today. One is really securing data and one is using data to secure, if that makes sense?

Anil: Yeah, yeah, it does.

Stefan: Why don’t we start with the first one? What’s your perspective about what’s going on in … Maybe we expand it from security to overall data governance. What is really the requirement of the market? Where are the products today? Where do they have to come, where are the shortcomings?

Anil: Yeah, let’s start with …read more

Reflections from the New Guy


It’s been 20 years since I was “the new Guy.”

Hello friends and colleagues. I wanted share some thoughts after my first 90 days at Hortonworks. It’s been a thrill ride to say the least, there is all of the normal new guy / first impression stuff – and for those of you who know me, you know I am very sensitive to all that!

Working with our founders and engineering team has been a blast. Seeing the passion in their eyes, feeling the energy and enthusiasm in their voice, has been inspirational. Their unbridled dedication to our new compute and open source paradigm is evident and infectious.

It is clear that we are at the center of multiple inflection points.

First, the open source paradigm will continue to reshape how software is developed. Leveraging a community of brilliant people means constant innovation. It also means that these talented people actually compete to find the best solutions and approach to data management problems—and the real winners are users of Apache Hadoop and HDP.

Second, it is really about the platform. Providing the first real solution for quickly landing very large and very diverse data, Hadoop along with the broader ecosystem provides the ability to capture data that used to go to waste. This collective pool of information is the raw material for refined and advanced analytics that will drive improved business models.

Meeting our customers and prospects has also been quite revealing. These are companies who are redefining their industries by being data centric and data driven. Along the way, I’ve heard some common themes.

“Get Going.”

At the Hadoop Summit in June, we had a customer panel comprised of some real thought leaders. They all mentioned in their comments that the best thing they did was getting started. The sooner you start collecting and making broad and diverse data available to data scientists and business analysts, the sooner the value shows up. And, while it may seem ‘salesy,’ it’s actually the point. Today’s modern data architecture turns the normal IT projects upside down. Schema on read is the opposite of traditional models, and very relevant today. Big data and sources of big data evolve and change so rapidly that it’s only possible to glean value by landing and analyzing.

“We’ve only just begun.”

The new and innovative use cases are being invented now, taking advantage of the new ‘land it first’ mentality. From …read more

The State of Hadoop: What’s Next?

**This post originally published on InsideBigData**
Everybody wants a piece of the big data pie – particularly Hadoop. Startups are popping up left and right in attempt to be a part of the Hadoop action and industry watchers are fueling the buzz — and for good reason.

Hadoop has emerged as the leading software framework for the storage and analysis of big data. Early adopters such as Facebook, Twitter and Yahoo! have successfully built custom analytics using Hadoop to tackle big data analytic challenges. Given this initial success, Hadoop has become the poster child for delivering scalable analytic powers that meet today’s big data requirements, and companies are biting at the opportunity to benefit from that potential.

Yet with the growing buzz surrounding Hadoop so comes the skepticism. While it would be ludicrous to doubt the value of data and its ability to create high-resolution observations and interpretations about how businesses are performing, it’s time to ponder how to bring big data technologies, such as Hadoop, into the next phase of efficiency and utility. In order to do that, we must understand what’s driving the skepticism that’s out there, and how to address it.

Who’s Jumping on the Hadoop Bandwagon?

Looking at the big data landscape, one of the obvious observations is the increasing number of startups focused on Hadoop. When you have a unique shift in the market like the one brought on by big data, it’s inevitable that startups will want to jump on the bandwagon. If there’s a great opportunity, Silicon Valley and the world of emerging technology will always try to capitalize on it.

With all of this hype, people question whether there is a growing Hadoop bubble and ahead-of-time expectations. We’ve seen Hadoop-related companies leave the gates with initially promising growth numbers and then stagnate early on. People are starting to question if Hadoop is worth all the fuss.

Those looking at the Hadoop landscape need to recognize whether there’s value creation in the company or if it’s a matter of unlimited funds that’s being used to buy growth. There are Hadoop-related companies that create tremendous value, have solid bookings and revenue numbers — that is where the potential lies. On the flip side, there are also companies where growth is mostly bought — that is where the potential dies.

Bring Something New to the Hadoop Game

The companies that are giving rise to the doubts around the promise of Hadoop are …read more

Mike's Maxims for Hadoop

Once you have a strategy for managing your data architecture, adding nodes will help the performance of the Hadoop in that architecture; if you have no strategy, adding nodes will let useless processes run really fast. A Hadoop cluster needs…
Read more

Review – DataMeer is an excellent "bridge" to Hadoop

DataMeer Review on TrustRadius DataMeer is an excellent “bridge” to Hadoop Organizations who want to get started with a Hadoop cluster and the analytic capabilities provided by Hadoop, but do not have a large software development team to assist in creating…
Read more