Hadoop Adoption – Where is your organization?
Make Use of Your Data
Organizations have always strived to make optimal use of their available data. As an organization reaches a certain threshold there are several limiting factors.
One of those factors is of course the ability, focus, and attention that an organization devotes to organizing their information to support ongoing growth and analysis.
Another factor is that for years, the technology that was available to most organizations was not easy to use. Perhaps it was too expensive, required too much reworking of current processes, or too close to the bleeding edge and not stable enough to commit resources to making a process change or investment in the technology.
Hadoop is Available Now
What has changed over the past couple years, is that Hadoop is available to any organization willing to spin up a test environment to assess the technology. Whether an organization deploys to three old servers in their environment, or spins up several virtual servers using for example AWS or Rackspace, is not as important as the fact that a smallish hardware budget can let an organization try something for a few months to see if it helps them without a huge commitment.
While Hadoop and data integration products are rapidly developing, there is a core available now that can be used to assess the usefulness of Hadoop. In addition, that core can be upgraded with reasonable effort and respect to existing installations to support the decision to move ahead and towards Hadoop in your organization!
What Level is My Organization?
For the next section, I will borrow the concept of Capability Maturity Model (CMM) from Carnegie Mellon University and the Software Engineering Institute, and apply that to the maturity of an organization with regard to data management and adoption of Hadoop.
Hadoop Adoption Level (HAL)
To simplify your starting point, which of these descriptions fits the status of your organization? Your answer determines what the Hadoop Adoption Level (HAL) of your organization is!
- Level 1 – We stopped trying to collect certain information years ago, because there was no way to retain that information in a useful way. The difference between Level 1 and zero, is that a Level 1 organization recognizes that better data management would be helpful.
- Level 2 – We have been collecting data from various sources such as our ordering process, website access logs, accounting or ERP system, Google Analytics. Unfortunately, we have no way to gather this combination of structured and unstructured data into something that could produce meaningful information.
- Level 3 – We have a test cluster set up, but are struggling to get data from various sources ingested and cataloged within Hadoop in a consistent way that supports our analyst team.
- Level 4 – We have several Hadoop clusters set up, and each team is pushing data into Hadoop their own way, from their own silo or department.
- Level 5 – We have more than one Hadoop cluster set up, and they are split for a specific purpose or requirement related to critical business policy or compliance factors.
For each one of these levels, there are steps your organization can take to make better use of your data and the latest technology!
Please let me your thoughts and suggestions about this article.