Best Practices Gap

As you build out your first multi-tenant Hadoop cluster, it is easy to focus on getting things working, without planning for how to provide appropriate structure and processes to provide operational support to your users.

A few items to put on the list for documenting in your cluster include the following:

  • Access Control – How do your users get set up with credentials to access the cluster?
  • Directory Structures – What is the appropriate directory structure for user directories in Unix and HDFS?
  • Retention and archiving – How do clean up unnecessary files and avoid wasting disk space, and retain what you require?

