Getting Started with Cloudera Data Platform Operational Database (COD)

Concepts

What is Cloudera Operational Database (COD)?

Operational Database is a relational and non-relational database built on Apache HBase and is designed to support OLTP applications, which use big data.

The operational database in Cloudera Data Platform has the following components: 

  • Apache Phoenix provides a relational model facilitating massive scalability. It leverages the scalability and resiliency of Apache HBase.
  • Apache HBase provides a non-relational model designed for massive scalability, so you can store unlimited amounts of data in a single platform and handle growing demands for serving data.
  • Apache ZooKeeper provides a distributed configuration service, a synchronization service, and a naming registry.
  • Apache Knox Gateway provides perimeter security so that the enterprise can confidently extend access to new users.
  • Apache HDFS is used to write the Apache HBase WALs (and HBase HFiles in some cases).
  • Object stores such as Amazon S3 and Microsoft ADLS Gen2 are used to store the Apache HBase HFiles.
  • Shared Data Experience (SDX) is used for security and governance capabilities. Security and governance policies are set once and applied across all data and workloads. Just like CDP itself, SDX is built on community open source projects with Apache Ranger and Apache Atlas taking pride of place. 

Atlas provides open metadata management and governance capabilities to build a catalog of all assets, and also classify and govern these assets. The SDX layer of CDP leverages the full spectrum of Atlas to automatically track and control all data assets.

Rager provides security key management, with a separate login for Key administrators using the Ranger KMS service. Apache Ranger also provides much needed security features like column masking and row filtering out of the box. Another important factor is that the access policies in Ranger can be customized with dynamic context using different attributes like ‘geographic region’ or ‘time of the day’.

  • IDBroker is a REST API built as part of Apache Knox’s authentication services. It allows an authenticated and authorized user to exchange a set of credentials or a token for cloud vendor access tokens.

CDP Operational Database Data Service

CDP Operational Database (COD) is a real-time auto-scaling operational database powered by Apache HBase and Apache Phoenix. It is a data service that runs on Cloudera Data Platform (CDP). You can access COD right from your CDP console. COD enables you to create a new operational database with a single click and auto-scales based on your workload.

The following are the key steps to get started with COD:

  • Create a database in an environment using a single click and a database should be up and available within a few minutes. 
  • Setup your workload password. For more information, click here
  • Download and install Apache Maven, Java, Python 3.8.
  • Install CDP Client on your machine. For more information, click here.
  • Follow the instructions in the examples repository to make changes to your maven settings-security.xml, settings.xml, and pom.xml.
  • Build and run the applications.

Apache HBase

HBase is a column-oriented data storage architecture that is formed on top of HDFS to overcome its limitations. It leverages the basic features of HDFS and builds upon it to provide scalability by handling a large volume of the read and write requests in real-time. Although the HBase architecture is a NoSQL database, it eases the process of maintaining data by distributing it evenly across the cluster. This makes accessing and altering data in the HBase data model quick. Learn more about Apache HBase.

Apache Phoenix

Apache Phoenix is a RDBMS, an ANSI SQL interface. Apache Phoenix implements best-practice optimizations to enable software engineers to develop next-generation data-driven applications based on HBase. Using Phoenix, you can create and interact with tables in the form of typical DDL/DML statements using the standard JDBC API, ODBC, Phoenix DB API.

Phoenix provides:

  • SQL and JDBC API support
  • Support for late-bound, schema-on-read
  • Access to data stored and produced in other components such as Apache Spark and Apache Hive

Learn more about Apache Phoenix.

Procedure

How to create an Operational Database

You can create an operational database in your registered environment using CDP Operational Database (COD).

Pre-requisites

  • You must be logged into the COD environment as an ODAdmin.
  • Ensure that you are authorized to create a database.

Steps

  1. Log in to the CDP web interface. For example, CDP console.
  2. Select Operational Database.
  3. In the COD web interface, click Create Database.
  4. Select the environment from the list in which you want to have the database.
  5. Provide a name for the database in the Database Name field.
  6. Click Create Database.

Result

Information page is displayed that shows the status of the database. Your new database is ready to be used once its status becomes Available.

Demo

How to manage a database connection

After you create an operational database and it is available, you can manage the database connections.

Pre-requisites

  • Ensure that a database is up and available.
  • You are authorized to make changes to the database.

Steps

  1. In the COD web interface, select the database for which you want to manage the connections.
  2. Under Connect, go inside each tab and modify the parameters.

References

If you are interested in trying out CDP Public Cloud and the Operational Database, try out our Test Drive.

The post Getting Started with Cloudera Data Platform Operational Database (COD) appeared first on Cloudera Blog.

Leave a Comment

Your email address will not be published. Required fields are marked *