Thursday, November 3, 2016

How-to: Use the New Apache Oozie Database Migration Tool

The #Apache #Oozie server is a stateless web application by design, with all information about running and completed workflows, coordinator jobs, and bundle jobs stored in a relational database. Prior to #Cloudera Manager 5.4, Oozie was configured to use the embedded Apache Derby database for this purpose by default. However, while Derby can safely be used in very small or test/dev clusters, it is not recommended for production Oozie installations—with the main reason being that Derby suffers from known locking issues at high scale and with large amounts of data. Furthermore, to provide additional scalability and fault tolerance, Oozie introduced high availability (HA) starting in CDH 5. Unfortunately, Derby cannot be used in an HA setup, as it does not support multiple concurrent connections. In this post, we’ll describe how Cloudera has addressed these issues for users in a forthcoming Oozie release (and soon to ship in Cloudera Enterprise).

http://blog.cloudera.com/blog/2016/11/how-to-use-the-new-apache-oozie-database-migration-tool/

1 comment:

  1. If we consider the Big data platform managed service, then adaptive learning is an excellent way to make it successful.

    ReplyDelete