This document gives you a brief description about how Kyuubi submits itself.
Kyuubi supports “client” mode by default, which means that Kyuubi launches a server process on the local machine node and serves client side JDBC/ODBC connections. We need to setup all environments and other preparations for each node for launching Kyuubi server. This is very discommodious to deploy Kyuubi server, especially in HA mode, and even worse when running on different releases of Linux.
Kyuubi containerization is a much more easy for Kyuubi deployment, which makes Kyuubi server instance a containerized, server-less service serving in YARN Container.
The above picture shows the whole architecture for Kyuubi containerization. The key concept is simple and obvious, which runs Kyuubi server as YARN container and serve the JDBC/ODBC client remotely. In such an deployment mode, we do not need to configure or even make some customizations for some complicated situations.
We can use the
Client to fire a number ofr Kyuubi servers that meet our needs. The containerized Kyuubi server will be maintained in the YARN cluster as a long running service.
|Client||Kyuubi YARN Client, with all information we need to deploy Kyuubi|
|Kyuubi Server||Kyuubi server instance wrapped as KyuubiAppMaster launched by YARN as an ApplicationMaster container|
|Spark AM||Spark’s ApplicationMaster, here as the role of ExecutorLauncher|
|Spark Executor||A process launched on a NodeManager, that runs tasks and keeps data in memory or disk storage across them. Each SparkContext has its own executors.|
|Zookeeper Service Discovery||ZooKeeper Dynamic Service Discovery, which is useful in Kyuubi containerization because the port of KyuubiServer frontend service is random picked.|
|JDBC/ODBC/Thrift Client||Various kinds of clients talk to Kyuubi Server|
The table below contains the server side configurations used by the Kyuubi container itself for launching and sizing itself.
|–deploy-mode||client||when “cluster” is set, Kyuubi containerization will be enabled|
|spark.driver.memory||1024m||Kyuubi server container heap size|
|spark.yarn.driver.memoryOverhead||spark.drive.memory * 0.1||Overhead memory for Kyuubi server container|
|spark.driver.cores||DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES||Kyuubi server container cores|
|spark.yarn.am.extraJavaOptions||(none)||Extra jvm options for Kyuubi container|
Firstly, please refer to the Kyuubi Deployment Guide on line documentation to learn how to configure the Kyuubi client.
Then, the only thing we need to do is to launch Kyuubi with
bin/start-kyuubi.sh and specify the deploy mode to “cluster”.
$ bin/start-kyuubi.sh \ --master yarn \ --deploy-mode cluster
At last, a KYUUBI type YARN application named KYUUBI SERVER[version] will be created on the YARN cluster. If we go to the ResourceManager UI, we may see somme thing as follow,
And also, the server log is available to look up through the ApplicationMaster page.