An Rule of Optimization which provides SQL Standard Authorization for Apache Spark
Ranger security support is one of the available Authorization methods for Spark SQL with spark-authorizer.
Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. The [spark-authorizer enables Spark SQL with control access ability reusing Ranger Plugin for Hive MetaStore . Ranger makes the scope of existing SQL-Standard Based Authorization expanded but without supporting Spark SQL. spark-authorizer sticks them together.
Configuration | Configuration File | Example | Descriptions |
---|---|---|---|
ranger.plugin.hive.policy.rest.url | ranger-hive-security.xml | http://ranger.admin.one:6080,http://ranger.admin.two.lt.163.org:6080 | Comma separated list of ranger admin address |
ranger.plugin.hive.service.name | ranger-hive-security.xml | Name of the Ranger service containing policies for this YARN instance | |
ranger.plugin.hive.policy.cache.dir | ranger-hive-security.xml | policycache | local cache directory for ranger policy caches |
Create ranger-hive-security.xml
in $SPARK_HOME/conf
with configurations above properly set.
<!-- Ranger -->
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizerFactory</value>
</property>
<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
</property>
<property>
<name>hive.conf.restricted.list</name>
<value>
hive.security.authorization.enabled,hive.security.authorization.manager,hive.security.authenticator.manager
</value>
</property>
Add configurations above in $SPARK_HOME/conf/hive-site.xml
to enable Ranger security support.
All access to Spark SQL/Hive tables that is authorized by Ranger is automatically audited by Ranger. Auditing destination of HDFS or Solr etc is supported.
Configuration | Configuration File | Example | Descriptions |
---|---|---|---|
xasecure.audit.is.enabled | ranger-hive-audit.xml | false | When true, auditing is enabled |
xasecure.audit.jpa.javax.persistence.jdbc.driver | ranger-hive-audit.xml | com.mysql.jdbc.Driver | jdbc driver for audit to a mysql database destination |
xasecure.audit.jpa.javax.persistence.jdbc.url | ranger-hive-audit.xml | jdbc:mysql://address/dbname | database instance auditing to |
xasecure.audit.jpa.javax.persistence.jdbc.user | ranger-hive-audit.xml | username | user name |
xasecure.audit.jpa.javax.persistence.jdbc.password | ranger-hive-audit.xml | Password | Password |
Create ranger-hive-security.xml
in $SPARK_HOME/conf
with configurations above properly set to enable or disable auditing.
ranger-hive-plugin
for Spark SQLPlease refer to the Install and Enable Ranger Hive Plugin for an overview on how to setup Ranger jars for Spark SQL.