Apache Zeppelin Configuration
Zeppelin Properties
Zeppelin can be configured via several sources.
Sources descending by priority:
- environment variables can be defined conf/zeppelin-env.sh
(conf\zeppelin-env.cmd
for Windows).
- system properties
- configuration file can be defined in conf/zeppelin-site.xml
Mouse hover on each property and click then you can get a link for that.
zeppelin-env.sh | zeppelin-site.xml | Default value | Description |
---|---|---|---|
ZEPPELIN_ADDR |
zeppelin.server.addr |
127.0.0.1 | Zeppelin server binding address |
ZEPPELIN_PORT |
zeppelin.server.port |
8080 | Zeppelin server port Note: Please make sure you're not using the same port with Zeppelin web application development port (default: 9000). |
ZEPPELIN_SSL_PORT |
zeppelin.server.ssl.port |
8443 | Zeppelin Server ssl port (used when ssl environment/property is set to true) |
ZEPPELIN_JMX_ENABLE |
zeppelin.jmx.enable |
false | Enable JMX by defining "true" |
ZEPPELIN_JMX_PORT |
zeppelin.jmx.port |
9996 | Port number which JMX uses |
ZEPPELIN_MEM |
N/A | -Xmx1024m -XX:MaxMetaspaceSize=512m | JVM mem options |
ZEPPELIN_INTP_MEM |
N/A | ZEPPELIN_MEM | JVM mem options for interpreter process |
ZEPPELIN_JAVA_OPTS |
N/A | JVM options | |
ZEPPELIN_ALLOWED_ORIGINS |
zeppelin.server.allowed.origins |
* | Enables a way to specify a ',' separated list of allowed origins for REST and websockets. e.g. http://localhost:8080 |
ZEPPELIN_CREDENTIALS_PERSIST |
zeppelin.credentials.persist |
true | Persist credentials on a JSON file (credentials.json) |
ZEPPELIN_CREDENTIALS_ENCRYPT_KEY |
zeppelin.credentials.encryptKey |
If provided, encrypt passwords on the credentials.json file (passwords will be stored as plain-text otherwise | |
ZEPPELIN_SERVER_CONTEXT_PATH |
zeppelin.server.context.path |
/ | Context path of the web application |
ZEPPELIN_NOTEBOOK_COLLABORATIVE_MODE_ENABLE |
zeppelin.notebook.collaborative.mode.enable |
true | Enable basic opportunity for collaborative editing. Does not change the logic of operation if the note is used by one person. |
ZEPPELIN_SSL |
zeppelin.ssl |
false | |
ZEPPELIN_SSL_CLIENT_AUTH |
zeppelin.ssl.client.auth |
false | |
ZEPPELIN_SSL_KEYSTORE_PATH |
zeppelin.ssl.keystore.path |
keystore | |
ZEPPELIN_SSL_KEYSTORE_TYPE |
zeppelin.ssl.keystore.type |
JKS | |
ZEPPELIN_SSL_KEYSTORE_PASSWORD |
zeppelin.ssl.keystore.password |
||
ZEPPELIN_SSL_KEY_MANAGER_PASSWORD |
zeppelin.ssl.key.manager.password |
||
ZEPPELIN_SSL_TRUSTSTORE_PATH |
zeppelin.ssl.truststore.path |
||
ZEPPELIN_SSL_TRUSTSTORE_TYPE |
zeppelin.ssl.truststore.type |
||
ZEPPELIN_SSL_TRUSTSTORE_PASSWORD |
zeppelin.ssl.truststore.password |
||
ZEPPELIN_SSL_PEM_KEY |
zeppelin.ssl.pem.key |
This directive points to the PEM-encoded private key file for the server. | |
ZEPPELIN_SSL_PEM_KEY_PASSWORD |
zeppelin.ssl.pem.key.password |
Password of the PEM-encoded private key. | |
ZEPPELIN_SSL_PEM_CERT |
zeppelin.ssl.pem.cert |
This directive points to a file with certificate data in PEM format. | |
ZEPPELIN_SSL_PEM_CA |
zeppelin.ssl.pem.ca |
This directive sets the all-in-one file where you can assemble the Certificates of Certification Authorities (CA) whose clients you deal with. These are used for Client Authentication. Such a file is simply the concatenation of the various PEM-encoded Certificate files. | |
ZEPPELIN_NOTEBOOK_HOMESCREEN |
zeppelin.notebook.homescreen |
Display note IDs on the Apache Zeppelin homescreen e.g. 2A94M5J1Z |
|
ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE |
zeppelin.notebook.homescreen.hide |
false | Hide the note ID set by ZEPPELIN_NOTEBOOK_HOMESCREEN on the Apache Zeppelin homescreen. For the further information, please read Customize your Zeppelin homepage. |
ZEPPELIN_WAR_TEMPDIR |
zeppelin.war.tempdir |
webapps | Location of the jetty temporary directory |
ZEPPELIN_NOTEBOOK_DIR |
zeppelin.notebook.dir |
notebook | The root directory where notebook directories are saved |
ZEPPELIN_NOTEBOOK_S3_BUCKET |
zeppelin.notebook.s3.bucket |
zeppelin | S3 Bucket where notebook files will be saved |
ZEPPELIN_NOTEBOOK_S3_USER |
zeppelin.notebook.s3.user |
user | User name of an S3 bucket e.g. bucket/user/notebook/2A94M5J1Z/note.json |
ZEPPELIN_NOTEBOOK_S3_ENDPOINT |
zeppelin.notebook.s3.endpoint |
s3.amazonaws.com | Endpoint for the bucket |
N/A | zeppelin.notebook.s3.timeout |
120000 | Bucket endpoint request timeout in msec |
ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID |
zeppelin.notebook.s3.kmsKeyID |
AWS KMS Key ID to use for encrypting data in S3 (optional) | |
ZEPPELIN_NOTEBOOK_S3_EMP |
zeppelin.notebook.s3.encryptionMaterialsProvider |
Class name of a custom S3 encryption materials provider implementation to use for encrypting data in S3 (optional) | |
ZEPPELIN_NOTEBOOK_S3_SSE |
zeppelin.notebook.s3.sse |
false | Save notebooks to S3 with server-side encryption enabled |
ZEPPELIN_NOTEBOOK_S3_CANNED_ACL |
zeppelin.notebook.s3.cannedAcl |
Save notebooks to S3 with the given [Canned ACL](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/CannedAccessControlList.html) which determines the S3 permissions. | |
ZEPPELIN_NOTEBOOK_S3_PATH_STYLE_ACCESS |
zeppelin.notebook.s3.pathStyleAccess |
false | Access S3 bucket using path style |
ZEPPELIN_NOTEBOOK_S3_SIGNEROVERRIDE |
zeppelin.notebook.s3.signerOverride |
Optional override to control which signature algorithm should be used to sign AWS requests | |
ZEPPELIN_NOTEBOOK_AZURE_CONNECTION_STRING |
zeppelin.notebook.azure.connectionString |
The Azure storage account connection string e.g. DefaultEndpointsProtocol=https; |
|
ZEPPELIN_NOTEBOOK_AZURE_SHARE |
zeppelin.notebook.azure.share |
zeppelin | Azure Share where the notebook files will be saved |
ZEPPELIN_NOTEBOOK_AZURE_USER |
zeppelin.notebook.azure.user |
user | Optional user name of an Azure file share e.g. share/user/notebook/2A94M5J1Z/note.json |
ZEPPELIN_NOTEBOOK_STORAGE |
zeppelin.notebook.storage |
org.apache.zeppelin.notebook.repo.GitNotebookRepo | Comma separated list of notebook storage locations |
ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC |
zeppelin.notebook.one.way.sync |
false | If there are multiple notebook storage locations, should we treat the first one as the only source of truth? |
ZEPPELIN_NOTEBOOK_PUBLIC |
zeppelin.notebook.public |
true | Make notebook public (set only owners ) by default when created/imported. If set to false will add user to readers and writers as well, making it private and invisible to other users unless permissions are granted. |
ZEPPELIN_INTERPRETER_DIR |
zeppelin.interpreter.dir |
interpreter | Interpreter directory |
ZEPPELIN_INTERPRETER_DEP_MVNREPO |
zeppelin.interpreter.dep.mvnRepo |
https://repo1.maven.org/maven2/,https://repo2.maven.org/maven2/ | Remote principal repository for interpreter's additional dependency loading |
ZEPPELIN_INTERPRETER_OUTPUT_LIMIT |
zeppelin.interpreter.output.limit |
102400 | Output message from interpreter exceeding the limit will be truncated |
ZEPPELIN_INTERPRETER_CONNECT_TIMEOUT |
zeppelin.interpreter.connect.timeout |
600s | Interpreter process connect timeout. Default time unit is msec |
ZEPPELIN_DEP_LOCALREPO |
zeppelin.dep.localrepo |
local-repo | Local repository for dependency loader. ex)visualiztion modules of npm. |
ZEPPELIN_HELIUM_NODE_INSTALLER_URL |
zeppelin.helium.node.installer.url |
https://nodejs.org/dist/ | Remote Node installer url for Helium dependency loader |
ZEPPELIN_HELIUM_NPM_INSTALLER_URL |
zeppelin.helium.npm.installer.url |
http://registry.npmjs.org/ | Remote Npm installer url for Helium dependency loader |
ZEPPELIN_HELIUM_YARNPKG_INSTALLER_URL |
zeppelin.helium.yarnpkg.installer.url |
https://github.com/yarnpkg/yarn/releases/download/ | Remote Yarn package installer url for Helium dependency loader |
ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE |
zeppelin.websocket.max.text.message.size |
1024000 | Size(in characters) of the maximum text message that can be received by websocket. |
ZEPPELIN_SERVER_DEFAULT_DIR_ALLOWED |
zeppelin.server.default.dir.allowed |
false | Enable directory listings on server. |
ZEPPELIN_NOTEBOOK_GIT_REMOTE_URL |
zeppelin.notebook.git.remote.url |
GitHub's repository URL. It could be either the HTTP URL or the SSH URL. For example git@github.com:apache/zeppelin.git | |
ZEPPELIN_NOTEBOOK_GIT_REMOTE_USERNAME |
zeppelin.notebook.git.remote.username |
token | GitHub username. By default it is `token` to use GitHub's API |
ZEPPELIN_NOTEBOOK_GIT_REMOTE_ACCESS_TOKEN |
zeppelin.notebook.git.remote.access-token |
token | GitHub access token to use GitHub's API. If username/password combination is used and not GitHub API, then this value is the password |
ZEPPELIN_NOTEBOOK_GIT_REMOTE_ORIGIN |
zeppelin.notebook.git.remote.origin |
token | GitHub remote name. Default is `origin` |
ZEPPELIN_RUN_MODE |
zeppelin.run.mode |
auto | Run mode. 'auto|local|k8s'. 'auto' autodetect environment. 'local' runs interpreter as a local process. k8s runs interpreter on Kubernetes cluster |
ZEPPELIN_K8S_PORTFORWARD |
zeppelin.k8s.portforward |
false | Port forward to interpreter rpc port. Set 'true' only on local development when zeppelin.k8s.mode 'on'. Don't use 'true' on production environment |
ZEPPELIN_K8S_CONTAINER_IMAGE |
zeppelin.k8s.container.image |
apache/zeppelin:0.11.0 | Docker image for interpreters |
ZEPPELIN_K8S_SPARK_CONTAINER_IMAGE |
zeppelin.k8s.spark.container.image |
apache/spark:latest | Docker image for Spark executors |
ZEPPELIN_K8S_TEMPLATE_DIR |
zeppelin.k8s.template.dir |
k8s | Kubernetes yaml spec files |
ZEPPELIN_K8S_SERVICE_NAME |
zeppelin.k8s.service.name |
zeppelin-server | Name of the Zeppelin server service resources |
ZEPPELIN_K8S_TIMEOUT_DURING_PENDING |
zeppelin.k8s.timeout.during.pending |
true | Value to enable/disable timeout handling when starting Interpreter Pods. Caution: This can lead to an infinity loop |
ZEPPELIN_METRIC_ENABLE_PROMETHEUS |
zeppelin.metric.enable.prometheus |
false | Value to enable/disable Prometheus metric endpoint on /metric |
ZEPPELIN_NOTEBOOK_CRON_ENABLE |
zeppelin.notebook.cron.enable |
false | Value to enable/disable Cron support in Notes |
ZEPPELIN_NOTEBOOK_CRON_FOLDERS |
zeppelin.notebook.cron.folders |
comma-separated list of folder, where cron is allowed | |
ZEPPELIN_NOTE_CACHE_THRESHOLD |
zeppelin.note.cache.threshold |
50 | Threshold for the number of notes in the cache before an eviction occurs. |
ZEPPELIN_NOTEBOOK_VERSIONED_MODE_ENABLE |
zeppelin.notebook.versioned.mode.enable |
true | Value to enable/disable version control support in Notes. |
SSL Configuration
Enabling SSL requires a few configuration changes. First, you need to create certificates and then update necessary configurations to enable server side SSL and/or client side certificate authentication.
Creating and configuring the Certificates
Information how about to generate certificates and a keystore can be found here.
A condensed example can be found in the top answer to this StackOverflow post.
The keystore holds the private key and certificate on the server end. The trustore holds the trusted client certificates. Be sure that the path and password for these two stores are correctly configured in the password fields below. They can be obfuscated using the Jetty password tool. After Maven pulls in all the dependency to build Zeppelin, one of the Jetty jars contain the Password tool. Invoke this command from the Zeppelin home build directory with the appropriate version, user, and password.
java -cp ./zeppelin-server/target/lib/jetty-all-server-<version>.jar \
org.eclipse.jetty.util.security.Password <user> <password>
If you are using a self-signed, a certificate signed by an untrusted CA, or if client authentication is enabled, then the client must have a browser create exceptions for both the normal HTTPS port and WebSocket port. This can by done by trying to establish an HTTPS connection to both ports in a browser (e.g. if the ports are 443 and 8443, then visit https://127.0.0.1:443 and https://127.0.0.1:8443). This step can be skipped if the server certificate is signed by a trusted CA and client auth is disabled.
Configuring server side SSL
The following properties needs to be updated in the zeppelin-site.xml
in order to enable server side SSL.
<property>
<name>zeppelin.server.ssl.port</name>
<value>8443</value>
<description>Server ssl port. (used when ssl property is set to true)</description>
</property>
<property>
<name>zeppelin.ssl</name>
<value>true</value>
<description>Should SSL be used by the servers?</description>
</property>
<property>
<name>zeppelin.ssl.keystore.path</name>
<value>keystore</value>
<description>Path to keystore relative to Zeppelin configuration directory</description>
</property>
<property>
<name>zeppelin.ssl.keystore.type</name>
<value>JKS</value>
<description>The format of the given keystore (e.g. JKS or PKCS12)</description>
</property>
<property>
<name>zeppelin.ssl.keystore.password</name>
<value>change me</value>
<description>Keystore password. Can be obfuscated by the Jetty Password tool</description>
</property>
<property>
<name>zeppelin.ssl.key.manager.password</name>
<value>change me</value>
<description>Key Manager password. Defaults to keystore password. Can be obfuscated.</description>
</property>
Enabling client side certificate authentication
The following properties needs to be updated in the zeppelin-site.xml
in order to enable client side certificate authentication.
<property>
<name>zeppelin.server.ssl.port</name>
<value>8443</value>
<description>Server ssl port. (used when ssl property is set to true)</description>
</property>
<property>
<name>zeppelin.ssl.client.auth</name>
<value>true</value>
<description>Should client authentication be used for SSL connections?</description>
</property>
<property>
<name>zeppelin.ssl.truststore.path</name>
<value>truststore</value>
<description>Path to truststore relative to Zeppelin configuration directory. Defaults to the keystore path</description>
</property>
<property>
<name>zeppelin.ssl.truststore.type</name>
<value>JKS</value>
<description>The format of the given truststore (e.g. JKS or PKCS12). Defaults to the same type as the keystore type</description>
</property>
<property>
<name>zeppelin.ssl.truststore.password</name>
<value>change me</value>
<description>Truststore password. Can be obfuscated by the Jetty Password tool. Defaults to the keystore password</description>
</property>
Storing user credentials
In order to avoid having to re-enter credentials every time you restart/redeploy Zeppelin, you can store the user credentials. Zeppelin supports this via the ZEPPELINCREDENTIALSPERSIST configuration.
Please notice that passwords will be stored in plain text by default. To encrypt the passwords, use the ZEPPELINCREDENTIALSENCRYPT_KEY config variable. This will encrypt passwords using the AES-128 algorithm.
You can generate an appropriate encryption key any way you'd like - for instance, by using the openssl tool:
openssl enc -aes-128-cbc -k secret -P -md sha1
Important: storing your encryption key in a configuration file is not advised. Depending on your environment security needs, you may want to consider utilizing a credentials server, storing the ZEPPELINCREDENTIALSENCRYPT_KEY as an OS env variable, or any other approach that would not colocate the encryption key and the encrypted content (the credentials.json file).
Obfuscating Passwords using the Jetty Password Tool
Security best practices advise to not use plain text passwords and Jetty provides a password tool to help obfuscating the passwords used to access the KeyStore and TrustStore.
The Password tool documentation can be found here.
After using the tool:
java -cp $ZEPPELIN_HOME/zeppelin-server/target/lib/jetty-util-9.2.15.v20160210.jar \
org.eclipse.jetty.util.security.Password \
password
2016-12-15 10:46:47.931:INFO::main: Logging initialized @101ms
password
OBF:1v2j1uum1xtv1zej1zer1xtn1uvk1v1v
MD5:5f4dcc3b5aa765d61d8327deb882cf99
update your configuration with the obfuscated password :
<property>
<name>zeppelin.ssl.keystore.password</name>
<value>OBF:1v2j1uum1xtv1zej1zer1xtn1uvk1v1v</value>
<description>Keystore password. Can be obfuscated by the Jetty Password tool</description>
</property>
Create GitHub Access Token
When using GitHub to track notebooks, one can use GitHub's API for authentication. To create an access token, please use the following link https://github.com/settings/tokens.
The value of the access token generated is set in the zeppelin.notebook.git.remote.access-token
property.
Note: After updating these configurations, Zeppelin server needs to be restarted.