Shell interpreter for Apache Zeppelin
Overview
Zeppelin Shell has two interpreters the default is the %sh interpreter.
Shell interpreter
Shell interpreter uses Apache Commons Exec to execute external processes.
In Zeppelin notebook, you can use %sh
in the beginning of a paragraph to invoke system shell and run commands.
Terminal interpreter
Terminal interpreter uses hterm, Pty4J analog terminal operation.
Note : Currently each command runs as the user Zeppelin server is running as.
Configuration
At the "Interpreters" menu in Zeppelin dropdown menu, you can set the property value for Shell interpreter.
Name | Default | Description |
---|---|---|
shell.command.timeout.millisecs | 60000 | Shell command time out in millisecs |
shell.working.directory.user.home | false | If this set to true, the shell's working directory will be set to user home |
zeppelin.shell.auth.type | Types of authentications' methods supported are SIMPLE, and KERBEROS | |
zeppelin.shell.principal | The principal name to load from the keytab | |
zeppelin.shell.keytab.location | The path to the keytab file | |
zeppelin.shell.interpolation | false | Enable ZeppelinContext variable interpolation into paragraph text |
zeppelin.terminal.ip.mapping | Internal and external IP mapping of zeppelin server | |
zeppelin.concurrency.max | 10 | Max concurrency of shell interpreter |
Example
Shell interpreter
The following example demonstrates the basic usage of Shell in a Zeppelin notebook.
If you need further information about Zeppelin Interpreter Setting for using Shell interpreter, please read What is interpreter setting? section first.
Kerberos refresh interval
For changing the default behavior of when to renew Kerberos ticket following changes can be made in conf/zeppelin-env.sh
.
# Change Kerberos refresh interval (default value is 1d). Allowed postfix are ms, s, m, min, h, and d.
export KERBEROS_REFRESH_INTERVAL=4h
# Change kinit number retries (default value is 5), which means if the kinit command fails for 5 retries consecutively it will close the interpreter.
export KINIT_FAIL_THRESHOLD=10
Object Interpolation
The shell interpreter also supports interpolation of ZeppelinContext
objects into the paragraph text.
The following example shows one use of this facility:
In Scala cell:
z.put("dataFileName", "members-list-003.parquet")
// ...
val members = spark.read.parquet(z.get("dataFileName"))
// ...
In later Shell cell:
%sh
rm -rf {dataFileName}
Object interpolation is disabled by default, and can be enabled (for the Shell interpreter) by
setting the value of the property zeppelin.shell.interpolation
to true
(see Configuration above).
More details of this feature can be found in Zeppelin-Context
Terminal interpreter
The following example demonstrates the basic usage of terminal in a Zeppelin notebook.
%sh.terminal
input any char
zeppelin.terminal.ip.mapping
When running the terminal interpreter in the notebook, the front end of the notebook needs to obtain the IP address of the server where the terminal interpreter is located to communicate.
In a public cloud environment, the cloud host has an internal IP and an external access IP, and the interpreter runs in the cloud host. This will cause the notebook front end to be unable to connect to the terminal interpreter properly, resulting in the terminal interpreter being unusable.
Solution: Set the mapping between internal IP and external IP in the terminal interpreter, and connect the front end of the notebook through the external IP of the terminal interpreter.
Example: {"internal-ip1":"external-ip1", "internal-ip2":"external-ip2"}