HDFS File System Interpreter for Apache Zeppelin
Overview
Hadoop File System is a distributed, fault tolerant file system part of the hadoop project and is often used as storage for distributed processing engines like Hadoop MapReduce and Apache Spark or underlying file systems like Alluxio.
Configuration
Property | Default | Description |
---|---|---|
hdfs.url | http://localhost:50070/webhdfs/v1/ | The URL for WebHDFS |
hdfs.user | hdfs | The WebHDFS user |
hdfs.maxlength | 1000 | Maximum number of lines of results fetched |
This interpreter connects to HDFS using the HTTP WebHDFS interface.
It supports the basic shell file commands applied to HDFS, it currently only supports browsing.
- You can use ls [PATH] and ls -l [PATH] to list a directory. If the path is missing, then the current directory is listed. ls supports a -h flag for human readable file sizes.
- You can use cd [PATH] to change your current directory by giving a relative or an absolute path.
- You can invoke pwd to see your current directory.
Tip : Use ( Ctrl + . ) for autocompletion.
Create Interpreter
In a notebook, to enable the HDFS interpreter, click the Gear icon and select HDFS.
WebHDFS REST API
You can confirm that you're able to access the WebHDFS API by running a curl command against the WebHDFS end point provided to the interpreter.
Here is an example:
$> curl "http://localhost:50070/webhdfs/v1/?op=LISTSTATUS"