Contribution Guidelines

Apache Zeppelin is an Apache2 License Software.

Contributing to Zeppelin (Source code, Documents, Image, Website) means you agree to the Apache2 License.

  1. Make sure your issue is not already in the Jira issue tracker
  2. If not, create a ticket describing the change you're proposing in the Jira issue tracker
  3. Setup travis Continuous Integration
  4. Contribute your patch via Pull Request on our Github Mirror.

Before you start, please read the Code of Conduct carefully, familiarize yourself with it and refer to it whenever you need it.

For those of you who are not familiar with Apache project, understanding How it works would be quite helpful.

Creating a Pull Request

When creating a Pull Request, you will automatically get the template below.

Filling it thoroughly can improve the speed of the review process.

### What is this PR for?
A few sentences describing the overall goals of the pull request's commits.
First time? Check out the contribution guidelines - https://zeppelin.apache.org/contribute.html

### What type of PR is it?
[Bug Fix | Improvement | Feature | Documentation | Hot Fix | Refactoring]

### Todos
* [ ] - Task

### What is the Jira issue?
* Open an issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN/
* Put link here, and add [ZEPPELIN-*Jira number*] in PR title, eg. [ZEPPELIN-533]

### How should this be tested?
Outline the steps to test the PR here.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update?
* Is there breaking changes for older versions?
* Does this needs documentation?

Testing a Pull Request

You can also test and review a particular Pull Request. Here are two useful ways.

  • Using a utility provided from Zeppelin.

    dev/test_zeppelin_pr.py [# of PR]
    

    For example, if you want to test #513, then the command will be:

    dev/test_zeppelin_pr.py 513
    
  • Another way is using github/hub.

    hub checkout https://github.com/apache/zeppelin/pull/[# of PR]
    

The above two methods will help you test and review Pull Requests.

Source Control Workflow

Zeppelin follows Fork & Pull model.

The Review Process

When a Pull Request is submitted, it is being merged or rejected by the following review process.

  • Anybody can be a reviewer and may comment on the change or suggest modifications.
  • Reviewer can indicate that a patch looks suitable for merging with a comment such as: "Looks good", "LGTM", "+1".
  • At least one indication of suitability (e.g. "LGTM") from a committer is required to be merged.
  • Pull request is open for 1 or 2 days for potential additional review, unless it's got enough indication of suitability.
  • A committer can then initiate lazy consensus ("Merge if there is no more discussion") after what the code can be merged after a certain time (normally 24 hours) if there is no more reviews.
  • Contributors can ping reviewers (including committers) by commenting 'Ready to review' or suitable indication.

Becoming a Committer

The PMC adds new committers from the active contributors, based on their contribution to Zeppelin.

The qualifications for new committers include:

  1. Sustained contributions: Committers should have a history of constant contributions to Zeppelin.
  2. Quality of contributions: Committers more than any other community member should submit simple, well-tested, and well-designed patches.
  3. Community involvement: Committers should have a constructive and friendly attitude in all community interactions. They should also be active on the dev, user list and reviewing patches. Also help new contributors and users.

Setting up

Here are some things you will need to build and test Zeppelin.

Software Configuration Management (SCM)

Zeppelin uses Git for its SCM system. so you'll need git client installed in your development machine.

Integrated Development Environment (IDE)

You are free to use whatever IDE you prefer, or your favorite command line editor.

Project Structure

Zeppelin project is based on Maven. Maven works by convention & defines directory structure for a project. The top-level pom.xml describes the basic project structure. Currently Zeppelin has the following modules.

<module>zeppelin-interpreter</module>
<module>zeppelin-zengine</module>
<module>spark</module>
<module>markdown</module>
<module>angular</module>
<module>shell</module>
<module>flink</module>
<module>ignite</module>
<module>lens</module>
<module>cassandra</module>
<module>zeppelin-web</module>
<module>zeppelin-server</module>
<module>zeppelin-distribution</module>

Code convention

We are following Google Code style:

There are some plugins to format, lint your code in IDE (use _tools/checkstyle.xml as rules)

Checkstyle report location is in ${submodule}/target/site/checkstyle.html Test coverage report location is in ${submodule}/target/site/cobertura/index.html

Getting the source code

First of all, you need the Zeppelin source code.

The official location for Zeppelin is http://git.apache.org/zeppelin.git.

git access

Get the source code on your development machine using git.

git clone git://git.apache.org/zeppelin.git zeppelin

You may also want to develop against a specific branch. For example, for branch-0.5.6

git clone -b branch-0.5.6 git://git.apache.org/zeppelin.git zeppelin

or with write access

git clone https://git-wip-us.apache.org/repos/asf/zeppelin.git

Fork repository

If you want not only build Zeppelin but also make change, then you need fork Zeppelin github mirror repository and make a pull request.

Build

Build Tools

To build the code, install

  • Oracle Java 7
  • Apache Maven

Building the code

mvn install

To skip test

mvn install -DskipTests

To build with specific spark / hadoop version

mvn install -Phadoop-2.2 -Dhadoop.version=2.2.0 -Pspark-1.3 -Dspark.version=1.3.0

Tests

Each new File should have its own accompanying unit tests. Each new interpreter should have come with its tests.

Zeppelin has 3 types of tests:

  • Unit Tests: The unit tests run as part of each package's build. E.g. SparkInterpeter Module's unit test is SparkInterpreterTest
  • Integration Tests: The integration tests run after all modules are build. The integration tests launch an instance of Zeppelin server. ZeppelinRestApiTest is an example integration test.
  • GUI integration tests: These tests validate the Zeppelin UI elements. These tests require a running Zeppelin server and launches a web browser to validate Notebook UI elements like Notes and their execution. See ZeppelinIT as an example.

Currently the GUI integration tests are not run in the Maven and are only run in the CI environment when the pull request is submitted to github.

Make sure to watch the CI results for your pull request.

Running GUI integration tests locally

All tests, just like the CI:
PATH=~/Applications/Firefox.app/Contents/MacOS/:$PATH CI="true" mvn verify -Pspark-1.6 -Phadoop-2.3 -Ppyspark -B -pl "zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" -Dtest="org.apache.zeppelin.AbstractFunctionalSuite" -DfailIfNoTests=false
Next to a Running instance of Zeppelin

This allows you to target a specific GUI integration test.

TEST_SELENIUM="true" mvn package -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' -Dtest=ParagraphActionsIT

Continuous Integration

Zeppelin project's CI system will collect information from pull request author's travis-ci and display status in the pull request.

Each individual contributor should setup travis-ci for the fork before making a pullrequest. Go to https://travis-ci.org/profile and switch on 'zeppelin' repository.

Run Zeppelin server in development mode

cd zeppelin-server
HADOOP_HOME=YOUR_HADOOP_HOME JAVA_HOME=YOUR_JAVA_HOME mvn exec:java -Dexec.mainClass="org.apache.zeppelin.server.ZeppelinServer" -Dexec.args=""

or use daemon script

bin/zeppelin-daemon start

Server will be run on http://localhost:8080