Basic Display System in Apache Zeppelin
Text
By default, Apache Zeppelin prints interpreter response as a plain text using text
display system.
You can explicitly say you're using text
display system.
Html
With %html
directive, Zeppelin treats your output as HTML
Mathematical expressions
HTML display system automatically formats mathematical expression using MathJax. You can use
\\( INLINE EXPRESSION \\)
and $$ EXPRESSION $$
to format. For example
Table
If you have data that row separated by \n
(newline) and column separated by \t
(tab) with first row as header row, for example
You can simply use %table
display system to leverage Zeppelin's built in visualization.
If table contents start with %html
, it is interpreted as an HTML.
Note : Display system is backend independent.
Network
With the %network
directive, Zeppelin treats your output as a graph. Zeppelin can leverage the Property Graph Model.
What is the Labelled Property Graph Model?
A Property Graph is a graph that has these elements:
- a set of vertices
- each vertex has a unique identifier.
- each vertex has a set of outgoing edges.
- each vertex has a set of incoming edges.
- each vertex has a collection of properties defined by a map from key to value
- a set of edges
- each edge has a unique identifier.
- each edge has an outgoing tail vertex.
- each edge has an incoming head vertex.
- each edge has a label that denotes the type of relationship between its two vertices.
- each edge has a collection of properties defined by a map from key to value.
A Labelled Property Graph is a Property Graph where the nodes can be tagged with labels representing their different roles in the graph model
What are the APIs?
The new NETWORK visualization is based on json with the following params:
- "nodes" (mandatory): list of nodes of the graph every node can have the following params:
- "id" (mandatory): the id of the node (must be unique);
- "label": the main Label of the node;
- "labels": the list of the labels of the node;
- "data": the data attached to the node;
- "edges": list of the edges of the graph;
- "id" (mandatory): the id of the edge (must be unique);
- "source" (mandatory): the id of source node of the edge;
- "target" (mandatory): the id of target node of the edge;
- "label": the main type of the edge;
- "data": the data attached to the edge;
- "labels": a map (K, V) where K is the node label and V is the color of the node;
- "directed": (true/false, default false) wich tells if is directed graph or not;
- "types": a distinct list of the edge types of the graph
If you click on a node or edge on the bottom of the paragraph you find a list of entity properties
This kind of graph can be easily flatten in order to support other visualization formats provided by Zeppelin.
How to use it?
An example of a simple graph
%spark
print(s"""
%network {
"nodes": [
{"id": 1},
{"id": 2},
{"id": 3}
],
"edges": [
{"source": 1, "target": 2, "id" : 1},
{"source": 2, "target": 3, "id" : 2},
{"source": 1, "target": 2, "id" : 3},
{"source": 1, "target": 2, "id" : 4},
{"source": 2, "target": 1, "id" : 5},
{"source": 2, "target": 1, "id" : 6}
]
}
""")
that will look like:
A little more complex graph:
%spark
print(s"""
%network {
"nodes": [{"id": 1, "label": "User", "data": {"fullName":"Andrea Santurbano"}},{"id": 2, "label": "User", "data": {"fullName":"Lee Moon Soo"}},{"id": 3, "label": "Project", "data": {"name":"Zeppelin"}}],
"edges": [{"source": 2, "target": 1, "id" : 1, "label": "HELPS"},{"source": 2, "target": 3, "id" : 2, "label": "CREATE"},{"source": 1, "target": 3, "id" : 3, "label": "CONTRIBUTE_TO", "data": {"oldPR": "https://github.com/apache/zeppelin/pull/1582"}}],
"labels": {"User": "#8BC34A", "Project": "#3071A9"},
"directed": true,
"types": ["HELPS", "CREATE", "CONTRIBUTE_TO"]
}
""")
that will look like: