Lens Interpreter for Apache Zeppelin

Overview

Apache Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.

Apache Lens

Installing and Running Lens

In order to use Lens interpreters, you may install Apache Lens in some simple steps:

  1. Download Lens for latest version from the ASF. Or the older release can be found in the Archives.
  2. Before running Lens, you have to set HIVEHOME and HADOOPHOME. If you want to get more information about this, please refer to here. Lens also provides Pseudo Distributed mode. Lens pseudo-distributed setup is done by using docker. Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup.
  3. Now, you can start lens server (or stop).
./bin/lens-ctl start (or stop)

Configuring Lens Interpreter

At the "Interpreters" menu, you can edit Lens interpreter or create new one. Zeppelin provides these properties for Lens.

Property Name value Description
lens.client.dbname default The database schema name
lens.query.enable.persistent.resultset false Whether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions
lens.server.base.url http://hostname:port/lensapi The base url for the lens server. you have to edit "hostname" and "port" that you may use(ex. http://0.0.0.0:9999/lensapi)
lens.session.cluster.user default Hadoop cluster username
zeppelin.lens.maxResult 1000 Max number of rows to display
zeppelin.lens.maxThreads 10 If concurrency is true then how many threads?
zeppelin.lens.run.concurrent true Run concurrent Lens Sessions
xxx yyy anything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)

Apache Lens Interpreter Setting

Interpreter Binding for Zeppelin Notebook

After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image.

Zeppelin Notebook Interpreter Binding

For more interpreter binding information see here.

How to use

You can analyze your data by using OLAP Cube QL which is a high level SQL like language to query and describe data sets organized in data cubes. You may experience OLAP Cube like this Video tutorial. As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh). All of these functions also can be used on Zeppelin by using Lens interpreter.

  • Create and Use(Switch) Databases.

    create database newDb
    
    use newDb
    

  • Create Storage.

    create storage your/path/to/lens/client/examples/resources/db-storage.xml
    

  • Create Dimensions, Show fields and join-chains of them.

    create dimension your/path/to/lens/client/examples/resources/customer.xml
    
    dimension show fields customer
    
    dimension show joinchains customer
    

  • Create Caches, Show fields and join-chains of them.

    create cube your/path/to/lens/client/examples/resources/sales-cube.xml
    
    cube show fields sales
    
    cube show joinchains sales
    

  • Create Dimtables and Fact.

    create dimtable your/path/to/lens/client/examples/resources/customer_table.xml
    
    create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml
    

  • Add partitions to Dimtable and Fact.

    dimtable add single-partition --dimtable_name customer_table --storage_name local --path your/path/to/lens/client/examples/resources/customer-local-part.xml
    
    fact add partitions --fact_name sales_raw_fact --storage_name local --path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml
    

  • Now, you can run queries on cubes.

    query execute cube select customer_city_name, product_details.description, product_details.category, product_details.color, store_sales from sales where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00')
    

    Lens Query Result

    These are just examples that provided in advance by Lens. If you want to explore whole tutorials of Lens, see the tutorial video.

    Lens UI Service

    Lens also provides web UI service. Once the server starts up, you can open the service on http://serverhost:19999/index.html and browse. You may also check the structure that you made and use query easily here.

    Lens UI Service