1 line of code to Visualize thousands of SOTA NLP models with John Snow Labs NLU & Streamlit with NER, Dependency Trees, Similarity Matrices for BERT, ELMO, ALBERT, XLNET, ELECTRA and much more! 6 New Magical Streamlit components

NLU :heart: Streamlit


The latest release of John Snow Labs NLU library integrates dozens of visualization capabilities with the latest NLU 3.0.2 Release , make sure to scroll all the way down and check out all the demo gifs or go straight to the release notes, since not all the gifs load properly in the forum! :slight_smile:

This release contains examples and tutorials on how to visualize the 1000+ state-of-the-art NLP models provided by NLU in just 1 line of code in streamlit.
It includes simple 1-liners you can sprinkle into your Streamlit app to for features like Dependency Trees, Named Entities (NER), text classification results, semantic simmilarity,
embedding visualizations via ELMO, BERT, ALBERT, XLNET and much more

This is the ultimate NLP research tool. You can visualize and compare the results of hundreds of context-aware deep learning embeddings and compare them with classical vanilla embeddings like Glove
and can see with your own eyes how context is encoded by transformer models like BERT or XLNETand many more!
Besides that, you can also compare the results of the 200+ NER models John Snow Labs provides and see how performance changes with varrying ebeddings, like Contextual, Static and Domain-Specific Embeddings.


For detailed instructions refer to the NLU install documentation here
You need Open JDK 8 installed and the following python packages

pip install nlu streamlit pyspark==3.0.1 sklearn plotly 

Problems? Connect with us on Slack!

Impatient and want some action?

Just run this Streamlit app, you can use it to generate python code for each NLU-Streamlit building block

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/01_dashboard.py

Quick Starter cheat sheet - All you need to know in 1 picture for NLU + Streamlit

For NLU models to load, see the NLU Namespace or the John Snow Labs Modelshub or go straight to the source.


Just try out any of these.
You can use the first example to generate python-code snippets which you can
recycle as building blocks in your streamlit apps!

Example: 01_dashboard

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/01_dashboard.py

Example: 02_NER

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/02_NER.py

Example: 03_text_similarity_matrix

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/03_text_similarity_matrix.py

Example: 04_dependency_tree

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/04_dependency_tree.py

Example: 05_classifiers

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/05_classifiers.py

Example: 06_token_features

streamlit run https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/examples/streamlit/06_token_features.py

How to use NLU?

All you need to know about NLU is that there is the nlu.load() method which returns a NLUPipeline object
which has a .predict() that works on most common data types in the pydata stack like Pandas dataframes .
Ontop of that, there are various visualization methods a NLUPipeline provides easily integrate in Streamlit as re-usable components. viz() method

Overview of NLU + Streamlit buildingblocks

Method Description
nlu.load('<Model>').predict(data) Load any of the 1000+ models by providing the model name any predict on most Pythontic data strucutres like Pandas, strings, arrays of strings and more
nlu.load('<Model>').viz_streamlit(data) Display full NLU exploration dashboard, that showcases every feature avaiable with dropdown selectors for 1000+ models
nlu.load('<Model>').viz_streamlit_similarity([string1, string2]) Display similarity matrix and scalar similarity for every word embedding loaded and 2 strings.
nlu.load('<Model>').viz_streamlit_ner(data) Visualize predicted NER tags from Named Entity Recognizer model
nlu.load('<Model>').viz_streamlit_dep_tree(data) Visualize Dependency Tree together with Part of Speech labels
nlu.load('<Model>').viz_streamlit_classes(data) Display all extracted class features and confidences for every classifier loaded in pipeline
nlu.load('<Model>').viz_streamlit_token(data) Display all detected token features and informations in Streamlit
nlu.load('<Model>').viz(data, write_to_streamlit=True) Display the raw visualization without any UI elements. See viz docs for more info. By default all aplicable nlu model references will be shown.
nlu.enable_streamlit_caching() Enable caching the nlu.load() call. Once enabled, the nlu.load() method will automatically cached. This is recommended to run first and for large peformance gans

Detailed visualizer information and API docs

function pipe.viz_streamlit

Display a highly configurable UI that showcases almost every feature available for Streamlit visualization with model selection dropdowns in your applications.
Ths includes :

  • Similarity Matrix & Scalars & Embedding Information for any of the 100+ Word Embedding Models
  • NER visualizations for any of the 200+ Named entity recognizers
  • Labled & Unlabled Dependency Trees visualizations with Part of Speech Tags for any of the 100+ Part of Speech Models
  • Token informations predicted by any of the 1000+ models
  • Classification results predicted by any of the 100+ models classification models
  • Pipeline Configuration & Model Information & Link to John Snow Labs Modelshub for all loaded pipelines
  • Auto generate Python code that can be copy pasted to re-create the individual Streamlit visualization blocks.
    NlLU takes the first model specified as nlu.load() for the first visualization run.
    Once the Streamlit app is running, additional models can easily be added via the UI.
    It is recommended to run this first, since you can generate Python code snippets to recreate individual Streamlit visualization blocks
nlu.load('ner').viz_streamlit(['I love NLU and Streamlit!','I hate buggy software'])

function pipe.viz_streamlit_classes

Visualize the predicted classes and their confidences and additional metadata to streamlit.
Aplicable with any of the 100+ classifiers

nlu.load('sentiment').viz_streamlit_classes(['I love NLU and Streamlit!','I love buggy software', 'Sign up now get a chance to win 1000$ !', 'I am afraid of Snakes','Unicorns have been sighted on Mars!','Where is the next bus stop?'])

function pipe.viz_streamlit_ner

Visualize the predicted classes and their confidences and additional metadata to Streamlit.
Aplicable with any of the 250+ NER models.
You can filter which NER tags to highlight via the dropdown in the main window.

Basic usage

nlu.load('ner').viz_streamlit_ner('Donald Trump from America and Angela Merkel from Germany dont share many views')

Example for coloring

# Color all entities of class GPE black
nlu.load('ner').viz_streamlit_ner('Donald Trump from America and Angela Merkel from Germany dont share many views',colors={'PERSON':'#6e992e', 'GPE':'#000000'})

function pipe.viz_streamlit_dep_tree

Visualize a typed dependency tree, the relations between tokens and part of speech tags predicted.
Aplicable with any of the 100+ Part of Speech(POS) models and dep tree model

nlu.load('dep.typed').viz_streamlit_dep_tree('POS tags define a grammatical label for each token and the Dependency Tree classifies Relations between the tokens')

function pipe.viz_streamlit_token

Visualize predicted token and text features for every model loaded.
You can use this with any of the 1000+ models and select them from the left dropdown.

nlu.load('stemm pos spell').viz_streamlit_token('I liek pentut buttr and jelly !')

function pipe.viz_streamlit_similarity

  • Displays a similarity matrix, where x-axis is every token in the first text and y-axis is every token in the second text.
  • Index i,j in the matrix describes the similarity of token-i to token-j based on the loaded embeddings and distance metrics, based on Sklearns Pariwise Metrics.. See this article for more elaboration on similarities
  • Displays a dropdown selectors from which various similarity metrics and over 100 embeddings can be selected.
    -There will be one similarity matrix per metric and embedding pair selected. num_plots = num_metric*num_embeddings
    Also displays embedding vector information.
    Applicable with any of the 100+ Word Embedding models
nlu.load('bert').viz_streamlit_word_similarity(['I love love loooove NLU! <3','I also love love looove  Streamlit! <3'])

1 line Install NLU on Google Colab

!wget https://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash

1 line Install NLU on Kaggle

!wget https://setup.johnsnowlabs.com/nlu/kaggle.sh -O - | bash

Install via PIP

! pip install nlu pyspark==3.0.1