Skip to main content

SweetViz is an open-source Python library that produces beautiful, highly detailed visualizations to start the EDA. It also covers all the result of a normal Pandas df.describe() method and much more. The output is a simple html file that you can download and open/use at your own convenience

Features of SweetViz:

Target analysis(optional):

  • Shows how a target variable associates with other variables.

Compare:

  • Can compare distinct dataset(training and testing)

Automatically detects numerical and categorical features

Association :

  • Shows associations for numerical as well as categorical data

Statistical summary :

  • Shows missing values, unique values, most frequent values,largest values,smallest values
  • numerical summary: min,max,range, mean,median,mode,std deviation, skewness,IQR and much more

Getting Started:

Firstly we will install the SweetViz library:

!pip install sweetviz

Setting up Dependencies:

import pandas as pd
import seaborn as sns
import sweetviz as sv

Loading the dataset, we will use planets dataset from the seaborn library:

#Loading the dataset
planets = sns.load_dataset('planets')
planets.head()

Lets analyze our dataset:

# Analyzing the dataset
report = sv.analyze(planets)


# Display the report
report.show_html('planets.html')

We can also explore the relation by clicking the Associations tab.

And its done. The EDA report is ready and it contains a lot of information for all the features. It is easy to understand and the report requires only a few lines of code.

Leave a Reply