EDA using sweetviz

Today we will see how to do EDA (Exploratory Data Analysis) using Sweetviz on dataset. There are many way we can perform automated EDA. One of them is Sweetviz.

What is EDA (Exploratory Data Analysis)?

EDA (Exploratory Data Analysis) is a process where we analyze datasets using visual methods. EDA should be performed in order to find the patterns, visual insights etc. There is different library used for EDA. Today we will see one of them is Sweetviz.

Sweetviz

It is python library that generates visualizations to start your EDA with a single line of code. Let’s explore Sweetviz in detail.

source: https://pypi.org/project/sweetviz/

Install Sweetviz

Like any other library we can install sweetviz using pip command

pip install sweetviz

Load Data

Today we are going to work with breast cancer dataset. Whose shape is (569,31). In which 569 rows and 31 columns are.

import pandas as pd
from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()
df = pd.DataFrame(data = cancer.data, columns=cancer.feature_names)
df['target'] = cancer.target
df['target'] = df['target'].replace({0:'malignant',1:'benign'})

Sweetviz has 3 main functions for creating reports:

  • Analyze
  • Compare
  • Compare_intra

In Analyze function pass your dataframe. Show_html() for visualizing report or save report. Β 

import sweetviz as sv
my_report = sv.analyze(df)
my_report.show_html('filename.html')

for compare 2 dataframe use function compare. it take 2 dataframe as input

import sweetviz as sv
train_data = df[:400]
test_data = df[400:]
my_report = sv.compare(train_data, test_data)
my_report.show_html('filename.html')

compare_intra comapre all data based on particular classes(Boolean) Ex. [Male, Female]

import sweetviz as sv
my_report = sv.compare_intra(df, df['target']=='malignant',['malignant','benign'])
my_report.show_html('filename.html')

For more information click here

For more blog click here

If you find any issue. Please let us know

πŸ‘πŸ‘πŸ‘πŸ‘πŸ‘πŸ‘πŸ‘πŸ‘πŸ‘

Leave a Reply