Sign in

Steps in Exploratory Data analysis using Python.

Exploratory Data analysis

Hello, guy’s so today we will focus on the basics steps that we need to perform on a data set in EDA where the data is continuous.

First, let's import all the required libraries for the analysis.

This contains all the functions and libraries and codes for Exploratory Data Analysis.

# Numpy Library

import NumPy as np

# Pandas Library

import pandas as pd

# Matplot Library

import matplotlib.pyplot as plt

# Seaborn Library

import seaborn as sns

Step-1

  • This includes knowing the shape and info by using dataframe.info() and dataframe.shape.
Dataframe.info() to get an understanding about each and every column along with their data types.
Dataframe.shape() to understand the shape of the dataframe.
  • Going through each column using dataframe.head().
This is how dataframe.head() looks like from here you can go through every column and understand them.
Dataframe.head() To understand the dataframe.
  • Finding the insignificant and significant columns for the analysis.
  • Reclassify the columns where it’s needed (What it means to reclassify is that for example if we have a column where we have categorical variables such as ‘Very Bad’,’ Bad’,’ Average’,’ Good’,’ Very good’ so in this case having ‘very bad and bad’ is pointless and similar for ‘Good and Very good’ so we go for something called Reclassifying the columns making it easier for analysis). Where were reclassify ‘Very Bad’ and ‘Bad’ as “Bad’ and similarly ‘Very Good and ‘Good’ as ‘Good’.

Currently Pursuing my Post graduation at Manipal University.