Creating an ETL Process in SSIS. Data Cleaning and Analysis

Description of Project

Exploring PPP Business Loans data (650K rows of data)
source of data: https://data.sba.gov/dataset/ppp-foia
Libraries used: Pandas, NumPy, Matplotlib, Seaborn

Methods of exploring data:

plotting categorical variables - univariate, bivariate, cross-tabulation, numerical variables - discrete, plotting outliers, distribution plot - skewness and kurtosis, distribution plot, scatter plot, correlation, heatmap matrix

Elements Used