Correlation Analysis in Python

Description of Project

Using Pandas, NumPy, Seaborn and Matplotlib we conduct some correlation analysis and visualise some data points

The following skills were utilised:

  • Data cleaning
  • Converting data types
  • Checking for duplicates
  • Plotting a scatter plot
  • Plotting a linear regression model
  • Calculating the pearson correlation coefficient between variables, presenting it in a correlation matrix
  • Visualising the correlation matrix in a heatmap
  • Converting cells with an object type to a number, to then compare those variables in a correlation matrix
  • Pivotting a level of the hierarchical index labels, sorting the pairs to see values of interest

Data source: https://www.kaggle.com/datasets/danielgrijalvas/movies?resource=download