Latest

100 Days Challenge Day 4 - K-Means Clustering (FIFA 19 Dataset Project)

100 Days Challenge - Day 4.

K-Means Clustering (FIFA 19 Dataset Project)

 


Revised Clustering from Andrew Ng's Machine Learning course on Coursera which I had already completed a few months ago and worked on FIFA 19 dataset to cluster a set of top football players (using FIFA 19 ratings like shooting, crossing, speed, acceleration, strength) into 4 classes expecting the clusters to reflect on the position, style and quality of play.

Topics covered include:

K Means Clustering
  • Algorithm
  • Optimization Objective
  • Choosing number of Clusters
Dataset : https://www.kaggle.com/karangadiya/fifa19/version/1

Project Link:
https://github.com/hithesh111/Hith100/blob/master/fifa19playerclustering.ipynb

Results:
The last column shows the cluster corresponding to each player and as expected, the goalkeepers are grouped together in De Gea cluster. Ramos cluster includes all defenders and defensive midfielders. Wingers and a few strikers have found their way into the Messi cluster. De Bruyne cluster has all kinds of attacking minded players which is kind of hard to interpret.
(I will try to expand this project later on with more players, random initialisation of cluster centroids and other optimization methods.)





No comments