Golf Project

Table of Contents

  1. Description of the Data
  2. Data Cleaning
  3. Exploratory Data Analysis
  4. Machine Learning Model (Classification)
  5. Machine Learning Model (Regression)
  6. Conclusion

1. Description of the Data

Back to table of Contents Lets import the Libraries we need and the data (csv) file!

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("golfdata.csv")

# Return first 5 rows
df.head()

Output:

  Player Name Rounds Fairway Percentage Year Avg Distance gir Average Putts Average Scrambling Average Score Points Wins Top 10 Average SG Putts Average SG Total SG:OTT SG:APR SG:ARG Money
0 Henrik Stenson 60.0 75.19 2018 291.5 73.51 29.93 60.67 69.617 868 NaN 5.0 -0.207 1.153 0.427 0.960 -0.027 $2,680,487
1 Ryan Armour 109.0 73.58 2018 283.5 68.22 29.31 60.13 70.758 1,006 1.0 3.0 -0.058 0.337 -0.012 0.213 0.194 $2,485,203
2 Chez Reavie 93.0 72.24 2018 286.5 68.67 29.12 62.27 70.432 1,020 NaN 3.0 0.192 0.674 0.183 0.437 -0.137 $2,700,018
3 Ryan Moore 78.0 71.94 2018 289.2 68.80 29.17 64.16 70.015 795 NaN 5.0 -0.271 0.941 0.406 0.532 0.273 $1,986,608
4 Brian Stuard 103.0 71.44 2018 278.9 67.12 29.11 59.23 71.038 421 NaN 3.0 0.164 0.062 -0.227 0.099 0.026 $1,089,763

OK, now what?

Lets Clean the data

  • First we can look at the data and see what needs to be cleaned

1. Description of the Data

Back to table of Contents

2. Data Cleaning

Back to table of Contents

3. Exploratory Data Analysis

Back to table of Contents

4. Machine Learning Model (Classification)

Back to table of Contents

5. Machine Learning Model (Regression)

Back to table of Contents

6. Conclusion

Back to table of Contents