Table of Contents
  • Project Overview
  • ▶Section 1: Setup
    • 1.1: Installation
    • 1.2: Libraries
  • Section 2: Data Acquisition
  • ▶Section 3: Exploration
    • 3.1: Overview
    • 3.2: Quality
    • 3.3: Filtering
  • ▶Section 4: Cleaning & Features
    • 4.1: Summary
  • ▶Section 5: League Table
    • 5.1: Zones
    • 5.2: Visualization
  • ▶Section 6: Team Performance
    • 6.1: Attacking
    • 6.2: Defensive
    • 6.3: Efficiency
  • ▶Section 7: Correlations
    • 7.1.1: Heatmap
    • 7.1.2: Rankings
    • 7.2: Summary
    • 7.3: Distribution
  • ▶Section 8: Player Analysis
    • 8.1: Top Scorers
    • 8.2: Positional
    • 8.3: Shooting
    • 8.4: Defensive
    • 8.5: Goalkeepers
    • 8.6: Visualizations
    • 8.7: Contributions
  • ▶Section 9: Data Science
    • 9.1: Hypothesis Testing
    • 9.2: Regression
    • 9.3: Clustering
  • Section 10: Summary
  • Conclusion
  • Attributions

Premier League 2024-25: A Data-Driven Analysis¶

An End-to-End Data Analysis & Data Science Project¶

Author: Shorya Raj

Project Overview¶

This project presents a comprehensive analysis (EDA) of the English Premier League 2024-25 season. It transitions from a foundational Data Analysis approach—describing what happened through team performance metrics and player statistics—into a Data Science exploration by testing hypotheses, building explanatory models, and discovering hidden patterns in the data through learning.

The notebook provides a complete, end-to-end workflow, from secure data acquisition via the Kaggle API to final interactive visualizations and a detailed summary of key insights.

Project Highlights¶

  • Dual Approach: Combines descriptive data analysis with inferential data science techniques.
  • Holistic Coverage: In-depth analysis of both team dynamics and individual player performance.
  • Advanced Techniques: Features hypothesis testing, explanatory regression modeling, and K-Means clustering.
  • Interactive Visualizations: Employs Plotly, Matplotlib, and Seaborn to create publication-quality charts and dashboards.
  • Professional Workflow: Demonstrates best practices in data cleaning, feature engineering, and reporting.

Technologies & Libraries¶

  • Data Manipulation & Analysis: Pandas, NumPy
  • Data Visualization: Matplotlib, Seaborn, Plotly
  • Data Acquisition: Kaggle API
  • Statistical Modeling & Machine Learning: SciPy, StatsModels, Scikit-learn

Section 1: Project Overview and Environment Setup ¶

This initial section prepares the notebook environment. It involves installing all necessary Python packages for data analysis, visualization, and statistical modeling, followed by importing the required libraries. A consistent plotting style is also set for all visualizations.

Section 1.1: Package Installation¶

✅ Packages installed successfully!

Section 1.2: Import Libraries¶

✅ Environment setup complete!
📦 Libraries imported successfully!

Section 2: Secure Data Acquisition¶

Here, we automate the process of downloading the Premier League datasets directly from Kaggle. This is achieved using the Kaggle API, with a secure method for handling API credentials directly in the notebook.

✅ Kaggle credentials loaded securely from Colab secrets!
📥 Downloading datasets...
✅ Datasets downloaded successfully!
✅ Extracted premier-league-2024-2025-team-statistics.zip
✅ Extracted football-players-stats-2024-2025.zip
🎉 All datasets ready for analysis!

Section 3: Data Loading and Initial Exploration¶

Once downloaded, the datasets are loaded into Pandas DataFrames. This section includes an initial exploration to understand the structure, shape, and quality of both the team and player statistics data, forming the basis for our subsequent cleaning and analysis.

📂 Extracting and loading datasets...
✅ Data loaded successfully!

Section 3.1: Dataset Overview¶

======================================================================
📊 DATASET OVERVIEW
======================================================================
🏟️  Team Statistics:
   • Shape: (20, 19)
   • Teams: 20
   • Metrics: 19

⚽ Player Statistics:
   • Shape: (2854, 165)
   • Total Players: 2854
   • Metrics: 165

📋 Team Statistics Columns:
   ['Rk', 'Squad', 'MP', 'W', 'D', 'L', 'GF', 'GA', 'GD', 'Pts', 'Pts/MP', 'xG', 'xGA', 'xGD', 'xGD/90', 'Attendance', 'Top Team Scorer', 'Goalkeeper', 'Notes']

📋 Player Statistics Columns:
   ['Rk', 'Player', 'Nation', 'Pos', 'Squad', 'Comp', 'Age', 'Born', 'MP', 'Starts', 'Min', '90s', 'Gls', 'Ast', 'G+A', 'G-PK', 'PK', 'PKatt', 'CrdY', 'CrdR', 'xG', 'npxG', 'xAG', 'npxG+xAG', 'G+A-PK', 'xG+xAG', 'PrgC', 'PrgP', 'PrgR', 'Sh', 'SoT', 'SoT%', 'Sh/90', 'SoT/90', 'G/Sh', 'G/SoT', 'Dist', 'FK', 'PK_stats_shooting', 'PKatt_stats_shooting', 'xG_stats_shooting', 'npxG_stats_shooting', 'npxG/Sh', 'G-xG', 'np:G-xG', 'Cmp', 'Att', 'Cmp%', 'TotDist', 'PrgDist', 'Ast_stats_passing', 'xAG_stats_passing', 'xA', 'A-xAG', 'KP', '1/3', 'PPA', 'CrsPA', 'PrgP_stats_passing', 'Live', 'Dead', 'FK_stats_passing_types', 'TB', 'Sw', 'Crs', 'TI', 'CK', 'In', 'Out', 'Str', 'Cmp_stats_passing_types', 'Tkl', 'TklW', 'Def 3rd', 'Mid 3rd', 'Att 3rd', 'Att_stats_defense', 'Tkl%', 'Lost', 'Blocks_stats_defense', 'Sh_stats_defense', 'Pass', 'Int', 'Tkl+Int', 'Clr', 'Err', 'SCA', 'SCA90', 'PassLive', 'PassDead', 'TO', 'Sh_stats_gca', 'Fld', 'Def', 'GCA', 'GCA90', 'Touches', 'Def Pen', 'Def 3rd_stats_possession', 'Mid 3rd_stats_possession', 'Att 3rd_stats_possession', 'Att Pen', 'Live_stats_possession', 'Att_stats_possession', 'Succ', 'Succ%', 'Tkld', 'Tkld%', 'Carries', 'TotDist_stats_possession', 'PrgDist_stats_possession', 'PrgC_stats_possession', '1/3_stats_possession', 'CPA', 'Mis', 'Dis', 'Rec', 'PrgR_stats_possession', 'CrdY_stats_misc', 'CrdR_stats_misc', '2CrdY', 'Fls', 'Fld_stats_misc', 'Off_stats_misc', 'Crs_stats_misc', 'Int_stats_misc', 'TklW_stats_misc', 'PKwon', 'PKcon', 'OG', 'Recov', 'Won', 'Lost_stats_misc', 'Won%', 'GA', 'GA90', 'SoTA', 'Saves', 'Save%', 'W', 'D', 'L', 'CS', 'CS%', 'PKatt_stats_keeper', 'PKA', 'PKsv', 'PKm', 'PSxG', 'PSxG/SoT', 'PSxG+/-', '/90', 'Cmp_stats_keeper_adv', 'Att_stats_keeper_adv', 'Cmp%_stats_keeper_adv', 'Att (GK)', 'Thr', 'Launch%', 'AvgLen', 'Opp', 'Stp', 'Stp%', '#OPA', '#OPA/90', 'AvgDist']

Section 3.2: Data Quality Assessment¶

======================================================================
🔍 DATA QUALITY ASSESSMENT
======================================================================
🏟️  Team Statistics - Missing Values:
Notes    10
dtype: int64

📊 Team Statistics Summary:
          Rk    MP      W      D      L     GF     GA     GD    Pts  Pts/MP  \
count  20.00  20.0  20.00  20.00  20.00  20.00  20.00  20.00  20.00   20.00   
mean   10.50  38.0  14.35   9.30  14.35  55.75  55.75   0.00  52.35    1.38   
std     5.92   0.0   6.00   2.87   6.96  14.71  14.42  27.04  18.58    0.49   
min     1.00  38.0   2.00   5.00   4.00  26.00  34.00 -60.00  12.00    0.32   
25%     5.75  38.0  11.00   7.75   9.75  45.50  45.50 -11.25  42.00    1.11   
50%    10.50  38.0  15.00   9.00  12.00  58.00  52.50   3.50  55.00    1.44   
75%    15.25  38.0  19.25  10.25  18.50  66.00  62.75  14.25  66.00    1.74   
max    20.00  38.0  25.00  15.00  30.00  86.00  86.00  45.00  84.00    2.21   

          xG    xGA    xGD  xGD/90  Attendance  
count  20.00  20.00  20.00   20.00       20.00  
mean   53.90  53.89   0.00    0.00    40475.55  
std    13.04  12.00  23.30    0.61    16886.82  
min    32.60  34.40 -52.10   -1.37    11210.00  
25%    45.05  47.28  -6.55   -0.17    29979.75  
50%    57.40  49.60   2.70    0.07    35118.50  
75%    61.25  58.50  16.20    0.43    54629.75  
max    82.20  84.80  43.60    1.15    73747.00  

📊 PLayer Statistics Summary:
            Rk      Age     Born       MP   Starts      Min      90s      Gls  \
count  2854.00  2846.00  2846.00  2854.00  2854.00  2854.00  2854.00  2854.00   
mean   1427.50    25.02  1998.64    19.01    13.50  1211.53    13.46     1.68   
std     824.02     4.49     4.50    11.50    11.32   965.19    10.72     3.15   
min       1.00    15.00  1982.00     1.00     0.00     1.00     0.00     0.00   
25%     714.25    22.00  1996.00     9.00     3.00   317.25     3.50     0.00   
50%    1427.50    25.00  1999.00    20.00    11.00  1052.50    11.70     0.00   
75%    2140.75    28.00  2002.00    30.00    23.00  1996.75    22.20     2.00   
max    2854.00    41.00  2008.00    38.00    38.00  3420.00    38.00    31.00   

           Ast      G+A  ...  Att (GK)     Thr  Launch%  AvgLen     Opp  \
count  2854.00  2854.00  ...    212.00  212.00   212.00  212.00  212.00   
mean      1.20     2.88  ...    491.60   69.45    34.14   33.04  226.56   
std       1.95     4.53  ...    410.27   57.99    14.24    6.07  187.82   
min       0.00     0.00  ...      1.00    0.00     0.00    6.00    0.00   
25%       0.00     0.00  ...    112.25   15.75    25.45   29.48   55.75   
50%       0.00     1.00  ...    397.50   55.00    33.20   32.45  175.50   
75%       2.00     4.00  ...    847.25  120.25    41.02   35.90  408.00   
max      18.00    47.00  ...   1498.00  197.00    92.30   56.30  710.00   

          Stp    Stp%    #OPA  #OPA/90  AvgDist  
count  212.00  211.00  212.00   212.00   208.00  
mean    14.38    6.16   18.77     1.16    13.91  
std     13.87    4.07   18.28     1.01     3.73  
min      0.00    0.00    0.00     0.00     2.00  
25%      2.00    4.00    3.00     0.67    11.98  
50%     10.50    5.60   14.00     1.00    13.70  
75%     22.00    7.90   30.25     1.47    15.52  
max     64.00   33.30   89.00    10.00    28.00  

[8 rows x 160 columns]

Section 3.3: Player Data Filtering and Assessment¶

======================================================================
⚽ PREMIER LEAGUE PLAYER DATA
======================================================================
✅ Premier League Players Found: 574
🏟️  Teams Represented: 20
📍 Position Distribution:
   • DF: 186 players
   • MF: 112 players
   • FW: 85 players
   • FW,MF: 60 players
   • GK: 44 players
   • MF,FW: 44 players
   • DF,MF: 16 players
   • MF,DF: 13 players
   • FW,DF: 7 players
   • DF,FW: 7 players

Section 4: Data Cleaning and Feature Engineering¶

Data preprocessing is a critical step for ensuring the accuracy and reliability of any analysis. In this section, we clean the datasets by handling missing values and create new, insightful features that will enable a deeper analysis of team and player performance.

🧹 Cleaning team statistics...
✅ Team data cleaned and enhanced!
🧹 Cleaning player statistics...
✅ Player data cleaned! 506 players with 90+ minutes

Section 4.1: Feature Engineering Summary¶

======================================================================
⚙️  FEATURE ENGINEERING SUMMARY
======================================================================
🏟️  Team Metrics Added:
   • Goals_Per_Game
   • Goals_Against_Per_Game
   • Goal_Difference_Per_Game
   • Win_Rate (%)
   • Points_Per_Game

⚽ Player Metrics Added:
   • Goals_Per_90
   • Assists_Per_90
   • Goal_Contributions_Per_90

📊 Clean Data Summary:
   • Teams: 20
   • Players: 506

✅ Data preparation complete!

Section 5: Final League Table and Standings Analysis¶

The core analysis begins by reconstructing the final Premier League table. This section breaks down the season's outcomes, identifying the champions, teams qualifying for European competitions, and the relegated clubs. This provides the foundational context for all subsequent team performance analysis.

🏆 PREMIER LEAGUE 2024-25 FINAL STANDINGS ANALYSIS
================================================================================

📋 FINAL LEAGUE TABLE:
==========================================================================================
 Position           Squad  MP  W  D  L  GF  GA  GD  Pts
        1       Liverpool  38 25  9  4  86  41  45   84
        2         Arsenal  38 20 14  4  69  34  35   74
        3 Manchester City  38 21  8  9  72  44  28   71
        4         Chelsea  38 20  9  9  64  43  21   69
        5   Newcastle Utd  38 20  6 12  68  47  21   66
        6     Aston Villa  38 19  9 10  58  51   7   66
        7 Nott'ham Forest  38 19  8 11  58  46  12   65
        8        Brighton  38 16 13  9  66  59   7   61
        9     Bournemouth  38 15 11 12  58  46  12   56
       10       Brentford  38 16  8 14  66  57   9   56
       11          Fulham  38 15  9 14  54  54   0   54
       12  Crystal Palace  38 13 14 11  51  51   0   53
       13         Everton  38 11 15 12  42  44  -2   48
       14        West Ham  38 11 10 17  46  62 -16   43
       15  Manchester Utd  38 11  9 18  44  54 -10   42
       16          Wolves  38 12  6 20  54  69 -15   42
       17       Tottenham  38 11  5 22  64  65  -1   38
       18  Leicester City  38  6  7 25  33  80 -47   25
       19    Ipswich Town  38  4 10 24  36  82 -46   22
       20     Southampton  38  2  6 30  26  86 -60   12

Section 5.1: European Qualification and Relegation Zones¶

============================================================
🎯 QUALIFICATION AND RELEGATION ZONES
============================================================

🏆 CHAMPIONS LEAGUE QUALIFIERS (Top 5):
   1. Liverpool - 84 points
   2. Arsenal - 74 points
   3. Manchester City - 71 points
   4. Chelsea - 69 points
   5. Newcastle Utd - 66 points

🥈 EUROPA LEAGUE QUALIFIERS (6th-7th):
   6. Aston Villa - 66 points
   7. Nott'ham Forest - 65 points

⬇️ RELEGATED TEAMS:
   18. Leicester City - 25 points
   19. Ipswich Town - 22 points
   20. Southampton - 12 points

Section 5.2: League Table Visualization¶

No description has been provided for this image

Section 6: Team Performance Analysis¶

This section moves beyond the final standings to conduct a detailed examination of team performance. The analysis is broken down into three key areas: attacking prowess, defensive solidity, and overall team efficiency, using both traditional and advanced metrics like Expected Goals (xG).

🔥 ATTACKING PERFORMANCE ANALYSIS
============================================================

⚽ TOP 5 ATTACKING TEAMS:
          Squad  GF  MP  Goals_Per_Game
      Liverpool  86  38            2.26
Manchester City  72  38            1.89
        Arsenal  69  38            1.82
  Newcastle Utd  68  38            1.79
       Brighton  66  38            1.74

📈 TOP 5 GOAL OVERPERFORMERS (vs Expected Goals):
          Squad  GF   xG  xG_Difference
Nott'ham Forest  58 45.5           12.5
         Wolves  54 43.7           10.3
        Arsenal  69 59.9            9.1
       Brighton  66 58.7            7.3
      Brentford  66 59.0            7.0

Section 6.1: Attacking Performance Visualization¶

📊 Creating attacking performance visualizations...
No description has been provided for this image
✅ Attacking visualizations created with team labels!

Section 6.2: Defensive Performance Analysis¶

============================================================
🛡️ DEFENSIVE PERFORMANCE ANALYSIS
============================================================

🛡️ BEST DEFENSIVE TEAMS (Fewest Goals Conceded):
          Squad  GA  MP  Goals_Against_Per_Game
        Arsenal  34  38                    0.89
      Liverpool  41  38                    1.08
        Chelsea  43  38                    1.13
Manchester City  44  38                    1.16
        Everton  44  38                    1.16

📊 BEST GOAL DIFFERENCE:
          Squad  GD  GF  GA
      Liverpool  45  86  41
        Arsenal  35  69  34
Manchester City  28  72  44
        Chelsea  21  64  43
  Newcastle Utd  21  68  47

Section 6.3: Team Efficiency Analysis¶

============================================================
⚡ TEAM EFFICIENCY ANALYSIS
============================================================

⚡ MOST EFFICIENT TEAMS (Points per Game):
          Squad  Points_Per_Game  Win_Rate  Goals_Per_Game
      Liverpool             2.21     65.79            2.26
        Arsenal             1.95     52.63            1.82
Manchester City             1.87     55.26            1.89
        Chelsea             1.82     52.63            1.68
  Newcastle Utd             1.74     52.63            1.79

📈 BIGGEST OVERPERFORMERS (Actual vs Expected):
          Squad  GD  xGD  Performance_vs_Expected
Nott'ham Forest  12 -3.4                     15.4
        Arsenal  35 25.5                      9.5
Manchester City  28 20.4                      7.6
      Brentford   9  3.6                      5.4
      Tottenham  -1 -4.5                      3.5

Section 7: Statistical Insights and Correlations¶

To understand the underlying drivers of success, this section applies statistical methods to the team data. We explore the relationships between different performance metrics using correlation heatmaps and rank-based visualizations to identify which factors are most strongly associated with winning points and a positive goal difference.

🔍 CALCULATING STATISTICAL CORRELATIONS
============================================================

⭐ FACTORS MOST CORRELATED WITH POINTS:
----------------------------------------
  • Points_Per_Game     :  1.000
  • Pts/MP              :  1.000
  • Win_Rate            :  0.988
  • W                   :  0.988
  • Goal_Difference_Per_Game:  0.970
  • GD                  :  0.970
  • xGD/90              :  0.944

🎯 FACTORS MOST CORRELATED WITH GOAL DIFFERENCE:
----------------------------------------
  • Goal_Difference_Per_Game:  1.000
  • xGD                 :  0.975
  • xGD/90              :  0.975
  • Points_Per_Game     :  0.970
  • Pts                 :  0.970

Section 7.1.1: Correlation Heatmap Visualization¶

📊 Creating enhanced correlation visualizations...
No description has been provided for this image
✅ Enhanced correlation visualizations created!

Section 7.1.2: Team Rankings Heatmap¶

📊 Creating comprehensive team rankings heatmap...
No description has been provided for this image
✅ Team rankings heatmap created!

🏆 CATEGORY LEADERS:
  • Overall: Liverpool
  • Attack: Liverpool
  • Defense: Arsenal
  • Efficiency: Liverpool
  • Consistency: Liverpool

Section 7.2: League Statistical Summary¶

============================================================
📈 LEAGUE STATISTICAL SUMMARY
============================================================

🏟️  SEASON OVERVIEW:
  • Total Goals Scored: 1,115
  • Total Games Played: 380
  • Average Goals per Game: 2.93

📊 POINTS DISTRIBUTION:
  • Average Points: 52.4
  • Standard Deviation: 18.6
  • Points Range: 72 points
  • League Competitiveness: ⚖️ Moderately Competitive

⚽ GOALS ANALYSIS:
  • Highest Scoring Team: Liverpool (86 goals)
  • Best Defensive Team: Arsenal (34 conceded)
  • Goal Difference Range: -60 to +45

Section 7.3: Performance Distribution Visualization¶

📊 Creating performance distribution visualizations...
No description has been provided for this image
✅ Performance distribution visualizations created with team names!

Section 8: Comprehensive Premier League Player Analysis¶

This section provides detailed analysis of individual player performance across all positions. We examine top performers, efficiency metrics, and positional effectiveness to identify standout players and performance patterns.

🔍 LOADING AND PREPARING PREMIER LEAGUE PLAYER DATA
======================================================================
✅ Found 574 Premier League players!
Teams represented: 20
Positions: {'DF': 186, 'MF': 112, 'FW': 85, 'FW,MF': 60, 'GK': 44, 'MF,FW': 44, 'DF,MF': 16, 'MF,DF': 13, 'FW,DF': 7, 'DF,FW': 7}
📊 Active players (90+ minutes): 506

Section 8.1: Top Scorers and Goal Contributors¶

======================================================================
⚽ TOP GOAL SCORERS AND CREATORS ANALYSIS
======================================================================

🔥 TOP 15 GOAL SCORERS:
              Player           Squad   Pos  Gls  Ast  MP  Goals_Per_90   xG  Goals_vs_xG
       Mohamed Salah       Liverpool    FW   29   18  38          0.77 25.2          3.8
      Alexander Isak   Newcastle Utd    FW   23    6  34          0.75 20.3          2.7
      Erling Haaland Manchester City    FW   22    3  31          0.72 22.0          0.0
        Bryan Mbeumo       Brentford    FW   20    7  38          0.53 12.3          7.7
          Chris Wood Nott'ham Forest    FW   20    3  36          0.61 13.4          6.6
         Yoane Wissa       Brentford    FW   19    4  35          0.59 18.5          0.5
       Ollie Watkins     Aston Villa    FW   16    8  38          0.55 15.3          0.7
       Matheus Cunha          Wolves MF,FW   15    6  33          0.52  8.6          6.4
         Cole Palmer         Chelsea MF,FW   15    8  37          0.42 17.3         -2.3
Jean-Philippe Mateta  Crystal Palace    FW   14    2  37          0.48 13.5          0.5
Jørgen Strand Larsen          Wolves    FW   14    4  35          0.49 10.3          3.7
        Jarrod Bowen        West Ham FW,MF   13    8  34          0.39  8.6          4.4
           Luis Díaz       Liverpool    FW   13    5  36          0.49 12.0          1.0
          Liam Delap    Ipswich Town    FW   12    2  37          0.42  9.3          2.7
        Raúl Jiménez          Fulham    FW   12    3  38          0.43 12.0          0.0

🎯 TOP 15 ASSIST PROVIDERS:
            Player           Squad   Pos  Ast  Gls  MP  Assists_Per_90  xAG  Assists_vs_xAG
     Mohamed Salah       Liverpool    FW   18   29  38            0.48 14.2             3.8
      Jacob Murphy   Newcastle Utd    FW   12    8  35            0.46  8.9             3.1
    Anthony Elanga Nott'ham Forest FW,MF   11    6  38            0.40  5.7             5.3
  Mikkel Damsgaard       Brentford MF,FW   10    2  38            0.31  8.4             1.6
   Bruno Fernandes  Manchester Utd    MF   10    8  36            0.30  8.5             1.5
  Antonee Robinson          Fulham    DF   10    0  36            0.28  4.2             5.8
     Morgan Rogers     Aston Villa FW,MF   10    8  37            0.29  7.8             2.2
       Bukayo Saka         Arsenal FW,MF   10    6  25            0.52  7.6             2.4
     Son Heung-min       Tottenham    FW    9    7  30            0.38  8.2             0.8
      Jarrod Bowen        West Ham FW,MF    8   13  34            0.24  6.8             1.2
      Eberechi Eze  Crystal Palace MF,FW    8    8  34            0.28  6.2             1.8
Morgan Gibbs-White Nott'ham Forest    MF    8    7  34            0.26  5.0             3.0
       Cole Palmer         Chelsea MF,FW    8   15  37            0.23 10.9            -2.9
             Sávio Manchester City FW,MF    8    1  29            0.41  6.9             1.1
     Ollie Watkins     Aston Villa    FW    8   16  38            0.28  3.3             4.7

🏅 TOP 15 GOAL CONTRIBUTORS (Goals + Assists):
         Player           Squad   Pos  Gls  Ast  Goal_Contributions  Goal_Contributions_Per_90  MP
  Mohamed Salah       Liverpool    FW   29   18                  47                       1.25  38
 Alexander Isak   Newcastle Utd    FW   23    6                  29                       0.95  34
   Bryan Mbeumo       Brentford    FW   20    7                  27                       0.71  38
 Erling Haaland Manchester City    FW   22    3                  25                       0.82  31
  Ollie Watkins     Aston Villa    FW   16    8                  24                       0.83  38
    Cole Palmer         Chelsea MF,FW   15    8                  23                       0.65  37
    Yoane Wissa       Brentford    FW   19    4                  23                       0.71  35
     Chris Wood Nott'ham Forest    FW   20    3                  23                       0.70  36
   Jarrod Bowen        West Ham FW,MF   13    8                  21                       0.64  34
  Matheus Cunha          Wolves MF,FW   15    6                  21                       0.73  33
   Jacob Murphy   Newcastle Utd    FW    8   12                  20                       0.76  35
      Luis Díaz       Liverpool    FW   13    5                  18                       0.67  36
Bruno Fernandes  Manchester Utd    MF    8   10                  18                       0.54  36
Justin Kluivert     Bournemouth    MF   12    6                  18                       0.69  34
  Morgan Rogers     Aston Villa FW,MF    8   10                  18                       0.52  37

⚡ MOST EFFICIENT SCORERS (Goals per 90 min, 500+ minutes):
        Player           Squad   Pos  Gls  Goals_Per_90  Min  xG_Per_90
   Jáder Durán     Aston Villa    FW    7          0.99  638       0.69
 Mohamed Salah       Liverpool    FW   29          0.77 3371       0.67
Alexander Isak   Newcastle Utd    FW   23          0.75 2756       0.66
 Rodrigo Muniz          Fulham    FW    8          0.75  964       0.54
Erling Haaland Manchester City    FW   22          0.72 2736       0.72
   Richarlison       Tottenham    FW    4          0.71  504       0.66
Ryan Sessegnon          Fulham FW,DF    4          0.62  580       0.25
    Chris Wood Nott'ham Forest    FW   20          0.61 2959       0.41
   Yoane Wissa       Brentford    FW   19          0.59 2919       0.57
 Ollie Watkins     Aston Villa    FW   16          0.55 2598       0.53

Section 8.2: Positional Performance Analysis¶

======================================================================
📍 PERFORMANCE ANALYSIS BY POSITION
======================================================================

📊 STATISTICS BY POSITION:
        Gls             Ast       Goal_Contributions_Per_90 xG_Per_90  \
      count  sum  mean  sum  mean                      mean      mean   
Pos                                                                     
DF      168  118  0.70  151  0.90                      0.08      0.04   
DF,FW     5    5  1.00    5  1.00                      0.18      0.08   
DF,MF    16    8  0.50   16  1.00                      0.13      0.08   
FW       72  452  6.28  173  2.40                      0.46      0.34   
FW,DF     4    6  1.50    3  0.75                      0.35      0.17   
FW,MF    54  179  3.31  141  2.61                      0.38      0.25   
GK       42    0  0.00    9  0.21                      0.01      0.00   
MF       93  168  1.81  181  1.95                      0.20      0.12   
MF,DF    12   18  1.50   17  1.42                      0.13      0.08   
MF,FW    40  127  3.18  107  2.68                      0.32      0.21   

      xAG_Per_90     MP  
            mean   mean  
Pos                      
DF          0.05  21.07  
DF,FW       0.16  22.80  
DF,MF       0.07  16.88  
FW          0.14  24.71  
FW,DF       0.10  13.00  
FW,MF       0.17  23.89  
GK          0.00  18.29  
MF          0.11  25.37  
MF,DF       0.10  25.58  
MF,FW       0.16  23.75  

⭐ TOP PERFORMERS BY POSITION:

DF (Top Contributor):
  🏆 Rayan Aït-Nouri (Wolves)
     4G + 7A = 11 contributions
     0.32 per 90min

DF,FW (Top Contributor):
  🏆 Keane Lewis-Potter (Brentford)
     1G + 3A = 4 contributions
     0.12 per 90min

DF,MF (Top Contributor):
  🏆 Matheus Nunes (Manchester City)
     1G + 6A = 7 contributions
     0.38 per 90min

FW (Top Contributor):
  🏆 Mohamed Salah (Liverpool)
     29G + 18A = 47 contributions
     1.25 per 90min

FW,DF (Top Contributor):
  🏆 Ryan Sessegnon (Fulham)
     4G + 2A = 6 contributions
     0.94 per 90min

FW,MF (Top Contributor):
  🏆 Jarrod Bowen (West Ham)
     13G + 8A = 21 contributions
     0.64 per 90min

GK (Top Contributor):
  🏆 Ederson (Manchester City)
     0G + 4A = 4 contributions
     0.16 per 90min

MF (Top Contributor):
  🏆 Bruno Fernandes (Manchester Utd)
     8G + 10A = 18 contributions
     0.54 per 90min

MF,DF (Top Contributor):
  🏆 Jack Hinshelwood (Brighton)
     5G + 2A = 7 contributions
     0.34 per 90min

MF,FW (Top Contributor):
  🏆 Cole Palmer (Chelsea)
     15G + 8A = 23 contributions
     0.65 per 90min

Section 8.3: Shooting and Attacking Metrics¶

======================================================================
🎯 SHOOTING ANALYSIS
======================================================================

🎯 BEST SHOT ACCURACY (10+ shots):
              Player          Squad  Sh  SoT  Shot_Accuracy  Gls
    Nathan Broadhead   Ipswich Town  13    8           61.5    2
Jørgen Strand Larsen         Wolves  54   33           61.1   14
  Riccardo Calafiori        Arsenal  10    6           60.0    2
      Justin Devenny Crystal Palace  10    6           60.0    1
       Donyell Malen    Aston Villa  15    9           60.0    3
       Ethan Pinnock      Brentford  10    6           60.0    2
      Ryan Sessegnon         Fulham  14    8           57.1    4
     Sammie Szmodics   Ipswich Town  21   12           57.1    4
     Marcus Rashford Manchester Utd  16    9           56.2    4
    Jack Hinshelwood       Brighton  18   10           55.6    5

💥 BEST CONVERSION RATE (10+ shots):
              Player           Squad  Sh  Gls  Conversion_Rate   xG
          Chris Wood Nott'ham Forest  65   20             30.8 13.4
       Michael Keane         Everton  10    3             30.0  0.6
      Ryan Sessegnon          Fulham  14    4             28.6  1.6
    Jack Hinshelwood        Brighton  18    5             27.8  2.7
     Trevoh Chalobah  Crystal Palace  11    3             27.3  1.2
Jørgen Strand Larsen          Wolves  54   14             25.9 10.3
       Iliman Ndiaye         Everton  35    9             25.7  6.2
        Bryan Mbeumo       Brentford  79   20             25.3 12.3
        James Mcatee Manchester City  12    3             25.0  2.8
     Marcus Rashford  Manchester Utd  16    4             25.0  1.7

📈 BIGGEST OVERPERFORMERS (Goals vs xG):
              Player           Squad  Gls   xG  Goals_vs_xG
        Bryan Mbeumo       Brentford   20 12.3          7.7
          Chris Wood Nott'ham Forest   20 13.4          6.6
       Matheus Cunha          Wolves   15  8.6          6.4
        Jarrod Bowen        West Ham   13  8.6          4.4
          Alex Iwobi          Fulham    9  4.7          4.3
       Mateo Kovačić Manchester City    6  1.9          4.1
       Mohamed Salah       Liverpool   29 25.2          3.8
Jørgen Strand Larsen          Wolves   14 10.3          3.7
         Amad Diallo  Manchester Utd    8  4.7          3.3
      James Maddison       Tottenham    9  5.8          3.2

Section 8.4: Defensive Performance Analysis¶

======================================================================
🛡️ DEFENSIVE PERFORMANCE ANALYSIS
======================================================================

⚔️ TOP TACKLERS:
                   Player           Squad   Pos  Tkl  TklW  Int  Clr  MP
       Idrissa Gana Gueye         Everton    MF  133    80   48   36  37
             Daniel Muñoz  Crystal Palace    DF  123    80   44  108  37
               João Gomes          Wolves    MF  116    71   25   33  36
        Noussair Mazraoui  Manchester Utd    DF  115    68   34   95  37
           Moisés Caicedo         Chelsea MF,DF  114    73   49   60  38
      Alexis Mac Allister       Liverpool    MF   95    58   22   29  35
         Antonee Robinson          Fulham    DF   95    61   62  133  36
          Elliot Anderson Nott'ham Forest    MF   92    56   31   76  37
                    André          Wolves    MF   91    55   37   30  33
          Tyrick Mitchell  Crystal Palace    DF   91    55   19  108  37
            Neco Williams Nott'ham Forest    DF   90    57   31  137  35
          Rayan Aït-Nouri          Wolves    DF   89    57   26   75  37
         Mateus Fernandes     Southampton    MF   89    48   28   35  36
            Thomas Partey         Arsenal MF,DF   89    53   35   49  35
Victor Bernth Kristiansen  Leicester City    DF   87    51   42   75  30

🔍 TOP INTERCEPTORS:
            Player           Squad   Pos  Int  Tkl  Clr  MP
 Aaron Wan-Bissaka        West Ham    DF   66   70  125  36
  Antonee Robinson          Fulham    DF   62   95  133  36
  Ryan Gravenberch       Liverpool    MF   60   69   59  37
      Jan Bednarek     Southampton    DF   56   36  190  30
   Virgil van Dijk       Liverpool    DF   56   38  190  37
   Maxence Lacroix  Crystal Palace    DF   54   68  207  35
      Dean Huijsen     Bournemouth    DF   51   36  198  32
    Moisés Caicedo         Chelsea MF,DF   49  114   60  38
Christian Nørgaard       Brentford    MF   49   79   70  34
Idrissa Gana Gueye         Everton    MF   48  133   36  37
     Carlos Baleba        Brighton    MF   46   79   44  34
      James Justin  Leicester City    DF   46   58  141  36
      Milos Kerkez     Bournemouth    DF   45   52  106  38
      Fabian Schär   Newcastle Utd    DF   45   36  138  34
    Joško Gvardiol Manchester City    DF   44   58  122  37

🏃 MOST DEFENSIVE ACTIONS PER 90:
         Player           Squad Pos  Defensive_Actions_Per_90  Tkl  Int  Clr
  Harry Toffolo Nott'ham Forest  DF                     12.67    5    2   12
     Willy Boly Nott'ham Forest  DF                     12.35    6    3   12
      Welington     Southampton  DF                     11.85   21    7   36
     James Hill     Bournemouth  DF                     11.00   15    9   31
         Morato Nott'ham Forest  DF                     10.89   22   10   78
 Charlie Taylor     Southampton  DF                     10.75    7    5   31
   Dean Huijsen     Bournemouth  DF                     10.56   36   51  198
 Philip Billing     Bournemouth  MF                     10.50   14    5    2
   Jan Bednarek     Southampton  DF                     10.04   36   56  190
   Ben Chilwell  Crystal Palace  DF                     10.00    7    6   16
 Woyo Coulibaly  Leicester City  DF                     10.00    6    5    1
James Tarkowski         Everton  DF                      9.78   64   41  213
        Murillo Nott'ham Forest  DF                      9.55   53   36  249
    Caleb Okoli  Leicester City  DF                      9.52   25   15   79
Maxence Lacroix  Crystal Palace  DF                      9.51   68   54  207

Section 8.5: Goalkeeper Analysis¶

======================================================================
🥅 GOALKEEPER ANALYSIS
======================================================================

📊 GOALKEEPER STATISTICS (33 goalkeepers):
           Player           Squad  MP    W    D    L   CS   GA  Save%  Saves
        Matz Sels Nott'ham Forest  38 19.0  8.0 11.0 13.0 46.0   73.9  119.0
       David Raya         Arsenal  38 20.0 14.0  4.0 13.0 34.0   74.2   86.0
  Jordan Pickford         Everton  38 11.0 15.0 12.0 12.0 44.0   73.0  117.0
   Dean Henderson  Crystal Palace  38 13.0 14.0 11.0 11.0 51.0   66.7  101.0
   Robert Sánchez         Chelsea  32 17.0  9.0  6.0 10.0 34.0   76.4   92.0
          Ederson Manchester City  26 16.0  4.0  6.0 10.0 26.0   69.2   53.0
          Alisson       Liverpool  28 18.0  7.0  3.0  9.0 29.0   72.0   73.0
      André Onana  Manchester Utd  34 10.0  9.0 15.0  9.0 44.0   68.9   88.0
Kepa Arrizabalaga     Bournemouth  31 13.0  7.0 11.0  8.0 39.0   73.9   95.0
        Nick Pope   Newcastle Utd  28 13.0  6.0  9.0  8.0 35.0   71.7   86.0
Emiliano Martínez     Aston Villa  37 18.0  9.0 10.0  8.0 45.0   69.0   96.0
          José Sá          Wolves  29 11.0  5.0 13.0  7.0 48.0   63.2   69.0
  Bart Verbruggen        Brighton  36 14.0 13.0  9.0  7.0 58.0   65.7   87.0
     Mark Flekken       Brentford  37 16.0  8.0 13.0  7.0 55.0   73.4  150.0
  Martin Dúbravka   Newcastle Utd  10  7.0  0.0  3.0  5.0 12.0   70.7   29.0
       Bernd Leno          Fulham  38 15.0  9.0 14.0  5.0 54.0   67.9  106.0
  Alphonse Areola        West Ham  26  5.0  7.0 13.0  5.0 41.0   64.3   77.0
Guglielmo Vicario       Tottenham  24  9.0  3.0 12.0  4.0 37.0   64.7   67.0
Caoimhín Kelleher       Liverpool  10  7.0  2.0  1.0  4.0 12.0   67.6   24.0
   Aaron Ramsdale     Southampton  30  2.0  5.0 23.0  3.0 66.0   67.6  120.0
    Stefan Ortega Manchester City  13  5.0  4.0  3.0  3.0 18.0   68.0   33.0
 Łukasz Fabiański        West Ham  14  6.0  3.0  4.0  2.0 21.0   74.6   50.0
 Jakub Stolarczyk  Leicester City  10  3.0  1.0  6.0  2.0 16.0   63.6   28.0
   Mads Hermansen  Leicester City  27  3.0  6.0 18.0  1.0 58.0   64.5   99.0
   Fraser Forster       Tottenham   7  1.0  2.0  4.0  1.0 15.0   69.0   27.0
     Mark Travers     Bournemouth   5  2.0  2.0  1.0  1.0  5.0   80.0   20.0
   Arijanet Muric    Ipswich Town  18  2.0  6.0 10.0  1.0 33.0   70.0   67.0
  Filip Jørgensen         Chelsea   6  3.0  0.0  3.0  1.0  9.0   71.4   19.0
   Antonín Kinský       Tottenham   6  1.0  0.0  5.0  1.0 11.0   65.6   23.0
 Christian Walton    Ipswich Town   7  1.0  1.0  5.0  1.0 19.0   56.4   20.0
    Sam Johnstone          Wolves   7  0.0  1.0  6.0  0.0 17.0   61.5   23.0
    Alex McCarthy     Southampton   5  0.0  0.0  5.0  0.0 13.0   70.3   24.0
      Alex Palmer    Ipswich Town  13  1.0  3.0  9.0  0.0 30.0   59.2   43.0

🏆 BEST CLEAN SHEET %:
           Player           Squad  MP   CS  CS%   GA
  Martin Dúbravka   Newcastle Utd  10  5.0 50.0 12.0
Caoimhín Kelleher       Liverpool  10  4.0 40.0 12.0
          Ederson Manchester City  26 10.0 38.5 26.0
       David Raya         Arsenal  38 13.0 34.2 34.0
        Matz Sels Nott'ham Forest  38 13.0 34.2 46.0
          Alisson       Liverpool  28  9.0 32.1 29.0
  Jordan Pickford         Everton  38 12.0 31.6 44.0
   Robert Sánchez         Chelsea  32 10.0 31.3 34.0
   Dean Henderson  Crystal Palace  38 11.0 28.9 51.0
        Nick Pope   Newcastle Utd  28  8.0 28.6 35.0

🤲 BEST SAVE %:
           Player           Squad  Saves  SoTA  Save%   GA
     Mark Travers     Bournemouth   20.0  25.0   80.0  5.0
   Robert Sánchez         Chelsea   92.0 127.0   76.4 34.0
 Łukasz Fabiański        West Ham   50.0  71.0   74.6 21.0
       David Raya         Arsenal   86.0 120.0   74.2 34.0
Kepa Arrizabalaga     Bournemouth   95.0 134.0   73.9 39.0
        Matz Sels Nott'ham Forest  119.0 165.0   73.9 46.0
     Mark Flekken       Brentford  150.0 203.0   73.4 55.0
  Jordan Pickford         Everton  117.0 163.0   73.0 44.0
          Alisson       Liverpool   73.0 100.0   72.0 29.0
        Nick Pope   Newcastle Utd   86.0 120.0   71.7 35.0

Section 8.6: Player Performance Visualizations¶

======================================================================
📊 CREATING PLAYER PERFORMANCE VISUALIZATIONS
======================================================================
No description has been provided for this image
✅ Enhanced player performance visualizations created!

📋 VISUALIZATION LEGEND:
  🟡 Yellow boxes: Top overall contributors
  🔵 Blue boxes: Top goal scorers
  🟢 Green boxes: Top assist providers / xG overperformers
  🔴 Red boxes: xG underperformers

Section 8.7: Team Player Contributions¶

======================================================================
🏟️ TEAM PLAYER CONTRIBUTIONS ANALYSIS
======================================================================

🏆 TEAM GOAL CONTRIBUTIONS:
                 Gls  Ast  Goal_Contributions  Players_Count    xG   xAG
Squad                                                                   
Liverpool         85   65                 150             22  84.1  62.3
Arsenal           67   55                 122             22  62.0  46.6
Manchester City   71   51                 122             25  69.8  54.9
Newcastle Utd     66   50                 116             23  65.0  46.5
Brentford         65   44                 109             22  60.5  42.9
Chelsea           61   47                 108             26  69.3  53.5
Tottenham         61   46                 107             28  59.9  45.0
Brighton          64   41                 105             28  59.5  40.3
Aston Villa       56   45                 101             28  57.6  42.0
Nott'ham Forest   57   42                  99             22  46.7  33.8
Bournemouth       57   41                  98             24  65.4  44.1
Fulham            53   44                  97             23  49.9  37.6
Wolves            53   42                  95             24  44.0  35.2
Crystal Palace    48   38                  86             24  60.7  46.9
West Ham          43   29                  72             25  48.2  34.2
Manchester Utd    42   29                  71             30  53.5  39.8
Everton           39   27                  66             23  42.2  32.7
Ipswich Town      35   26                  61             30  35.0  24.2
Leicester City    33   25                  58             27  32.9  25.2
Southampton       25   16                  41             30  33.0  24.9

⚽ TOP PERFORMER PER TEAM:
Arsenal              | Kai Havertz               | 9G 3A
Aston Villa          | Ollie Watkins             | 16G 8A
Bournemouth          | Justin Kluivert           | 12G 6A
Brentford            | Bryan Mbeumo              | 20G 7A
Brighton             | João Pedro                | 10G 6A
Chelsea              | Cole Palmer               | 15G 8A
Crystal Palace       | Jean-Philippe Mateta      | 14G 2A
Everton              | Iliman Ndiaye             | 9G 0A
Fulham               | Raúl Jiménez              | 12G 3A
Ipswich Town         | Liam Delap                | 12G 2A
Leicester City       | Jamie Vardy               | 9G 4A
Liverpool            | Mohamed Salah             | 29G 18A
Manchester City      | Erling Haaland            | 22G 3A
Manchester Utd       | Amad Diallo               | 8G 6A
Newcastle Utd        | Alexander Isak            | 23G 6A
Nott'ham Forest      | Chris Wood                | 20G 3A
Southampton          | Paul Onuachu              | 4G 1A
Tottenham            | Brennan Johnson           | 11G 3A
West Ham             | Jarrod Bowen              | 13G 8A
Wolves               | Matheus Cunha             | 15G 6A

Section 9: Applying Data Science - Inference & Modeling¶

This section elevates the project from a descriptive analysis to a data science investigation. Instead of only observing what happened, we use statistical tools to understand why it might have happened and to quantify the relationships between different performance variables. This includes formal hypothesis testing, building an explanatory regression model, and using unsupervised learning to discover player archetypes.

Section 9.1: Statistical Hypothesis Testing (Inferential Statistics)¶

Hypothesis: Do teams that finish in the Top 4 (Champions League spots) have a statistically significant higher Goal Difference (GD) than the rest of the league?

🔬 HYPOTHESIS TEST: Does Goal Difference define Top 4 teams?
============================================================
H₀ (Null Hypothesis): There is no significant difference in Goal Difference.
H₁ (Alternative Hypothesis): There IS a significant difference.
------------------------------------------------------------
T-statistic: 5.16
P-value: 0.0002

✅ Conclusion: The result is statistically significant (p < 0.05).
   We REJECT the null hypothesis. Top 4 teams have a significantly different Goal Difference.

Section 9.2: Explanatory Modeling with Linear Regression¶

Goal: Build a model to explain how Goals For (GF), Goals Against (GA), and Expected Goal Difference (xGD) contribute to the final Points (Pts) tally.

 EXPLANATORY MODEL: What factors drive league points?
============================================================
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                    Pts   R-squared:                       0.942
Model:                            OLS   Adj. R-squared:                  0.931
Method:                 Least Squares   F-statistic:                     86.38
Date:                Tue, 08 Jul 2025   Prob (F-statistic):           4.25e-10
Time:                        15:33:11   Log-Likelihood:                -57.857
No. Observations:                  20   AIC:                             123.7
Df Residuals:                      16   BIC:                             127.7
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     57.4780     11.684      4.920      0.000      32.710      82.246
GF             0.6552      0.203      3.231      0.005       0.225       1.085
GA            -0.7471      0.227     -3.290      0.005      -1.229      -0.266
xGD           -0.0407      0.219     -0.186      0.854      -0.504       0.423
==============================================================================
Omnibus:                        8.548   Durbin-Watson:                   2.172
Prob(Omnibus):                  0.014   Jarque-Bera (JB):                6.408
Skew:                          -0.909   Prob(JB):                       0.0406
Kurtosis:                       5.095   Cond. No.                         848.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

📊 INTERPRETATION:
 - R-squared: Shows how much of the variance in Points is explained by our model.
 - Coef: Shows the strength and direction of each factor's effect on Points.
 - P>|t|: A low p-value (< 0.05) suggests the factor is statistically significant.

Section 9.3: Unsupervised Learning for Player Profiling (Clustering)¶

Goal: Identify different profiles of attacking players (e.g., "Pure Finishers", "Creative Forwards", "All-Rounders") using K-Means clustering.

👥 PLAYER PROFILING: Identifying attacker archetypes
============================================================
✅ Successfully clustered 67 attackers into 3 profiles.

--- Cluster 0: Profile ---
Goals_Per_90              0.434474
Assists_Per_90            0.132803
Sh/90                     2.689615
PrgC_stats_possession    34.269231
KP                       20.423077
dtype: float64

Sample Players in this Profile:
               Player        Squad
                 Beto      Everton
Dominic Calvert-Lewin      Everton
           Liam Delap Ipswich Town

--- Cluster 1: Profile ---
Goals_Per_90               0.384251
Assists_Per_90             0.243794
Sh/90                      2.850526
PrgC_stats_possession    107.578947
KP                        51.684211
dtype: float64

Sample Players in this Profile:
         Player           Squad
  Harvey Barnes   Newcastle Utd
  Matheus Cunha          Wolves
Kevin De Bruyne Manchester City

--- Cluster 2: Profile ---
Goals_Per_90              0.130380
Assists_Per_90            0.118158
Sh/90                     1.477727
PrgC_stats_possession    50.045455
KP                       26.772727
dtype: float64

Sample Players in this Profile:
                Player       Squad
        Cameron Archer Southampton
Jean-Ricner Bellegarde      Wolves
      Mikkel Damsgaard   Brentford

Section 10: Season Summary and Export¶

Creation of comprehensive summary of findings. This section brings together all analysis into a professional presentation format with exportable insights and executive summary.

================================================================================
🏁 PREMIER LEAGUE 2024-25 COMPREHENSIVE SEASON SUMMARY
================================================================================

🏆 SEASON HIGHLIGHTS:
  👑 Champion: Liverpool
     • Final Points: 84
     • Goal Difference: +45
     • Win Rate: 65.8%

  ⬇️ Relegated Teams:
     18. Leicester City - 25 points
     19. Ipswich Town - 22 points
     20. Southampton - 12 points

📊 CATEGORY WINNERS:
  ⚽ Best Attack: Liverpool (86 goals)
  🛡️ Best Defense: Arsenal (34 conceded)
  🏆 Most Wins: Liverpool (25 wins)

📈 LEAGUE STATISTICS:
  • Total Goals: 1,115
  • Average Goals per Game: 2.93
  • Points Spread: 72 points
  • Most Competitive Positions: Top 4 & Relegation battles

============================================================
💾 EXPORTING ANALYSIS RESULTS
============================================================
✅ Export completed successfully!
📁 Files saved to 'premier_league_2024_25_analysis' directory:
   • final_league_table.csv
   • enhanced_team_statistics.csv
   • executive_summary.txt

Project Conclusion & Final Thoughts¶

This project successfully demonstrates an end-to-end data analysis and data science workflow within the exciting domain of sports analytics. By integrating descriptive statistics, advanced visualizations, hypothesis testing, and machine learning, we have extracted deep, multi-faceted insights from the Premier League 2024-25 season data.

The structured approach—from data acquisition and cleaning to team and player analysis, and finally to statistical modeling—provides a robust and reproducible framework that can be adapted for future seasons or other sports leagues.

Final Summary of Project¶

📊 ANALYSIS SCOPE COVERED:

  • ✅ Complete season review and final standings
  • ✅ Detailed team performance metrics and comparisons
  • ✅ Comprehensive player performance analysis across all positions
  • ✅ Statistical insights and correlation analysis
  • ✅ Data science applications (hypothesis testing, regression, clustering)
  • ✅ Interactive visualizations and data storytelling

🚀 TECHNICAL SKILLS DEMONSTRATED:

  • • Data acquisition and API integration (Kaggle)
  • • Advanced data wrangling and feature engineering (Pandas)
  • • Statistical analysis and modeling (SciPy, StatsModels, Scikit-learn)
  • • Interactive dashboard creation (Plotly)
  • • Data visualization (Matplotlib, Seaborn)
  • • Professional coding practices with functional programming

💼 THIS PROJECT VALUE:

  • 🎯 Perfect for demonstrating both Data Analysis and Data Science expertise.
  • 📈 Shows end-to-end analytical thinking, from descriptive to inferential analysis.
  • 🏆 Explores an industry-relevant domain (sports analytics) with real-world data.
  • 🔧 Highlights technical versatility across a wide range of popular data science tools.

📚 Data Sources & Attributions¶

Primary Data Sources¶

This analysis was made possible through the following high-quality datasets:

Team Statistics:

  • Dataset: Premier League 2024-2025 Team Statistics
  • Source: Kaggle Dataset by @sattvikyadav
  • URL: kaggle.com/datasets/sattvikyadav/premier-league-2024-2025-team-statistics
  • License: Open Dataset License
  • Usage: Complete team performance metrics, final league standings, and advanced statistics

Player Statistics:

  • Dataset: Football Players Stats 2024-2025
  • Source: Kaggle Dataset by @hubertsidorowicz
  • URL: kaggle.com/datasets/hubertsidorowicz/football-players-stats-2024-2025
  • License: Open Dataset License
  • Usage: Individual player performance data across all Premier League teams

Data Acknowledgments¶

  • All statistical data is sourced from official Premier League records and verified third-party providers
  • Team and player performance metrics reflect the complete 2024-25 Premier League season
  • Expected Goals (xG) and advanced metrics sourced from professional football analytics providers

🛠️ Technical Stack & Tools¶

Programming & Analysis¶

  • Python 3.x - Primary programming language
  • Jupyter Notebook - Interactive development environment
  • Google Colaboratory - Cloud-based execution platform with GPU acceleration

Data Science Libraries¶

  • pandas 1.5+ - Data manipulation and analysis
  • NumPy 1.21+ - Numerical computing and array operations
  • SciPy - Statistical analysis and hypothesis testing
  • Scikit-learn - Machine learning algorithms and statistical modeling
  • StatsModels - Advanced statistical analysis and regression modeling

Visualization Libraries¶

  • Matplotlib 3.5+ - Static data visualization
  • Seaborn 0.11+ - Statistical data visualization
  • Plotly 5.0+ - Interactive charts and dashboards

Data Acquisition¶

  • Kaggle API - Secure dataset download and management
  • Python requests - HTTP library for data fetching

📖 Methodology References¶

Statistical Methods¶

  • Correlation Analysis: Pearson correlation coefficients for metric relationships
  • Hypothesis Testing: Independent t-tests for group comparisons
  • Linear Regression: Ordinary Least Squares (OLS) for explanatory modeling
  • Clustering Analysis: K-Means clustering for player profiling

Sports Analytics Framework¶

  • Expected Goals (xG) methodology follows industry-standard football analytics practices
  • Per-90-minute metrics calculated using official playing time data
  • Performance efficiency ratios based on established sports science literature

Data Science Best Practices¶

  • Cross-validation techniques for model validation
  • Feature engineering following domain expertise principles
  • Interactive visualization design based on data storytelling principles

🏆 Project Information¶

Project Scope¶

This project demonstrates end-to-end data science capabilities including:

  • Secure data acquisition and preprocessing
  • Exploratory Data Analysis (EDA) and statistical inference
  • Advanced visualization and interactive dashboard creation
  • Machine learning applications in sports analytics
  • Professional reporting and insight generation

Educational Purpose¶

This analysis was conducted for educational and portfolio purposes, showcasing:

  • Technical proficiency in Python and data science tools
  • Domain expertise in sports analytics
  • Statistical analysis and modeling capabilities

⚖️ Legal & Ethical Considerations¶

Data Usage Compliance¶

  • All datasets used are publicly available under open data licenses
  • Data usage complies with Kaggle Terms of Service and dataset-specific licenses
  • No personally identifiable information (PII) was processed or stored
  • Analysis conducted in accordance with data protection principles

Disclaimer¶

  • This analysis is for educational and demonstration purposes only
  • Statistical findings reflect historical data and should not be used for commercial betting or gambling
  • All insights and conclusions are based on available data and may not reflect complete season context

🙏 Acknowledgments¶

Special Thanks¶

  • Kaggle Community for providing high-quality, accessible sports datasets
  • Premier League for maintaining comprehensive statistical records
  • Open Source Community for developing the excellent tools that made this analysis possible

Inspiration¶

This project was inspired by the growing field of sports analytics and the desire to apply data science techniques to understand football performance dynamics. Special recognition to the broader sports analytics community for pioneering statistical approaches to football analysis.