FIFA Soccer Players’ Value Analysis¶

Group Member: Haochen Chen, Maria Chen

Dataset:

Kaggle Competition Datasets

FIFA Players 2019 data:https://www.kaggle.com/datasets/javagarm/fifa-19-complete-player-dataset
FIFA Players 2020 data: https://www.kaggle.com/datasets/sagunsh/fifa-20-complete-player-dataset
FIFA Players 2021 data:https://www.kaggle.com/datasets/umeshkumar017/fifa-21-player-and-formation-analysis

Introduction of Project:

Now people pay more and more attention to the changes in football. FIFA is the International Association Football Federation, an organization to ensure fair competition for players from all countries. FIFA released the football game of the same name, and all the data in the game come from real data. Each player in the game will have different worth, salary and skill rating. Players can choose to buy different player cards to form their own team to win the game. This project mainly analyzes the factors that change the value of players to help users get the highest value player cards at the lowest cost. We use the three year datas, cause we want to see whether or not Covide affect the players' value, and what will change accross these three years.

Simple Timeline for Project：

Our team will have a meeting on Fridays every week to discuss each other's completion, problems encountered and solutions. Maria Chen will be mainly responsible for the analysis of Player in the project, and Haochen Chen will be mainly responsible for the analysis of Team in the project. After both parties have completed the analysis of their respective databases, we will integrate the data from our analysis and finally work together to predict the future performance of players and teams.

Connect and Import connect with google drive, and import the basic funtion.

# Connect our notebook to the Drive
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/MyDrive/CMPS3160_Project

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/CMPS3160_Project

#Import some package we will use in the future process
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns; sns.set_theme()
from scipy import stats, integrate
from collections import Counter
from sklearn.utils import shuffle

#origanl table FIFA 19-21 data; data 1 = FIFA 19 data 
data190 = pd.read_csv('/content/drive/MyDrive/CMPS3160_Project/data/1.csv')

data_2=pd.read_csv('/content/drive/MyDrive/CMPS3160_Project/data/2.csv')

data210 = pd.read_csv('/content/drive/MyDrive/CMPS3160_Project/data/3.csv')

# Save the change csv to the new csv which not include the string in the Value, Age, Wage
path1 = "/content/drive/MyDrive/CMPS3160_Project/data/19.csv"
data19 = pd.read_csv(path1, sep=",",encoding = "ISO-8859-1")

path2 = "/content/drive/MyDrive/CMPS3160_Project/data/20.csv"
data20 = pd.read_csv(path2, sep=",",encoding = "ISO-8859-1")

path3 = "/content/drive/MyDrive/CMPS3160_Project/data/21.csv"
data21 = pd.read_csv(path3, sep=",",encoding = "ISO-8859-1")

/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py:3326: DtypeWarning:

Columns (16,86) have mixed types.Specify dtype option on import or set low_memory=False.

/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py:3326: DtypeWarning:

Columns (74) have mixed types.Specify dtype option on import or set low_memory=False.

Table Introduction¶

We have collected FIFA player data over the past three years to analyze what factors affect a player's value. The main data we need are that payers' name, age, value, wage, country, club, position and overall score. The overall score is the FIFA game scording system. There are six main criteria which appear alongside the overall score: speed, shooting, passing, defending, dribbling and physicality.

Key Factor Definition: Value: Value of a soccer player card in FIFA is an estimate of the amount for which the owner can sell the player card to another user. Or the amount for which the user is willing to pay for the player card. Wage: Wage of a soccer player is the amount of money that is regularly paid to them by the soccer club they work for.

Table1: (data19) FIFA Players 2019 data

data19.head()

Table 2: (data20) FIFA Players 2020 data

data20.head()

Table 3: (data21) FIFA Player 2020 data

data21.head()

ETL¶

Cause we want to anlysis which factors will affect the players' value change. First, we need to clean the data, which make us easy to anlysis the data in the future steps. We will drop the NaN Age, Country, Club, Value, Wage. This step is very important. Because in the orignal Excel sheet, like the unknow age will appear 0. These data will affect our caculation in the later process.

data19['Age'] = data19['Age'].drop(0)
data20['Age'] = data20['Age'].drop(0)
data21['Age'] = data21['Age'].drop(0)
data19['Nationality'].fillna('No', inplace = True)
data20['Country'].fillna('No', inplace = True)
data21['Nationality'].fillna('No', inplace = True)
data19['Club'].fillna('No', inplace = True)
data20['Club'].fillna('No', inplace = True)
data21['Club'].fillna('No', inplace = True)
data19["Value"] = data19.Value.replace(0,np.nan)
data20["Value"] = data20.Value.replace(0,np.nan)
data21["Value"] = data21.Value.replace(0,np.nan)
data19["Wage"] = data19.Wage.replace(0,np.nan)
data20["Wage"] = data20.Wage.replace(0,np.nan)
data21["Wage"] = data21.Wage.replace(0,np.nan)

In the next step, we want to merge all useful data into our create own table. We want a table will have players' Name, every year Age, Club, Nationality, Value, Wage. Because we think these factor may affect the value change in the future. This step can help us more easier to anlysis the data in the future steps.

First, We create a new table call "information19", this table mainly display the players' information in the FIFA 2019.

information19 = pd.DataFrame()
information19["Name"] = data19["Name"]
information19["19Age"]= data19["Age"]
information19["19Club"] = data19["Club"]
information19["19Nationality"] = data19["Nationality"]
information19["19Value"] = data19["Value"]
information19["19Wage"] = data19["Wage"]

information19.head()

Second, We create a new table call "information20", this table mainly display the players' information in the FIFA 2020.

information20 = pd.DataFrame()
information20["Name"] = data20["Name"]
information20["20Age"]= data20["Age"]
information20["20Club"] = data20["Club"]
information20["20Nationality"] = data20["Country"]
information20["20Value"] = data20["Value"]
information20["20Wage"] = data20["Wage"]

information20.head()

Third, We create a new table call "information21", this table mainly display the players' information in the FIFA 2021.

information21 = pd.DataFrame()
information21["Name"] = data21["Name"]
information21["21Age"] = data21["Age"]
information21["21Club"] = data21["Club"]
information21["21Nationality"] = data21["Nationality"]
information21["21Value"] = data21["Value"]
information21["21Wage"] = data21["Wage"]

information21.head()

We will use these three tables to do the data compare. We want to see whether or not the covid affect the players' value change during this three year. Next step we will do some simple data anlysis about comparing players' three year factor change.

This step we want to compare the players’ three year age change. When do this part data analysis, we discover a problem show in the original tabular data. Some players’ age change will not increase with year change. This may be due to statistical errors in the original tabular data collection. But we mainly want to see the age distribution of the players. What is the age range of most players, and how old are the oldest and youngest players?

fig, axes = plt.subplots(1,3,figsize = (12,4),sharey = True)
fig.tight_layout(h_pad =4)
sns.set(color_codes=True)
information19["19Age"].plot.hist(ax= axes[0]).set_title("19Age")
information20["20Age"].plot.hist(ax= axes[1]).set_title("20Age")
information21["21Age"].plot.hist(ax= axes[2]).set_title("21Age")

Text(0.5, 1.0, '21Age')

In these three tables, we can see how often the players are distributed over three years of age. We can find that basically the age with the largest number of players every year is around 25 years old. The oldest player will not exceed 45 years old, and the youngest player is basically 16 years old. This shows that although it changes every year, there will always be new young players joining. Football has relatively high age requirements for players. Will the value of young players be lower than that of experienced players? Is the player's age directly proportional to the player's value? We will conduct a more in-depth analysis later.

Next we also want to see whether the salaries of players have changed a lot under the influence of covid. We first normalized the salaries to ensure that the final chart display allows people to see the distribution of players' salaries at a glance.

fig, axes = plt.subplots(1,3,figsize = (12,4),sharey = True)
fig.tight_layout(h_pad =4)
sns.set(color_codes=True)
log19Wage = np.log(information19["19Wage"])
log20Wage = np.log(information20["20Wage"])
log21Wage = np.log(information21["21Wage"])
log19Wage.plot.hist(ax= axes[0]).set_title("19Wage")
log20Wage.plot.hist(ax= axes[1]).set_title("20Wage")
log21Wage.plot.hist(ax= axes[2]).set_title("21Wage")

Text(0.5, 1.0, '21Wage')

According to these three tables, we can analyze the salary distribution changes in the past three years. We can find that in 2019 and 20, most of the salary distribution is gathered on the far left. Perhaps because of the epidemic, the salaries of players have not changed much, and more players' salaries have been reduced. By 21 years, we can find that the salaries of many originally low-paid players have increased, but the maximum salary of players has decreased.

Next, we want to analyze the changes in the value of players in the past three years. Whether the player's value has been reduced or increased by the impact of covid. At the same time, we also want to analyze the value distribution of players, trying to find out whether changes in player values are related to changes in other factors.

fig, axes = plt.subplots(1,3,figsize = (12,4),sharey = True)
fig.tight_layout(h_pad =4)
sns.set(color_codes=True)
log19Value = np.log(information19["19Value"])
log20Value = np.log(information20["20Value"])
log21Value = np.log(information21["21Value"])
log19Value.plot.hist(ax = axes[0]).set_title("19Value")
log20Value.plot.hist(ax = axes[1]).set_title("20Value")
log21Value.plot.hist(ax = axes[2]).set_title("21Value")

Text(0.5, 1.0, '21Value')

We first normalize the player values. Let's make our chart display easier to see the exact distribution. We find that the value of players generally increases with age for the most part. The 20-year increase in player value is not obvious. Perhaps because of covid, most players did not have the opportunity to endorse or play, so the value of the players has not changed. In 21 years, most low-value players have grown.

We also speculate whether players from some countries will be worth more. Country is likely to be an important factor affecting player value changes.

information19.groupby(["19Nationality"])["19Value"].mean().plot.bar(figsize=(60,25))

<matplotlib.axes._subplots.AxesSubplot at 0x7f881b242d30>

information20.groupby(["20Nationality"])["20Value"].mean().plot.bar(figsize=(60,25))

<matplotlib.axes._subplots.AxesSubplot at 0x7f881b0a9a00>

information21.groupby(["21Nationality"])["21Value"].mean().plot.bar(figsize=(60,25))

<matplotlib.axes._subplots.AxesSubplot at 0x7f8816320160>

When we analyze the relationship between player country and player value, we try to first calculate the average value of all players in each country. At this time, we found that some countries have a small number of players, but the value of each player is very high. This will cause the average value of all players in the country to be high, and it is impossible to accurately analyze which countries' players have higher value. Because the number of players in each country is not guaranteed to be consistent. In addition, we also found that some countries have missing player value data. In the following analysis, we will solve this problem to analyze the relationship between player country and value.

EDA¶

Next, we will try to find out the correlation between the player's value and his age and wage. We firstly think that there are some close connections between the player's value and his age and wage. We will use several graphs to prove and illustrate our ideas.

features = ["Value","Age","Wage"]
corr = data19[features].corr()
corr
sns.heatmap(corr)

<matplotlib.axes._subplots.AxesSubplot at 0x7f8815fbf7f0>

From the correlation graph, we can easily find out the results that the player's value has high correlation with the player's wage, but it has almost no connection to the player's age.

# Save the change csv to the new csv which not include the string in the Value, Age, Wage
path1 = "/content/drive/MyDrive/CMPS3160_Project/data/19.csv"
data19 = pd.read_csv(path1, sep=",",encoding = "ISO-8859-1")

path2 = "/content/drive/MyDrive/CMPS3160_Project/data/20.csv"
data20 = pd.read_csv(path2, sep=",",encoding = "ISO-8859-1")

path3 = "/content/drive/MyDrive/CMPS3160_Project/data/21.csv"
data21 = pd.read_csv(path3, sep=",",encoding = "ISO-8859-1")

import plotly.express as px
nat_cnt=data19.groupby('Nationality').apply(lambda x:x['Name'].count()).reset_index(name='Counts')
nat_cnt.sort_values(by='Counts',ascending=False,inplace=True)
top_20_nat_cnt=nat_cnt[:20]
fig=px.bar(top_20_nat_cnt,x='Nationality',y='Counts',color='Counts',title='Nationwise Representation in the FIFA Game')
fig.show()

import plotly.express as px
cost_prop=data21[['Name','Club','Nationality','Wage','Value','Position']]
fig=px.scatter(cost_prop,x='Value',y='Wage',color='Value',size='Wage',hover_data=['Name','Club','Nationality','Position'],title='Value vs Wage Presentation of all the Players')
fig.show()

From the scatter graph above, we can see that the player who has higher wage will have the higher value in the FIFA, also the player who has lower wage will have the lower value in the FIFA. (The graph shows an linear interaction with stimated positive slope.)

After finding out the relationship between value and wage, we want to explore the relationships between the player's value and his position. We want to find out which position of the player has higher value than others? This provides us with a basis going forward. Let’s see how every position takes a bite out of the pie.

y = data19.groupby("Position")[["Value"]].sum()
y = y.reset_index()
y.groupby("Position")[["Value"]].sum()
y.sort_values("Value",ascending=False,inplace=True)

mylabels = y["Position"]
ys = y["Value"]

percent = 100.*ys/ys.sum()
patches, texts = plt.pie(ys, startangle=90, radius=1.2,shadow = True)
labels = ['{0} - {1:1.2f} %'.format(i,j) for i,j in zip(mylabels,percent)]

sort_legend = True
if sort_legend:
    patches, labels, dummy =  zip(*sorted(zip(patches, labels, ys),
                                          key=lambda x: x[2],
                                          reverse=True))

plt.legend(patches, labels, loc='upper right', bbox_to_anchor=(-0.1, 1.),
           fontsize=8)
plt.title("FIFA Average value Percentage Per Position in 2019")

plt.show()

y = data20.groupby("BP")[["Value"]].sum()
y = y.reset_index()
y.groupby("BP")[["Value"]].sum()
y.sort_values("Value",ascending=False,inplace=True)

mylabels = y["BP"]
ys = y["Value"]

percent = 100.*ys/ys.sum()
patches, texts = plt.pie(ys, startangle=90, radius=1.2,shadow = True)
labels = ['{0} - {1:1.2f} %'.format(i,j) for i,j in zip(mylabels,percent)]

sort_legend = True
if sort_legend:
    patches, labels, dummy =  zip(*sorted(zip(patches, labels, ys),
                                          key=lambda x: x[2],
                                          reverse=True))

plt.legend(patches, labels, loc='upper right', bbox_to_anchor=(-0.1, 1.),
           fontsize=8)
plt.title("FIFA Average value Percentage Per Position in 2020")

plt.show()

y = data21.groupby("Position")[["Value"]].sum()
y = y.reset_index()
y.groupby("Position")[["Value"]].sum()
y.sort_values("Value",ascending=False,inplace=True)

mylabels = y["Position"]
ys = y["Value"]

percent = 100*ys/ys.sum()
patches, texts = plt.pie(ys, startangle=90, radius=1.2,shadow = True)
labels = ['{0} - {1:1.2f} %'.format(i,j) for i,j in zip(mylabels,percent)]

sort_legend = True
if sort_legend:
    patches, labels, dummy =  zip(*sorted(zip(patches, labels, ys),
                                          key=lambda x: x[2],
                                          reverse=True))

plt.legend(patches, labels, loc='upper right', bbox_to_anchor=(-0.1, 1.),
           fontsize=8)
plt.title("FIFA Average value Percentage Per Position in 2021")

plt.show()

Based on three pie graphs above, we can find that striker is the most valuable position compared with other positions (for three years). That's the reason that most players of FIFA are willing to pay a lot to get an outstanding striker. Besides, Center-Back, Gool-keeper, and Center Attacking Midfielder are the second popular positions that have higher average player's value than other positions. Which means those positions players card will have higher value to buy or sell in the future.

def mon_to_num(num):
    if type(num) == int:
        return num
    a = list(num)
    if a[0] == "€":
        a.remove("€")
    if a[-1] == "K":
        a.remove("K")
    if a[-1] == "M":
        a.remove("M")
        a = float("".join(a))
        a = int(a)
        a = str(a)+"000"
        a = list(a)
    b = int("".join(a))
    return b


del_list = list()
for i in range(0,len(data190["Wage"])):
    if type(data190["Wage"][i]) == str and list(data190["Wage"][i])[0] == "€" and list(data190["Wage"][i])[-1] == "K":
        data190["Wage"][i] = mon_to_num(data190["Wage"][i])
    elif type(data190["Wage"][i]) == str and list(data190["Wage"][i])[0] == "€" and list(data190["Wage"][i])[-1] == "M":
        data190["Wage"][i] = mon_to_num(data190["Wage"][i])
    else :
        del_list.append(i)
  
data190 = data190.drop(del_list)
data190.reset_index(inplace=True,drop=True)
data190 = data190.loc[data190["Age"].notnull()]
data190.reset_index(inplace=True,drop=True)
data19_final = data190.loc[:,["Age","Wage","Value"]]

del_list = list()
for i in range(0,len(data19_final["Value"])):
    if type(data19_final["Value"][i]) == str and list(data19_final["Value"][i])[0] == "€" and list(data19_final["Value"][i])[-1] == "K":
        data19_final["Value"][i] = mon_to_num(data19_final["Value"][i])
    elif type(data19_final["Value"][i]) == str and list(data19_final["Value"][i])[0] == "€" and list(data19_final["Value"][i])[-1] == "M":
        data19_final["Value"][i] = mon_to_num(data19_final["Value"][i])
    else :
        del_list.append(i)

data19_final = data19_final.drop(del_list)
data19_final.reset_index(inplace=True,drop=True)
data19_final

<ipython-input-70-d9fdcb1b3217>:22: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

def mon_to_num(num):
    if type(num) == int:
        return num
    a = list(num)
    if a[0] == "€":
        a.remove("€")
    if a[-1] == "K":
        a.remove("K")
    if a[-1] == "M":
        a.remove("M")
        a = float("".join(a))
        a = int(a)
        a = str(a)+"000"
        a = list(a)
    b = int("".join(a))
    return b

del_list = list()


for i in range(0,len(data_2["Wage"])):
    if type(data_2["Wage"][i]) == str and list(data_2["Wage"][i])[0] == "€" and list(data_2["Wage"][i])[-1] == "K":
        data_2["Wage"][i] = mon_to_num(data_2["Wage"][i])
    elif type(data_2["Wage"][i]) == str and list(data_2["Wage"][i])[0] == "€" and list(data_2["Wage"][i])[-1] == "M":
        data_2["Wage"][i] = mon_to_num(data_2["Wage"][i])
    else :
        del_list.append(i)
        
data_2 = data_2.drop(del_list)

data_2.reset_index(inplace=True,drop=True)

data_2 = data_2.loc[data_2["Age"].notnull()]

data_2_final = data_2.loc[:,["Age","Wage","Value"]]

del_list = list()
for i in range(0,len(data_2_final["Value"])):
    if type(data_2_final["Value"][i]) == str and list(data_2_final["Value"][i])[0] == "€" and list(data_2_final["Value"][i])[-1] == "K":
        data_2_final["Value"][i] = mon_to_num(data_2_final["Value"][i])
    elif type(data_2_final["Value"][i]) == str and list(data_2_final["Value"][i])[0] == "€" and list(data_2_final["Value"][i])[-1] == "M":
        data_2_final["Value"][i] = mon_to_num(data_2_final["Value"][i])    
    else :
        del_list.append(i)

data_2_final = data_2_final.drop(del_list)

data_2_final.reset_index(inplace=True,drop=True)

data20_fianl = data_2_final

<ipython-input-71-4835b3585793>:23: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-71-4835b3585793>:42: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-71-4835b3585793>:40: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data21.loc[data21["Wage"].isnull()]  
data21.loc[data21["Value"].isnull()] 
data21.loc[data21["Age"].isnull()]  

data21_final = data21.loc[:,["Age","Wage","Value"]]

dataf = pd.concat((data19_final,data20_final,data21_final),axis=0)

label_need=dataf.keys()
print(label_need)

Index(['Age', 'Wage', 'Value'], dtype='object')

#define x and y
x = dataf[label_need].values[:,0:2]
y = dataf[label_need].values[:,2]
print(x)
print(y)

[['31' 565]
 ['33' 405]
 ['26' 290]
 ...
 [35 0]
 [32 20000]
 [35 6000]]
[110000 77000 118000 ... 0 2200000 110000]

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=33, test_size=0.25)
#  print(x_test.shape) 
# Train the model
from sklearn.neighbors import KNeighborsRegressor
k = 5
knn = KNeighborsRegressor(k)
knn.fit(x_train, y_train)

KNeighborsRegressor()

y_pred = knn.predict(x_test) 
plt.figure(figsize=(16, 10), dpi=144)
plt.scatter(x_test[:,0], y_test, c='g', s=100)         
plt.scatter(x_test[:,0], y_pred, c='k')       
plt.axis('tight')
plt.title("KNeighborsRegressor (k = %i)" % k)
plt.xlabel("Age")
plt.ylabel("Value")
plt.show()

plt.figure(figsize=(16, 10), dpi=144)
plt.scatter(x_test[:,1], y_test, c='g', s=100)        
plt.scatter(x_test[:,1], y_pred, c='k')      
plt.axis('tight')
plt.title("KNeighborsRegressor (k = %i)" % k)
plt.xlabel("Wage")
plt.ylabel("Value")
plt.show()

From the Knn regression graph above, We can predict the player's value based on his age and wage.

We were relatively successful in predicting FIFA player's value with a limited number of dataset.

However, a large portion of one player's value depends on his wage and position. Our goal was not to evaluate 'value' stats, but to try to separate positions and other factors, and use only those deemed significant. We believe we accomplished this and have more ideas for improvement.

Nevertheless, while this project is far from perfect, we are satisfied with what we've created, and can proudly display a working analysis with many interesting aspects.

This is our final product for our FIFA Player Value Analysis, and thank you for reading.

Cover Format¶

%%shell
jupyter nbconvert --to html /content/drive/MyDrive/CMPS3160_Project/Project.ipynb
# Change our notebook format to the html format

[NbConvertApp] Converting notebook /content/drive/MyDrive/CMPS3160_Project/Project.ipynb to html
[NbConvertApp] Writing 2767513 bytes to /content/drive/MyDrive/CMPS3160_Project/Project.html

Final Plan¶

We changed some of our previous code in the current project and added missing explanations. We currently spend a lot of time looking for the right data, and we find it very difficult to find very suitable data. In addition, we will continue to use the knowledge learned in class to continue research.

	Unnamed: 0	ID	Name	Age	Photo	Nationality	Flag	Overall	Potential	Club	...	Composure	Marking	StandingTackle	SlidingTackle	GKDiving	GKHandling	GKKicking	GKPositioning	GKReflexes	Release Clause
0	0	158023	L. Messi	31.0	https://cdn.sofifa.org/players/4/19/158023.png	Argentina	https://cdn.sofifa.org/flags/52.png	94.0	94	FC Barcelona	...	96.0	33.0	28.0	26.0	6.0	11.0	15.0	14.0	8.0	226.5M
1	1	20801	Cristiano Ronaldo	33.0	https://cdn.sofifa.org/players/4/19/20801.png	Portugal	https://cdn.sofifa.org/flags/38.png	94.0	94	Juventus	...	95.0	28.0	31.0	23.0	7.0	11.0	15.0	14.0	11.0	127.1M
2	2	190871	Neymar Jr	26.0	https://cdn.sofifa.org/players/4/19/190871.png	Brazil	https://cdn.sofifa.org/flags/54.png	92.0	93	Paris Saint-Germain	...	94.0	27.0	24.0	33.0	9.0	9.0	15.0	15.0	11.0	228.1M
3	3	193080	De Gea	27.0	https://cdn.sofifa.org/players/4/19/193080.png	Spain	https://cdn.sofifa.org/flags/45.png	91.0	93	Manchester United	...	68.0	15.0	21.0	13.0	90.0	85.0	87.0	88.0	94.0	138.6M
4	4	192985	K. De Bruyne	27.0	https://cdn.sofifa.org/players/4/19/192985.png	Belgium	https://cdn.sofifa.org/flags/7.png	91.0	92	Manchester City	...	88.0	68.0	58.0	51.0	15.0	13.0	5.0	10.0	13.0	196.4M

	Name	Image	Country	Position	Age	Overall	Potential	Club	ID	Height	...	A/W	D/W	IR	PAC	SHO	PAS	DRI	DEF	PHY	Hits
0	Lionel Messi	https://cdn.sofifa.org/players/4/20/158023.png	Argentina	RW,CF,ST	32	94	94	FC Barcelona	158023	5'7"	...	Medium	Low	5	87	92	92	96	39	66	585
1	C. Ronaldo dos Santos Aveiro	https://cdn.sofifa.org/players/4/20/20801.png	Portugal	ST,LW	34	93	93	Juventus	20801	6'2"	...	High	Low	5	90	93	82	89	35	78	448
2	Neymar da Silva Santos Jr.	https://cdn.sofifa.org/players/4/20/190871.png	Brazil	LW,CAM	27	92	92	Paris Saint-Germain	190871	5'9"	...	High	Medium	5	91	85	87	95	32	58	432
3	Jan Oblak	https://cdn.sofifa.org/players/4/20/200389.png	Slovenia	GK	26	91	91	AtlÃ©tico Madrid	200389	6'2"	...	Medium	Medium	3	87	92	78	89	52	90	240
4	Kevin De Bruyne	https://cdn.sofifa.org/players/4/20/192985.png	Belgium	CAM,CM	28	91	91	Manchester City	192985	5'11"	...	High	High	4	76	86	92	86	61	78	298

	ï»¿	ID	Name	Age	Photo	Nationality	Flag	Overall	Potential	Club	...	Penalties	Composure	Defensive Awareness	Standing Tackle	Sliding Tackle	GK Diving	GK Handling	GK Kicking	GK Positioning	GK Reflexes
0	0	253283	Facundo Pellistri	18	https://cdn.sofifa.com/players/253/283/20_60.png	Uruguay	https://cdn.sofifa.com/flags/uy.png	71	87	PeÃ±arol	...	66.0	61.0	35.0	11.0	18.0	9.0	12.0	7.0	8.0	7.0
1	1	179813	Edinson Cavani	32	https://cdn.sofifa.com/players/179/813/20_60.png	Uruguay	https://cdn.sofifa.com/flags/uy.png	86	86	Paris Saint-Germain	...	85.0	80.0	57.0	48.0	39.0	12.0	5.0	13.0	13.0	10.0
2	2	245541	Giovanni Reyna	17	https://cdn.sofifa.com/players/245/541/20_60.png	United States	https://cdn.sofifa.com/flags/us.png	68	87	Borussia Dortmund	...	50.0	59.0	30.0	23.0	24.0	10.0	13.0	14.0	12.0	7.0
3	3	233419	Raphael Dias Belloli	23	https://cdn.sofifa.com/players/233/419/20_60.png	Brazil	https://cdn.sofifa.com/flags/br.png	81	85	Stade Rennais FC	...	73.0	79.0	45.0	54.0	38.0	8.0	7.0	13.0	8.0	14.0
4	4	198710	James RodrÃguez	28	https://cdn.sofifa.com/players/198/710/20_60.png	Colombia	https://cdn.sofifa.com/flags/co.png	82	82	Everton	...	81.0	87.0	52.0	41.0	44.0	15.0	15.0	15.0	5.0	14.0

	Name	19Age	19Club	19Nationality	19Value	19Wage
0	L. Messi	NaN	FC Barcelona	Argentina	110500000.0	565000.0
1	Cristiano Ronaldo	33.0	Juventus	Portugal	77000000.0	405000.0
2	Neymar Jr	26.0	Paris Saint-Germain	Brazil	118500000.0	290000.0
3	De Gea	27.0	Manchester United	Spain	72000000.0	260000.0
4	K. De Bruyne	27.0	Manchester City	Belgium	102000000.0	355000.0

	Name	20Age	20Club	20Nationality	20Value	20Wage
0	Lionel Messi	NaN	FC Barcelona	Argentina	95500000.0	565000.0
1	C. Ronaldo dos Santos Aveiro	34.0	Juventus	Portugal	58500000.0	405000.0
2	Neymar da Silva Santos Jr.	27.0	Paris Saint-Germain	Brazil	105500000.0	290000.0
3	Jan Oblak	26.0	AtlÃ©tico Madrid	Slovenia	77500000.0	125000.0
4	Kevin De Bruyne	28.0	Manchester City	Belgium	90000000.0	370000.0

	Age	Wage	Value
0	31	565	110000
1	33	405	77000
2	26	290	118000
3	27	260	72000
4	27	355	102000
...	...	...	...
17619	19	1	60
17620	19	1	60
17621	16	1	60
17622	17	1	60
17623	16	1	60