most_wickets_2021 = pd.read_html("https://stats.espncricinfo.com/ci/engine/records/bowling/most_wickets_career.html?id=13781;type=tournament")[0]
most_wickets_2021
Link to where I got the data from
This code prints out a dataframe that looks like this: image of the dataframe
How do I make it so the player's team just shows up in the first column right next to their name, instead of showing up in every column as a new row? Ex:
Player | Mat |
---|---|
Shahnawaz Dahani (Multan Sultans) | 11 |
I didn't type every column but I hope you understand what I mean.
CodePudding user response:
Using a simple for loop process then "looking ahead" to the line that has the team, here is a possible simple way you could accomplish that.
import pandas as pd
df = pd.read_html("https://stats.espncricinfo.com/ci/engine/records/bowling/most_wickets_career.html?id=13781;type=tournament")[0]
print('Player/Team ', '\t\t\t', 'Mat', '\t', 'Inns')
for x in range(len(df) - 1):
if ((x %2) == 0):
print(df['Player'][x], df['Player'][x 1], '\t\t\t', df['Mat'][x], '\t', df['Inns'][x])
Basically, this code snippet is reading through the data frame and for every other row, it is grabbing data either from the first row which has the player stats or from the subsequent row for the team name.
Here is a sampling of the output from my terminal when this program was run.
@Una:~/Python_Programs/Cricket$ python3 Cricket.py
Player/Team Mat Inns
Shahnawaz Dahani (Multan Sultans) 11 11
Wahab Riaz (Peshawar Zalmi) 12 12
Shaheen Shah Afridi (Lahore Qalandars) 10 10
JP Faulkner (Lahore Qalandars) 6 6
Imran Tahir (Multan Sultans) 7 7
Hasan Ali (Islamabad United) 10 10
S Mahmood (Peshawar Zalmi) 5 5
Imran Khan (Multan Sultans) 7 7
Mohammad Wasim (Islamabad United) 11 11
I didn't do too much work on formatting, but that should be easily cleaned up. You can try this method out and/or wait to see if someone has a more elegant way to do this.
Hope that helps.
Regards.
CodePudding user response:
import pandas as pd
data = pd.DataFrame(most_wickets_2021["Player"].values.reshape(-1,2))
res = pd.DataFrame({
"Player": data[0] " " data[1],
})
data = pd.DataFrame(most_wickets_2021[most_wickets_2021.columns[1:]].values[:,:][::2])
data.columns = most_wickets_2021.columns[1:]
pd.concat([res, data], axis=1)