This is the example data that would be pasted into an input() prompt and ideally I would like it to be processed and made into a csv file through python:
,,,,,,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Expected,Expected,Expected,SCA,SCA,Passes,Passes,Passes,Passes,Carries,Carries,Dribbles,Dribbles,-additional
Player,#,Nation,Pos,Age,Min,Gls,Ast,PK,PKatt,Sh,SoT,CrdY,CrdR,Touches,Press,Tkl,Int,Blocks,xG,npxG,xA,SCA,GCA,Cmp,Att,Cmp%,Prog,Carries,Prog,Succ,Att,-9999
Gabriel Jesus,9,br BRA,FW,25-124,82,0,0,0,0,1,0,0,0,40,13,1,1,0,0.1,0.1,0.0,4,0,20,27,74.1,2,33,1,4,5,b66315ae
Eddie Nketiah,14,eng ENG,FW,23-067,8,0,0,0,0,0,0,0,0,6,2,0,0,0,0.0,0.0,0.1,2,0,4,4,100.0,1,4,1,0,0,a53649b7
Martinelli,11,br BRA,LW,21-048,90,1,0,0,0,2,1,0,0,38,21,0,2,1,0.6,0.6,0.1,1,0,24,28,85.7,1,34,5,3,4,48a5a5d6
Bukayo Saka,7,eng ENG,RW,20-334,90,0,0,0,0,3,0,0,0,52,23,3,0,3,0.2,0.2,0.0,2,1,24,36,66.7,2,37,8,2,2,bc7dc64d
Martin Ødegaard,8,no NOR,AM,23-231,89,0,0,0,0,2,0,0,0,50,22,2,1,2,0.1,0.1,0.0,2,0,30,39,76.9,5,28,3,1,2,79300479
Albert Sambi Lokonga,23,be BEL,CM,22-287,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0.0,0.0,0.0,0,0,1,1,100.0,0,1,1,0,0,1b4f1169
Granit Xhaka,34,ch SUI,DM,29-312,90,0,0,0,0,0,0,1,0,60,5,0,2,3,0.0,0.0,0.0,4,0,42,49,85.7,6,32,2,0,0,e61b8aee
Thomas Partey,5,gh GHA,DM,29-053,90,0,0,0,0,1,0,0,0,62,25,7,1,2,0.1,0.1,0.0,0,0,40,47,85.1,5,26,4,0,1,529f49ab
Oleksandr Zinchenko,35,ua UKR,LB,25-233,82,0,1,0,0,1,1,0,0,64,16,3,3,1,0.0,0.0,0.3,2,1,44,54,81.5,6,36,5,0,0,51cf8561
Kieran Tierney,3,sct SCO,LBWB,25-061,8,0,0,0,0,0,0,0,0,6,1,0,0,0,0.0,0.0,0.0,0,0,2,4,50.0,0,1,0,0,0,fce2302c
Gabriel Dos Santos,6,br BRA,CB,24-229,90,0,0,0,0,0,0,0,0,67,5,1,1,2,0.0,0.0,0.0,0,0,52,58,89.7,1,48,3,0,0,67ac5bb8
William Saliba,12,fr FRA,CB,21-134,90,0,0,0,0,0,0,0,0,58,3,1,2,2,0.0,0.0,0.0,0,0,42,46,91.3,1,35,1,0,0,972aeb2a
Ben White,4,eng ENG,RB,24-301,90,0,0,0,0,0,0,1,0,61,22,7,4,5,0.0,0.0,0.1,1,0,29,40,72.5,5,25,2,1,1,35e413f1
Aaron Ramsdale,1,eng ENG,GK,24-083,90,0,0,0,0,0,0,0,0,33,0,0,0,0,0.0,0.0,0.0,0,0,24,32,75.0,0,21,0,0,0,466fb2c5
14 Players,,,,,990,1,1,0,0,10,2,2,0,599,158,25,17,21,1.1,1.1,0.5,18,2,378,465,81.3,35,361,36,11,15,-9999
The link to the table is: https://fbref.com/en/matches/e62f6e78/Crystal-Palace-Arsenal-August-5-2022-Premier-League#stats_18bb7c10_summary
I have attempted to use pandas dataframe but I am only able to export the first row of headers and nothing else (only the items before player).
CodePudding user response:
Would have been nice for you to include your attempt.
Pandas works just fine:
import pandas as pd
url = 'https://fbref.com/en/matches/e62f6e78/Crystal-Palace-Arsenal-August-5-2022-Premier-League#stats_18bb7c10_summary'
df = pd.read_html(url, header=1)[10]
Output:
print(df.to_markdown())
| | Player | # | Nation | Pos | Age | Min | Gls | Ast | PK | PKatt | Sh | SoT | CrdY | CrdR | Touches | Press | Tkl | Int | Blocks | xG | npxG | xA | SCA | GCA | Cmp | Att | Cmp% | Prog | Carries | Prog.1 | Succ | Att.1 |
|---:|:---------------------|----:|:---------|:------|:-------|------:|------:|------:|-----:|--------:|-----:|------:|-------:|-------:|----------:|--------:|------:|------:|---------:|-----:|-------:|-----:|------:|------:|------:|------:|-------:|-------:|----------:|---------:|-------:|--------:|
| 0 | Gabriel Jesus | 9 | br BRA | FW | 25-124 | 82 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 40 | 13 | 1 | 1 | 0 | 0.1 | 0.1 | 0 | 4 | 0 | 20 | 27 | 74.1 | 2 | 33 | 1 | 4 | 5 |
| 1 | Eddie Nketiah | 14 | eng ENG | FW | 23-067 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 2 | 0 | 0 | 0 | 0 | 0 | 0.1 | 2 | 0 | 4 | 4 | 100 | 1 | 4 | 1 | 0 | 0 |
| 2 | Martinelli | 11 | br BRA | LW | 21-048 | 90 | 1 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 38 | 21 | 0 | 2 | 1 | 0.6 | 0.6 | 0.1 | 1 | 0 | 24 | 28 | 85.7 | 1 | 34 | 5 | 3 | 4 |
| 3 | Bukayo Saka | 7 | eng ENG | RW | 20-334 | 90 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 52 | 23 | 3 | 0 | 3 | 0.2 | 0.2 | 0 | 2 | 1 | 24 | 36 | 66.7 | 2 | 37 | 8 | 2 | 2 |
| 4 | Martin Ødegaard | 8 | no NOR | AM | 23-231 | 89 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 50 | 22 | 2 | 1 | 2 | 0.1 | 0.1 | 0 | 2 | 0 | 30 | 39 | 76.9 | 5 | 28 | 3 | 1 | 2 |
| 5 | Albert Sambi Lokonga | 23 | be BEL | CM | 22-287 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 100 | 0 | 1 | 1 | 0 | 0 |
| 6 | Granit Xhaka | 34 | ch SUI | DM | 29-312 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 60 | 5 | 0 | 2 | 3 | 0 | 0 | 0 | 4 | 0 | 42 | 49 | 85.7 | 6 | 32 | 2 | 0 | 0 |
| 7 | Thomas Partey | 5 | gh GHA | DM | 29-053 | 90 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 62 | 25 | 7 | 1 | 2 | 0.1 | 0.1 | 0 | 0 | 0 | 40 | 47 | 85.1 | 5 | 26 | 4 | 0 | 1 |
| 8 | Oleksandr Zinchenko | 35 | ua UKR | LB | 25-233 | 82 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 64 | 16 | 3 | 3 | 1 | 0 | 0 | 0.3 | 2 | 1 | 44 | 54 | 81.5 | 6 | 36 | 5 | 0 | 0 |
| 9 | Kieran Tierney | 3 | sct SCO | LB,WB | 25-061 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 4 | 50 | 0 | 1 | 0 | 0 | 0 |
| 10 | Gabriel Dos Santos | 6 | br BRA | CB | 24-229 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 67 | 5 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 52 | 58 | 89.7 | 1 | 48 | 3 | 0 | 0 |
| 11 | William Saliba | 12 | fr FRA | CB | 21-134 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 58 | 3 | 1 | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 42 | 46 | 91.3 | 1 | 35 | 1 | 0 | 0 |
| 12 | Ben White | 4 | eng ENG | RB | 24-301 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 61 | 22 | 7 | 4 | 5 | 0 | 0 | 0.1 | 1 | 0 | 29 | 40 | 72.5 | 5 | 25 | 2 | 1 | 1 |
| 13 | Aaron Ramsdale | 1 | eng ENG | GK | 24-083 | 90 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 24 | 32 | 75 | 0 | 21 | 0 | 0 | 0 |
| 14 | 14 Players | nan | nan | nan | nan | 990 | 1 | 1 | 0 | 0 | 10 | 2 | 2 | 0 | 599 | 158 | 25 | 17 | 21 | 1.1 | 1.1 | 0.5 | 18 | 2 | 378 | 465 | 81.3 | 35 | 361 | 36 | 11 | 15 |
CodePudding user response:
could you elaborate more? maybe you could split the raw text by comma and then convert it to a dataframe like:
list_of_string = input.split(',')
df = pd.DataFrame(list_of_string)
df.to_csv('yourfile.csv')
CodePudding user response:
You can use user input controlled while loop to get user input. Finally, you may exit depending on the user’s choice. Look at the code below:
user_input = 'Y'
while user_input.lower() == 'y':
# Run your code here.
user_input = input('Do you want to add one more entry: Y or N?')
CodePudding user response:
This is most intuitive and understandable solution I could come up with uses of basic linear algebra to solve the problem which I find pretty neat. I recommend you to find an another way to parse the data. Check out beautifulsoup and requests.
import pandas as pd#for dataframe
data = '''
,,,,,,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Expected,Expected,Expected,SCA,SCA,Passes,Passes,Passes,Passes,Carries,Carries,Dribbles,Dribbles,-additional
Player,#,Nation,Pos,Age,Min,Gls,Ast,PK,PKatt,Sh,SoT,CrdY,CrdR,Touches,Press,Tkl,Int,Blocks,xG,npxG,xA,SCA,GCA,Cmp,Att,Cmp%,Prog,Carries,Prog,Succ,Att,-9999
Gabriel Jesus,9,br BRA,FW,25-124,82,0,0,0,0,1,0,0,0,40,13,1,1,0,0.1,0.1,0.0,4,0,20,27,74.1,2,33,1,4,5,b66315ae
Eddie Nketiah,14,eng ENG,FW,23-067,8,0,0,0,0,0,0,0,0,6,2,0,0,0,0.0,0.0,0.1,2,0,4,4,100.0,1,4,1,0,0,a53649b7
Martinelli,11,br BRA,LW,21-048,90,1,0,0,0,2,1,0,0,38,21,0,2,1,0.6,0.6,0.1,1,0,24,28,85.7,1,34,5,3,4,48a5a5d6
Bukayo Saka,7,eng ENG,RW,20-334,90,0,0,0,0,3,0,0,0,52,23,3,0,3,0.2,0.2,0.0,2,1,24,36,66.7,2,37,8,2,2,bc7dc64d
Martin Ødegaard,8,no NOR,AM,23-231,89,0,0,0,0,2,0,0,0,50,22,2,1,2,0.1,0.1,0.0,2,0,30,39,76.9,5,28,3,1,2,79300479
Albert Sambi Lokonga,23,be BEL,CM,22-287,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0.0,0.0,0.0,0,0,1,1,100.0,0,1,1,0,0,1b4f1169
Granit Xhaka,34,ch SUI,DM,29-312,90,0,0,0,0,0,0,1,0,60,5,0,2,3,0.0,0.0,0.0,4,0,42,49,85.7,6,32,2,0,0,e61b8aee
Thomas Partey,5,gh GHA,DM,29-053,90,0,0,0,0,1,0,0,0,62,25,7,1,2,0.1,0.1,0.0,0,0,40,47,85.1,5,26,4,0,1,529f49ab
Oleksandr Zinchenko,35,ua UKR,LB,25-233,82,0,1,0,0,1,1,0,0,64,16,3,3,1,0.0,0.0,0.3,2,1,44,54,81.5,6,36,5,0,0,51cf8561
Kieran Tierney,3,sct SCO,LBWB,25-061,8,0,0,0,0,0,0,0,0,6,1,0,0,0,0.0,0.0,0.0,0,0,2,4,50.0,0,1,0,0,0,fce2302c
Gabriel Dos Santos,6,br BRA,CB,24-229,90,0,0,0,0,0,0,0,0,67,5,1,1,2,0.0,0.0,0.0,0,0,52,58,89.7,1,48,3,0,0,67ac5bb8
William Saliba,12,fr FRA,CB,21-134,90,0,0,0,0,0,0,0,0,58,3,1,2,2,0.0,0.0,0.0,0,0,42,46,91.3,1,35,1,0,0,972aeb2a
Ben White,4,eng ENG,RB,24-301,90,0,0,0,0,0,0,1,0,61,22,7,4,5,0.0,0.0,0.1,1,0,29,40,72.5,5,25,2,1,1,35e413f1
Aaron Ramsdale,1,eng ENG,GK,24-083,90,0,0,0,0,0,0,0,0,33,0,0,0,0,0.0,0.0,0.0,0,0,24,32,75.0,0,21,0,0,0,466fb2c5
14 Players,,,,,990,1,1,0,0,10,2,2,0,599,158,25,17,21,1.1,1.1,0.5,18,2,378,465,81.3,35,361,36,11,15,-9999
'''
#you can just replace data with user input
def tryNum(x):#input a value and if its a number then it returns a number, if not it returns itself back
try:
x = float(x)
return x
except:
return x
rows = [i.split(',')[:-1] for i in data.split('\n')[2:-2]]#removing useless lines
col_names = [i for i in rows[0]]#fetching all column names
cols = [[tryNum(rows[j][i]) for j in range(1,len(rows))] for i in range(len(rows[0]))]#get all column info by transposing the "matrix" if you will
full = {}#setting up the dictionary
for i,y in zip(col_names,cols):#putting the data in the dict
full[i]=y
df = pd.DataFrame(data = full)#uploading it all to the df
print(df.head())
CodePudding user response:
The correct approach is as proposed by chitown88, however if you want to copy paste the data by hand into the terminal and get a csv you can do something like this:
import pandas as pd
from datetime import datetime
while True:
print("Enter/Paste your content. Ctrl-D or Ctrl-Z ( windows ) to save it.")
contents = []
while True:
try:
line = input()
except EOFError:
break
contents.append(line)
df = pd.DataFrame(contents)
df.to_csv(f"df_{int(datetime.now().timestamp())}.csv", index=None)
Start the Python script, paste the data into the terminal, press CTRL D and press enter to export the data you pasted into the terminal into a csv file.