Here is the csv file I using.
Goals:
I want to extract the rows that has values of columns "generation_id" or "is_main_series" greater than 0 and assign them a variable "selection" (Just to be clear, I want rows that has at least one of two columns greater than 0.)
Then, I extract the "identifier" columns from the "selection" and assign them a variable "name".
Finally, I would like to format the rows of "selection" for each "name" in this way --> name(generation_id, is_main_series), name1(is_main_series), etc
Error: line 7: ValueError: Cannot index with multidimensional key
I'm struggling to find solution for this error because I don't think this Dataframe is multiindexed.
Here is the code I wrote so far:
import pandas as pd
df = pd.read_csv('abilities.csv')
df = df.fillna(0)
def getPokedex():
selection = df.loc[df[['generation_id', 'is_main_series']] != 0 ]
for donor in selection:
name = selection['identifier']
name = name.to_string(index=False)
for types in selection.columns:
return f"{name}(types)"
print(getPokedex())
Would appreciate any help.
CodePudding user response:
I hope I can help you. I think I can resolve the first at least. You are comparing a list with a number. It doesn't look fine. Did you try something like this?:
// get the boolean values
booleanValues = (df.iloc[:,2] != 0) | (df.iloc[:,3] != 0)
// then get the df filtered
df_filtered = df[booleanValues]
I think the error raised is that. Then I have to think about the remainder of the code but I wanted to give you this quick help.
CodePudding user response:
As I understand your question, you want to print rows in a particular format, for which "generation_id" or "is_main_series" greater than 0. You can try the below code. See comments inline
import pandas as pd
url="https://raw.githubusercontent.com/veekun/pokedex/master/pokedex/data/csv/abilities.csv"
df=pd.read_csv(url)
#Filter dataframe for generation_id or is_main_series greater than 0
df = df[(df['generation_id']>0) | (df['is_main_series']>0)]
#Format and print the rows.
for index, row in df.iterrows():
print(row['identifier'],'(', row['generation_id'], ',' , row['is_main_series'] ,')')
Output:
stench ( 3 , 1 )
drizzle ( 3 , 1 )
speed-boost ( 3 , 1 )
battle-armor ( 3 , 1 )
...
shackle ( 5 , 0 )
decoy ( 5 , 0 )
shield ( 5 , 0 )
Or
If you want the output of all columns in one line with comma separated, you can do this. Here the idea is to create a new column which you need to print it later. And then convert the column to list and then join it as a comma separated string.
df['Formatted_column'] = df['identifier'] '(' df['generation_id'].astype(str) ',' df['is_main_series'].astype(str) ')'
print(', '.join(df['Formatted_column'].to_list()))
Output of this
stench(3,1), drizzle(3,1), speed-boost(3,1), battle-armor(3,1), .... decoy(5,0), shield(5,0)