I've browsed a few answers but haven't found the exact thing i'm looking for yet.
I have a pandas dataframe with a single column structured as follows (example)
0 alex
1 7
2 female
3 nora
4 3
5 female
...
999 fred
1000 15
1001 male
i want to split that single column into 3 columns holding name, age, and gender. to look something like this:
name age gender
0 alex 7 female
1 nora 3 female
...
100 fred 15 male
is there a way to do this? i was thinking about using the index but not sure how to actually do it
CodePudding user response:
assuming "0" is your column name:
list_a = list(df[0])
a = np.array(list_a).reshape(-1, 3).tolist()
df2= pd.DataFrame(a,columns = ["name", "age","gender"])
CodePudding user response:
Not the most efficient solution perhaps, but you can use pd.concat()
and put them all next to each other, if they're always in order:
df = pd.DataFrame({'Value':['alex',7,'female','nora',3,'female','fred',15,'male']})
df2 = pd.concat([df[(df.index x) % 3 == 0].reset_index(drop=True) for x in range(3)],axis=1)
df2.columns = ["name", "gender", "age"]
Returns:
name gender age
0 alex female 7
1 nora female 3
2 fred male 15
CodePudding user response:
Consider unstack
:
import pandas as pd
df = pd.DataFrame(["alex", 7, "female", "nora", 3, "female", "fred", 15, "male"])
people = range(len(df) // 3)
attributes = ["name", "age", "gender"]
multi_index = pd.MultiIndex.from_product([people, attributes])
df.set_index(multi_index).unstack(level=1).droplevel(level=0, axis=1).reindex(columns=attributes)
Output:
name age gender
0 alex 7 female
1 nora 3 female
2 fred 15 male
CodePudding user response:
here is one way to do it
# step through the DF and get values for name, age and gender as series
# each starts from 0, 1 and 3
name=df['Value'][::3].values
age=df['Value'][1::3].values
gender=df['Value'][2::3].values
# create a DF based on the values
out=pd.DataFrame({'name': name,
'age' : age,
'gender': gender})
out
name age gender
0 alex 7 female
1 nora 3 female
2 fred 15 male