I'd like to create new columns in my dataframe using unique values from another column, for example
Column 1 has the following values:
Apple
Apple
Banana
Strawberry
Strawberry
Strawberry
When I check unique values in Column 1, the output would be :
Apple
Banana
Strawberry
Now I want to use these three values to create columns named "Apple","Banana","Strawberry" and I want to keep the code dynamic to adapt to however number of unique values are present in Column 1
I'm new to python, any help will be appreciated!
So far, I've been doing getting that output by manually creating new columns in the dataset, I need this to happen automatically depending on the unique values in Column 1
CodePudding user response:
extract unique values, iterate on them to create columns and fill in data.
Here I inly put boolean values based on matching with the col1
value ...
df = pd.DataFrame({"col1": ["apple", "apple", "banana", "pineapple", "banana", "apple"]})
data=
col1
0 apple
1 apple
2 banana
3 pineapple
4 banana
5 apple
transform:
unique_col1_val = df["col1"].unique().tolist()
for u in unique_col1_val:
df[u] = df["col1"] == u # you need to determine how to fill these new columns
# here we just put a bool indicating a match between new col name and col1 content ...
# to put an int truth value use:
# df[u] = (df["col1"] == u).astype(int)
In [72]: df
Out[72]:
col1 apple banana pineapple
0 apple True False False
1 apple True False False
2 banana False True False
3 pineapple False False True
4 banana False True False
5 apple True False False
using df[u] = (df["col1"] == u).astype(int)
:
col1 apple banana pineapple
0 apple 1 0 0
1 apple 1 0 0
2 banana 0 1 0
3 pineapple 0 0 1
4 banana 0 1 0
5 apple 1 0 0