Add row values as new columns in a dataframe-CodePudding

I have a dataframe like follows:

A     B     C
a1    b1    c1
a1    b2    c2
a2    b1    c3
a2    b2    c4

I want to convert it to something like:

A     B1   B2     
a1    c1   c2     
a1    c1   c2     
a2    c3   c4     
a2    c3   c4

The values in B which are repeated are column names whose values are in C column. I want them to be columns of the dataframe

In fact the dataset is created by flattening a tree, there are more columns each of which is an internal node. the first column is root, C are leaves

These are some tries by me:

cmd = "b1,b2"
cond = ""
for c in cmd.split(","):
    cond  = "|"   f" (df['B'] == '{c}') "
    cols = [c]   cols
    new_col = main_df[main_df['B'] == c]['C']
    df[c] = new_col
df = df[eval(cond)]

However, it doesn't work.

CodePudding user response：

Try:

df.pivot('A', columns='B')

You will get only two rows, but your four rows are 2 rows duplicated anyhow.

    A   C
B       b1  b2
0   a1  c1  c2
1   a2  c3  c4

CodePudding user response：

df.pivot can be a solution for such cases. However, I wanted to add my desired columns and so I have this solution:

df = df[df["B"] == "b1"]
df = df.reset_index(drop=True)
for c in desired_cols.split(","):
    new_col = main_df[main_df['B'] == c]['C']
    new_col = new_col.reset_index(drop=True)
    df[c] = new_col