Home > other >  Split a python pandas column using positional string in another column
Split a python pandas column using positional string in another column

Time:07-06

I have a dataframe with the structure:

ID Split Data
1 GT:RC:BC:CN 1:4:5:3
2 GT:RC:CN 1:7:0
3 GT:BC 4:2

I would like to create n new columns and populate with the data in the Data column, where n is the total number of unique fields split by a colon in the Split column (in this case, this would be 4 new columns: GT, RC, BC, CN). The new columns should be populated with the corresponding data in the Data column, so for ID 3, only column GT and BC should be populated. I have tried using string splitting, but that doesn't take into account the correct column to move the data to.

The output should look like this:

ID Split Data GT RC BC CN
1 GT:RC:BC:CN 1:4:5:3 1 4 5 3
2 GT:RC:CN 1:7:0 1 7 0
3 GT:BC 4:2 4 2

CodePudding user response:

You can use:

out = df.join(pd.concat([pd.Series(d.split(':'), index=s.split(':'))
                         for s,d in zip(df['Split'], df['Data'])], axis=1).T)

output:

   ID        Split     Data GT   RC   BC   CN
0   1  GT:RC:BC:CN  1:4:5:3  1    4    5    3
1   2     GT:RC:CN    1:7:0  1    7  NaN    0
2   3        GT:BC      4:2  4  NaN    2  NaN
  • Related