Home > other >  Find Multiple Patterns and Merge
Find Multiple Patterns and Merge

Time:11-05

How can I find and sort in a new dataframe the following

Input:

      IdA        IdB           Value
0     123        ABC_25_01     1
1     456        DEF_25_01     2
2     123        ABC_60_01     3
3     456        DEF_60_01     4

Output

      IdA        IdB      Value25 Value60
0     123        ABC_01   1       3
1     456        DEF_01   2       4

Thank you!

CodePudding user response:

Assuming the values in IdB are all in the form X_Y_Z where X_Z is the final IdB and Y is the Column header:

  1. Use str.extract to get the "Category" i.e. the number between the two underscores
  2. User str.replace to remove the "Category" from IdB.
  3. pivot to get the required output
df["Category"] = df["IdB"].str.extract("_(.*)_")
df["IdB"] = df["IdB"].str.replace("_(.*)_", "_", regex=True)
output = (df.pivot(["IdA", "IdB"], "Category", "Value")
          .add_prefix("Value")
          .reset_index())

>>> output

Category  IdA     IdB  Value25  Value60
0         123  ABC_01        1        3
1         456  DEF_01        2        4
  • Related