Home > OS >  How do I split a column based on strings, clean up data, then do calculations on it?
How do I split a column based on strings, clean up data, then do calculations on it?

Time:11-30

Still learning my way around Python and trying to figure out how to process some data. I've got a dataframe with 1 column that I need to extract into 3 columns of data. I don't need to keep the original column.

Here's the data - "Given Data" is the original column and I want to extract out columns A and B, then do the math for column C (A/B). Thanks for your help!

Data Screenshot

CodePudding user response:

Try with str.strip and str.split:

df[["A", "B"]] = df["Given Data"].str.strip("()").str.split(" / ", expand=True).astype(int)
df["C"] = df["A"].div(df["B"])

>>> df
    Given Data    A    B         C
0  (313 / 321)  313  321  0.975078
1  (654 / 654)  654  654  1.000000
2  (673 / 842)  673  842  0.799287
3  (342 / 402)  342  402  0.850746
4  (586 / 774)  586  774  0.757106

If you want to convert the numeric "C" column to percentage strings, you can do:

df["C"] = df["C"].mul(100).map("{:.2f}%".format)

>>> df
    Given Data    A    B        C
0  (313 / 321)  313  321   97.51%
1  (654 / 654)  654  654  100.00%
2  (673 / 842)  673  842   79.93%
3  (342 / 402)  342  402   85.07%
4  (586 / 774)  586  774   75.71%
  • Related