Home > front end >  How to move a character to another column in the same row in a pandas dataframe
How to move a character to another column in the same row in a pandas dataframe

Time:12-19

I got stuck trying to clean a dataframe similar to this one:

code course name EOS Mid test
AA101 Course 1 350 420 NaN
AA102 Course 2 400 470 NaN
AB101 Course 3 #560 570 NaN
AB102 Course 4 410 465 NaN
AC101 Course 5 # 522 NaN

I need to keep only numerical values in the column EOS and move # characters that appear in it to the column test, to indicate that an additional test is required for that course. This is because some of the values have a # before the actual number, such as Course 3, and some have only the # as the value, such as Course 5.

The dataframe was created using Camelot to extract those values from a PDF table.

What I need is to take this # out of this column and add it to the test column instead.

Is there an easy way to do that?

CodePudding user response:

There is no builtin function to do just this, but it can be done using two lines:

df.loc[df['EOS'].str.contains('#', na=False) 'test'] = '#'
df['EOS'] = df['EOS'].str.replace('#', '')

Output:

>>> df
    code  course name  EOS  Mid test
0  AA101     Course 1  350  420  NaN
1  AA102     Course 2  400  470  NaN
2  AB101     Course 3  560  570    #
3  AB102     Course 4  410  465  NaN
4  AC101     Course 5       522    

CodePudding user response:

You can do this

data = pd.DataFrame(
    [  ["AA101", "Course 1", "350",  420,   None],
       ["AA102", "Course 2", "400",  470,   None],
       ["AB101", "Course 3", "#560", 570,   None],
       ["AB102", "Course 4", "410",  465,   None],
       ["AC101", "Course 5", "#",    522,   None]  ],
    columns = ["code", "course name", "EOS", "Mid", "test"]
)

transformed_data = (
    data
    .assign(test=lambda d: d["EOS"].apply(lambda d_: "#" if "#" in d_ else None))
    .assign(EOS=lambda d: d["EOS"].replace({"#": ""}, regex=True))
)
  • Related