As described in the title, I have the following problem:
Data is prepared as a pandas dataframe incoming as follows:
Article | Title |
---|---|
A0 | A00183 |
BB2 | BB2725 |
C2C3 | C2C3945 |
As you can see, the "Title" column is repeating the string value of the Article
column.
I want this to be deleted, so that the table looks as follows:
Article | Title |
---|---|
A0 | 0183 |
BB2 | 725 |
C2C3 | 945 |
I want to do this with Pandas.
I already found out how to read the length of the string row in column Article
, so that I already know the amount of characters to be deducted with this:
df1['Length of Article string:'] = df1['Article:'].apply(len)
But now I am to stupid to figure out how to delete the strings, that can change in amount for every row, in the Title
column.
Thanks for your help!
Kind regards
Tried Pandas Documentation, found some hints regarding split and strip, but I do not have enough know-how to implement...
CodePudding user response:
You can replace from list derived from Article column.
df["Title"] = df["Title"].replace(df["Article"].tolist(), "", regex=True)
print(df)
Article Title
0 AA 0123
1 BBB 234
2 CCCC 345
CodePudding user response:
you can use replace() with a lambda function.
df['Title'] = df[['Article','Title']].apply(lambda x : x['Title'].replace((x['Article']), ''), axis=1)