Home > Software engineering >  How do I use regex within a certain cell in a csv to edit?
How do I use regex within a certain cell in a csv to edit?

Time:10-29

I have a CSV with a structure as:

Test CSV:

Column A Column B
abc-dfcv rebtgsergbsedrfgesrg
water rdfe egreg
oluiuilegregreg
def fefd rtjtyujdtgfhndgfhjfh
water edgregerg
rygebfvkjuer

As can be seen, in each cell of column B there are multiple lines. I need to edit it so only the lines which start with "water" are kept within t of the lines are omitted. This has to be done for all cells in Column B.

The regex statement I've made is re.findall("^water'.*").

I tried to directly apply regex, but it halts and errors at the end of a line within a cell.

Thinking of something along these lines, but blanking on what the regex input should be.

df = pd.read_csv("MyFile.csv") for p in range(len(df.index)): df._set_value(p, "SCHEDULES", str(re.findall("^water'.*", ??????????????? ))) df.to_csv("Nexpose_Schedules.csv", index=False)

CodePudding user response:

You can do it like this:

df = pd.read_csv('MyFile.csv')
df_new = df.loc[df['Column B'].str.contains(r'\bwater', case=False)]

CodePudding user response:

you can use the function 'startswith' instead of regex and the answer would like this:

result = df[df["Column B"].str.startswith("water")]
  • Related