How do I use regex within a certain cell in a csv to edit?-CodePudding

I have a CSV with a structure as:

Test CSV:

Column A	Column B
abc-dfcv	rebtgsergbsedrfgesrg
	water rdfe egreg
	oluiuilegregreg


def fefd	rtjtyujdtgfhndgfhjfh
	water edgregerg
	rygebfvkjuer

As can be seen, in each cell of column B there are multiple lines. I need to edit it so only the lines which start with "water" are kept within t of the lines are omitted. This has to be done for all cells in Column B.

The regex statement I've made is re.findall("^water'.*").

I tried to directly apply regex, but it halts and errors at the end of a line within a cell.

Thinking of something along these lines, but blanking on what the regex input should be.

df = pd.read_csv("MyFile.csv") for p in range(len(df.index)): df._set_value(p, "SCHEDULES", str(re.findall("^water'.*", ??????????????? ))) df.to_csv("Nexpose_Schedules.csv", index=False)

CodePudding user response：

You can do it like this:

df = pd.read_csv('MyFile.csv')
df_new = df.loc[df['Column B'].str.contains(r'\bwater', case=False)]

CodePudding user response：

you can use the function 'startswith' instead of regex and the answer would like this:

result = df[df["Column B"].str.startswith("water")]