How remove values after the last comma in string-CodePudding

I am new to python and this platform and apologies already if my post not clear enough - I am struggling a bit with the table design.

I am basically trying to remove the last comma in my string and insert the word "and" instead

(col. Public_Transport_Route_Desc)	Target
First: This route coincides with	First: This route coincides with
portions of existing bus routes	portions of existing bus routes
214, 216.	214 and 216.
Second: This route coincides with	Second: This route coincides with
portions of existing bus routes	portions of existing bus routes
201, 205, 208, 214, 216, 220,	201, 205, 208, 214, 216, 220,
220X, 226X.	220X and 226X.
Third: This route coincides with	Third: This route coincides with
portions of existing bus routes	portions of existing bus routes
201, 208, 214, 216, 220, 220X,	201, 208, 214, 216, 220, 220X
226X.	and 226X.
Forth:This route does not coincide	Forth:This route does not coincide
with any portions of existing bus	with any portions of existing bus
routes.	routes.

I have handled the last part of the string (after the last comma) separately and add the word "and" to it. Now I need to remove the last part of the original string and join the string with the amended last part which includes the word "and". For this, I was trying to remove the last part of the string (after the last comma) but I haven't had much luck with this.

Is there a way to remove the last part of the string after the last comma or even another way to approach the problem.

import pandas as pd
data= pd.read_csv("MCA Data.csv")
data["Temp"] = data["Public_Transport_Route_Desc"].str.rsplit(',').str[-1]
data["Desc_Check"]=data["Temp"].str.startswith('T', na=False)

for index, row in data.iterrows():
if data.loc[index, "Desc_Check"] == True:
data.loc[index, "Temp"] = ""
else:
data.loc[index, "Temp"] = "and"   data.loc[index, "Temp"]here

Thanks!

CodePudding user response：

simple string splitting and rejoining.

note that this has issues with handling multiple sentences. You might need to split sentences based on the period character...

def replace_last_comma(instr: str) -> str:
    if ',' in instr:
        # split and rejoin, taking all commas excluding the last
        result = ",".join(instr.split(",")[:-1])
        # add the last part with " and"
        result = f"{result} and{instr.split(',')[-1]}"
        return result
    else:
        return instr

print(replace_last_comma("this, is, a, test"))
print(replace_last_comma("this is a, test"))
print(replace_last_comma("this is a test"))

output:

this, is, a and test
this is a and test
this is a test

CodePudding user response：

I would use str.rpartition.

>>> s = "Foo, bar, baz, bang"
>>> start, sep, end = s.rpartition(", ")
>>> if sep:
...     sep = " and "
... 
>>> result = start   sep   end
>>> print(result)
Foo, bar, baz and bang

rpartition will split a string into three parts: the part of the string before the last occurrence of the separator, the separator itself, and the string after the separator. You can then change the separator to whatever you want and put the string back together. If the separator does not occur in the string, the first two strings will be empty, with the entire string in end.

>>> s = "Foo bar baz bang"
>>> start, sep, end = s.rpartition(", ")
>>> if sep:
...     sep = " and "
... 
>>> result = start   sep   end
>>> print(result)
Foo bar baz bang

CodePudding user response：

Given you are loading data using pandas dataframe, you can try the following code:

def replace_last_comma_with_and(input_str: str):
    comma_splits = input_str.rsplit(",", 1)
    if len(comma_splits) > 1:
       return f"{comma_splits[0].rstrip()} and {comma_splits[-1].lstrip()}"
    
    return input_str

data["Target"] = data["Public_Transport_Route_Desc"].apply(lambda x: replace_last_comma_with_and(x))