I am new to python and this platform and apologies already if my post not clear enough - I am struggling a bit with the table design.
I am basically trying to remove the last comma in my string and insert the word "and" instead
(col. Public_Transport_Route_Desc) | Target |
---|---|
First: This route coincides with | First: This route coincides with |
portions of existing bus routes | portions of existing bus routes |
214, 216. | 214 and 216. |
Second: This route coincides with | Second: This route coincides with |
portions of existing bus routes | portions of existing bus routes |
201, 205, 208, 214, 216, 220, | 201, 205, 208, 214, 216, 220, |
220X, 226X. | 220X and 226X. |
Third: This route coincides with | Third: This route coincides with |
portions of existing bus routes | portions of existing bus routes |
201, 208, 214, 216, 220, 220X, | 201, 208, 214, 216, 220, 220X |
226X. | and 226X. |
Forth:This route does not coincide | Forth:This route does not coincide |
with any portions of existing bus | with any portions of existing bus |
routes. | routes. |
I have handled the last part of the string (after the last comma) separately and add the word "and" to it. Now I need to remove the last part of the original string and join the string with the amended last part which includes the word "and". For this, I was trying to remove the last part of the string (after the last comma) but I haven't had much luck with this.
Is there a way to remove the last part of the string after the last comma or even another way to approach the problem.
import pandas as pd
data= pd.read_csv("MCA Data.csv")
data["Temp"] = data["Public_Transport_Route_Desc"].str.rsplit(',').str[-1]
data["Desc_Check"]=data["Temp"].str.startswith('T', na=False)
for index, row in data.iterrows():
if data.loc[index, "Desc_Check"] == True:
data.loc[index, "Temp"] = ""
else:
data.loc[index, "Temp"] = "and" data.loc[index, "Temp"]here
Thanks!
CodePudding user response:
simple string splitting and rejoining.
note that this has issues with handling multiple sentences. You might need to split sentences based on the period character...
def replace_last_comma(instr: str) -> str:
if ',' in instr:
# split and rejoin, taking all commas excluding the last
result = ",".join(instr.split(",")[:-1])
# add the last part with " and"
result = f"{result} and{instr.split(',')[-1]}"
return result
else:
return instr
print(replace_last_comma("this, is, a, test"))
print(replace_last_comma("this is a, test"))
print(replace_last_comma("this is a test"))
output:
this, is, a and test
this is a and test
this is a test
CodePudding user response:
I would use str.rpartition.
>>> s = "Foo, bar, baz, bang"
>>> start, sep, end = s.rpartition(", ")
>>> if sep:
... sep = " and "
...
>>> result = start sep end
>>> print(result)
Foo, bar, baz and bang
rpartition will split a string into three parts: the part of the string before the last occurrence of the separator, the separator itself, and the string after the separator. You can then change the separator to whatever you want and put the string back together. If the separator does not occur in the string, the first two strings will be empty, with the entire string in end
.
>>> s = "Foo bar baz bang"
>>> start, sep, end = s.rpartition(", ")
>>> if sep:
... sep = " and "
...
>>> result = start sep end
>>> print(result)
Foo bar baz bang
CodePudding user response:
Given you are loading data using pandas dataframe, you can try the following code:
def replace_last_comma_with_and(input_str: str):
comma_splits = input_str.rsplit(",", 1)
if len(comma_splits) > 1:
return f"{comma_splits[0].rstrip()} and {comma_splits[-1].lstrip()}"
return input_str
data["Target"] = data["Public_Transport_Route_Desc"].apply(lambda x: replace_last_comma_with_and(x))