I have a dataframe
df1 = pd.DataFrame({"strings":["stackoverflow", "stackexchange"], "start":[3, 4], "end": [7, 9]})
I want to split the strings
column at start
and end
positions.
df1['strings'].str[df1['start']:df1['end']]
gives me NaN
.
I managed to get the result this way, but it doesn't seem optimal.
[df1['strings'].str[i:j] for i, j in zip(df1['start'], df1['end'])][:1]
CodePudding user response:
I don't think there's any way to vectorize this - the best you can do is row-wise function application.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({"strings":["stackoverflow", "stackexchange"], "start":[3, 4], "end": [7, 9]})
df1['strings'] = df1.apply(lambda x: x['strings'][x["start"]:x["end"]], axis=1)
See also: Get substring in one column based on the value in another column
Note that your example is not quite correct - you're taking the start and end values from a particular row and applying them to every row.
CodePudding user response:
one way is this
for _, row in df1.iterrows():
print(row['strings'][row['start']:row['end']])