Home > Back-end >  Python, split column of strings at positions specified in other columns
Python, split column of strings at positions specified in other columns

Time:09-21

I have a dataframe

df1 = pd.DataFrame({"strings":["stackoverflow", "stackexchange"], "start":[3, 4], "end": [7, 9]})

I want to split the strings column at start and end positions.

df1['strings'].str[df1['start']:df1['end']]

gives me NaN.

I managed to get the result this way, but it doesn't seem optimal.

[df1['strings'].str[i:j] for i, j in zip(df1['start'], df1['end'])][:1]

CodePudding user response:

I don't think there's any way to vectorize this - the best you can do is row-wise function application.

import pandas as pd
import numpy as np

df1 = pd.DataFrame({"strings":["stackoverflow", "stackexchange"], "start":[3, 4], "end": [7, 9]})
df1['strings'] = df1.apply(lambda x: x['strings'][x["start"]:x["end"]], axis=1)

See also: Get substring in one column based on the value in another column

Note that your example is not quite correct - you're taking the start and end values from a particular row and applying them to every row.

CodePudding user response:

one way is this

    for _, row in df1.iterrows():
        print(row['strings'][row['start']:row['end']])
  • Related