Say we have this dict as a dataframe with two columns:
data = {
"slice_by" : [2, 2, 1]
"string_to_slice" : ["one", "two", "three"]
}
First line works just fine, second one doesn't:
df["string_to_slice"].str[:1])
df["string_to_slice"].str[:df["slice_by"])
Output:
0 ne
1 wo
2 hree
Name: string_to_slice, Length: 3, dtype: object
0 NaN
1 NaN
2 NaN
Name: string_to_slice, Length: 3, dtype: float64
What would be the appropiate way to do this? I'm sure I could make up something with df.iterrows() but that's probably not the efficient way.
CodePudding user response:
here is one way to do it, by using apply
df.apply(lambda x: x['string_to_slice'][:x['slice_by']], axis=1)
0 on
1 tw
2 t
CodePudding user response:
I am assuming you want str[slice_by:]
and not str[:slice_by]
. With that assumption you can do:
def slice_string(string_to_slice, slice_by):
return string_to_slice[slice_by:]
np_slice_string = np.vectorize(slice_string)
out = np_slice_string(df['string_to_slice'], df['slice_by'])
print(out):
['e' 'o' 'hree']