I'm trying to use a cell value as the slice for a string in a new column. For example, if I create this table.
data = pd.DataFrame(data = {'Name':['This is a title'], 'Number':[-5]})
Name Number
0 This is a title -5
And create a new column like so:
data['Test'] = data.Name.str[:data.Number.item()]
It'll create the new column, as expected:
Name Number Test
0 This is a title -5 This is a
The issue occurs when I have more than row, so if I create the following table:
data = pd.DataFrame(data = {'Name':['This is a title', 'This is another title'], 'Number':[-5, -13]})
Name Number
0 This is a title -5
1 This is another title -13
The creation of the 'Test' column yields:
can only convert an array of size 1 to a Python scalar
I understand why this is happening since the column now has more than one value, what I want to know is how can I do this with a dataframe that has more than one row? I've tried .items(), .values(), etc. and the new column just becomes NaN.
Any thoughts?
Thanks!
CodePudding user response:
You can use apply
with axis=1
and move on dataframe
row by row.
import pandas as pd
data = pd.DataFrame(data = {'Name':['This is a title', 'This is another title'], 'Number':[-5, -13]})
data['Test'] = data.apply(lambda row: row['Name'][:row['Number']], axis=1)
print(data)
Output:
Name Number Test
0 This is a title -5 This is a
1 This is another title -13 This is
CodePudding user response:
Unfortunately, here, you need to loop. A list comprehension will be the most efficient:
data['Test'] = [s[:i] for s,i in zip(data['Name'], data['Number'])]
output:
Name Number Test
0 This is a title -5 This is a
1 This is another title -13 This is