def mean1(x):
return sum(x)/len(x)
df2['children'] = df2['children'].apply(mean1)
The error I am getting is the following:- 'int' object is not iterable
I think I am applying the Apply() function correctly. But still getting an error.
CodePudding user response:
You should apply mean1
on the column, not the items:
df2['children'] = mean1(df2['children'])
Or better, use the pandas builtin mean
method:
df2['children'] = df2['children'].mean()
CodePudding user response:
With a sample dataframe
In [372]: df
Out[372]:
0 1 2 3
1 0 1 2 3
2 4 100 6 7
3 8 9 10 11
In [373]: df[1] # one column
Out[373]:
1 1
2 100
3 9
Name: 1, dtype: int64
and your function - modified to show what x
it's getting:
In [375]: def mean1(x):
...: print(x)
...: return sum(x)/len(x)
...:
In [376]: df[1].apply(mean1)
1
Traceback (most recent call last):
File "<ipython-input-376-e12f9dfea5ae>", line 1, in <module>
df[1].apply(mean1)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/series.py", line 4357, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "/usr/local/lib/python3.8/dist-packages/pandas/core/apply.py", line 1043, in apply
return self.apply_standard()
File "/usr/local/lib/python3.8/dist-packages/pandas/core/apply.py", line 1099, in apply_standard
mapped = lib.map_infer(
File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
File "<ipython-input-375-48efb527b53e>", line 3, in mean1
return sum(x)/len(x)
TypeError: 'int' object is not iterable
See x
is 1
, a single number. Python can't do the sum
and len
on 1
. The error wasn't in the apply
, but in your function, which was not written with single numbers in mind.
What were you intending it to do? Take the mean of the whole column ? Or the mean of an array or list in each cell?
In [378]: mean1(df[1])
1 1
2 100
3 9
Name: 1, dtype: int64
Out[378]: 36.666666666666664
apply
and your function would work if the dataframe column contained lists or arrays
In [386]: df = pd.DataFrame([None,None,None],columns=['one'])
In [387]: df['one'] = [np.ones(5).tolist(),np.arange(4).tolist(),np.zeros(9).tol
...: ist()]
In [388]: df
Out[388]:
one
0 [1.0, 1.0, 1.0, 1.0, 1.0]
1 [0, 1, 2, 3]
2 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
In [389]: df['one'].apply(mean1)
[1.0, 1.0, 1.0, 1.0, 1.0]
[0, 1, 2, 3]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
Out[389]:
0 1.0
1 1.5
2 0.0
Name: one, dtype: float64