Problems with putting the apply method-CodePudding

def mean1(x):
    return sum(x)/len(x)

df2['children'] = df2['children'].apply(mean1)

The error I am getting is the following:- 'int' object is not iterable

I think I am applying the Apply() function correctly. But still getting an error.

CodePudding user response：

You should apply mean1 on the column, not the items:

df2['children'] = mean1(df2['children'])

Or better, use the pandas builtin mean method:

df2['children'] = df2['children'].mean()

CodePudding user response：

With a sample dataframe

In [372]: df
Out[372]: 
   0    1   2   3
1  0    1   2   3
2  4  100   6   7
3  8    9  10  11
In [373]: df[1]     # one column
Out[373]: 
1      1
2    100
3      9
Name: 1, dtype: int64

and your function - modified to show what x it's getting:

In [375]: def mean1(x):
     ...:     print(x)
     ...:     return sum(x)/len(x)
     ...: 
In [376]: df[1].apply(mean1)
1
Traceback (most recent call last):
  File "<ipython-input-376-e12f9dfea5ae>", line 1, in <module>
    df[1].apply(mean1)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/series.py", line 4357, in apply
    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/apply.py", line 1043, in apply
    return self.apply_standard()
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/apply.py", line 1099, in apply_standard
    mapped = lib.map_infer(
  File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
  File "<ipython-input-375-48efb527b53e>", line 3, in mean1
    return sum(x)/len(x)
TypeError: 'int' object is not iterable

See x is 1, a single number. Python can't do the sum and len on 1. The error wasn't in the apply, but in your function, which was not written with single numbers in mind.

What were you intending it to do? Take the mean of the whole column ? Or the mean of an array or list in each cell?

In [378]: mean1(df[1])
1      1
2    100
3      9
Name: 1, dtype: int64
Out[378]: 36.666666666666664

apply and your function would work if the dataframe column contained lists or arrays

In [386]: df = pd.DataFrame([None,None,None],columns=['one'])
In [387]: df['one'] = [np.ones(5).tolist(),np.arange(4).tolist(),np.zeros(9).tol
     ...: ist()]
In [388]: df
Out[388]: 
                                             one
0                      [1.0, 1.0, 1.0, 1.0, 1.0]
1                                   [0, 1, 2, 3]
2  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
In [389]: df['one'].apply(mean1)
[1.0, 1.0, 1.0, 1.0, 1.0]
[0, 1, 2, 3]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
Out[389]: 
0    1.0
1    1.5
2    0.0
Name: one, dtype: float64