I have trying to learn Python on datacamp. When I run statistical codes with agg - like min, max - I realized that it is the same result with using np with these codes. But I can not use mean and median methods without np.
So,
is there any difference between np.min and min , np.max and max?
why don't median and mean work without numpy?
Code:
unemp_fuel_stats = sales.groupby("type")["unemployment",
"fuel_price_usd_per_l"].agg([np.min, np.max, np.mean, np.median])
Result:
unemployment fuel_price_usd_per_l
amin amax mean median amin amax mean median
type
A 3.879 8.992 7.973 8.067 0.664 1.107 0.745 0.735
B 7.170 9.765 9.279 9.199 0.760 1.108 0.806 0.803
Another result:
unemployment fuel_price_usd_per_l
min max min max
type
A 3.879 8.992 0.664 1.107
B 7.170 9.765 0.760 1.108
CodePudding user response:
min
is the builtin python function and numpy.min
is another min function that is in the numpy package. They act similarly but different.
Given a single list, they will act the same:
print(np.min([1, 2, 3, 4, 5]))
print(min([1, 2, 3, 4, 5]))
1
1
However, given multiple arguments, numpy.min
will throw an error as the first argument must be array-like. min
will simply return the minimum of all elements.
print(min(1, 2, 3, 4, 5))
try:
print(np.min(1, 2, 3, 4, 5))
except Exception as e:
print(e)
1
output must be an array
numpy.min
can take the minimum along a specific axis of a multi-dimensional array (which uses multiple square brackets) or the minimum of all elements. min
will simply given the minimum list (though I forget how they're ordered).
twod_array = [[1, 2, 5],
[3, 7, 5],
[10, 5, 7]]
print(min(twod_array))
print(np.min(twod_array))
print(np.min(twod_array, axis = 0)) # Minimum of each column
print(np.min(twod_array, axis = 1)) # Minimum of each row
[1, 2, 5]
1
[1 2 5]
[1 3 5]
Hope that was enlightening. The tl;dr is that they are simply different functions with the same name. It's good to use import numpy as np
to differentiate min
from np.min
. Using something like from numpy import min
will cause ambiguity as to which function is actually being used.
Python has no builtin median
or mean
but numpy
has numpy.median
and numpy.mean