How to replace nan in numpy array into blank or empty string. I googled it and it also related nan inside a pandas dataframe instead of numpy array.
CodePudding user response:
An array with np.nan
will be float dtype (let's not talk about object
dtypes here :))
In [274]: arr = np.array([1,2,np.nan, 4,np.nan])
In [275]: arr
Out[275]: array([ 1., 2., nan, 4., nan])
In [277]: arr[[2,4]]
Out[277]: array([nan, nan])
We can't replace any value in such array with a string!
In [278]: arr[[2,4]] = ' '
Traceback (most recent call last):
Input In [278] in <cell line: 1>
arr[[2,4]] = ' '
ValueError: could not convert string to float: ''
But if we first convert the float dtype to string:
In [279]: sarr = arr.astype(str)
In [280]: sarr
Out[280]: array(['1.0', '2.0', 'nan', '4.0', 'nan'], dtype='<U32')
In [281]: sarr[[2,4]] = ' '
In [282]: sarr
Out[282]: array(['1.0', '2.0', ' ', '4.0', ' '], dtype='<U32')
In a string dtype array, 'nan'
isn't special, not like it is in a float.
We have to use isnan
to identify float nan
:
In [283]: np.isnan(arr)
Out[283]: array([False, False, True, False, True])
In [284]: np.nonzero(np.isnan(arr))
Out[284]: (array([2, 4]),)
but use ordinary ==
to test for string 'nan'
:
In [285]: sarr = arr.astype(str)
In [286]: sarr == 'nan'
Out[286]: array([False, False, True, False, True])
Several answers suggest pandas - as in:
In [287]: S = pd.Series(arr)
In [288]: S
Out[288]:
0 1.0
1 2.0
2 NaN
3 4.0
4 NaN
dtype: float64
In [289]: S.replace?
In [290]: S.replace(np.nan, ' ')
Out[290]:
0 1.0
1 2.0
2
3 4.0
4
dtype: object
Note though the change dtype - from float to object. In this case, the series contains floats and strings.
In [292]: _.to_numpy()
Out[292]: array([1.0, 2.0, ' ', 4.0, ' '], dtype=object)
CodePudding user response:
Try this:
dfCopy = df.replace(np.nan, '', regex=True)
Check out the documentation for replace here
CodePudding user response:
You can use built-in functions to replace particular values, for example:
import numpy as np
arr = np.array((np.nan, 1, 0, np.nan, -42))
arr[np.isnan(arr)] = -100
print(arr)
The output would be:
array([-100., 1., 0., -100., -42.])
Note: you should be careful about what value you replace np.nan
with, as it should be the same type as the array (i.e. if your array is of type str
you can replace with an empty string).
CodePudding user response:
You can use np.where()
method to do that in this way:
a = np.array([[nan, 2], [3, nan]])
a = np.where(np.isnan(a), '', a)
print(a)
Output:
[['' '2.0']
['3.0' '']]
Process finished with exit code 0
Also if you want to replace it with a number value you could use np.nan_to_num()
method:
a = np.array([[nan, 2], [3, nan]])
a = np.nan_to_num(a, nan=0)
print(a)
Output:
[[0. 2.]
[3. 0.]]
Process finished with exit code 0
CodePudding user response:
Using fillna np methods :
Ex:
df2 = df.fillna("")
You can also convert berween numpy array to dataFram as following:
df = pd.DataFrame(numpy_array)
For more please check following: https://sparkbyexamples.com/pandas/pandas-replace-nan-with-blank-empty-string/#:~:text=Convert Nan to Empty String,in the Pandas DataFrame column.