I have a pd.DataFrame (called O) of 1402 rows × 1402 columns with columns and index both as ['XXX-icsd', 'YYY-icsd', ...] and cell values as some np.float64, some np.nan and problematically, some as pandas.core.series.Series.
202324-icsd | 644068-icsd | 27121-icsd | 93847-icsd | 154319-icsd | |
---|---|---|---|---|---|
202324-icsd | 0.000000 | 0.029729 | NaN | 0.098480 | 0.097867 |
644068-icsd | NaN | 0.000000 | NaN | 0.091311 | 0.091049 |
27121-icsd | 0.144897 | 0.137473 | 0.0 | 0.081610 | 0.080442 |
93847-icsd | NaN | NaN | NaN | 0.000000 | 0.005083 |
154319-icsd | NaN | NaN | NaN | NaN | 0.000000 |
The problem is that some cells (e.g. O.loc["192693-icsd", "192401-icsd"]
) contain a pandas.core.series.Series of form:
192693-icsd 0.129562
192693-icsd 0.129562
Name: 192401-icsd, dtype: float64
I'm struggling to make this cell contain only a np.float64. I tried:
O.loc["192693-icsd", "192401-icsd"] = O.loc["192693-icsd", "192401-icsd"][0]
and other various known forms of assignnign a new value to a cell in pd.DataFrame, but they only assign a new element to the same series in this cell, e.g. if I do
O.loc["192693-icsd", "192401-icsd"] = 5
then when calling O.loc["192693-icsd", "192401-icsd"]
I get:
192693-icsd 5.0
192693-icsd 5.0
Name: 192401-icsd, dtype: float64
How to modify O.loc["192693-icsd", "192401-icsd"]
so that it is of type np.float64?
CodePudding user response:
IIUC, you can try DataFrame.applymap
to check each cell type and get the first row if it is Series
df = df.applymap(lambda x: x.iloc[0] if type(x) == pd.Series else x)
CodePudding user response:
It works as expected for O.loc["192693-icsd", "192401-icsd"] = O.loc["192693-icsd", "192401-icsd"][0]
Check this colab link: https://colab.research.google.com/drive/1XFXuj4OBu8GXQx6DTqv04XellmFcFWbC?usp=sharing