Convert missing values in a DataFrame to a given data type (Python

I have a DataFrame d1 with strings and missing values, such as

d1 = pd.DataFrame([["A", "B", "C"],
                   ["D", np.nan, "F"],
                   ["G", "H", "I"],],
                  columns=[1, 2, 3])

whose columns I would like to aggregate in single-row DataFrame d2:

Folllowing suggestions in a previous post, tried the following code:

d2 = d1.agg(''.join).to_frame().T

Still, as one of the values in d1 was missing (and, thus, a float), I got the following error:

TypeError: sequence item 1: expected str instance, float found

Would you know how to change missing values in DataFrames to another data type such as string?

CodePudding user response：

You can fill the missing value with an empty string:

d1.fillna('')

So the overall code becomes

d1.fillna('').agg(''.join).to_frame().T

     1   2    3
0  ADG  BH  CFI

CodePudding user response：

You can do a replace for nan values into ''

d1 = pd.DataFrame([["A", "B", "C"],
                   ["D", np.nan, "F"],
                   ["G", "H", "I"],],
                  columns=['1', '2', '3'])
d1.replace(np.nan,'',inplace=True)
d2 = d1.agg(''.join,axis=1).to_frame().T

CodePudding user response：

The null value is causing the error, so fill it with empty string. You could try this:

d2 = pd.DataFrame(d1.fillna('').agg(''.join)).T
print(d2)

     1   2    3
0  ADG  BH  CFI