I have a DataFrame d1 with strings and missing values, such as
d1 = pd.DataFrame([["A", "B", "C"],
["D", np.nan, "F"],
["G", "H", "I"],],
columns=[1, 2, 3])
whose columns I would like to aggregate in single-row DataFrame d2:
Folllowing suggestions in a previous post, tried the following code:
d2 = d1.agg(''.join).to_frame().T
Still, as one of the values in d1 was missing (and, thus, a float), I got the following error:
TypeError: sequence item 1: expected str instance, float found
Would you know how to change missing values in DataFrames to another data type such as string?
CodePudding user response:
You can fill the missing value with an empty string:
d1.fillna('')
So the overall code becomes
d1.fillna('').agg(''.join).to_frame().T
1 2 3
0 ADG BH CFI
CodePudding user response:
You can do a replace for nan values into ''
d1 = pd.DataFrame([["A", "B", "C"],
["D", np.nan, "F"],
["G", "H", "I"],],
columns=['1', '2', '3'])
d1.replace(np.nan,'',inplace=True)
d2 = d1.agg(''.join,axis=1).to_frame().T
CodePudding user response:
The null value is causing the error, so fill it with empty string. You could try this:
d2 = pd.DataFrame(d1.fillna('').agg(''.join)).T
print(d2)
1 2 3
0 ADG BH CFI