I have a dataframe:
id1 id2
a NaN
b c
d e
I want to create new columns ids as a concatenation of id1 and id2:
df.ids = df.id1 "-" df.id2
But result is:
id1 id2 ids
a NaN NaN
b c b-c
d e d-e
As you see concatenation of a and NaN is NaN, however, I would like to get a-NaN. How to do that? desired result:
id1 id2 ids
a NaN a-NaN
b c b-c
d e d-e
CodePudding user response:
import numpy as np
import pandas as pd
df = pd.DataFrame({"id1":['a', 'b', 'c'], "id2":[np.nan, 'c', 'e']})
df['ids'] = df.id1.replace(np.nan, 'NaN') "-" df.id2.replace(np.nan, 'NaN')
print(df)
> id1 id2 ids
> 0 a NaN a-NaN
> 1 b c b-c
> 2 c e c-e
I think you can replace the np.nan before the sum.
CodePudding user response:
df = pd.DataFrame({'id1': ['a', 'b', 'c'], 'id2': [np.NaN, 'c', 'e']})
df['id2'] = df['id2'].astype(str)
df['ids'] = df['id1'] "-" df['id2']
df
Result:
id1 id2 ids
0 a nan a-nan
1 b c b-c
2 c e c-e
Alternatively, use:
df['id2'] = df['id2'].fillna('NaN')
instead of the type conversion. This way, you get to pick the exact string (including capitalization) you want for the NaNs.
CodePudding user response:
This will work:
df['ids'] = df['id1'].astype(str) "-" df['id2'].astype(str)
Output:
id1 id2 ids
0 a NaN a-nan
1 b c b-c
2 d e d-e
Another way would be to first convert NaN to string by using .fillna()
:
df['id2'].fillna('NaN', inplace = True)
df['ids'] = df['id1'] "-" df['id2']
Output:
id1 id2 ids
0 a NaN a-NaN
1 b c b-c
2 d e d-e
CodePudding user response:
Given a number a, I would say that you cannot perform a-NaN, as NaN stands for "Not a Number".
If you just want to skip NaN as if they were 0s', you'll need to filter them (as done in this other case).