I want to merge 2 columns of the same dataframe, and add a duplicate row using the same values as it has in the other columns
consider the following dataframe:
Column A | Column B | Column C |
---|---|---|
ABC | '' | 1 |
GHI | XYZ | 2 |
'' | PQR | 3 |
'' | '' | 4 |
The conditions are:
- If the Column A has a alphanumeric value and the Column B has a Nan value or a '' (empty string) -> the Result column should only consider the value from Number-first
- If the Column A has a Nan or '' (empty string) value and the Column B has a alphanumeric value -> the Result column should only consider the value from Number-second
- If the values from both the columns are alphanumeric the result column should duplicate itself where the first value should be Column A and the second value should be Column B
- If both the Columns have Nan or empty string values, the result should consist of a '' (empty string) value
Following would be the output for the above dataframe:
Column A | Column B | Column C | Result |
---|---|---|---|
ABC | '' | 1 | ABC |
GHI | XYZ | 2 | GHI |
GHI | XYZ | 2 | XYZ |
'' | PQR | 3 | PQR |
'' | '' | 4 | '' |
I have been unsuccessful in making it work
Thanks a lot in advance for the help !
CodePudding user response:
You can create a list of values from Column A
and Column B
columns then explode it:
result = df[['A', 'B']].replace('', np.nan).stack().groupby(level=0).apply(list)
df = df.assign(Result=result.fillna('')).explode('Result')
print(df)
# Output
A B C Result
0 ABC 1 ABC
1 GHI XYZ 2 GHI
1 GHI XYZ 2 XYZ
2 PQR 3 PQR
3 4