Home > Enterprise >  Pandas or R - Merge Rows By Same Value in Column Over NaN values - Look at Example
Pandas or R - Merge Rows By Same Value in Column Over NaN values - Look at Example

Time:12-22

I have a very specific dataset it looks something like this:

record_id event_id instrument repeat_inst
PI0005 v03_abc_1 NaN 1
PI0005 v03_abc_1 i_sensor NaN
PI0005 v03_abc_1 NaN NaN
PI0005 v02_abc_33 i_sensor NaN
PI0005 v02_abc_33 NaN NaN
PI0006 v02_abc_1 i_sensor 1
PI0006 v02_abc_1 NaN NaN

How do I make it look like this:

record_id event_id instrument repeat_inst
PI0005 v03_abc_1 i_sensor 1
PI0005 v02_abc_33 i_sensor NaN
PI0006 v02_abc_2 i_sensor 1

Where rows with the same record_id and event_id get merged together, where NaN values are replaced with the other value, and if both values are NaN, then NaN can be kept (like in the forth and fifth row in the original dataframe).

Assume that only one of the related cells have a value and all others have NaN.

This should apply to all columns of the data, there are thousands of columns and rows.

I tried using group by, but don't know how to continue.

CodePudding user response:

With R

library(dplyr)
df1 %>%
   group_by(record_id, event_id) %>%
   summarise(across(everything(),  ~.x[!is.na(.x)][1]))
  • Related