I have dataframe with steps/action in user behaviour. Sample is provided. There are many steps. Each step contains two columns: subtitle and dimension. I need to merge columns subtitle and dimension for each step new column - if dimension is empty then keep only subtitle, if not keep only dimension.
So new column step0 value: if df['dimension1 (step0)'] not null value then use df['dimension1 (step0)] if df['dimension 1 (step0)] is null then use df['subtitle (step0)'] then repeated for step1.
I am complete newbie.
Expected output:
values for df['step0']: client, homepage, internal values for df['step1']: client, client, map
etc.
Please help by providing code
CodePudding user response:
Assume idVisit
is the index. Then you may try .combine_first()
method on every odd column (dimension
) with every even one (subtitle
):
# set the index just in case
df.set_index('idVisit', inplace=True)
# loop over subtitles and dimensions zipped together and enumerated
for n, (subtitle, dimension) in enumerate(zip(df.columns[0::2], df.columns[1::2])):
df[f'step {n}'] = df[dimension].combine_first(df[subtitle])
# show only added columns
df.iloc[:, 8:]
Output:
# only the added columns are shown
step 0 step 1 step 2 step 3
idVisit
1 client client client client
2 homepage client homepage client
3 internal map internal map