Home > Back-end >  Рow to swap values in columns depending on a certain condition?
Рow to swap values in columns depending on a certain condition?

Time:08-16

I have a dataset:

id1          id2         val 
first10     second5      10
second3     first19      14
first2      second7      8
first10     second10     1
second8     first22      9

I want to swap values in columns id1 and id2 to have in id1 only values containing "first" and in id2 only values containing "second". So desired result is:

id1          id2         val 
first10     second5      10
first19     second3      14
first2      second7      8
first10     second10     1
first22     second8      9

How could I do that? i know about str.contains() but how should condition look like?

CodePudding user response:

here is one way do it, using np.where by checking if the first id starts with second, then swap, else keep as is

df['id1'],df['id2']=np.where(df['id1'].str.startswith('second'),
                           [df['id2'],df['id1']],
                           [df['id1'],df['id2']]
                          )
df
        id1         id2     val
0   first10     second5     10
1   first19     second3     14
2   first2      second7     8
3   first10     second10    1
4   first22     second8     9

CodePudding user response:

Maybe not the most elegant solution, but you can achieve it, transforming DataFrame in a list of dict and then making it a DataFrame again. For example:

df_list = []
for i in range(df.shape[0]):
    row = dict(df.iloc[i])
    new_row = {}
    for key in row:
        val = row[key]
        if type(val) == str and val.startswith('f'):
            new_row['id1'] = val
        elif type(val) == str and val.startswith('s'):
            new_row['id2'] = val
        else:
            new_row['val'] = val
    df_list.append(new_row)

new_df = pd.DataFrame(df_list)

CodePudding user response:

My suggestion is to define a function that can take a row of the dataframe and swap the id1 and id2 values in the condition required, and then apply it to the dataframe with df.apply(func, axis=1).

def conditional_swap(row):
    if row["id1"].startswith("second"):
        row["id1"], row["id2"] = row["id2"], row["id1"]
    return row

df.apply(conditional_swap, axis=1)

(axis=1 ensures that it is applied row-wise, not column-wise)

Hope this helps!

  • Related