Swap rows between two dataframes with a pattern-CodePudding

I have 2 dataframes looking like this :

In each dataframe there is pattern of 1-2 in the values column. (the values are not significant to my problem, just to demonstrate the pattern)

df1 = {'idx': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
       'values': [20, 1000, 10001, 21, 1000, 1002, 22, 1003, 1007,23]}
df2 = {'idx': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
       'values': [1000, 21, 22, 1000, 22, 23, 1000, 20, 21, 1000]}

I need to swap rows between the two dataframes so that the outcome would be :

df_expected1 = {'idx': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
       'values': [20, 21, 22, 21, 22, 23, 22, 20, 21,23]}

df_expected2 = {'idx': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
       'values': [1000, 1000, 10001, 1000, 1000, 1002, 1000, 1003, 1007, 1000]}

CodePudding user response：

Knowing that swap-needed indices are from 3n-2, you can make mask then use numpy.where:

m = df1["idx"].add(2).mod(3).eq(0)
s1 = np.where(m, df1["values"], df2["values"])
s2 = np.where(~m, df1["values"], df2["values"])

df1["values"] = s1
df2["values"] = s2

Output:

   idx  values
0    1      20
1    2      21
2    3      22
3    4      21
4    5      22
5    6      23
6    7      22
7    8      20
8    9      21
9   10      23

   idx  values
0    1    1000
1    2    1000
2    3   10001
3    4    1000
4    5    1000
5    6    1002
6    7    1000
7    8    1003
8    9    1007
9   10    1000

CodePudding user response：

This should do it, knowing that the indices are the same between two dataframes :

df1[df1['idx']%3 == 1], df2[df1['idx']%3 == 1] = df2[df1['idx']%3 == 1], df1[df1['idx']%3 == 1]

Output :

   idx  values
 0    1    1000
 1    2    1000
 2    3   10001
 3    4    1000
 4    5    1000
 5    6    1002
 6    7    1000
 7    8    1003
 8    9    1007
 9   10    1000
    idx  values
 0    1      20
 1    2      21
 2    3      22
 3    4      21
 4    5      22
 5    6      23
 6    7      22
 7    8      20
 8    9      21
 9   10      23

CodePudding user response：

In the code snippet below, I assumed that indexes in both df1 and df2 were equal and that values from df1 are always supposed to be greater than df2.

import pandas as pd
import numpy as np

from pprint import pprint

df1 = {'idx': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
       'values': [20, 1000, 10001, 21, 1000, 1002, 22, 1003, 1007,23]}
df2 = {'idx': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
       'values': [1000, 21, 22, 1000, 22, 23, 1000, 20, 21, 1000]}

a = pd.DataFrame(df1).set_index('idx')
b = pd.DataFrame(df2).set_index('idx')

col_name = 'values'
a_series = a[col_name]
b_series = b[col_name]
for i in a_series.index:
    if a_series.loc[i] > b_series.loc[i]:
        a_series.loc[i], b_series.loc[i] = b_series.loc[i], a_series.loc[i]

df_expected1 = {'idx': a_series.index.tolist(), 'values': a_series.values.tolist()}
df_expected2 = {'idx': b_series.index.tolist(), 'values': b_series.values.tolist()}

pprint(df_expected1)
pprint(df_expected2)

Output:

{'idx': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 'values': [1000, 1000, 10001, 1000, 1000, 1002, 1000, 1003, 1007, 1000]}
{'idx': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 'values': [20, 21, 22, 21, 22, 23, 22, 20, 21, 23]}