Home > Net >  Find the values based on date in another data frame with different ids
Find the values based on date in another data frame with different ids

Time:07-31

I have three data frames. df_k, df_w and df_related. df_related is a df which connect the other two dfs. I want to choose the value from df_w which the ids and the date is same as the date in df_k. Here is a simple example:

df_k = pd.DataFrame()
df_k['id_k'] = [3, 4, 4]
df_k['date'] = ['2021-05-01', '2021-05-01', '2021-05-02']
df_k 
    id_k    date
0   3   2021-05-01
1   4   2021-05-01
2   4   2021-05-02

Here is the df_related:

df_related = pd.DataFrame()
df_related['id_w'] = [1,  2,  3]

df_related['id_k'] = [3, 4, 5]
df_related
      id_w  id_k
  0     1   3
  1     2   4
  2     3   5

And here is the df_w:

df_w['id_w'] = [1, 1, 2, 2, 3]
df_w['date'] = ['2021-05-01', '2021-05-02', '2021-05-01', '2021-05-02', '2021-05-01']
df_w['val'] = [1,-1, 2,-2, 6]

   id_w     date    val
0   1   2021-05-01   1
1   1   2021-05-02  -1
2   2   2021-05-01   2
3   2   2021-05-02  -2
4   3   2021-05-01   6

Here is the desired output:

   id_w     date    val
0   1   2021-05-01   1
1   2   2021-05-01   2
2   2   2021-05-02  -2

Explanation: I only has the id_w in the df_k for one day.(2021-05-01). I dont any data in df_k regarding the id_w=3. I have two data in two different date for id=3 in id_k. Could you please help me with that?

CodePudding user response:

You can map and merge. There are several ways such as assigning a new column, or like here, using a merge key:

key = df_k['id_k'].map(df_related.set_index('id_k')['id_w']).rename('id_w')

out = (df_k.merge(df_w, left_on=[key, 'date'], right_on=['id_w', 'date'])
           #.drop(columns='id_k')  # uncomment to drop id_k
       )

output:

   id_k        date  id_w  val
0     3  2021-05-01     1    1
1     4  2021-05-01     2    2
2     4  2021-05-02     2   -2
  • Related