I have 3 different dataframe's where each one contains different dates (df1, df2, & df3(df3 contains a list of every date starting from 2014-01-01)). What I am trying to accomplish is merging all 3 together and have all matching dates line up on the same row, and any rows where dates don't match to display 'NaN' or something similar. I'm having a bit of a problem where I outer merge df1 to df2 (we'll call the df1 & df2 merged dataframe df4) on the columns which contain dates. My result is below:
VIPR (Y-M-D) Viriciti (Y-M-D)
0 NaN 2021-07-07
1 NaN 2021-07-08
2 NaN 2021-07-09
3 NaN 2021-07-10
4 NaN 2021-07-11
5 NaN 2021-07-13
6 NaN 2021-07-14
7 NaN 2021-07-15
8 NaN 2021-07-16
9 NaN 2021-07-18
10 NaN 2021-07-20
11 NaN 2021-07-21
12 NaN 2021-07-22
13 NaN 2021-07-31
14 NaN 2021-08-02
15 NaN 2021-08-03
16 NaN 2021-08-10
17 NaN 2021-08-11
18 NaN 2021-08-12
19 NaN 2021-08-16
20 NaN 2021-08-17
21 NaN 2021-08-18
22 NaN 2021-08-19
23 NaN 2021-08-21
24 NaN 2021-08-24
25 NaN 2021-08-25
26 NaN 2021-08-26
27 NaN 2021-08-27
28 NaN 2021-08-28
29 NaN 2021-08-29
30 NaN 2021-08-31
31 NaN 2021-09-01
32 NaN 2021-09-02
33 NaN 2021-09-03
34 NaN 2021-09-04
35 NaN 2021-09-05
36 NaN 2021-09-10
37 NaN 2021-09-12
38 NaN 2021-09-14
39 NaN 2021-09-15
40 NaN 2021-09-16
41 NaN 2021-09-17
42 NaN 2021-09-18
43 NaN 2021-09-21
44 NaN 2021-09-22
45 NaN 2021-09-23
46 NaN 2021-09-24
47 NaN 2021-09-25
48 NaN 2021-09-28
49 NaN 2021-09-29
50 NaN 2021-09-30
51 NaN 2021-10-01
52 NaN 2021-10-02
53 NaN 2021-10-05
54 NaN 2021-10-12
55 NaN 2021-10-13
56 NaN 2021-10-14
57 NaN 2021-10-15
58 NaN 2021-10-16
59 NaN 2021-10-17
60 NaN 2021-10-18
61 NaN 2021-10-20
62 NaN 2021-10-21
63 NaN 2021-10-23
64 NaN 2021-10-27
65 NaN 2021-10-28
66 NaN 2021-10-29
67 NaN 2021-10-30
68 NaN 2021-11-01
69 NaN 2021-11-02
70 NaN 2021-11-03
71 NaN 2021-11-04
72 NaN 2021-11-09
73 NaN 2021-11-11
74 NaN 2021-11-12
75 NaN 2021-11-16
76 NaN 2021-11-19
77 NaN 2021-11-20
78 NaN 2021-11-23
79 NaN 2021-11-24
80 NaN 2021-11-25
81 NaN 2021-12-02
82 NaN 2021-12-03
83 NaN 2021-12-10
84 NaN 2021-12-11
85 NaN 2021-12-14
86 NaN 2021-12-15
87 NaN 2021-12-16
88 NaN 2021-12-17
89 NaN 2021-12-18
90 NaN 2022-01-04
91 NaN 2022-01-05
92 NaN 2022-01-06
93 NaN 2022-01-07
94 NaN 2022-01-08
95 NaN 2022-01-10
96 NaN 2022-01-11
97 NaN 2022-01-18
98 NaN 2022-01-19
99 NaN 2022-01-20
100 NaN 2022-01-21
101 NaN 2022-01-22
102 NaN 2022-01-25
103 NaN 2022-01-26
104 NaN 2022-01-27
105 NaN 2022-01-28
106 NaN 2022-01-29
107 NaN 2022-02-01
108 NaN 2022-02-02
109 NaN 2022-02-03
110 NaN 2022-02-04
111 NaN 2022-02-05
112 NaN 2022-02-08
113 NaN 2022-02-09
114 NaN 2022-02-11
115 NaN 2022-02-12
116 NaN 2022-02-15
117 NaN 2022-02-16
118 NaN 2022-02-17
119 NaN 2022-02-23
120 NaN 2022-02-24
121 NaN 2022-02-25
122 NaN 2022-02-26
123 NaN 2022-02-27
124 NaN 2022-03-03
125 NaN 2022-03-04
126 NaN 2022-03-08
127 NaN 2022-03-10
128 NaN 2022-03-22
129 NaN 2022-03-23
130 NaN 2022-03-24
131 NaN 2022-03-30
132 NaN 2022-03-31
133 NaN 2022-04-04
134 NaN 2022-04-05
135 NaN 2022-04-06
136 NaN 2022-04-07
137 NaN 2022-04-08
138 NaN 2022-04-09
139 NaN 2022-04-13
140 NaN 2022-04-18
141 NaN 2022-04-19
142 NaN 2022-04-20
143 NaN 2022-04-21
144 NaN 2022-04-22
145 NaN 2022-04-23
146 NaN 2022-04-24
147 NaN 2022-04-25
148 NaN 2022-04-26
149 NaN 2022-04-27
150 NaN 2022-04-28
151 NaN 2022-04-29
152 NaN 2022-05-02
153 NaN 2022-05-03
154 NaN 2022-05-04
155 NaN 2022-05-05
156 NaN 2022-05-06
157 NaN 2022-05-07
158 NaN 2022-05-08
159 NaN 2022-05-10
160 NaN 2022-05-11
161 NaN 2022-05-12
162 NaN 2022-05-13
163 NaN 2022-05-14
164 NaN 2022-05-15
165 NaN 2022-05-16
166 NaN 2022-05-17
167 NaN 2022-05-19
168 NaN 2022-05-20
169 NaN 2022-05-21
170 NaN 2022-05-22
171 NaN 2022-05-23
172 NaN 2022-05-24
173 NaN 2022-05-25
174 NaN 2022-05-26
175 NaN 2022-05-27
176 NaN 2022-05-28
177 NaN 2022-05-29
178 NaN 2022-05-30
179 NaN 2022-05-31
180 NaN 2022-06-01
181 NaN 2022-06-02
182 NaN 2022-06-03
183 2022-06-04 2022-06-04
184 2022-06-05 2022-06-05
185 2022-06-06 2022-06-06
186 2022-06-07 2022-06-07
187 NaN 2022-06-08
188 NaN 2022-06-09
189 2022-06-10 2022-06-10
190 2022-06-11 2022-06-11
191 2022-06-12 2022-06-12
192 2022-06-13 2022-06-13
193 2022-06-14 2022-06-14
194 NaN 2022-06-15
195 2022-06-16 NaN
196 NaN 2022-06-17
197 2022-06-20 NaN
198 2022-06-21 2022-06-21
199 NaN 2022-06-22
200 NaN 2022-06-23
201 NaN 2022-06-24
202 NaN 2022-06-25
203 NaN 2022-06-26
204 NaN 2022-06-27
205 NaN 2022-06-28
206 NaN 2022-06-29
207 NaN 2022-06-30
208 NaN 2022-07-01
209 NaN 2022-07-02
210 NaN 2022-07-03
211 NaN 2022-07-04
212 2022-07-05 2022-07-05
213 2022-07-06 2022-07-06
214 2022-07-07 2022-07-07
215 2022-07-08 2022-07-08
216 2022-07-09 2022-07-09
217 NaN 2022-07-10
218 NaN 2022-07-16
219 NaN 2022-07-19
220 NaN 2022-07-20
221 2022-07-21 NaN
222 2022-07-22 2022-07-22
223 2022-07-22 2022-07-22
224 2022-07-23 2022-07-23
225 2022-07-24 2022-07-24
226 2022-07-25 2022-07-25
227 2022-07-26 2022-07-26
228 2022-07-27 2022-07-27
229 2022-07-28 2022-07-28
230 2022-07-29 2022-07-29
231 2022-07-30 2022-07-30
232 2022-07-31 2022-07-31
233 2022-08-01 2022-08-01
234 2022-08-02 2022-08-02
235 2022-08-03 2022-08-03
236 2022-08-04 2022-08-04
237 2022-08-05 2022-08-05
238 NaN 2022-08-06
239 NaN 2022-08-09
240 NaN 2022-08-13
241 NaN 2022-08-14
242 2022-08-15 NaN
243 NaN 2022-08-16
244 NaN 2022-08-17
245 2022-08-20 NaN
246 NaN 2022-08-21
247 2022-08-25 NaN
248 2022-08-26 2022-08-26
249 NaN 2022-08-27
250 2022-08-29 NaN
251 2022-08-30 2022-08-30
252 2022-08-31 2022-08-31
253 2022-09-01 2022-09-01
254 NaN 2022-09-02
255 NaN 2022-09-03
256 2022-09-08 NaN
257 2022-09-09 2022-09-09
258 NaN 2022-09-10
259 2022-09-13 NaN
260 2022-09-14 2022-09-14
261 2022-09-15 2022-09-15
262 2022-09-16 2022-09-16
263 NaN 2022-09-17
264 NaN 2022-09-21
This is the result I am looking for! Now when I try to outer merge df3 to df4 the 'VIPR (Y-M-D)' column becomes unsorted. It seems that the dates in 'VIPR (Y-M-D)' that had 'NaN' next to them in df4 are becoming unsorted and going to the bottom are doing so because on the last merge of df3 to df4 my code is:
df5 = df3.merge(df4, how='outer', left_on='Date', right_on='Viriciti (Y-M-D)', sort=True)
The result of this merge is:
Date VIPR (Y-M-D) Viriciti (Y-M-D)
0 2014-01-01 NaN NaN
1 2014-01-02 NaN NaN
2 2014-01-03 NaN NaN
3 2014-01-04 NaN NaN
4 2014-01-05 NaN NaN
5 2014-01-06 NaN NaN
...
2744 2021-07-07 NaN 2021-07-07
...
3076 2022-06-04 2022-06-04 2022-06-04
3077 2022-06-05 2022-06-05 2022-06-05
3078 2022-06-06 2022-06-06 2022-06-06
3079 2022-06-07 2022-06-07 2022-06-07
3080 2022-06-08 NaN 2022-06-08
3081 2022-06-09 NaN 2022-06-09
3082 2022-06-10 2022-06-10 2022-06-10
3083 2022-06-11 2022-06-11 2022-06-11
3084 2022-06-12 2022-06-12 2022-06-12
3085 2022-06-13 2022-06-13 2022-06-13
3086 2022-06-14 2022-06-14 2022-06-14
3087 2022-06-15 NaN 2022-06-15
3088 2022-06-16 NaN NaN
3089 2022-06-17 NaN 2022-06-17
3090 2022-06-18 NaN NaN
3091 2022-06-19 NaN NaN
3092 2022-06-20 NaN NaN
3093 2022-06-21 2022-06-21 2022-06-21
3094 2022-06-22 NaN 2022-06-22
3095 2022-06-23 NaN 2022-06-23
3096 2022-06-24 NaN 2022-06-24
3097 2022-06-25 NaN 2022-06-25
3098 2022-06-26 NaN 2022-06-26
3099 2022-06-27 NaN 2022-06-27
3100 2022-06-28 NaN 2022-06-28
3101 2022-06-29 NaN 2022-06-29
3102 2022-06-30 NaN 2022-06-30
3103 2022-07-01 NaN 2022-07-01
3104 2022-07-02 NaN 2022-07-02
3105 2022-07-03 NaN 2022-07-03
3106 2022-07-04 NaN 2022-07-04
3107 2022-07-05 2022-07-05 2022-07-05
3108 2022-07-06 2022-07-06 2022-07-06
3109 2022-07-07 2022-07-07 2022-07-07
3110 2022-07-08 2022-07-08 2022-07-08
3111 2022-07-09 2022-07-09 2022-07-09
3112 2022-07-10 NaN 2022-07-10
3113 2022-07-11 NaN NaN
3114 2022-07-12 NaN NaN
3115 2022-07-13 NaN NaN
3116 2022-07-14 NaN NaN
3117 2022-07-15 NaN NaN
3118 2022-07-16 NaN 2022-07-16
3119 2022-07-17 NaN NaN
3120 2022-07-18 NaN NaN
3121 2022-07-19 NaN 2022-07-19
3122 2022-07-20 NaN 2022-07-20
3123 2022-07-21 NaN NaN
3124 2022-07-22 2022-07-22 2022-07-22
3125 2022-07-22 2022-07-22 2022-07-22
3126 2022-07-23 2022-07-23 2022-07-23
3127 2022-07-24 2022-07-24 2022-07-24
3128 2022-07-25 2022-07-25 2022-07-25
3129 2022-07-26 2022-07-26 2022-07-26
3130 2022-07-27 2022-07-27 2022-07-27
3131 2022-07-28 2022-07-28 2022-07-28
3132 2022-07-29 2022-07-29 2022-07-29
3133 2022-07-30 2022-07-30 2022-07-30
3134 2022-07-31 2022-07-31 2022-07-31
3135 2022-08-01 2022-08-01 2022-08-01
3136 2022-08-02 2022-08-02 2022-08-02
3137 2022-08-03 2022-08-03 2022-08-03
3138 2022-08-04 2022-08-04 2022-08-04
3139 2022-08-05 2022-08-05 2022-08-05
3140 2022-08-06 NaN 2022-08-06
3141 2022-08-07 NaN NaN
3142 2022-08-08 NaN NaN
3143 2022-08-09 NaN 2022-08-09
3144 2022-08-10 NaN NaN
3145 2022-08-11 NaN NaN
3146 2022-08-12 NaN NaN
3147 2022-08-13 NaN 2022-08-13
3148 2022-08-14 NaN 2022-08-14
3149 2022-08-15 NaN NaN
3150 2022-08-16 NaN 2022-08-16
3151 2022-08-17 NaN 2022-08-17
3152 2022-08-18 NaN NaN
3153 2022-08-19 NaN NaN
3154 2022-08-20 NaN NaN
3155 2022-08-21 NaN 2022-08-21
3156 2022-08-22 NaN NaN
3157 2022-08-23 NaN NaN
3158 2022-08-24 NaN NaN
3159 2022-08-25 NaN NaN
3160 2022-08-26 2022-08-26 2022-08-26
3161 2022-08-27 NaN 2022-08-27
3162 2022-08-28 NaN NaN
3163 2022-08-29 NaN NaN
3164 2022-08-30 2022-08-30 2022-08-30
3165 2022-08-31 2022-08-31 2022-08-31
3166 2022-09-01 2022-09-01 2022-09-01
3167 2022-09-02 NaN 2022-09-02
3168 2022-09-03 NaN 2022-09-03
3169 2022-09-04 NaN NaN
3170 2022-09-05 NaN NaN
3171 2022-09-06 NaN NaN
3172 2022-09-07 NaN NaN
3173 2022-09-08 NaN NaN
3174 2022-09-09 2022-09-09 2022-09-09
3175 2022-09-10 NaN 2022-09-10
3176 2022-09-11 NaN NaN
3177 2022-09-12 NaN NaN
3178 2022-09-13 NaN NaN
3179 2022-09-14 2022-09-14 2022-09-14
3180 2022-09-15 2022-09-15 2022-09-15
3181 2022-09-16 2022-09-16 2022-09-16
3182 2022-09-17 NaN 2022-09-17
3183 2022-09-18 NaN NaN
3184 2022-09-19 NaN NaN
3185 2022-09-20 NaN NaN
3186 NaN NaN 2022-09-21
3187 NaN 2022-06-16 NaN
3188 NaN 2022-06-20 NaN
3189 NaN 2022-07-21 NaN
3190 NaN 2022-08-15 NaN
3191 NaN 2022-08-20 NaN
3192 NaN 2022-08-25 NaN
3193 NaN 2022-08-29 NaN
3194 NaN 2022-09-08 NaN
3195 NaN 2022-09-13 NaN
As you can see the 'VIPR (Y-M-D)' column has some dates that are not matched at the very bottom. How can I fix this so all dates are matched in the corresponding rows?
CodePudding user response:
since your third dataframe includes all dates, use how="left"
. Additionally, in your second join, you're joining on a column which does have NaNs. Instead, you should join both df1 and df2 directly to df3:
merged = (
df3
.merge(df1, how="left", left_on="Date", right_on="VIPR (Y-M-D)")
.merge(df2, how="left", left_on="Date", right_on="Viriciti (Y-M-D)")
)