I have a large data frame which has reading data in it, and I want to merge another dataframe
of the same structure but a subset of columns and far fewer rows.
The idea is that the large dataframe
represents almost all of what I want but I will have a set of readings that might start at any point (row) in the larger frame that I need to drop the columns onto.
As an example if the large data frame looked similar to this and had 5 rows:
A B
0 1 11
1 2 12
2 3 13
3 4 14
4 5 15
The smaller dataframe
looks like the following and has fewer rows and only one of the columns:
B
0 1000
1 2000
When I merge I want to have a dataframe
that contains all the row count of the first, but I want to "overlay" the second frame onto it from a row I specify, so for example from row 2, so I would expect then for the new dataframe
to look like this:
A B
0 1 11
1 2 12
2 3 1000
3 4 2000
4 5 15
The end result is that the new dataframe
is the same size as the first, but the value of column B has been updated, from a row I specify to the length of the second dataframe
and only for the columns in the second data frame.
CodePudding user response:
Here you go:
Let's say df
is the bigger df and df1
is the small one.
...
shift = 2
df1.index = df1.index shift
df.update(df1)
Result:
A B
0 1 11.0
1 2 12.0
2 3 1000.0
3 4 2000.0
4 5 15.0
CodePudding user response:
Yes, you need to align your indexes first. Pandas does most operations with intrinsic data alignment, therefore you can use this methodology to update your df1, dataframe:
df2.set_axis([2, 3]).combine_first(df1)
Output:
A B
0 1 11
1 2 12
2 3 1000
3 4 2000
4 5 15