I am trying to map time category from the other time range dataframe. Here is the Time Range Dataframe. The time between "Start" and "End" is categorized into each "Weeks".
Start End Weeks
0 2019-08-30 2019-10-03 1
1 2019-10-04 2019-10-31 2
2 2019-11-01 2019-11-28 3
Here is my time dataframe.
Date
0 2019-09-06
1 2019-10-13
2 2019-10-12
3 2019-11-04
Here is the output I want. You can see that each date is categorized into the corresponding "Weeks" based on Time Range Dataframe.
Date Weeks
0 2019-09-06 1
1 2019-10-13 2
2 2019-10-12 2
3 2019-11-04 3
My code is as follows.
import pandas as pd
# Date Range dataframe
Date_Range = {'Start':[datetime.datetime(2019, 8, 30, 0, 0),datetime.datetime(2019, 10, 4, 0, 0),datetime.datetime(2019, 11, 1, 0, 0)],
'End':[datetime.datetime(2019, 10, 3, 0, 0),datetime.datetime(2019, 10, 31, 0, 0),datetime.datetime(2019, 11, 28, 0, 0)],
'Weeks':[1,2,3],}
Date_Range_DF = pd.DataFrame(Date_Range)
print(Date_Range_DF)
# My dataframe
Date = {'Date':[datetime.datetime(2019, 9, 6, 0, 0),datetime.datetime(2019, 10, 13, 0, 0),datetime.datetime(2019, 10, 12, 0, 0),datetime.datetime(2019, 11, 4, 0, 0)]}
Date_DF = pd.DataFrame(Date)
print(Date_DF)
# Merge Dataframe
import numpy as np
import pandas as pd
df1 = Date_Range
df2 = Date_Range_DF
df2.start, df2.End = pd.to_datetime(df2.Start), pd.to_datetime(df2.End)
filtered = pd.merge_asof(df1, df2, left_on='Date', right_on='Start')
However, it keeps showing the erroe TypeError: Can only merge Series or DataFrame objects, a <class 'dict'> was passed
Please help me to find the bugs. Thank.
CodePudding user response:
I have looked over your code and it seems to me that you haven't defined "datetime", which is inside your Date_Range on line 3. That may be why your getting the error you are seeing. You may want to define this and it should be good. Unless I misunderstood what you wanted...
CodePudding user response:
To merge two pandas dataframes, you can use the merge function. This function will allow you to specify which columns from each dataframe to merge on, as well as how to handle any rows that don't have a match in the other dataframe.
Here is an example of how to use the merge function to merge two dataframes based on time range:
import pandas as pd
load the first dataframe
df1 = pd.read_csv('df1.csv')
load the second dataframe
df2 = pd.read_csv('df2.csv')
merge the two dataframes on the 'time' column
merged_df = df1.merge(df2, on='time')
print the merged dataframe
print(merged_df)
In this example, we assume that both dataframes have a column called time that contains the time range for each row. The merge function will merge the two dataframes by matching rows with the same time range.
You can also specify additional arguments to the merge function to customize how the dataframes are merged. For example, you can specify the how argument to control how the rows are combined (e.g. using an inner join, outer join, etc.), or the left_on and right_on arguments to specify different column names to merge on in each dataframe.
For more information on the merge function, you can refer to the official pandas documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html
CodePudding user response:
Step1
make boundary list (bins
)
bins = [Date_Range_DF.loc[0, 'Start'] - pd.Timedelta('1day')] Date_Range_DF['End'].tolist()
bins
[Timestamp('2019-08-29 00:00:00'),
Timestamp('2019-10-03 00:00:00'),
Timestamp('2019-10-31 00:00:00'),
Timestamp('2019-11-28 00:00:00')]
Step2
divide category by pd.cut
(use boundary from bins
)
s = pd.cut(pd.to_datetime(Date_DF['Date']), bins=bins, labels=df1['Weeks'].tolist())
s
0 1
1 2
2 2
3 3
Name: Date, dtype: category
Categories (3, int64): [1 < 2 < 3]
Step3
make s
to Weeks
column
Date_DF['Weeks'] = s
Date_DF
Date Weeks
0 2019-09-06 1
1 2019-10-13 2
2 2019-10-12 2
3 2019-11-04 3
if you don want that dtype of Week
is category, use following code
for example make int
Date_DF['Weeks'] = s.astype('int')