Home > Net >  Create a dataframe from a date range in python
Create a dataframe from a date range in python

Time:04-01

Given an interval from two dates, which will be a Python TimeStamp.

create_interval('2022-01-12', '2022-01-17', 'Holidays')

Create the following dataframe:

date interval_name
2022-01-12 00:00:00 Holidays
2022-01-13 00:00:00 Holidays
2022-01-14 00:00:00 Holidays
2022-01-15 00:00:00 Holidays
2022-01-16 00:00:00 Holidays
2022-01-17 00:00:00 Holidays

If it can be in a few lines of code I would appreciate it. Thank you very much for your help.

CodePudding user response:

If you're open to using Pandas, this should accomplish what you've requested

import pandas as pd

def create_interval(start, end, field_val):
    #setting up index date range
    idx = pd.date_range(start, end)
    #create the dataframe using the index above, and creating the empty column for interval_name
    df = pd.DataFrame(index = idx, columns = ['interval_name'])
    #set the index name
    df.index.names = ['date']
    #filling out all rows in the 'interval_name' column with the field_val parameter
    df.interval_name = field_val
    return df

create_interval('2022-01-12', '2022-01-17', 'holiday')

CodePudding user response:

I hope I coded exactly what you need.

import pandas as pd

def create_interval(ts1, ts2, interval_name):
    ts_list_dt = pd.date_range(start=ts1, end=ts2).to_pydatetime().tolist()
    ts_list = list(map(lambda x: ''.join(str(x)), ts_list_dt))
    d = {'date': ts_list, 'interval_name': [interval_name]*len(ts_list)}
    df = pd.DataFrame(data=d)
    return df

df = create_interval('2022-01-12', '2022-01-17', 'Holidays')
print(df)

output:

         date             interval_name
0  2022-01-12 00:00:00      Holidays
1  2022-01-13 00:00:00      Holidays
2  2022-01-14 00:00:00      Holidays
3  2022-01-15 00:00:00      Holidays
4  2022-01-16 00:00:00      Holidays
5  2022-01-17 00:00:00      Holidays

If you want DataFrame without Index column, use df = df.set_index('date') after creating DataFrame df = pd.DataFrame(data=d). And then you will get:

    date             interval_name      
2022-01-12 00:00:00      Holidays
2022-01-13 00:00:00      Holidays
2022-01-14 00:00:00      Holidays
2022-01-15 00:00:00      Holidays
2022-01-16 00:00:00      Holidays
2022-01-17 00:00:00      Holidays
  • Related