Home > Software design >  Python - generate a timestamp table in pandas given a date period
Python - generate a timestamp table in pandas given a date period

Time:02-19

This is kind of a mixture between these two questions:

Pandas is a Timestamp within a Period (because it adds a time period in pandas)

Generate a random date between two other dates (but I need multiple dates (at least 1 million which I specify with a variable LIMIT))

How can I generate random dates WITH random time between a given date period randomly for a specific given amount?

Performance is rather important for me, hence I chose to go with pandas, any performance boosts are appreciated even if that means using another library.

My approach so far would be the following:

tstamp = pd.to_datetime(['01/01/2010', '2020-12-31'])
# ???

But I don't know how to randomize between dates. I was thinking of using randint for a random unix epoch time and then converting that, but it would slow it down A LOT.

CodePudding user response:

All I had to do was to add str(fake.date_time_between(start_date='-10y', end_date='now')) into my Pandas DataFrame append logic. I'm not even sure that the str() there is necessary.

P.S. you initialize it like this:

from faker import Faker
# initialize Faker
fake = Faker()

CodePudding user response:

You can try this, it is very fast:

start = np.datetime64('2017-01-01')
end = np.datetime64('2018-01-01')
limit = 1000000
delta = np.arange(start,end)
indices = np.random.choice(len(delta), limit)
delta[indices]
  • Related