I'm trying to normalize an array of elements in a time range. Say you have 20 bank transactions that occur on Jan 1st, 2022
transaction 1 - 2022/01/01
transaction 2 - 2022/01/01
...
transaction 20 - 2022/01/01
we don't have other data than the day they occurred, but we still want to assign them an hour of the day, so they end as:
transaction 1 - 2022/01/01 00:00
transaction 2 - 2022/01/01 ??:??
...
transaction 20 - 2022/01/01 23:59
In Go I have this function that try to calculate the normalization of a time of day for an index in an array of elements:
func normal(start, end time.Time, arraySize, index float64) time.Time {
delta := end.Sub(start)
minutes := delta.Minutes()
duration := minutes * ((index 1) / arraySize)
return start.Add(time.Duration(duration) * time.Minute)
}
Howeve, I get an unexpected calculation of 2022/1/1 05:59 for index 0 in an array of 4 elements in a time range of 2022/1/1 00:00 to 2022/1/1 23:59, instead I would expect to see 2022/1/1 00:00. The only that works fine these conditions is index 3.
so, what am I doing wrong with my normalization?
EDIT:
Here is the function fixed thanks to @icza
func timeIndex(min, max time.Time, entries, position float64) time.Time {
delta := max.Sub(min)
minutes := delta.Minutes()
if position < 0 {
position = 0
}
duration := (minutes * (position / (entries - 1)))
return min.Add(time.Duration(duration) * time.Minute)
}
There is an example: Let's say our start and end date is 2022/01/01 00:00
- 2022/01/01 00:03
, also we have 3 entries in our array of bank transactions and that we want to get the normalized time for the transaction nº 3 (2
in the array):
result := timeIndex(time.Date(2022, time.January, 1, 0, 0, 0, 0, time.UTC), time.Date(2022, time.January, 1, 0, 3, 0, 0, time.UTC), 3, 2)
since there is only 4 minutes between the starting and ending times (from 00:00
to 00:03
) and want to find the normalized time for the last entry (index 2
) in the array (size 3
) the result should be:
fmt.Printf("%t", result.Equal(time.Date(2022, time.January, 1, 0, 3, 0, 0, time.UTC))
// prints "true"
or the last minute in the range, which is 00:03
.
Here is a reproducible example: https://go.dev/play/p/EzwkqaNV1at
CodePudding user response:
Between n
points there are n-1
segments. This means if you want to include start
and end
in the interpolation, the number of time periods (being delta
) is arraySize - 1
.
Also if you add 1
to the index
, you can't possibly have start
as the result (you'll skip the 00:00
).
So the correct algorithm is this:
func normal(start, end time.Time, arraySize, index float64) time.Time {
minutes := end.Sub(start).Minutes()
duration := minutes * (index / (arraySize - 1))
return start.Add(time.Duration(duration) * time.Minute)
}
Try it on the Go Playground.
Also note that if you have many transactions (in the order of the number of minutes in a day which is around a thousand), you may easily end up having multiple transactions having the same timestamp (same hour and minute). If you want to avoid this, use a smaller precision than minute, e.g. seconds or milliseconds:
func normal(start, end time.Time, arraySize, index float64) time.Time {
sec := end.Sub(start).Seconds()
duration := sec * (index / (arraySize - 1))
return start.Add(time.Duration(duration) * time.Second)
}
Yes, this will result in timestamps where the seconds is also not necessarily zero, but will ensure different, unique timestamps for higher transaction numbers.
If you have transactions in the order of magnitude that is close to the number of seconds in a day (which is 86400), then you can complete drop this "unit" and use time.Duration
itself (which is the number of nanoseconds). This will guarantee timestamp uniqueness even for the highest number of transactions:
func normal(start, end time.Time, arraySize, index float64) time.Time {
delta := float64(end.Sub(start))
duration := delta * (index / (arraySize - 1))
return start.Add(time.Duration(duration))
}
Testing this with 1 million transactions, here are the first 15 time parts (they defer only in their sub-second part):
0 - 00:00:00.00000
1 - 00:00:00.08634
2 - 00:00:00.17268
3 - 00:00:00.25902
4 - 00:00:00.34536
5 - 00:00:00.43170
6 - 00:00:00.51804
7 - 00:00:00.60438
8 - 00:00:00.69072
9 - 00:00:00.77706
10 - 00:00:00.86340
11 - 00:00:00.94974
12 - 00:00:01.03608
13 - 00:00:01.12242
14 - 00:00:01.20876
15 - 00:00:01.29510
16 - 00:00:01.38144
17 - 00:00:01.46778
18 - 00:00:01.55412
19 - 00:00:01.64046
Try this one on the Go Playground.