Home > front end >  How can Haskell manage / resample a series of irregular data?
How can Haskell manage / resample a series of irregular data?

Time:06-13

A DESCRIPTION OF THE PROBLEM DOMAIN IS PROVIDED BELOW

Given a TimeStamp defined as...

type TimeStamp = Int

a DataPoint defined as...

data DataPoint a = DataPoint {index :: TimeStamp, value :: a} deriving (Show, Eq)

instance Foldable DataPoint where
  foldMap f (DataPoint _ y) = f y

a Series defined as...

data Series a = Series [DataPoint a]

instance Foldable Series where
  foldMap f (Series xs) = foldMap (foldMap f) xs
  length = size

and the following helpers:

emptySeries :: Series a
emptySeries = Series []

timeSeries :: [(TimeStamp, a)] -> Series a
timeSeries xs = Series $ map (uncurry DataPoint) xs

size :: Series a -> Int
size (Series xs) = length xs

How can I create a timeSeries of a specified resolution, given an irregular set of data. The resulting timeSeries should contain the most up-to-date value for every point in time, at the given resolution; to improve accessibility, the newest data is at the front of the timeSeries. For example...

if irrData = [(98,5), (96,4), (93,9)], the resulting timeSeries would be [(98,5), (97,4), (96,4), (95,9), (94,9), (93,9)]

Bonus points if you can...

  • use the same function to resample a timeSeries to a different resolution
  • correctly parse data provided in any order (ex. [(98,5), (101,2), (93,4)]).

MY CURRENT SOLUTION ATTEMPT IS PROVIDED BELOW, AND WILL CHANGE OVER TIME AS I WORK TOWARDS IT'S FINAL FORM

The solution below functions as expected; I would still like to clean the code, as one line is very long. See the posted answer for details.

resample :: Int -> [(Int, v)] -> [(Int, v)]
resample _ [] = []
resample _ [x] = [x]
resample r xs = foldr (\i acc -> (i, snd $ head (filter (\x -> (fst x) <= i) s)):acc) [] [(fst $ head s), (fst $ head s) - r .. (fst $ last s)]
  where
    s = sortBy (flip $ on compare fst) xs

CodePudding user response:

resample :: Int -> [(Int, v)] -> [(Int, v)]
resample _ [] = []
resample _ [x] = [x]
resample r xs = foldr (\i acc -> (i, snd $ head (filter (\x -> (fst x) <= i) s)):acc) [] [(fst $ head s), (fst $ head s) - r .. (fst $ last s)]
  where
    s = sortBy (flip $ on compare fst) xs

Given an Int resolution, and [(Int, v)] series, it will return a resampled series. The resulting series is bounded by the limits of the original series, with a point structure representing the new resolution; the value of each point represents the most up-to-date value from the original series.

  • Related