According to RTS -p
time inherited, 90% of my execution time is spent running a Generic
based instance MyClass MyType
definitions, for a single class recursively over 100 types/instances. I've tried getting GHC to tell me the details of that, to figure out which instances are slow, or which instances are called more often, so I can go and hand-optimize those.
However, I can't seem to get ghc to put cost centres on these functions even with -fprof-auto-calls
.
What can I do to get call centres on each type class instance function, rather than just the whole top-level thing as one, so I know how much time is spent in each one?
Using ghc 9.2.4.
CodePudding user response:
If you have a typical generic function setup with a class for processing the representation whose instances do all the actual work:
class GSlow f where
gslow :: f a -> Int
a separate type class for the generic function itself:
class Slow a where
slow :: a -> Int
default slow :: (Generic a, GSlow (Rep a)) => a -> Int
slow = defaultSlow
defaultSlow :: (Generic a, GSlow (Rep a)) => a -> Int
defaultSlow = gslow . from
an instance for fields K1
that passes control from one data type to the other:
instance Slow c => GSlow (K1 i c) where
gslow (K1 x) = ... slow x ...
and a whole bunch of empty instances for your 100 data types:
instance Slow Type1
instance Slow Type2
etc.
then, by far, the easiest thing to do is to search and replace your empty instance Slow
s with:
instance Slow Type1 where slow = defaultSlow
instance Slow Type2 where slow = defaultSlow
With -fprof-auto
, you should get a call center for every type-specific slow = defaultSlow
function which should allow you to attribute the work in the GSlow
instances to individual data types.