I have the following dataframe called df:
using DataFrames
df = DataFrame(group = ["A", "A", "A", "A", "B", "B", "B", "B"],
value = [2,1,4,3,3,5,2,1])
8×2 DataFrame
Row │ group value
│ String Int64
─────┼───────────────
1 │ A 2
2 │ A 1
3 │ A 4
4 │ A 3
5 │ B 3
6 │ B 5
7 │ B 2
8 │ B 1
I would like to calculate the difference with previous values of consecutive rows in column value per group. The offset should have NaN, 0, or missing. Here is the desired output:
8×3 DataFrame
Row │ group value diff
│ String Int64 Float64
─────┼────────────────────────
1 │ A 2 NaN
2 │ A 1 -1.0
3 │ A 4 3.0
4 │ A 3 -1.0
5 │ B 3 NaN
6 │ B 5 2.0
7 │ B 2 -3.0
8 │ B 1 -2.0
So I was wondering if anyone knows how to calculate the difference between consecutive rows per group in Julia
?
CodePudding user response:
Using DataFrames.jl (you can replace missing
by any value you like):
julia> select(groupby(df, :group),
:value => (x -> [missing; diff(x)]) => :diff)
8×2 DataFrame
Row │ group diff
│ String Int64?
─────┼─────────────────
1 │ A missing
2 │ A -1
3 │ A 3
4 │ A -1
5 │ B missing
6 │ B 2
7 │ B -3
8 │ B -1
Using DataFramesMeta.jl:
julia> @chain df begin
groupby(:group)
@select :diff = [missing; diff(:value)]
end
8×2 DataFrame
Row │ group diff
│ String Int64?
─────┼─────────────────
1 │ A missing
2 │ A -1
3 │ A 3
4 │ A -1
5 │ B missing
6 │ B 2
7 │ B -3
8 │ B -1
Normally diff
in Julia like in e.g. R would produce one less row (and the syntax would be simpler:
julia> combine(groupby(df, :group), :value => diff => :diff)
6×2 DataFrame
Row │ group diff
│ String Int64
─────┼───────────────
1 │ A -1
2 │ A 3
3 │ A -1
4 │ B 2
5 │ B -3
6 │ B -1
julia> @chain df begin
groupby(:group)
@combine :diff = diff(:value)
end
6×2 DataFrame
Row │ group diff
│ String Int64
─────┼───────────────
1 │ A -1
2 │ A 3
3 │ A -1
4 │ B 2
5 │ B -3
6 │ B -1
Yet another way would be to use lag
from ShiftedArrays.jl:
julia> using ShiftedArrays: lag
julia> @chain df begin
groupby(:group)
@combine :diff = :value - lag(:value)
end
8×2 DataFrame
Row │ group diff
│ String Int64?
─────┼─────────────────
1 │ A missing
2 │ A -1
3 │ A 3
4 │ A -1
5 │ B missing
6 │ B 2
7 │ B -3
8 │ B -1