I have the following looking dataframe:
using DataFrames
df = DataFrame(
condition = [false, false, true, false, false, false, true, false, false, false],
time = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Output:
10×2 DataFrame
Row │ condition time
│ Bool Int64
─────┼──────────────────
1 │ false 1
2 │ false 2
3 │ true 3
4 │ false 4
5 │ false 5
6 │ false 6
7 │ true 7
8 │ false 8
9 │ false 9
10 │ false 10
I would like to calculate the difference in rows with respect to a conditioned value (true/false). This means that for row 1 the nearest true is 2 rows a way. The conditioned rows with true should have a value of 0. Here is the desired output:
10×3 DataFrame
Row │ condition time diff
│ Bool Int64 Int64
─────┼─────────────────────────
1 │ false 1 2
2 │ false 2 1
3 │ true 3 0
4 │ false 4 1
5 │ false 5 2
6 │ false 6 1
7 │ true 7 0
8 │ false 8 1
9 │ false 9 2
10 │ false 10 3
So I was wondering if anyone knows how to calculate the difference in rows with the closest conditioned value in dataframe Julia?
CodePudding user response:
transform(df, :condition =>
(w->((f,u)->min.(f(u),reverse(f(reverse(u)))))(
v->accumulate(
(x,y)->ifelse(y,0,x 1),
v;init=length(v)
),
w
)) => :diff)
(u
,v
,w
are vectors. x
,y
are bool/int. f
is a function)
Does the job with output:
10×3 DataFrame
Row │ condition time diff
│ Bool Int64 Int64
─────┼─────────────────────────
1 │ false 1 2
2 │ false 2 1
3 │ true 3 0
4 │ false 4 1
5 │ false 5 2
6 │ false 6 1
7 │ true 7 0
8 │ false 8 1
9 │ false 9 2
10 │ false 10 3
On my REPL it was 1-line as follows, but tried to make it more readable above:
transform(df, :condition => (v->((f, v)->min.(f(v),reverse(f(reverse(v)))))(v->accumulate((x, y)->ifelse(y, 0, x 1), v; init=length(v)), v)) => :diff)
It is not the clearest way, and also not the most efficient, but it is a short piece of code. To get clearer and more efficient result, a separate function should be defined.
Last thing, the column has to have one true
value, otherwise the results are not meaningful (this can be checked easily with a bit more code, but not sure what OP wants in this case).