Home > database >  Pentaho data integration loop count variable
Pentaho data integration loop count variable

Time:11-11

I want a simple loop function to count the number of loop like below in java programming:

for (int i = 0; i <3; i  ) {
    count = count 1;
}
System.out.println(count);

I am doing it using Pentaho data integration. so I have 1 job contain 3 transformations in it, where first transformation set the number of loop (above example 3), then second transformation click "Execute every input row" for looping, and set variable inside the transformation using Javascript with getVariable() and setVariable() function. the last transformation just get variable and write log to show the count. The problem is every loop in the transformation 2 will get variable as 0. so it end up result = 1, what I expect is 3.

added the project files here: Passing a value to a variable in a job

I prefer using parameters (next tab) better than arguments/variables, but that's my preference.

CodePudding user response:

The problem is that, in t2 transformation, you are getting the variable and setting a new value for the same variable at the same time, which does not work in the same transformation. When you close the Set variable step you get this warning:

Set variable step warning

To avoid it you need to use two variables, one you set before executing the loop, and another set each time you execute the loop or after executing the loop with the last value.

I have modified your job to make it work, in t1 transformation, I have added a new field (rownum_seq) created with the Add sequence step, to know how much to add to variable cnt in each execution of the loop. I could have used your id field, but in case you don't have a similar field in your real world job, that's the step you need to achieve something similar. I have modified the variable name to make more clear what I'm doing, in t1 I set the value of variable var_cnt_before.

In t2 transformation, I read var_cnt_before, and set the value of var_cnt_after as the sum of var_cnt_before rownum_seq, this means I'm changing the value of var_cnt_after each time t2 is executed.

In t3 transformation, I read var_cnt_after, which has the value of the last execution of t2.

You could also calculate var_cnt_after in t1 and not modify it in t2, using the Group by step to get the max value of rownum_seq, so you don't need to modify that variable each time you execute t2, depending on what you need to achieve you might need to use it or change in t2 or you just need the final value so you calculate it in t1.

This is the link to the modified job and transformations.

  • Related