Home > database >  Reactor flux can't multi thread when on ECS
Reactor flux can't multi thread when on ECS

Time:08-08

I was parsing a big size file using flux. Here is the main code code

I want it to run in multi-threads, which .parallel() -> runOn.boundElastic() can do that.

It works fine on my personal pc, as shown in the pic, there are 5 threads running concurrently local run

But when it is deployed on ECS, things goes different. There is just one single thread running. ecs run

tests i made

I don't know if it related to millicores on ecs.

As I made some tests on Ecs, I found that when I requested 500m(millicores) CPU, there is one thread, and when I requested 2000m CPU, there are two threads running.

question

But how come it related to CPU cores on ECS?

And what am I supposed to do to make it running on multi-threads, like 10 or more?

CodePudding user response:

It is indeed related to the number of available CPU cores as that's how the parallel operator is implemented in Reactor by default.

The naive solution is to use the overloaded version of the parallel operator and pass the parallelization level yourself:

flux
  .parallel(10)
  .runOn(...)
  //...

However, note that if the pipeline is doing CPU intensive work, then this will not make your processing any faster as you'll have one CPU core anyway which can only do one thing at a time.

If the pipeline is doing IO intensive work (database calls, HTTP calls, file reads/writes, etc.) then it doesn't make sense to use ParallelFlux as that is not intended for this purpose. Instead you can use flatMap operator to make the IO operations concurrent like this:

flux.
  .flatMap(x -> reactiveDatabaseCall(x))
  .flatMap(x -> Mono.fromCallable(() -> blockingIoOperation(x)).subscribeOn(Schedulers.boundedElastic()))

By default flatMap has a concurrency level of 256, meaning it will execute at most 256 items at a given time. If that's too many you can use the overloaded version of flatMap where you can pass in the desired concurrency level.

CodePudding user response:

Per the official documentation for ECS Task Definitions, the supported CPU and Memory values are as follows:

CPU value Memory value Operating systems supported for Fargate
256 (.25 vCPU) 512 MiB, 1 GB, 2 GB Linux
512 (.5 vCPU) 1 GB, 2 GB, 3 GB, 4 GB Linux
1024 (1 vCPU) 2 GB, 3 GB, 4 GB, 5 GB, 6 GB, 7 GB, 8 GB Linux, Windows
2048 (2 vCPU) Between 4 GB and 16 GB in 1 GB increments Linux, Windows
4096 (4 vCPU) Between 8 GB and 30 GB in 1 GB increments Linux, Windows

Note the number of vCPU. To get multiple CPU cores available you will need to use either 2048 or 4096 as your CPU value in the task definition.

  • Related