Calculating amount of parallelism of a schedule-CodePudding

We have a parallel scheduling system which works this way

There are N parallel processes ( or agents ) that each compute 1 task at a time till its completion without any preemption

We have an event log that has the start & end times of each tasks. Tasks are grouped into jobs and we have several jobs running in parallel. Jobs don't depend on each other and tasks within a job can ALL run in parallel at the same time

What happens is that, if we increase the number of jobs, then the net parallelism of a job is reduced or rather, major chunk of tasks of a job get executed nearly serially

e.g. IF I have 10 processes and 10 jobs with 10 tasks each, then each job will run on just 1 process effectively making all the jobs run serially

I want to quantify this information, and I am not sure what is the best way to do so. Perhaps what I want is to quantify resource contention

Some ideas that were in my mind was

1. Create a graph of tasks per job

There is a directed edge between two tasks T1 and T2 if T1.end < T2.start

Then, the amount of tasks that were executed serially would be the longest path in this directed graph

Do you see any concern with this approach. Is there a better way to model this?

2. Time to start a task

For each task, I calculate task.start - job.start. But then again, what information is this data providing?

CodePudding user response：

A simple metric: calculate how long it would have taken if every job had run in ||el ( i.e. there were an infinite number of processors ). This is the "perfect time" Subtract "perfect" from how long it actually took to complete every job. Divide by the "perfect" time and express as a per cent. This is a metric of how much contention occurred.

First, let's look at the perfect job completion time: since all the tasks can run in parallel this is the maximum of the task durations in the job.

If all the jobs start at the same time, then the perfect time for ALL jobs is simply the longest perfect job time.

If the jobs start at different times, then you have to calculate the maximum value of job start time job perfect time.

Perfect job time job j

PJT(j) = MAX( td(t) ) for all tasks in j

Perfect time

PT = MAX( starttime(j)   PJT(j) ) for all j

Contention metric

C = 100 * (actual time of completion of all jobs - PT ) / PT

Basic example:

3 jobs 1 task each duration 10 secs
2 processors available
Actual time is 20 secs
Perfect time is 10 secs
Contention is 100%

4 jobs 1 task each duration 10 secs
2 processors available
Actual time is 20 secs
Perfect time is 10 secs
Contention is 100%

Task Delayed Metric

TDM = 100 * number of tasks delayed / total number of tasks

Basic example:

3 jobs 1 task each duration 10 secs
2 processors available
1 task is delayed
TDM is 33%

4 jobs 1 task each duration 10 secs
2 processors available
2 tasks delayed
TDM is 50%

Metric comparison

The contention metric gives a measure of the total delay be all the jobs considered as one. The TDM gives a better measure of the delays experienced by individual tasks. Which one is used depends on what you are most interested in: overall delay or individual task delay.