Home > database >  Really slow Dijkstra algorithm, what am I doing wrong?
Really slow Dijkstra algorithm, what am I doing wrong?

Time:12-13

I was tasked to perform the Dijkstra Algorithm on big graphs (25 million nodes). These are represented as a 2D array: -each node as a double[] with latitude, longitude and offset (offset meaning index of the first outgoing edge of that node) -each edge as a int[] with sourceNodeId,targetNodeId and weight of that edge Below is the code, I used int[] as a tupel for the comparison in the priority queue.

The algorithm is working and gets the right results HOWEVER it is required to be finished in 15s but takes like 8min on my laptop. Is my algorithm fundamentally slow? Am I using the wrong data structures? Am I missing something? I tried my best optimizing as far as I saw fit. Any help or any ideas would be greatly appreciated <3

public static int[] oneToAllArray(double[][]nodeList, int[][]edgeList,int sourceNodeId) {
        int[] distance = new int[nodeList[0].length]; //the array that will be returned
        //the priorityQueue will use arrays with the length 2, representing [index, weight] for each node and order them by their weight
        PriorityQueue<int[]> prioQueue = new PriorityQueue<>((a, b) -> ((int[])a)[1] - ((int[])b)[1]);
        int offset1; //used for determining the amount of outgoing edges
        int offset2;
        int newWeight; //declared here so we dont need to declare it a lot of times later (not sure if that makes a difference)
        //currentSourceNode here means the node that will be looked at for OUTGOING edges
        int[] currentSourceNode= {sourceNodeId,0};
        prioQueue.add(currentSourceNode); 
        //at the start we only add the sourceNode, then we start the actual algorithm
        
        while(!prioQueue.isEmpty()) {
            if(prioQueue.size() % 55 == 2) {
                System.out.println(prioQueue.size());
            }
            
            currentSourceNode=prioQueue.poll();
            int sourceIndex = currentSourceNode[0];
            
            if(sourceIndex == nodeList[0].length-1) {
                offset1= (int) nodeList[2][sourceIndex];
                offset2= edgeList[0].length;
            } else {
                offset1= (int) nodeList[2][sourceIndex];
                offset2= (int) nodeList[2][sourceIndex 1];
            }
            //checking every outgoing edge for the currentNode
            for(int i=offset1;i<offset2;i  ) {
                int targetIndex = edgeList[1][i];
                
                //if the node hasnt been looked at yet, the weight is just the weight of this edge   distance to sourceNode
                if(distance[targetIndex]==0&&targetIndex!=sourceNodeId) {
                    distance[targetIndex] = distance[sourceIndex]   edgeList[2][i];
                    int[]targetArray = {targetIndex, distance[targetIndex]};
                    prioQueue.add(targetArray);
                } else if(prioQueue.stream().anyMatch(e -> e[0]==targetIndex)) {
                    //above else if checks if this index is already in the prioQueue
                    newWeight=distance[sourceIndex] edgeList[2][i];
                    //if new weight is better, we have to update the distance   the prio queue
                    if(newWeight<distance[targetIndex]) {
                        distance[targetIndex]=newWeight;
                        int[] targetArray;
                        targetArray=prioQueue.stream().filter(e->e[0]==targetIndex).toList().get(0);
                        prioQueue.remove(targetArray);
                        targetArray[1]=newWeight;
                        prioQueue.add(targetArray);
                    }
                    
                }
                
            }
            
            
            
            
        }
        
        
        
        
        return distance;
    }

CodePudding user response:

For each node that you process, you are doing a linear scan of the priority queue to see if something is already queued, and a second scan to find all the things that are queued if you have to update the distance. Instead, keep a separate multi-set of things that are in the queue.

CodePudding user response:

This is not a proper Dijkstra's implementation.

One of the key elements of Dijkstra is that you mark nodes as "visited" when they have been evaluated and prevent looking at them again because you can't do any better. You are not doing that, so your algorithm is doing many many more computations than necessary. The only place where a priority queue or sort is required is to pick the next node to visit, from amongst the unvisited. You should re-read the algorithm, implement the "visitation tracking" and re-formulate.

  • Related