Home > Net >  How to calculate the similarity of two movies ranking list effciently?
How to calculate the similarity of two movies ranking list effciently?

Time:03-06

Problem description:

You and Peter are talking about n movies, which are represented by integers [1,n]. You have made a ranking list for the movies according to your preference. Now, Peter tells you his ranking list. You want to know how similar your and Peter's tastes are. For 2 movies i, j, if you and Peter both rank movie i before movie j, You will get11 similarity. Please output the total similarity.

I know that I can solve this problem in brute force way. Its Java code is like this:

int n = in.nextInt();
int[] rankingListOfMe = new int[n];
int[] rankingListOfPeter = new int[n];
int[] peterMovie2RankingIndex = new int[n   1];
for (int i = 0; i < n;   i) rankingListOfMe[i] = in.nextInt();
for (int i = 0; i < n;   i) {
    rankingListOfPeter[i] = in.nextInt();
    peterMovie2RankingIndex[rankingListOfPeter[i]] = i;
}
long similarity = 0L;
for (int i = 1; i < n;   i) {
    if (rankingListOfMe[i] == rankingListOfPeter[0]) continue;
    int curJMovieIndex = peterMovie2RankingIndex[rankingListOfMe[i]];
    for (int j = 0; j < i;   j) {
        if (peterMovie2RankingIndex[rankingListOfMe[j]] < curJMovieIndex) similarity  ;
    }
}

Array rankingListOfXX is the array whose index is the movie ranking and it stores movie id. Array peterMovie2RankingIndex is the array, with length n 1, whose index is the movie id and it stores corresponding movie ranking, to get the movie's ranking by the movie id easily. every time I traverse an movie id, I just count how many movies satisfy the request. Although this way can solve the problem, I wonder if there is any other way to solve if more efficiently. The time complexity of algorithm above is O(n^2), which is too much for me. I've been thinking for a while but don't know where to optimize. I think it has something with sort algorithm but I don't know how to use sort algorithm to solve this problem.

CodePudding user response:

public void similarity(int[] me, int[] peter){
    int[] peterTemp = new int[peter.length];
    Map<Integer, Integer> map = new HashMap<>();
    for(int i = 0; i < me.length; i  ){
        map.put(me[i], i);
    }
    for(int i = 0; i < peter.length; i  ){ 
        peterTemp[peterTemp.length - (i   1)] = map.get(peter[i]);
    }

    // as David Eisenstat pointed out we are going to count inversion in array, invCount method copied from here
    // https://stackoverflow.com/questions/337664/counting-inversions-in-an-array
    System.out.println(invCount(peterTemp));
}

long merge(int[] arr, int[] left, int[] right) {
    int i = 0, j = 0, count = 0;
    while (i < left.length || j < right.length) {
        if (i == left.length) {
            arr[i j] = right[j];
            j  ;
        } else if (j == right.length) {
            arr[i j] = left[i];
            i  ;
        } else if (left[i] <= right[j]) {
            arr[i j] = left[i];
            i  ;
        } else {
            arr[i j] = right[j];
            count  = left.length-i;
            j  ;
        }
    }
    return count;
}

long invCount(int[] arr) {
    if (arr.length < 2)
        return 0;

    int m = (arr.length   1) / 2;
    int left[] = Arrays.copyOfRange(arr, 0, m);
    int right[] = Arrays.copyOfRange(arr, m, arr.length);

    return invCount(left)   invCount(right)   merge(arr, left, right);
}

test:

for {1, 5, 6, 7} and  {6, 7, 1, 5} result is 2
for {4, 7, 8, 3, 1, 2} and {3, 8, 7, 1, 2, 4} result is 7

Explanation:‌ If we build peterTemp like the following:

peterTemp[i] = map.get(peter[i]);

for every i, j such that i > j and peterTemp[i] >‌ peterTemp[j] we found a similarity. I built peterTemp like this peterTemp[peterTemp.length - (i 1)] = map.get(peter[i]); to use the referenced code only.

  • Related