Home > Back-end >  What is the most efficient way to join one list to another in kotlin?
What is the most efficient way to join one list to another in kotlin?

Time:12-22

I start with a list of integers from 1 to 1000 in listOfRandoms.

I would like to left join on random from the createDatabase list.

I am currently using a find{} statement within a loop to do this but feel like this is too heavy. Is there not a better (quicker) way to achieve same result?

Psuedo Code

data class DatabaseRow(
  val refKey: Int,
  val random: Int  
)

fun main() {
    val createDatabase = (1..1000).map { i -> DatabaseRow(i, Random()) }
    
    val listOfRandoms = (1..1000).map { j ->
        val lookup = createDatabase.find { it.refKey == j }
        lookup.random
    }
}

CodePudding user response:

As mentioned in comments, the question seems to be mixing up database and programming ideas, which isn't helping.

And it's not entirely clear which parts of the code are needed, and which can be replaced. I'm assuming that you already have the createDatabase list, but that listOfRandoms is open to improvement.

The ‘pseudo’ code compiles fine except that:

  • You don't give an import for Random(), but none of the likely ones return an Int. I'm going to assume that should be kotlin.random.Random.nextInt().
  • And because lookup is nullable, you can't simply call lookup.random; a quick fix is lookup!!.random, but it would be safer to handle the null case properly with e.g. lookup?.random ?: -1. (That's irrelevant, though, given the assumption above.)

I think the general solution is to create a Map. This can be done very easily from createDatabase, by calling associate():

val map = createDatabase.associate{ it.refKey to it.random }

That should take time roughly proportional to the size of the list. Looking up values in the map is then very efficient (approx. constant time):

map[someKey]

In this case, that takes rather more memory than needed, because both keys and values are integers and will be boxed (stored as separate objects on the heap). Also, most maps use a hash table, which takes some memory.

Since the key is (according to comments) “an ascending list starting from a random number, like 18123..19123”, in this particular case it can instead be stored in an IntArray without any boxing. As you say, array indexes start from 0, so using the key directly would need a huge array and use only the last few cells — but if you know the start key, you could simply subtract that from the array index each time.

Creating such an array would be a bit more complex, for example:

val minKey = createDatabase.minOf{ it.refKey }
val maxKey = createDatabase.maxOf{ it.refKey }
val array = IntArray(maxKey - minKey   1)
for (row in createDatabase)
    array[row.refKey - minKey] = row.random

You'd then access values with:

array[someKey - minKey]

…which is also constant-time.

Some caveats with this approach:

  • If createDatabase is empty, then minOf() will throw a NoSuchElementException.
  • If it has ‘holes’, omitting some keys inside that range, then the array will hold its default value of 0 — you can change that by using the alternative IntArray constructor which also takes a lambda giving the initial value.)
  • Trying to look up a value outside that range will give an ArrayIndexOutOfBoundsException.

Whether it's worth the extra complexity to save a bit of memory will depend on things like the size of the ‘database’, and how long it's in memory for; I wouldn't add that complexity unless you have good reason to think memory usage will be an issue.

  • Related