Home > Blockchain >  How to find specific word in objects past 6 months and categories them week by week?
How to find specific word in objects past 6 months and categories them week by week?

Time:03-25

I'm new to Kotlin and trying to figure out how I can do the best way. I have an api call that I call and I convert the response to a list of objects:

data class JobAd(
   val published: LocalDate?,
   val title: String?,
   val jobtitle: String?,
   val description: String?
)

On the api call, I search for all job ads that are from today and back in time of 6 months. For example I get all objects which is from LocalDate.now() and 6 months back LocalDate).now().minusMonths(6). I want to iterate through all the objects and see if 2 random words (java and kotlin) are contained in the object. I want to check either title, jobtitle or description contain the word java or kotlin. I only need one hit of the word java or kotlin in these properties, if title contain java or kotlin, add it to list and check next object. If not title contain the words and either jobtitle, but description does it, add it to the list and check next object. and add it to a list based on which week it is. I want the output to be like this:

(2022) Week 12 -> Java: 0, Kotlin: 1
(2022) Week 11 -> Java: 0, Kotlin: 0 (If some weeks does not have hit, i want to show to too)
...
(2021) Week 52 -> Java: 1, Kotlin: 2

This is my code so far:

private fun findAdsBasedOnKeyWords(jobAds: MutableList<JobAd>, keywords: List<String>, from: LocalDate, to: LocalDate): MutableMap<Any, MutableMap<String, Any>> {
    val resultMap = mutableMapOf<Any, MutableMap<String, Any>>()
    val counter = mutableMapOf<String, Any>() //Meta data

    for (jobAd: JobAd in jobAds) {
        for (keyword: String in keywords) {
            val weekNumber = DateParser.getWeekNumber(jobAd.published!!)

            // Initialize placeholder data, to fill even empty weeks
            resultMap.putIfAbsent(weekNumber, emptyMapOfKeywords(keywords, jobAd.published))

            // Validate keyword exist in job ad
            val contains = jobAd.toString().lowercase()
                .contains(keyword.lowercase()) //Can be an issue if the toString gets overridden
            if (contains) {
                counter.putIfAbsent(keyword, 0)
                counter.compute(keyword) { _, v -> v.toString().toInt()   1 }
                resultMap[weekNumber]!!.compute(keyword) { _, v -> v.toString().toInt()   1 }
            }
        }
    }
    resultMap["total"] = counter
    resultMap["period"] = mutableMapOf("from" to from, "to" to to)
    logger.info("[{}] matches found", counter)
    return resultMap
}

//Helper method to generate placeholder data
private fun emptyMapOfKeywords(keywords: List<String>, published: LocalDate): MutableMap<String, Any> {
    val keywordMap = mutableMapOf<String, Any>()
    for (keyword in keywords) {
        keywordMap.putIfAbsent(keyword, 0)
    }

    keywordMap.putIfAbsent("from", DateParser.startOfWeekDate(published))//Monday of the week
    keywordMap.putIfAbsent("to", DateParser.endOfWeekDate(published))//Sunday of the week
    return keywordMap
}

Is there any way to do it better or optimize it and please add comment for why.

CodePudding user response:

It's a pretty extreme anti-pattern to use Maps to hold various types of data that you need to inspect. That's trying to force a strongly typed language to behave like a weakly typed language, losing all the protection you get from using types.

Maps are appropriate when the keys are something you don't know at compile time and you know you'll need to look up items by their keys at runtime.

So instead of a MutableMap<Any, MutableMap<String, Any>> return value, you should create classes for holding results. From what I can tell, you want to return a series of line items for every week in the input range, so you can create a class like this to represent a line item, and then return a simple list of them from your function. You are currently also returning the range, but I don't see what you're using it for so I left it out.

You're working with a week of a year a lot, so I think it will also be helpful to have a class to represent that, along with a couple of functions to help convert from LocalDate.

data class LocalWeek(val year: Int, val week: Int)

fun LocalDate.toLocalWeek() = LocalWeek(year, get(IsoFields.WEEK_OF_WEEK_BASED_YEAR))

/** Gets every week represented in a range of dates. */
fun ClosedRange<LocalDate>.toLocalWeeks() = sequence {
    var date = start
    val lastExclusive = endInclusive   Period.ofWeeks(1)
    while (date < lastExclusive ) {
        yield(date.toLocalWeek())
        date  = Period.ofWeeks(1)
    }
}

data class JobAdsSearchLineItem(
    val localWeek: LocalWeek,
    val keywordHitCountsByKeyword: Map<String, Int>
) {
    fun toReadableString() =
        "(${localWeek.year}) Week ${localWeek.week} -> "  
                keywordHitCountsByKeyword.entries
                    .joinToString { (word, count) -> "$word: $count" }
}

Using toString() is fragile, like you mentioned in your code comments. I would create a helper function like this to evaluate whether a term is found:

fun JobAd.containsIgnoreCase(str: String): Boolean {
    val value = str.lowercase()
    return title.orEmpty().lowercase().contains(value)
            || jobtitle.orEmpty().lowercase().contains(value)
            || description.orEmpty().lowercase().contains(value)
}

Since you're using !! on your published date, I'm assuming these values don't need to be nullable. It would be much easier to work with if you make the property non-nullable:

data class JobAd(
   val published: LocalDate,
   val title: String?,
   val jobtitle: String?,
   val description: String?
)

Then your search function can be written like this:

private fun findAdsBasedOnKeyWords(
    jobAds: List<JobAd>,
    keywords: List<String>,
    from: LocalDate,
    to: LocalDate
): List<JobAdsSearchLineItem> {

    // Initialize empty results holders representing every week in the range
    // Use an outer map here because we need to keep retrieving the inner maps by
    // the week when iterating the input below.
    val results = mutableMapOf<LocalWeek, MutableMap<String, Int>>()
    for (localWeek in (from..to).toLocalWeeks()) {
        results[localWeek] = mutableMapOf<String, Int>().apply {
            for (keyword in keywords) {
                put(keyword, 0)
            }
        }
    }

    for (jobAd in jobAds) {
        val weekResults = results[jobAd.published.toLocalWeek()] ?: continue
        for (keyword in keywords) {
            if (jobAd.containsIgnoreCase(keyword)) {
                weekResults[keyword] = weekResults.getOrDefault(keyword, 0)   1
            }
        }
    }

    return results.entries.map { JobAdsSearchLineItem(it.key, it.value) }
}

And to use it you can call this function and use the toReadableString() function to help generate your output from the list of results.

  • Related