How to explore features of strings PYTHON-CodePudding

I have a table with relative abundances of strings collected elsewhere. I also have a list of features that are associated with the strings. What is the best way to explore each string for each feature and sum the relative abundancies.

Example Input Table:

 ──────────── ──────────── 
| String     | Abundance  |
 ──────────── ──────────── 
| abcdef     | 12         |
| cdefgh     | 15         |
| fghijk     | 36         |
| jklmnoabc  | 37         |
 ──────────── ────────────

Example String Features:

cdef, abc, jk

Example Output

 ────────── ──────────────── 
| Feature  | Abundance (%)  |
 ────────── ──────────────── 
| cdef     | 27             |
| abc      | 59             |
| jk       | 73             |
 ────────── ────────────────

Any help would be greatly appreciated!

CodePudding user response：

The answer is to go through the list of string feature for each string you have and use the inoperator of Python. This will check if your feature has an occurence in the string you apply it to. You then want to accumulate abundance and associate it to your feature.

CodePudding user response：

You can use this code:

a = [['abcdef', 12], ['cdefgh', 15], ['fghijk', 36], ['jklmnoabc', 37]]
s = ['cdef', 'abc', 'jk']

def func(a, s):
    ans = []
    for i in s:
        ans.append(0)
        for j in a:
            if i in j[0]:  # string j[0] contains string i
                ans[len(ans) - 1]  = j[1]
    return ans
print(func(a, s))

CodePudding user response：

Did you try to do a loop with a regex? Because, you must go through your two lists (inputs and features). I don't think there is a particular algorithm to highly accelerate the process.

Here is what I'm thinking about

import re

for feature in features:
    p = re.compile(f"{feature.string}")
    feature.abundance = 0
    for input in inputs:
        m = p.match(input.string)
        if m: # if not None
           feature.abundance  = input.abundance

With that, you will have all your stuff in your features list.