Home > other >  Counting occurrences of multiple characters in a string, with python
Counting occurrences of multiple characters in a string, with python

Time:07-18

I'm trying to create a function that -given a string- will return the count of non-allowed characters ('error_char'), like so: 'total count of not-allowed / total length of string'.

So far I've tried:

def allowed_characters(s):
    s = s.lower()
    correct_char = 'abcdef'
    error_char = 'ghijklmnopqrstuvwxyz'
    counter = 0
    
    for i in s:
        if i in correct_char:
            no_error = '0' '/'  str(len(s))
            return no_error
    
        elif i in error_char: 
            counter  = 1
            result = str(sum(counter))   '/'   str(len(s))
            return result

but all I get is '0/56' where I'm expecting '22/56' since m,x,y,z are 'not allowed' and m repeats 19 times

allowed_characters('aaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbmmmmmmmmmmmmmmmmmmmxyz')
'0/56'

Then I've tried:

def allowed_characters(s):
    s = s.lower()
    correct_char = 'abcdef'
    error_char = 'ghijklmnopqrstuvwxyz'
    counter = 0
    
    for i in s:
        if i in correct_char:
            no_error = '0' '/'  str(len(s))
            return no_error
    
        elif i in error_char: 
            import regex as re
            rgx_pattern = re.compile([error_char])
            count_e = rgx_pattern.findall(error_char, s)
            p_error = sum([count_e.count(i) for i in error_char])
            result = str(p_error)   '/'   str(len(s))

But I get the same result...

I've also tried these other ways, but keep getting the same:

def allowed_characters1(s):
    s = s.lower()
    correct_char = 'abcdef'
    
    for i in s:
        if i not in correct_char:
            counter = sum([s.count(i) for i in s])
            p_error = str(counter)   '/'   str(len(s))
            return p_error
            
        elif i in correct_char:
            no_error = '0' '/'  str(len(s))
            return no_error

and...

def allowed_characters2(s):
    s = s.lower()
    correct_char = 'abcdef'
    
    for i in s:
        if i not in correct_char:
            counter = sum(s.count(i))
            p_error = str(counter)   '/'   str(len(s))
            return p_error
            
        elif i in correct_char:
            no_error = '0' '/'  str(len(s))
            return no_error

I've even tried changing the logic and iterating over 'correct/error_char' instead, but nothing seems to work... I keep getting the same result over and over. It looks as though the loop stops right after first character or doesn't run the 'elif' part?

CodePudding user response:

Whenever it comes to do quicker counting - it's always good to think about Counter You can try to simplify your code like this:

Notes - please don't change your Problem Description during the middle of people's answering posts. That make it very hard to keep in-sync.

There is still room to improve it though.

from collections import Counter

def allowed_char(s):
    s = s.lower()
    correct_char = 'abcdef'
    error_char = 'ghijklmnopqrstuvwxyz'
    
  
    ok_counts = Counter(s)
    print(f' allowed: {ok_counts} ')
    correct_count  = sum(count for c, count in ok_counts.items() if c in correct_char)

    error_count = sum(count for c, count in ok_counts.items() if c in error_char)
   
    #return sum(not_ok.values()) / total

    return correct_count, error_count  # print both 
    

s =('aaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbmmmmmmmmmmmmmmmmmmmxyz')

print(allowed_char(s))             # (34, 22)

print(allowed_char('abdmmmmxyz'))  # (3, 7)

Alternatively, you really want to use for-loop and learn to process the string of characters, you could try this:


def loop_count(s):
    s = s.lower()
    correct_count = error_count = 0

    for c in s:
        if c in correct_char:
            correct_count  = 1
        else:
            error_count  = 1

    return correct_count, error_count

CodePudding user response:

I would use a regex replacement trick here using len():

def allowed_characters(s):
    return len(s) - len(re.sub(r'[^ghijklmnopqrstuvwxyz] ', '', s))

The above returns the length of the input string minus the length of the input with all allowed characters removed (alternatively minus the length of the string with only non allowed characters).

  • Related