Home > Net >  Count the number letter occurrences in a string
Count the number letter occurrences in a string

Time:12-01

I would like to code in python a summary of the number of atoms in a molecular formula (string). The string is a letter followed by a number (when there is no number it's counted as one).

Input: C3H7NO2C3H7NO2S Output: C6H14N2O4S

The only letters that I have are: O,C,N,H and S.

CodePudding user response:

A C# "oneliner" that you certainly cannot send in as a homework solution:

var s = "C3H7NO2C3H7NO2S";
var s2 = string.Join(" ",Regex.Split(s, @"([A-Z]\d*)")
        .Where(x => !string.IsNullOrEmpty(x))
        .Select(x => Regex.Match(x, @"([A-Z])(\d*)"))
        .Select(m => new {elt = m.Groups[1].Value, cnt = m.Groups.Count > 2 ? m.Groups[2].Value : "1"})
        .Select(e => new {e.elt, cnt = string.IsNullOrEmpty(e.cnt) ? 1 : int.Parse(e.cnt)})
        .GroupBy(e => e.elt)
        .Select(g => $"{g.Key}{g.Sum(x=>x.cnt)}")
        );

// s2 contains "C6 H14 N2 O4 S1"

fiddle

  • first get the element-count pairs (where count is optional)
  • then split those into element and count (missing count = 1)
  • then group by element and sum the counts
  • then join up to a single string

CodePudding user response:

def convert(aa_seq):

temp = re.findall('\d |\D ', str_atoms) #split string to numbers and chars
#print(temp)

# dictionary
dicta = {}

# list to dictionary
for i in range(int(0.5 * len(temp))):
    key = temp[2 * i] # set atom as key
    val = int(temp[2 * i   1]) #set value as the number that follows
    # print(key   ' = '   str(val))
    if (key in dicta): #add value to existing key
        dicta[key]  = val
    else: # if key does not exist create this
        dicta[key] = val
print(dicta)

# dictionary to string
final_str = ''
for x in dicta:
    final_str  = x   str(dicta[x])
print(str_atoms)
print(final_str)
  • Related