I would like to code in python a summary of the number of atoms in a molecular formula (string). The string is a letter followed by a number (when there is no number it's counted as one).
Input: C3H7NO2C3H7NO2S Output: C6H14N2O4S
The only letters that I have are: O,C,N,H and S.
CodePudding user response:
A C# "oneliner" that you certainly cannot send in as a homework solution:
var s = "C3H7NO2C3H7NO2S";
var s2 = string.Join(" ",Regex.Split(s, @"([A-Z]\d*)")
.Where(x => !string.IsNullOrEmpty(x))
.Select(x => Regex.Match(x, @"([A-Z])(\d*)"))
.Select(m => new {elt = m.Groups[1].Value, cnt = m.Groups.Count > 2 ? m.Groups[2].Value : "1"})
.Select(e => new {e.elt, cnt = string.IsNullOrEmpty(e.cnt) ? 1 : int.Parse(e.cnt)})
.GroupBy(e => e.elt)
.Select(g => $"{g.Key}{g.Sum(x=>x.cnt)}")
);
// s2 contains "C6 H14 N2 O4 S1"
- first get the element-count pairs (where count is optional)
- then split those into element and count (missing count = 1)
- then group by element and sum the counts
- then join up to a single string
CodePudding user response:
def convert(aa_seq):
temp = re.findall('\d |\D ', str_atoms) #split string to numbers and chars
#print(temp)
# dictionary
dicta = {}
# list to dictionary
for i in range(int(0.5 * len(temp))):
key = temp[2 * i] # set atom as key
val = int(temp[2 * i 1]) #set value as the number that follows
# print(key ' = ' str(val))
if (key in dicta): #add value to existing key
dicta[key] = val
else: # if key does not exist create this
dicta[key] = val
print(dicta)
# dictionary to string
final_str = ''
for x in dicta:
final_str = x str(dicta[x])
print(str_atoms)
print(final_str)