So I wrote this code that takes in a file, filename: str
and returns the number of times each letter exists in the string in the form of ' ' .. here is my code
def letterhelper(filename):
r = list(filename)
c_r = set(r)
c_r.remove(' ')
c_r.remove(',')
c_r.remove('.')
c_r.remove('\n')
f = []
for x in c_r:
f.append([-r.count(x), x])
return f
def charHistogram(data: str):
r = open(filename)
q = r.read()
g = letterhelper(str.lower(q))
for t in sorted(g):
print(t[1], (-t[0]) * ' ')
and data is a separate file which will be opened by function letterhelper()
A sample input that data may contain is...
"My Brothers and Sisters give me stress"
So the issue is, when data
is
Lorem ipsum dolor sit amet, consectetur adipiscing
elit. Praesent ac sem lorem. Integer elementum
ultrices purus, sit amet malesuada tortor
pharetra ac. Vestibulum sapien nibh, dapibus
nec bibendum sit amet, sodales id justo.
the function correctly returns
e
t
s
i
a
m
r
u
l
n
o
c
d
p
b
g
h
j
v
None
but if data
= Someday Imma be greater than the rest
the output is
c_r.remove(',')
KeyError: ','
What changes should I make so that my code correctly returns a histogram like when data
is "Lorem ipsum ....." for all string inputs provided??
CodePudding user response:
The following would solve the problem and make is easier for you to add more characters for removal in future.
def letterhelper(filename):
r = list(filename)
c_r = set(r)
chars_to_remove = (' ', ',', '.', '\n')
for char in chars_to_remove:
if char in c_r:
c_r.remove(char)
f = []
for x in c_r:
f.append([-r.count(x), x])
return f
.
.
.
.
.
.
In case you want suggestions on your code.
def letterhelper(filename): # I GUESS you wanted the parameter here to be 'data' and not 'filename'
r = list(filename) # You don't need to convert str to list for using 'count' method
c_r = set(r)
chars_to_remove = (' ', ',', '.', '\n')
for char in chars_to_remove:
if char in c_r:
c_r.remove(char)
f = [] # You can use list comprehentions
for x in c_r:
f.append([-r.count(x), x]) # you don't need to negate the count here, you can reverse the sorted list in the function 'charHistogram'
return f
def charHistogram(data: str): # I GUESS you wanted the parameter here to be 'filename' and not 'data'
r = open(filename) # It is always a good practice to close the file as soon as you are done with it
q = r.read()
g = letterhelper(str.lower(q))
for t in sorted(g): # sorted(g, reverse=True)
print(t[1], (-t[0]) * ' ') # t[0] * ' '
The following is what I could come up with.
def letterhelper(data: str):
chars_to_remove = (" ", ",", ".", "\n")
# f = []
# for x in set(data):
# if x not in chars_to_remove:
# f.append((data.count(x), x))
# The list comprehention in the next line does exactly the same thing as the for loop above
f = [(data.count(x), x) for x in set(data) if x not in chars_to_remove]
return f
def charHistogram(filename: str):
# r = open(filename)
# q = r.read()
# r.close()
# The with statement in the next line does exactly the same thing as the 3 lines above
with open(filename) as r:
q = r.read()
g = letterhelper(str.lower(q)) # q.lower() will also work
for t in sorted(g, reverse=True): # this will sort the list in descending order
print(t[1], t[0] * " ")
CodePudding user response:
Easy:
if ',' in c_r:
c_r.remove(',')
That should be an efficient operation.