I have input data which contains strings and I want to convert the elements in the list to a number.
To do this, first I need to remove the word 'stars', which I did in the code.
I am stuck in the next step, where I need to remove the letter 'k' from elements that has it. Then, for elements originally had a letter 'k', I need to multiply its value by 1000.
#Input data
starlist = [ '1.1k stars', '1k stars', '1k stars', '978 stars']
#Remove the word stars
starlist = [starlist .replace('stars','') for starlist in starlist ]
Expected output
startlist = ['1100', '1000', '1000', '978']
I tried creating a for loop to iterate through the list, identify if the element has a 'k'. If it does, then remove k, convert to float and multiply by 1000. Problem with this approach is, I cannot overwrite the original value in the original list. I'm not sure how to write back into the original list.
#Remove letter 'k' and multiply by 1000 by if the element had letter 'k'
for rating in starlist:
if starlist.find('k') != -1:
rating = rating.replace('k','')
rating = float(rating) * 1000
CodePudding user response:
#Input data
starlist = [ '1.1k stars', '1k stars', '1k stars', '978 stars']
#Remove the word stars
starlist = [starlist .replace('stars','') for starlist in starlist ]
newlist = []
#Remove letter 'k' and multiply by 1000 by if the element had letter 'k'
for rating in starlist:
if rating.find('k') != -1:
rating = rating.replace('k','')
rating = float(rating) * 1000
newlist.append(rating)
else:
rating = float(rating)
newlist.append(rating)
starlist = newlist
print(starlist)
CodePudding user response:
You were on the right track. Two small things came in:
- use the
rating
to find"k"
. - consider case where there is no
"k"
.
#Remove letter 'k' and multiply by 1000 by if the element had letter 'k'
mlist = []
for rating in starlist:
if rating.find('k') != -1:
rating = rating.replace('k','')
rating = float(rating) * 1000
else:
rating = float(rating)
mlist.append(rating)
starlist = mlist
print(starlist)
CodePudding user response:
For simplicity we will define function to convert rating
def convert_rating(rating):
# Here we found a bug in your code! It should be rating, not starlist
if rating.find('k') != -1:
rating = rating.replace('k','')
return float(rating) * 1000
return int(rating)
You have few options here.
- Creating new list
starlist = [convert_rating(r) for r in starlist]
- Use the index to replace values of original
starlist
for index, rating in enumerate(starlist):
starlist[index] = convert_rating(rating)
One thing more! The floats can be inaccurate, so you should use Decimal or update convert_rating
function to remove the comma from string and multiply by proper amount.
CodePudding user response:
# Input data
starlist = [ '1.1k stars', '1k stars', '1k stars', '978 stars']
# Remove the word stars
starlist = [starlist .replace('stars','') for starlist in starlist ]
The output for your piece of code is:
['1.1k ', '1k ', '1k ', '978 ']
For removing the k
, You can loop using enumerate to update a list within a loop.
for i, s in enumerate(starlist):
if 'k' in s:
starlist[i] = str(float(s.replace('k', '')) * 1000)
After doing the multiplication with 1000
, you can see there is a conversion to str
. To have the desired output:
['1100.0', '1000.0', '1000.0', '978 ']
You can lose the str
conversion if you aim for having a list of floats.
CodePudding user response:
I provide two methods.
- Create a new list, then assign it to the original list.
starlist = [ float(rating.replace('k', ''))*1000 if 'k' in rating else float(rating) for rating in starlist]
- Modify the original list in-place by subscripts.
for i, rating in enumerate(starlist):
starlist[i] = float(rating.replace('k', ''))*1000 if 'k' in rating else float(rating)