Home > Software design >  in python, how do you convert a list of strings within a dictionary to a list of integers?
in python, how do you convert a list of strings within a dictionary to a list of integers?

Time:06-13

I have a function (main) that takes data from a csv file and converts it into a dictionary whose keys are the entries in the first column and their values are a list of all the other entries in that row (eg: one row is: 2020-12-20,0,0,0,0,206, so the key is 2020-12-20 and the rest of the entries are strings in a list: ['0', '0', '0', '0', '206']):

def main():
    import csv
    # doses_data_mar_20.csv
    dict_doses_by_date = {}

    filename_input = str(input("Please enter a .csv file to read: "))
    with open(filename_input, "r") as inp, open('doses.csv', 'w') as out:
        header = inp.readline()
        reader = csv.reader(inp, delimiter=",", quotechar='"')
        for line in reader:
            dict_doses_by_date[line[0]] = line[1:6]
    return dict_doses_by_date

def count_doses_by_date(dict_dose_by_date):

now I need to define a new function count_doses_by_date that takes each list of strings as an input and converts each of these lists of strings into a list of integers and add all the integers to get their totals. then outputs this into another csv file.

I tried doing this:

def count_doses_by_date(dict_dose_by_date):
    import csv
    # doses_data_mar_20.csv
    dict_doses_by_date = {}
    filename_input = str(input("Please enter a .csv file to read: "))
    with open(filename_input, "r") as inp, open('doses.csv', 'w') as out:
        header = inp.readline()
        reader = csv.reader(inp, delimiter=",", quotechar='"')
        for line in reader:
            dict_doses_by_date[line[0]] = line[1:6]
        for k in dict_doses_by_date:
            list_integers = [int(x) for x in dict_doses_by_date[k]]
            sum_integers = sum(list_integers)
            print_value = "{}, {} \n".format(k, sum_integers)
    return out.write(print_value)

but I’m getting errors since some of the lists contain strings like '1,800' which contain commas that prevent it from be converted to an integer. I don't know how to get rid of there's thousands commas without disrupting the commas that separate the csv values.

I'm stuck.. how would this be done?

CodePudding user response:

You should use the pandas library. You can use pd.read_csv to get a dataframe directly from the file, and you can set the first column to the index column. You can use df.applymap(lamba x : int(x.replace(',','')) to get rid of the commas and convert to int, then do df.sum(axis = 1) to get a row-by-row sum.

CodePudding user response:

Would you try this? Use string.isdigit() to determine whether it is a number or not

line = ['2020-12-20', '0', '0', '0', '0', '206']
filtered_line = [int(e) if e.isdigit() else '' for e in line[1:6]]
print([x for x in filtered_line if x != ''])

Output

[0, 0, 0, 0, 206]

In your use case, the code could be this:

dict_doses_by_date = {}
reader = [['2020-12-20', '0', '0', '0', '10', '206'], ['2020-12-21', '0', '0', '0', '20', '316'], ['2020-12-22', '0', '0', '0', '30', '426']]

for line in reader:
    dict_doses_by_date[line[0]] = line[1:6]
    sum_integers = sum([int(x) for x in line[1:6]])
    print("{}, {}".format(line[0], sum_integers))

print(dict_doses_by_date)

Output

2020-12-20, 216
2020-12-21, 336
2020-12-22, 456
{'2020-12-20': ['0', '0', '0', '10', '206'], '2020-12-21': ['0', '0', '0', '20', '316'], '2020-12-22': ['0', '0', '0', '30', '426']}

CodePudding user response:

So, if your string is something like "1234" you can do

int(number, base=base)

And you will obtain an integer. So for example:

print(int("1234"))

Will print the 1234 number.

Please check the rest of documentation here: https://docs.python.org/3/library/functions.html#int

Then to actually achieve what you want you can proceed as suggested on the other comments or any way you would like, just loop through the list of elements and keep adding them (a = int("1234")) then return the total and write it to the file.

Of course, if your strings have unexpected symbols such as "thousands commas" then you need to normalize strings before calling int() by removing the character with replace() or by other means.

  • Related