I have a csv file with four columns. The first two columns are single strings, and the last two columns are meant to be multi-value lists. Using a pandas dataframe to pull the columns in, I'm trying to make a series of dictionaries. The first was easy -
df = pd.read_csv('data.csv')
dict1 = dict(zip(df[col_1],df[col_2]))
- but now, where I'm trying to make
{col_1: col_3}
and{col_1: col_4}
, where col_3 and col_4 each have multiple values, I'm finding it harder to get pandas/python to produce what I'm looking for.
My csv data is structured like this:
col_1, col_2, col_3, col_4
John Doe, A4w, "22,35,67", "45,78,99"
My desired output is a dictionary where the entry in col_1 is the key, and the value is a list of each individual item in col_3 (and then another dictionary structured in the same way for col_4).
So far, I've been able to get pandas to give me {John Doe:"22,35,67"}
but what I want is {John Doe:['22','35','67']}
. (I need to be able to iterate over the list later). How do I change a 'multiple value' into a 'list' here?
CodePudding user response:
I believe that in this case you will have to use the .str.split(',')
in the columns you would like to transform into a list, for example:
df = pd.DataFrame({'col1' : 'John Doe', 'col2' : 'A4w', 'col3' : "22,35,67", 'col4' : "45,78,99"}, index = [0])
print(dict(zip(df['col1'], df['col3'].str.split(','))))
# {'John Doe': ['22', '35', '67']}