I'm trying to change a .CSV row data. So for the new data I have a dict. I want to say python that if the keys in my dict are equal to the items row in my .csv then change the data in another row.
The CSV file looks something like:
ID | Value_1 | Value_2 | Value_3 |
1 | info | changeme | info |
1 | info | changeme | info |
2 | info | changeme2| info |
3 | info | changeme3| info |
and the dict looks something like:
dictionary = {1: 'info1', 2: 'info2', 3: 'info3'}
Note that in the .csv file the ID's could be repeated, but not in the dict (because it works like that). So when I acces to the key in my dictionary and that key is equal to the ID in the .csv file, the value_2 row (as an example) have to change it's content for the value of the key into the dictionary.
I hope y'all understand my explanation :/
Then I have tried something like this. But I don't really know if the problem is with my coding or with pandas:
for key, values in dictionary.items():
if list(str(keys)) == list(df['ID']):
df['VALUE'].replace(to_replace='VALUES', value= values, inplace= True)
but it's not working. Also tried that outside the for loop, and without the if. It just don't work. But creates a new row indicating like the len of the .csv file.
Maybe I don't have to use pandas for doing this? Any advise would be helpful!
CodePudding user response:
I am pretty sure this is what you want - though not very clear from your question.
# assuming df is this
"""
ID | Value_1 | Value_2 | Value_3 |
1 | info | changeme | info |
1 | info | changeme | info |
2 | info | changeme2| info |
3 | info | changeme3| info |
"""
dictionary = {1: 'info1', 2: 'info2', 3: 'info3'}
df["ID"] = df["ID"].astype(int)
df["column_with_mapped_value"] = df["ID"].replace(to_replace=dictionary)
# output
"""
ID Value_1 Value_2 Value_3 column_with_mapped_value
0 1 info changeme info info1
1 1 info changeme info info1
2 2 info changeme2 info info2
3 3 info changeme3 info info3
"""
CodePudding user response:
To iterate through a dataset checking a condition I usually use the apply method. In your case, would be like this:
import pandas as pd
df = pd.read_csv('teste.csv', sep=';')
# ID Value_1 Value_2 Value_3
# 1 info changeme info
# 1 info changeme info
# 2 info changeme info
# 3 info changeme info
# 5 info changeme info
dictionary = {1: 'info1', 2: 'info2', 3: 'info3', 4: 'info4'}
df['Value_2'] = df.apply(lambda x: dictionary[x.ID] if x.ID in dictionary else x.Value_2, axis=1)
# ID Value_1 Value_2 Value_3
# 1 info info1 info
# 1 info info1 info
# 2 info info2 info
# 3 info info3 info
# 5 info changeme info
I added one more line in the csv and in the dict to test what happens if a key don't exist in one of them.