I'm trying to write a script that takes the keys in a dictionary and replaces them with the values in they map to in a CSV file. I'm having problems trying to find matching rows.
CSV file
QuestionKey,QuestionId
BASIC1,F4AB5C41-5BB2-41BD-AF7C-08E76BA05DCE
BASIC2,1E6D5B13-BDD2-43E7-9B74-36AD8C816A9C
Script:
QUESTIONS_MAP = pd.read_csv('data/question_ids.csv', dtype=str)
def replace_questions_ids_with_keys(content: dict) -> dict:
"""Replace the question ids with the keys used in the decision response"""
# LOGGER.info(QUESTIONS_MAP)
for key, value in content.items():
# LOGGER.info(QUESTIONS_MAP.QuestionId)
# Find item QuestionKey for item in QUESTIONS_MAP where key in QuestionId
question_key = list(filter(lambda x: key in x['QuestionId'], QUESTIONS_MAP))
LOGGER.info(question_key)
if question_key:
content[QUESTIONS_MAP[key]] = value
del content[key]
return content
Example content
dict:
{'F4AB5C41-5BB2-41BD-AF7C-08E76BA05DCE': '', '1E6D5B13-BDD2-43E7-9B74-36AD8C816A9C': ''}
Runtime error:
2022-05-18 14:46:55,340 - ERROR - string indices must be integers
Expected response:
{'BASIC1': '', 'BASIC2': ''}
CodePudding user response:
First, it is more convenient to work with a dict than a dataframe. Thus,
# map question id -> question key
# squeeze tells pandas to produce a series when there's only one column
# index=1 tells it to use question id as an index
# finally, .to_dict() makes a dictionary out of the series
QUESTIONS_MAP = pd.read_csv('filename.csv', squeeze=True, index_col=1).to_dict()
Then, just use a dict comprehension:
content = {
QUESTIONS_MAP[id_]: value
for id_, value in content.items()
if id_ in QUESTIONS_MAP
}
CodePudding user response:
Imho I think that using a dataframe (and pandas) is an overkill for what I can see.
Here's a simple solution:
import pandas as pd
QUESTIONS_MAP = pd.read_csv('data/question_ids.csv', dtype=str)
def replace_questions_ids_with_keys(content: dict) -> dict:
"""Replace the question ids with the keys used in the decision response"""
result = {}
for key, value in content.items():
for i in range(len(QUESTIONS_MAP)):
question_key = QUESTIONS_MAP.values[i][0]
question_id = QUESTIONS_MAP.values[i][1]
if question_id == key:
result[question_key] = ""
return result
my_dict = {'F4AB5C41-5BB2-41BD-AF7C-08E76BA05DCE': '', '1E6D5B13-BDD2-43E7-9B74-36AD8C816A9C': ''}
replace_questions_ids_with_keys(my_dict)