Home > Mobile >  getting hard time with python and csv
getting hard time with python and csv

Time:12-14

if i resume I'm trying to make a python script that can read symptoms from a medical task given in terminal and compare it to others symptoms in dataset.csv then give what the patient form the task is likely suffering from.

the problem I have is it doesn't seems to read the dataset.csv and just give me The patient is likely suffering from d.

the dataset.csv is like this :

Asthma, Wheezing, coughing, chest tightness, and shortness of breath
Atelectasis, Shortness of breath, chest pain or discomfort, and a cough
Atypical pneumonia, Fever, chills, chest pain or discomfort, and shortness of breath
Basal cell carcinoma, Flat, pale, or yellowish patch of skin
Bell's palsy, Facial droop or weakness, numbness, pain around the jaw
Biliary colic, Pain in the upper abdomen that may spread to the shoulder or back
Bladder cancer, Blood in the urine, pain or burning with urination, and frequent urination
Brain abscess, Headache, fever, confusion, drowsiness, seizures, and weakness

and my script is the following :

#!/usr/bin/env python3

import argparse
import csv

# Parse the command line arguments
parser = argparse.ArgumentParser()
parser.add_argument('-t', '--task', help='The symptoms to search for in the dataset')
parser.add_argument('-d', '--dataset', help='The dataset to search in')
args = parser.parse_args()

# Get the task symptoms
task_symptoms = args.task.split(', ')

# Initialize a dictionary to store disease counts
disease_counts = {}

# Open the dataset
try:
    # Open the dataset
    with open(args.dataset, 'r') as csv_file:
        csv_reader = csv.reader('dataset.csv')

# Iterate through each row
    for row in csv_reader:
        
        # Get the disease and symptoms
        disease = row[0].strip()
        symptoms = row[1:]
        
        # Initialize the count
        count = 0
        
        # Iterate through each symptom in the task
        for task_symptom in task_symptoms:
            
            # Iterate through each symptom in the dataset
            for symptom in symptoms:

                # If the symptom matches a symptom in the task
                if task_symptom == symptom:
                    
                    # Increment the count
                    count  = 1

        # Store the disease name and count in the dictionary
        disease_counts[disease] = count
# Get the maximum count
    max_count = max(disease_counts.values())

    # Get the most probable disease from the counts
    most_probable_disease = [k for k, v in disease_counts.items() if v == max_count][0]

    print(f'The patient is likely suffering from {most_probable_disease}.')

except FileNotFoundError:
    print("Error: Could not open the file.")

what I do wrong ??

what I except is The patient is likely suffering from Asthma {for example, if we give other symptoms should give other diseases}

it's been 3 weeks but i can't figure it out

Thank you for helping me

CodePudding user response:

It looks like there are a few mistakes in the code:

The try block for opening the CSV file and reading its contents is indented too much. As a result, the for loop that iterates through the rows in the CSV file is inside the try block, which means that it will only be executed if the CSV file is successfully opened. This is likely not what you want, as you will want the for loop to be executed even if the CSV file cannot be opened. To fix this issue, you can simply unindent the for loop so that it is not inside the try block.

There is a # symbol before the Get the maximum count comment, which will cause a syntax error in the code. You can simply remove this # symbol to fix the error.

The except block for handling the FileNotFoundError is not indented properly, which means that it will not be executed if the CSV file cannot be opened. To fix this, you can indent the except block so that it is at the same level as the try block.

After making these changes, the code should be able to read from the CSV file and find the most probable disease based on the symptoms provided in the task argument.

If that does not work, then you have another problem with the csv file itself.

CodePudding user response:

I believe the problem is the format of the csv file.

Asthma, Wheezing, coughing, chest tightness, and shortness of breath

Because there is a space after each comma, this line in the csv file will produce these fields:

row[0] = "Asthma"
row[1] = " Wheezing"
row[2] = " coughing"
row[3] = " chest tightness"
row[4] = " and shortness of breath"

See how all of the fields after the first begin with a space? The string " coughing" does not match the string "coughing".

  • Related