How to convert a csv file to a Dictionary in Python?-CodePudding

I have a csv file which has the configuration information to create the yaml file (final desired result). Firstly, I am trying to convert each row of the csv file to a Dictionary and then I can easily convert Dictionary to yaml file using yaml.dump(Created_Dictionary)

Sample Input file (test.csv):

fieldname|type|allowed
field_A|String|10,20,30
field_B|Integer|

My source code using pandas library:

df = pd.read_csv("test.csv", "|")
df_to_dict = df.to_dict(orient='records')
print(df_to_dict) # print the dictionary

test_yaml = yaml.dump(df_to_dict)
print(test_yaml) # print the yaml file

Output I am getting for dictionary(df_to_dict):

[{'fieldname': 'field_A', 'type': 'String', 'allowed': '10,20,30'}, {'fieldname': 'field_B', 'type': 'Integer', 'allowed': nan}]

Output I am getting for yaml (test_yaml):

- allowed: 10,20,30
  fieldname: field_A
  type: String
- allowed: .nan
  fieldname: field_B
  type: Integer

Desired dictionary output (df_to_dict) is:

[
      {'EXT_FILE_IND':
          {'type': 'String', 'maxlength': '1', 'required': 'TRUE', 'empty': 'TRUE', 'coerce': '', 'allowed': '10,20,30'}
       },
      {'EXT_0003_SITE_ID':
          {'type': 'String', 'maxlength': '4', 'required': 'TRUE', 'empty': 'TRUE', 'coerce': '', 'allowed': ''}
       },
      {'EXT_1001_CLAIM_NUMBER':
          {'type': 'String', 'maxlength': '15', 'required': 'TRUE', 'empty': 'TRUE', 'coerce': '', 'allowed': ''}
       }
     ]

Desired yaml output (test_yaml) is:

field_A:
  type: String
  allowed: 10,20,30
field_B:
  type: Integer
  allowed:

I see that the variable, df_to_dict, is a list of dictionaries. Do I have to loop through each list item and then build the dictionary for each row ? I am not understanding the correct approach. Any help is appreciated.

CodePudding user response：

Try:

my_dict = df.set_index("fieldname").to_dict("index")
test_yaml = yaml.dump(my_dict, sort_keys=False)

>>> print(test_yaml)
field_A:
  allowed: 10,20,30
  type: String
field_B:
  allowed: .nan
  type: Integer

CodePudding user response：

You want to play around with the index of your pandas DataFrame.

>>> df = pd.read_csv("test.csv", sep="|", index_col=0)
>>> df
              type   allowed
fieldname                   
field_A     String  10,20,30
field_B    Integer       NaN
>>> df.to_dict(‘index’) # returns dict like {index -> {column -> value}}
{'field_A': {'type': 'String', 'allowed': '10,20,30'}, 'field_B': {'type': 'Integer', 'allowed': nan}}
>>> print(yaml.dump(df.to_dict(‘index’)))
field_A:
  allowed: 10,20,30
  type: String
field_B:
  allowed: .nan
  type: Integer

The .nan you have to deal with a custom dump or filter.

See

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_dict.html?highlight=to_dict#pandas.DataFrame.to_dict

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html