create a dictionary from a csv file that has no header, and the dictionary keys are a list given in-CodePudding

I've tried several different ways but none of them work.

For example

data = readData('energy_2.csv', ['M', 'V', 'H'])

Should return:

{'M': [150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676], 'V': [12.977528, 12.595397, 13.489379, 13.802984, 12.841754, 12.651333, 13.346861, 11.646957, 11.92044, 12.43258, 12.695264, 12.583452, 12.592251, 12.903853, 12.53648], 'H': [75.638787, 75.329646, 74.502896, 74.24593, 74.056594, 75.484752, 74.883227, 76.901755, 75.238127, 76.996652, 74.006737, 75.1968, 73.863355, 75.000366, 76.025984]}

And mine returns:

{'M': ['150.270685;12.977528;75.638787', '150.062813;12.595397;75.329646', '150.090797;13.489379;74.502896', '150.050383;13.802984;74.24593', '150.065112;12.841754;74.056594', '149.968068;12.651333;75.484752', '149.915192;13.346861;74.883227', '150.060597;11.646957;76.901755', '149.798183;11.92044;75.238127', '150.074012;12.43258;76.996652', '150.052881;12.695264;74.006737', '149.9411;12.583452;75.1968', '150.01887;12.592251;73.863355', '149.924113;12.903853;75.000366', '149.906676;12.53648;76.025984']}

My code:

def readData(filename, labels):
    import pandas as pd
    df = pd.read_csv(filename, header=None)
    return {k: list(v) for k, v in zip(labels, df.values.T)}

CSV FILE:

150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984

CodePudding user response：

You don't need pandas. Just use csv.reader!

From your question, it appears your CSV file is separated by semicolons. You need to specify this, since the default separator is a comma

First, create a dictionary with the keys from labels where the values are empty lists. Then, append the values in each row to the correct list. Since you want float values, remember to convert them to float before appending!

import csv

def readData(filename, labels):
    data = {lbl: [] for lbl in labels}
    with open(filename, "r") as f:
        reader = csv.reader(f, delimiter=";")
        for row in reader:
            for lbl, value in zip(labels, row):
                data[lbl].append(float(value))
    return data

which gives you the required data:

{'M': [150.270685,
  150.062813,
  150.090797,
  150.050383,
  150.065112,
  149.968068,
  149.915192,
  150.060597,
  149.798183,
  150.074012,
  150.052881,
  149.9411,
  150.01887,
  149.924113,
  149.906676],
 'V': [12.977528,
  12.595397,
  13.489379,
  13.802984,
  12.841754,
  12.651333,
  13.346861,
  11.646957,
  11.92044,
  12.43258,
  12.695264,
  12.583452,
  12.592251,
  12.903853,
  12.53648],
 'H': [75.638787,
  75.329646,
  74.502896,
  74.24593,
  74.056594,
  75.484752,
  74.883227,
  76.901755,
  75.238127,
  76.996652,
  74.006737,
  75.1968,
  73.863355,
  75.000366,
  76.025984]}

CodePudding user response：

The issue is the default separator (sep=','). Try setting sep=';' instead of using the default. You can also set names to the inputted list, labels.

For example:

import pandas as pd

def readData(filename, labels):
    df = pd.read_csv(filename, header=None, sep=";", names=labels)
    return list(df['M'])

data = readData('energy_2.csv', ['M', 'V', 'H'])
print(data)

Output:

[150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676]

Source: pandas.read_csv (docs)

Side Note: The answer above assumes energy_2.csv looks similar to this:

150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984

CodePudding user response：

As was already said, the main issue was the ; separator in your .csv (the default separator being , as the name 'csv' implies).

I first read the .csv as a list of triplets (in this case), then turn it into three lists of values, then into a dictionary.

def readData(filename, labels):
    import csv
    with open(filename) as f:
        data = list(csv.reader(f, delimiter = ';'))
        return dict([[ labels[i], [d[i] for d in data if d]] for i in range(len(labels))])
        
headers = ['M', 'V', 'H']

print(readData('test.csv', headers))

# {'M': ['150.270685', '150.062813', '150.090797', '150.050383', '150.065112', '149.968068', '149.915192', '150.060597', '149.798183', '150.074012', '150.052881', '149.9411', '150.01887', '149.924113', '149.906676'], 'V': ['12.977528', '12.595397', '13.489379', '13.802984', '12.841754', '12.651333', '13.346861', '11.646957', '11.92044', '12.43258', '12.695264', '12.583452', '12.592251', '12.903853', '12.53648'], 'H': ['75.638787', '75.329646', '74.502896', '74.24593', '74.056594', '75.484752', '74.883227', '76.901755', '75.238127', '76.996652', '74.006737', '75.1968', '73.863355', '75.000366', '76.025984']}