Home > OS >  filename as key in dictionary - pandas
filename as key in dictionary - pandas

Time:10-02

I have around 100 .csv files in a folder.
They are named like AA.csv, BB.csv, CC.csv.....

I have used the below command to load all the files into the dataframe

import pandas as pd
import glob

df = pd.concat(map(pd.read_csv, glob.glob('/Users/redman/stock-data/*.csv')))

How can i store them in a dictionary with using the filename as the key???
Here the dictionary key will be "AA", "BB", "CC".
Any suggestions would be great.

CodePudding user response:

We can use a dictionary comprehension:

import glob

import pandas as pd

d = {f: pd.read_csv(f)
     for f in glob.glob('/Users/redman/stock-data/*.csv')}

We can use Path.stem and Path.glob if we only want the file name and not the extension or path:

import glob
from pathlib import Path

import pandas as pd

d = {f.stem: pd.read_csv(f.resolve())
     for f in Path('/Users/redman/stock-data/').glob('*.csv')}

Complete Working Example with Generated Sample Data:

import glob
from pathlib import Path
from pprint import pprint

import numpy as np
import pandas as pd

# Generate Sample Directory and csv
np.random.seed(26)
Path('./stock-data').mkdir(exist_ok=True)
for f in ['AA', 'BB', 'CC']:
    pd.DataFrame(
        np.random.randint(1, 100, (3, 5)),
    ).to_csv(f'./stock-data/{f}.csv', index=False)

# Build Dictionary
d = {f.stem: pd.read_csv(f.resolve())
     for f in Path('./stock-data/').glob('*.csv')}
pprint(d)

d:

{'AA':     0   1   2   3   4
0  54  63   7  49  66
1  84  78  91  33  69
2  99  46  18  56  54,
 'BB':     0   1   2   3   4
0  90  53  89  22  31
1  16  43  24  48  92
2  57  83  48  13  48,
 'CC':     0   1   2   3   4
0  17  13  61  77  84
1  20  30  52  29  82
2  45  18  60  53  71}
  • Related