I have a series of csv files inside a directory. Each csv file has the following columns:
slotID; NLunUn; NLunTot; MeanBPM
I would like, starting from the values contained within the slotID column, to create data frames that contain the relative values. Eg the 1st csv has the following values:
slotID NLun An NLunTot MeanBPM
7 11 78 129,7
11 6 63 123,3
12 6 33 120,6
13 5 41 124,5
14 4 43 118,9
the 2nd csv has the following values
slotID NMarAn NMarTot MeanBPM
7 10 72 131,2
11 5 48 121,5
12 4 17 120,9
13 4 19 125,6
16 6 45 127,4
I would like to create a dataframe which for example is called dataframe1 which has the values of slot 7 inside, another csv which contains the values of slot 11 etc ... Any suggestion is welcome, I've been trying for several days but can't seem to jump out, please help me. This is what i've done so far:
import pandas as pd
#import matplotlib.pyplot as plt
import os
import glob
import numpy as np
path = os.getcwd()
csv_files = glob.glob(os.path.join(path, "*.csv"))
for f in csv_files:
dfDay = pd.read_csv(f, encoding = "ISO-8859-1", sep = ';')
//inside dfday there are all the files that contain the data
CodePudding user response:
Provided that all the csv-files have the same structure (i.e. column names) you could do something like this:
...
path = os.getcwd()
csv_files = glob.glob(os.path.join(path, "*.csv"))
df = pd.concat(
(pd.read_csv(f, encoding='ISO-8859-1', sep=';') for f in csv_files),
ignore_index=True
)
slot_dfs = {slot: group for slot, group in df.groupby("slotID")}
# Exporting to csv-files
for n, df_slot in enumerate(slot_dfs.values(), start=1):
df_slot.to_csv(f"dataframe{n}.csv", index=False)
The dictionary slot_dfs
contains the dataframes for each available slot.
If you really want to create variables for the dataframes then you could try
for n, (_, group) in enumerate(df.groupby("slotID"), start=1):
globals()[f"dataframe{n}"] = group
# Exporting to csv-file
group.to_csv(f"dataframe{n}.csv", index=False)
instead of creating the slot_dfs
dictionary. After that print(dataframe1)
should show the dataframe for the first slot etc.