Home > database >  Read a folders name up to a point and search inside file
Read a folders name up to a point and search inside file

Time:05-26

I have folder with hunderds of ascii files and their names are similar to those of:

(POD20keuaks_221599.ascii, POD11ejsjs_221202.ascii etc.) 

I also have csv file that has several columns in it and it goes like below with the file names written up to '_':

name,sale
POD20keuak,20
POD11ejsjs,22
...

I have a running code that first goes inside the csv file and takes the name and use that for the file name but it does not do it for all names if there are two same named files up to the '_' such as "POD20keuaks_221599.ascii and POD20keuaks_352599.ascii". In that case it only reads one.

import os
import numpy as np
import pandas as pd
import re

dir_path = '/home/user/Downloads/Project/' 
Dvalues = pd.read_csv(dir_path 'Dvalues.csv')
file_names = list(Dvalues['name'])
ascii_files_name = [] 

for filename in os.listdir(dir_path):
    if filename[:3] != 'POD':continue 
    if filename[:filename.index('_')] in file_names:
        ascii_files_name.append(filename)
        
files_data = {} 
for file in ascii_files_name:
    file_short_name = file[:file.index('_')]
    D_value_sale = Dvalues[Dvalues['name'] == file_short_name]['sale'].values[0] 
    files_data[file_short_name] = [np.loadtxt(dir_path   file), D_value_sale]  

I can't understand why it's not reading all and how to fix it. Or ideally, how can we first read the file names up to '_' and go look inside csv file to take the corresponding value with that name (kind of reverse of how I did it)?

CodePudding user response:

I can't run it but I think all problem it that you usefile_short_name to keep files

Code may read all files but line

files_data[file_short_name] = [np.loadtxt(dir_path   file), D_value_sale] 

replaces values from previous file.

Maybe you should use full name

files_data[file] = [np.loadtxt(dir_path   file), D_value_sale] 

or you should use list to keep all files

if file_short_name not in files_data:
    files_data[file_short_name] = []

files_data[file_short_name].append( [np.loadtxt(dir_path   file), D_value_sale] )
  • Related