Currently I am using data (wavelength, flux) in txt format and have six txt files. The wavelengths are the same but the fluxes are different. I have imported the txt files using pd.read_cvs
(as can be seen in the code) and assigned each flux a different name. These different named fluxes are placed in an array. Finally, I interpolate the fluxes with a temperature array. The codes works and because currently I only have six files writing the code this way is ok. The problem I have moving forward is that when I have 100s of txt files I need a better method.
How can I use glob to import the txt files, assign a different name to each flux (if that is necessary) and finally interpolate? Any help would be appreciated. Thank you.
import pandas as pd
import numpy as np
from scipy import interpolate
fcf = 0.0000001 # flux conversion factor
wcf = 10 #wave conversion factor
temperature = np.array([725,750,775,800,825,850])
# import files and assign column headers; blank to ignore spaces
c1p = pd.read_csv("../c/725.txt",sep=" ",header=None)
c1p.columns = ["blank","0","blank","blank","1"]
c2p = pd.read_csv("../c/750.txt",sep=" ",header=None)
c2p.columns = ["blank","0","blank","blank","1"]
c3p = pd.read_csv("../c/775.txt",sep=" ",header=None)
c3p.columns = ["blank","0","blank","blank","1"]
c4p = pd.read_csv("../c/800.txt",sep=" ",header=None)
c4p.columns = ["blank","0","blank","blank","1"]
c5p = pd.read_csv("../c/825.txt",sep=" ",header=None)
c5p.columns = ["blank","0","blank","blank","1"]
c6p = pd.read_csv("../c/850.txt",sep=" ",header=None)
c6p.columns = ["blank","0","blank","blank","1"]
wave = np.array(c1p['0']/wcf)
c1fp = np.array(c1p['1']*fcf)
c2fp = np.array(c2p['1']*fcf)
c3fp = np.array(c3p['1']*fcf)
c4fp = np.array(c4p['1']*fcf)
c5fp = np.array(c5p['1']*fcf)
c6fp = np.array(c6p['1']*fcf)
cfp = np.array([c1fp,c2fp,c3fp,c4fp,c5fp,c6fp])
flux_int = interpolate.interp1d(temperature,cfp,axis=0,kind='linear',bounds_error=False,fill_value='extrapolate')
My attempts so far...I think I have loaded the files into a list using glob as
import pandas as pd
import numpy as np
from scipy import interpolate
import glob
c_list=[]
path = "../c/*.*"
for file in glob.glob(path):
print(file)
c = pd.read_csv(file,sep=" ",header=None)
c.columns = ["blank","0","blank","blank","1"]
c_list.append
I am still unsure how to extract just the fluxes into an array in order to interpolate. I will continue to post my attempts.
My updated code
fcf = 0.0000001
import pandas as pd
import numpy as np
from scipy import interpolate
import glob
c_list=[]
path = "../c/*.*"
for file in glob.glob(path):
print(file)
c = pd.read_csv(file,sep=" ",header=None)
c.columns = ["blank","0","blank","blank","1"]
c = c['1']*fcf
c_list.append(c)
fluxes = np.array(c_list)
temperature = np.array([7250,7500,7750,8000,8250,8500])
flux_int =interpolate.interp1d(temperature,fluxes,axis=0,kind='linear',bounds_error=False,fill_value='extrapolate')
When I run this code I get the following error
raise ValueError("x and y arrays must be equal in length along "
ValueError: x and y arrays must be equal in length along interpolation axis.
I think the error in the code that needs correcting is here fluxes = np.array(c_list)
. This is one list of all fluxes but I need a list of fluxes from each file. How is this done?
Final attempt
import pandas as pd
import numpy as np
from scipy import interpolate
import glob
c_list=[]
path = "../c/*.*"
for file in glob.glob(path):
print(file)
c = pd.read_csv(file,sep=" ",header=None)
c.columns = ["blank","0","blank","blank","1"]
c = c['1']* 0.0000001
c_list.append(c)
c1=np.array(c_list[0])
c2=np.array(c_list[1])
c3=np.array(c_list[2])
c4=np.array(c_list[3])
c5=np.array(c_list[4])
c6=np.array(c_list[5])
fluxes = np.array([c1,c2,c3,c4,c5,c6])
temperature = np.array([7250,7500,7750,8000,8250,8500])
flux_int = interpolate.interp1d(temperature,fluxes,axis=0,kind='linear',bounds_error=False,fill_value='extrapolate')
This code work but I am still not sure about
c1=np.array(c_list[0])
c2=np.array(c_list[1])
c3=np.array(c_list[2])
c4=np.array(c_list[3])
c5=np.array(c_list[4])
c6=np.array(c_list[5])
Is there a better way to write this?
CodePudding user response:
Here's 2 things that you can tdo:
Instead of
c = c['1']* 0.0000001
try doing
c = c['1'].to_numpy()* 0.0000001
This will build a list of numpy Arrays rather than a list of pandas Series
When constructing fluxes, you can just do
fluxes = np.array(c_list)