I´m trying to write a short program that will allow me to subtract values in one file (file A.dat) from other files **(B0.dat, B1.dat, B2.dat.....) I want to make a program that will repeat the process of subtracting the values A file values as many Bfiles I have(3, 7, or 81). All the files have the same amount columns. File A has one row, files B have 2rows of co I guess the best solution would be to do a loop but I´m having errors. And at the end each corrected file I´d like to save as B0sub, B1sub, B2sub.....
file A. dat
A B C D
-1 2 2.5 4
file B0.dat
A B C D
7 8 9 10
5 3 13 11
file B1.dat
A B C D
11 12 13 14
3 4 7 8
file B2.dat
A B C D
6 8.5 5.3 1
0.8 4.2 2 9
I have totally no idea how to do it so far I tried this
import os
filepath = 'location of files'
i = 0
filename = f'B{i}.dat'
file = pd.read_csv(filepath, delimiter='\t', decimal=',', header=0)
## adding 'sub' to the file
for file in files
os.rename(os.path.join(directory,file), os.path.join, file 'sub' '.dat')
# next file
i = 1
filename = f'B{i}.dat'`````
CodePudding user response:
- Please use CSV if you have comme separated data
- In your future questions, please provide a simple way for us to recreate the data you have (as I did in my answer)
- Just use
B = B - A.loc[0]
to substract one row to an entire dataframe
import pandas as pd
import glob
# Create sample data
data_A = pd.DataFrame(data={"A": [-1], "B": [2]})
data_A.to_csv("A.csv", index=False)
data_B0 = pd.DataFrame(data={"A": [7, 5], "B": [8, 3]})
data_B1 = pd.DataFrame(data={"A": [11, 3], "B": [12, 4]})
data_B0.to_csv("B0.csv", index=False)
data_B1.to_csv("B1.csv", index=False)
# Now let's read the data and substract
A = pd.read_csv("A.csv")
for f in glob.glob("B*.csv"):
B = pd.read_csv(f)
B = B - A.loc[0]
print(B)
B.to_csv("sub_" f, index=False)
Displays:
# For B0:
A B
0 8 6
1 6 1
# For B1:
A B
0 12 10
1 4 2
CodePudding user response:
Assuming that the way you read the file is working with pd.read_csv()
, you first have to save all your values from the A file and then have 3 loops:
-
- Loop is for going through each B file
-
- Loop is for going through each row of your B file
-
- Loop is for going through each value of the current row and current B file.
After that, you can save the outputs in a new file:
import pandas as pd
# getting values of A
df_a = pd.read_csv('A.dat', delimiter='\t', decimal=',', header=0)
# Looping through each file
for num in range(NUMBEROFFILES):
# Opening the file
df = pd.read_csv(f'B{num}.dat', delimiter='\t', decimal=',', header=0)
# Now looping through each row of the file
for i in range(2):
# Finally looping through each value of the row
for j in range(len(df.columns)):
# Subtracting the A values from the B values
df[j][i] = df[j][i] - df_a[j]
# Saving the df in a new file
df.to_csv(f'B{num}sub.dat')