Home > Software engineering >  python in loop substracting files one from another
python in loop substracting files one from another


I´m trying to write a short program that will allow me to subtract values in one file (file A.dat) from other files **(B0.dat, B1.dat, B2.dat.....) I want to make a program that will repeat the process of subtracting the values A file values as many Bfiles I have(3, 7, or 81). All the files have the same amount columns. File A has one row, files B have 2rows of co I guess the best solution would be to do a loop but I´m having errors. And at the end each corrected file I´d like to save as B0sub, B1sub, B2sub.....

file A. dat 

A   B   C   D
-1  2   2.5 4

file B0.dat         
A   B   C   D
7   8   9   10
5   3   13  11

file B1.dat 
A   B   C   D
11  12  13  14
3   4   7   8

file B2.dat 
A   B   C   D
6   8.5 5.3 1
0.8 4.2 2   9

I have totally no idea how to do it so far I tried this 

import os
filepath = 'location of files'

i = 0   
filename = f'B{i}.dat'    
file = pd.read_csv(filepath, delimiter='\t', decimal=',', header=0)
## adding 'sub' to the file 
for file in files 
    os.rename(os.path.join(directory,file), os.path.join, file   'sub'   '.dat')

# next file 
 i  = 1
filename = f'B{i}.dat'`````

CodePudding user response:

  1. Please use CSV if you have comme separated data
  2. In your future questions, please provide a simple way for us to recreate the data you have (as I did in my answer)
  3. Just use B = B - A.loc[0] to substract one row to an entire dataframe

import pandas as pd
import glob

# Create sample data
data_A = pd.DataFrame(data={"A": [-1], "B": [2]})

data_A.to_csv("A.csv", index=False)

data_B0 = pd.DataFrame(data={"A": [7, 5], "B": [8, 3]})
data_B1 = pd.DataFrame(data={"A": [11, 3], "B": [12, 4]})
data_B0.to_csv("B0.csv", index=False)
data_B1.to_csv("B1.csv", index=False)

# Now let's read the data and substract
A = pd.read_csv("A.csv")

for f in glob.glob("B*.csv"):
    B = pd.read_csv(f)
    B = B - A.loc[0]
    B.to_csv("sub_"   f, index=False)


# For B0:
   A  B
0  8  6
1  6  1

# For B1:
    A   B
0  12  10
1   4   2

CodePudding user response:

Assuming that the way you read the file is working with pd.read_csv(), you first have to save all your values from the A file and then have 3 loops:

    1. Loop is for going through each B file
    1. Loop is for going through each row of your B file
    1. Loop is for going through each value of the current row and current B file.

After that, you can save the outputs in a new file:

import pandas as pd

# getting values of A
df_a = pd.read_csv('A.dat', delimiter='\t', decimal=',', header=0)

# Looping through each file
for num in range(NUMBEROFFILES):
    # Opening the file
    df = pd.read_csv(f'B{num}.dat', delimiter='\t', decimal=',', header=0)
    # Now looping through each row of the file
    for i in range(2):
        # Finally looping through each value of the row
        for j in range(len(df.columns)):
            # Subtracting the A values from the B values
            df[j][i] = df[j][i] - df_a[j]
    # Saving the df in a new file

  • Related