How to combine every 4 lines in a txt file?-CodePudding

I have a txt.file that looks like this:

data1  data2  data3  
data4  data5  data6  
data7  data8  data9  
data10 data11 data12 
data13 data14 data15 
data16 data17 data18 
data19 data20 data21
data22 data23 data24 
.
.
.

and I want to rearrange my txt file so that from data1 to data12 will be 1 line, and data13 to data24 will be second line and so on so forth. It is basically combining every 4 lines into 1 line. Desired output should look like this:

I desire something like this:

data1  data2  data3  data4  data5  data6  data7  data8  data9  data10 data11 data12 
data13 data14 data15 data16 data17 data18 data19 data20 data21 data22 data23 data24

How can I do this in Python?

Thank you for any advices, Baris

I tried methods shared under various posts but none of them actually worked.

CodePudding user response：

You could try something like this:

with open("text.txt" "r") as f:  # load data
    lines = f.readlines()

newlines = []
for i in range(0, len(lines), 4):  # step through in blocks of four
    newline = lines[i].strip()   " "   lines[i 1].strip()   " "   lines[i 2].strip()   " "   lines[i 3].strip()   " " # add the lines together after stripping the newline characters at the end
    newlines.append(newline   "\n")  # save them to a list

You would need to add some extra handling for any trailing lines if the number is not evenly divisible by 4.

CodePudding user response：

If you have a number of items to form a rectangular array, you can use a numpy reshape:

N = 4
df = pd.read_csv('your_file', sep='\s ', header=None)
df2 = pd.DataFrame(df.to_numpy().reshape(-1, N*df.shape[1]))

Else, a pandas reshape is needed:

N = 4
df = (pd.read_csv('your_file', sep='\s ', header=None)
   .stack(dropna=False).to_frame()
   .assign(idx=lambda d: d.index.get_level_values(0)//N,
           col=lambda d: d.groupby('idx').cumcount(),
          )
   .pivot(index='idx', columns='col', values=0)
   
)

Output:

       0       1       2       3       4       5       6       7       8       9       10      11
0   data1   data2   data3   data4   data5   data6   data7   data8   data9  data10  data11  data12
1  data13  data14  data15  data16  data17  data18  data19  data20  data21  data22  data23  data24

CodePudding user response：

You may use numpy. It will be just a single reshape operation on your data

import numpy as np

# data.txt:
# data1  data2  data3  
# data4  data5  data6  
# data7  data8  data9  
# data10 data11 data12 
# data13 data14 data15 
# data16 data17 data18 
# data19 data20 data21
# data22 data23 data24

data = np.loadtxt('data.txt', dtype='str')
data_reshaped = data.reshape((2, 12))
print(data_reshaped)