Home > database >  Converting specific data from text file to csv using Python
Converting specific data from text file to csv using Python

Time:11-15

I am trying to convert data from a text file into a csv to their corresponding column names.

This is an example of the text file

Run  1    Tbb= 20 C    Volt=3.093   Tamb= 20.13 C   

 \1b2JTriTemp 1.0.5


 AD Averaged 00f2(mV), 0001, 0001, 0001, 0001, 0001, 0001, 0000,
 RAW Values  00f2(xx), FFFF, FFFF, FFFF, FFFF, FFFF, FFFF,
 AD Averaged 0132(mV), 0001, 0001, 0004, 3061, 0002, 0001, 0000,
 RAW Values  0132(xx), 0000, 0002, 0006, 0D0F, 0003, 0000,

When I run this code:

import pandas as pd

with open("sample data as comma.txt", "r") as f:
    data = f.readlines()
with open("sample data as comma.txt", "w") as f:
    for line in data:
        if "RAW" not in line:
            f.write(line)

df = pd.read_csv("sample data as comma.txt", delimiter=',')
df.columns = ['', 'TH ', 'Vacm', 'Vout', 'Bat mon', 'TH-', 'Vbat2', 'Vamb', '']
df.to_csv('Sample raw data CSV.csv')

I get an error "pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 8, saw 9".

It is important to note that I want to be able to keep the line that defines which run it is. E.G "Run 1" along with its Tbb, Volt and Tamb. This can just be on one line before each data set.

Also note that the Run line should be its own seperate row, and not sorted into the columns.

Here is an example of how it should end up: Here is an example of how it should end up

Any help/advice would be greatly appreciated, thanks!

CodePudding user response:

Just do a little manipulation here and add those "blank" values.

So this code looks to see what the max number of columns needed, and then appends those extra "blank" values in those rows.

import pandas as pd

with open("sample data as comma.txt", "r") as f:
    data = f.readlines()
    
data = [x.strip().split(',') for x in data if "RAW" not in x]
max_len = max([len(i) for i in data])

for row in data:
    if len(row) < max_len:
        row  = [''] * (max_len - len(row))

df = pd.DataFrame(data, columns = ['', 'TH ', 'Vacm', 'Vout', 'Bat mon', 'TH-', 'Vbat2', 'Vamb', ''] )
df.to_csv('Sample raw data CSV.csv', index=False)

Output: From your sample data

print(df.to_string())
                                                        TH    Vacm   Vout Bat mon    TH-  Vbat2   Vamb  
0  Run  1    Tbb= 20 C    Volt=3.093   Tamb= 20.13 C                                                    
1                                                                                                       
2                                 \1b2JTriTemp 1.0.5                                                    
3                                                                                                       
4                                                                                                       
5                               AD Averaged 00f2(mV)   0001   0001   0001    0001   0001   0001   0000  
6                               AD Averaged 0132(mV)   0001   0001   0004    3061   0002   0001   0000  
  • Related