I am trying to import a txt file to python and I'm using pandas.
The file I'm trying to import looks more or less like this:
Final Test Values
***************************
Date: Friday, 24 September
Version : Version 3.0(3)
ID L : 1937
ID P : 60
***************************
A ; B ; C ; D ; E ; F
-----------------------------------------------------------------
660 ; 25 ; 5.6478 ; 0.9381 ; 0.67 ; 8.00
661 ; 25 ; 6.2592 ; 0.6103 ; 0.52 ; 8.00
662 ; 25 ; 6.7193 ; 0.5644 ; 0.52 ; 8.00
663 ; 25 ; 4.3940 ; 1.0760 ; 0.54 ; 8.00
664 ; 25 ; 6.4188 ; 0.5507 ; 0.54 ; 8.00
665 ; 25 ; 6.5221 ; 0.5619 ; 0.00 ; 8.00
The values that I am really interested in is just this part:
660 ; 25 ; 5.6478 ; 0.9381 ; 0.67 ; 8.00
661 ; 25 ; 6.2592 ; 0.6103 ; 0.52 ; 8.00
662 ; 25 ; 6.7193 ; 0.5644 ; 0.52 ; 8.00
663 ; 25 ; 4.3940 ; 1.0760 ; 0.54 ; 8.00
664 ; 25 ; 6.4188 ; 0.5507 ; 0.54 ; 8.00
665 ; 25 ; 6.5221 ; 0.5619 ; 0.00 ; 8.00
The date, ID L and ID P can vary everytime.
I have done the entire code by manually opening the txt file and deleting everything until the 660, but that is obviously not the best way to do it.
Anyone has any suggestion?
Thank you!
CodePudding user response:
Use skiprows
argument to ignore the rows you don't need:
data = StringIO("""Final Test Values
***************************
Date: Friday, 24 September
Version : Version 3.0(3)
ID L : 1937
ID P : 60
***************************
A ; B ; C ; D ; E ; F
-----------------------------------------------------------------
660 ; 25 ; 5.6478 ; 0.9381 ; 0.67 ; 8.00
661 ; 25 ; 6.2592 ; 0.6103 ; 0.52 ; 8.00
662 ; 25 ; 6.7193 ; 0.5644 ; 0.52 ; 8.00
663 ; 25 ; 4.3940 ; 1.0760 ; 0.54 ; 8.00
664 ; 25 ; 6.4188 ; 0.5507 ; 0.54 ; 8.00
665 ; 25 ; 6.5221 ; 0.5619 ; 0.00 ; 8.00
""")
df = pd.read_csv(data, sep=";", skiprows=[0,1,2,3,4,5,6,8])
A B C D E F
0 660 25 5.6478 0.9381 0.67 8.0
1 661 25 6.2592 0.6103 0.52 8.0
2 662 25 6.7193 0.5644 0.52 8.0
3 663 25 4.3940 1.0760 0.54 8.0
4 664 25 6.4188 0.5507 0.54 8.0
5 665 25 6.5221 0.5619 0.00 8.0