Home > OS >  Python: extract position-dependent strings from .txt and save them to different columns of a datafra
Python: extract position-dependent strings from .txt and save them to different columns of a datafra

Time:08-11

I have a .txt file (output.txt) from which I want to use specific strings. The required strings start at position 13 and go to the end of a line. I would like to save them to different columns of a dataframe.

I created an empty dataframe with 4 columns:

cameras = pd.DataFrame(columns=['name', 'altitude', 'latitude', 'longitude']) 
 

and I have tried to assign the strings to different columns

with open('output.txt','r') as f:
        for line in f.readlines():
            if line.startswith('name'):
                cameras['name'] = line[13:-1]
            if line.startswith('NN'):
                cameras['altitude'] = line[13:-1]
            if line.startswith('lat'):
                cameras['latitude'] = line[13:-1]
            if line.startswith('lon'):
                cameras['longitude'] = line[13:-1]

But apparently the dataframe is still empty. I guess it's an easier problem to fix. Thanks in advance!

CodePudding user response:

You can create data as an array of tuples of form (<name>, <altitude>, <latitude>, <longitude>).

Then you use pd.from_records() to create the dataframe.

There are several pitfalls here that you should be aware of. The assumption is that the input data is rows in order 'name', 'altitude', 'latitude', 'longitude'. If the order breaks (due to missing row or incorrect order), you'll get into data incosistencies. Do strict data validations.

Please refer https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_records.html

  • Related