I'm trying to import a text file and have it delimited in the rows which it can be without losing the lines which can't be. Here's an example:
Some text for the title
some more text for a description
some more descriptions.
City,State,Capital
Philadelphia,Pennsylvania,No
Sacramento,California,Yes
New York,New York,No
Austin,Texas,Yes
Miami,Florida,No
The portion with commas would be delimited.
I've tried a few things.
This is a token error:
pd.read_csv(file.txt, sep=',')
This works but sometimes the text files don't all start on the same line so I'd like to keep the information:
pd.read_csv(file.txt, skiprow=x)
Is there some parameter I could pass to get this working?
Some text for the title
some more text for a description
some more descriptions
City | State | Captial |
---|---|---|
Philadelphia | Pennsylvania | No |
CodePudding user response:
You could split the text file and read each part seperately, then you could use pd.read_csv
for it, but as one file, as far as I know, you should just read the file with readlines()
and some conditions.
Try it with this:
with open('your_textfile.txt', 'r') as f:
some_information = []
row = f.readline()
while row !='\n':
some_information.append(row.strip())
row = f.readline()
data = [x.strip().split(',') for x in f.readlines()]
df = pd.DataFrame(data[1:], columns=data[0])
print(some_information, data, df, sep='\n\n')
['Some text for the title', 'some more text for a description', 'some more descriptions.']
[['City', 'State', 'Capital'],
['Philadelphia', 'Pennsylvania', 'No'],
['Sacramento', 'California', 'Yes'],
['New York', 'New York', 'No'],
['Austin', 'Texas', 'Yes'],
['Miami', 'Florida', 'No']]
City State Capital
0 Philadelphia Pennsylvania No
1 Sacramento California Yes
2 New York New York No
3 Austin Texas Yes
4 Miami Florida No
CodePudding user response:
I used this and it work correctly based off the text file you provided
df = pd.read_csv(filepath, skiprows=4)
df