I have a .txt (notepad) file called Log1. It has the following saved in it: [1, 1, 1, 0]
When I write a program to retrieve the data:
Log1 = pd.read_csv('Path...\\Log1.txt')
Log1 = list(Log1)
print(Log1)
It prints: ['[1', ' 1', ' 1.1', ' 0]']
I dont understand where the ".1" is coming from on the third number. Its not in the text file, it just adds it.
Funny enough if I change the numbers in the text file to: [1, 0, 1, 1]. It does not add the .1 It prints ['[1', ' 0', ' 1', ' 1]']
Very odd why its acting this way if anyone has an idea.
CodePudding user response:
This should work. Can you please try this,
log2 = log1.values.tolist()
Output:
[['1'], ['1'], ['1'], ['0']]
CodePudding user response:
Your data is not in a CSV format. In CSV you would rather have
1;1;0;1
or something similar.
If you have multiple lines like this, it might make sense to parse this as CSV, otherwise I'd rather parse it using a regexp and .split
on the result.
Proposal: Add a bigger input example and your expected output.
CodePudding user response:
Solved it with input from above. Its just pandas interpretation of the data that was messing up the output:
Log4 = []
with open('path...\\Log4.txt') as f:
Log4 = f.readlines()
prints ['[1, 1, 1, 0]']
CodePudding user response:
Well, I worked out some other options as well, just for the record:
Solution 1 (plain read - this one gets a list of string)
log4 = []
with open('log4.txt') as f:
log4 = f.readlines()
print(log4)
Solution 2 (convert to list of ints)
import ast
with open('log4.txt', 'r') as f:
inp = ast.literal_eval(f.read())
print(inp))
Solution 3 (old school string parsing - convert to list of ints, then put it in a dataframe)
with open('log4.txt', 'r') as f:
mylist = f.read()
mylist = mylist.replace('[','').replace(']','').replace(' ','')
mylist = mylist.split(',')
df = pd.DataFrame({'Col1': mylist})
df['Col1'] = df['Col1'].astype(int)
print(df)
Other ideas here as well:
https://docs.python-guide.org/scenarios/serialization/
In general the reading from the text file (deserializing) is easier if the text file is written in a good structured format in the first place - csv file, pickle file, json file, etc. In this case, using the ast.literal_eval()
worked well since this was written out as a list using it's __repr__
format -- though honestly I've never done that before so it was an interesting solution to me as well :)