I have data like this :data.txt
Product list
Name Quantity Price
Iphone11 5 14000000.0
SS note 10 4 13000000.0
Nokia C100 1 20000000.0
and this is my code to filter the
fname = open("Data.txt")
num_line = 0
for line in fname:
num_line = 1
if num_line == 1 or num_line == 2:
continue
data = line.strip().split()
if data == []:
continue
print(data)
And I have a result:
['Iphone11', '5', '14000000.0']
['SS', 'note', '10', '4', '13000000.0']
['Nokia', 'C100', '1', '20000000.0']
I want my code have format like it to put into my database:
['Iphone11', '5', '14000000.0']
['SS note 10', '4', '13000000.0']
['Nokia C100', '1', '20000000.0']
please help me
CodePudding user response:
You can use str.rsplit()
to split from the right; and set sep=None
to split base any whitespace; and set maxsplit=2
to split only two times and skip if more whitespace is found for each line. (It's better to use with
like below, otherwise, you need to close the file after opening and reading.)
with open("Data.txt") as fname:
num_line = 0
for line in fname:
num_line = 1
if num_line == 1 or num_line == 2:
continue
data = line.rsplit(maxsplit=2) # <-> line.rsplit(sep=None, maxsplit=2)
if data == []:
continue
print(data)
['Iphone11', '5', '14000000.0']
['SS note 10', '4', '13000000.0']
['Nokia C100', '1', '20000000.0']
Explanation:
Signature: str.rsplit(self, /, sep=None, maxsplit=-1)
Docstring:
sep -> The delimiter according which to split the string. None (the default value) means split according to any whitespace,and discard empty strings from the result.
CodePudding user response:
If you cannot modify your data to be properly formatted with commas, then parsing by space will "break up" names with spaces in them.
So, what you can do is use a gather "*" to gather all the parts of the name and then join them back together as shown below.
(the data variable simulates reading the lines from the file)
data = ['iphone11 5 1100.0', 'nokia 5 plus 2 1220.0', 'batphone 13 extra` plus 3 2000.0']
for line in data:
*name, qty, price = line.split()
name = ' '.join(name)
print (name)
print (f' qty: {qty}, price: {price}')
Output:
iphone11
qty: 5, price: 1100.0
nokia 5 plus
qty: 2, price: 1220.0
batphone 13 extra plus
qty: 3, price: 2000.0
CodePudding user response:
Take a look at Pandas!
import pandas as pd
from io import StringIO
s="""Product list
Name Quantity Price
Iphone11 5 14000000.0
SS note 10 4 13000000.0
Nokia C100 1 20000000.0 """
widths = [20, 8, 10]
df = pd.read_fwf(StringIO(s), widths=widths, header=None, skiprows=2)
There are many options that can do almost everything!!
>>>df
0 1 2
0 Iphone11 5 14000000.0
1 SS note 10 4 13000000.0
2 Nokia C100 1 20000000.0
If you want to capture the headers then try this:
df = pd.read_fwf(StringIO(s), widths=widths, header=0, skiprows=1)
df
Name Quantity Price
0 Iphone11 5 14000000.0
1 SS note 10 4 13000000.0
2 Nokia C100 1 20000000.0
CodePudding user response:
The split method in string by default use as a split character any whitespace.
If the line read from data.txt have a different separator for instace comma (,) try this
data = line.strip().split(",")