I am getting a list from an excel file using pandas.
start_path = r'C:\scratch\\'
File = 'test.xlsx'
import pandas as pd
mylist = []
df = pd.read_excel(start_path File, sheet_name='GIS')
mylist = df['Column A'].tolist()
List:
mylist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56', 'ABC']
My goal is to then create a new list from this list, only with the elements that start with LB. So the new list would then be:
newlist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
Or just remove all elements from a list that do not start with 'LB' (thus removing ABC from the list).
newlist = [str(x for x in mylist if "LB" in x)]
I tried the above and this just spits out:
['<generator object <genexpr> at 0x0000024B5B8F62C8>']
I also have tried the following:
approved = ['LB']
mylist[:] = [str(x for x in mylist if any(sub in x for sub in approved))]
This gets the same generator object message as before.
I feel like this is really simple but cannot figure it out.
CodePudding user response:
You can use str.startswith
in list-comprehension:
mylist = ["LB-52/LP-7", "LB-53/LI-5", "LB-54/LP-8", "LB-55", "LB-56", "ABC"]
newlist = [value for value in mylist if value.startswith("LB")]
print(newlist)
Prints:
['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
You can remove str()
in newlist = [str(x for x in mylist if "LB" in x)]
but this will leave values such as xxxLBxxx
(LB
is inside the string)
CodePudding user response:
newlist = [x for x in mylist if x[0:2]=="LB"]
You can also use slicing and also check on desired indices with slicing
CodePudding user response:
In general, pandas tends to do the same job faster than python. Therefore you should try to do most of your computation or filtering in pandas before "moving over" to python.
mylist = df.loc[df['Column A'].str.startswith("LB"), 'Column A'].tolist()
mylist
>>> ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
CodePudding user response:
mylist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56', 'ABC']
new_list = [*filter(lambda x: x.startswith('LB'), mylist)]
# ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
CodePudding user response:
This code works
newlist = ['LB-52/LP-7', 'LB-53/LI-5', 'LB-54/LP-8', 'LB-55', 'LB-56']
mylist= [ch for ch in newlist if ch[0:2]=="LB"]