I finally gave up finding a nice compact pandas answer and did this:
states_list = self.activities_df["park_states"].unique()
for state in states_list:
if len(state) == 2:
distinct_states.append(state)
All the other ways I tried to do this with pandas didn't work. I tried this type of logic:
states_list = df[df.column_name.str.len() == 2]
distinct_states = states_list['column_name']
I tried this type of logic:
list = df[(df['col_1'].str.len() == 2)]
I've tried multiple other things and many different combinations of the above, and I can't get it the syntax correct.
What's the best way to do what I'm wanting using pandas?
CodePudding user response:
You need to use .loc
:
my_list = df.loc[df['col_1'].str.len() == 2, 'col_1'].values.tolist()
For example:
import pandas as pd
df = pd.DataFrame({'col_1':['a','ab','ac','abc','acb','ad']})
length_to_find = 2
my_list = df.loc[df['col_1'].str.len() == length_to_find, 'col_1'].values.tolist()
print(my_list)
Output:
['ab', 'ac', 'ad']
Note:
Depending on the contents of the column, you may need to convert the column type to a string before extracting the list, otherwise you may end up with errors.
df['col_1'] = df['col_1'].astype('str')
CodePudding user response:
I would rewrite it as a listcomp if statement with the following syntax my_variable = [x for x in iterable if condition]
:
sates_list = [distinct_states.append(state) for state in self.activities_df["park_states"].unique() if len(state) == 2]
But since I don't have more information that's the best I could come up with.
If you provide more data in your question, I'm sure we could figure it out.