Home > Enterprise >  Using .str.contains to filter a df of FRED series
Using .str.contains to filter a df of FRED series

Time:01-06

I am trying to download a data series for each state from the FRED api. i have loaded all the data series containing 'Housing Inventory: Active Listing Count state' into a df however there are still over 1000 rows. Is there a way i can search the title of each series to see if it contains the name of a state?

i have tried

df=df.loc[df['title'].str.contains(["Alaska","Alabama",...,"Wyoming"])]

Series ID = ACTLISCOU

CodePudding user response:

Assuming you have a list with all the states, you can define a custom function to filter your title column and use it calling pd.Series.apply:

state_list = ["Alaska","Alabama",...,"Wyoming"]
def my_filter(value):
    # return True if any state is in the value
    return any(state in value for state in state_list)

# Call apply to filter DF based on True|False by your filter
df_filtered = df[df['title'].apply(my_filter)]

CodePudding user response:

The following code returns the country contained in the ACTLISCOUXX dataset, in this case California:

df = pd.read_csv('ACTLISCOUCA.csv',sep=';',header=None)
us_country_list=["Arizona","California","Oregon"]
country=[i for i in us_country_list if i in df.dropna().iloc[0][1]][0]
print(country)

How it works

  1. The CSV file is imported as a Pandas dataframe
  2. a list comprehension is used to build an array of involved countries by matching a list of US countries with the second column of the first row of the dataframe with both columns. This array should contain only one element if only one country is mentioned. Only the first element of the array is saved in the country variable.
  • Related