rand_samp = fips.sample(n=20)
rand_samp.reset_index(inplace=True)
print(rand_samp)
countys = rand_samp['CountyID'].to_list()
mask = ((temp['CountyID'] == countys[0]) | (temp['CountyID'] == countys[1]) | (temp['CountyID'] == countys[2]) | (temp['CountyID'] == countys[3]) | (temp['CountyID'] == countys[4]) | (temp['CountyID'] == countys[5]) | (temp['CountyID'] == countys[6]) | (temp['CountyID'] == countys[7]) | (temp['CountyID'] == countys[8]) | (temp['CountyID'] == countys[9]) | (temp['CountyID'] == countys[10]) | (temp['CountyID'] == countys[11]) | (temp['CountyID'] == countys[12]) | (temp['CountyID'] == countys[13]) | (temp['CountyID'] == countys[14]) | (temp['CountyID'] == countys[15]) | (temp['CountyID'] == countys[16]) | (temp['CountyID'] == countys[17]) | (temp['CountyID'] == countys[18]) | (temp['CountyID'] == countys[19]))
temp_rand = temp[mask]
temp_rand.reset_index(inplace=True)
temp_rand
I am looking for a better way to make a mask for my dataset that doesn't involve typing out each individual statement. I tried to use a for loop to loop through my list and then append to the new dataset but is there any other ways?
CodePudding user response:
ìsin()
is what you are looking for.
import pandas as pd
temp = pd.DataFrame([['USA', 1],['Poland', 8], ['Austria', 3]], columns=['CountyID', 'GDP'])
countys = ['Germany', 'Austria', 'Poland']
mask = temp['CountyID'].isin(countys)
print(mask)
temp_rand = temp[mask]
Output:
0 False
1 True
2 True
Name: CountyID, dtype: bool