I am trying to figure out how to do this but not going any where.
import sqlite3
with sqlite3.connect("/content/drive/MyDrive/simple_db.sqlite") as con:
tmp_df = pd.read_sql_query("SELECT * from simple_zipcode", con)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 42724 entries, 0 to 42723
Data columns (total 24 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 zipcode 42724 non-null object
1 zipcode_type 42724 non-null object
2 major_city 42724 non-null object
3 post_office_city 33104 non-null object
4 common_city_list 41877 non-null object
5 county 42724 non-null object
6 state 42724 non-null object
7 lat 42724 non-null float64
8 lng 42724 non-null float64
9 timezone 42724 non-null object
10 radius_in_miles 33104 non-null float64
11 area_code_list 42724 non-null object
12 population 31448 non-null float64
..
def zco(x):
print(x)
y=tmp_df[tmp_df.zipcode==x]["county"].item()
print("hello",y)
if y is None or y is "":
county="NotFound"
else:
county = y ## or major_cit
return county
data['County'] = data['ZIPCode'].apply(zco)
I want search "data['ZIPCode']" in tmp_df and get the "county" and put that in new column in "data["County"]
CodePudding user response:
Don't have time to work it all out but maybe start with this idea:
data.merge(tmp_df[['zipcode','county']], how='left', left_on='ZIPCode', right_on='zipcode')
Then do a fillna
to change the non-matching entries to something like 'not found'
.
CodePudding user response:
It's keep adding the new col every time i re-run this section of code. i am using google collab.
data=data.merge(tmp_df[['zipcode','county']], how='left', left_on='ZIPCode', right_on='zipcode')
data = data.drop(["zipcode"], axis=1)
data.info()