If I have a data set like this one.
date PCP1 PCP2 PCP3 PCP4
1/1/1985 0 -99 -99 -99
1/2/1985 0 -99 -99 -99
1/3/1985 0 0 -99 -99
1/4/1985 0 0 -99 -99
1/5/1985 1 -99 1 1
1/6/1985 0 -99 -99 -99
1/7/1985 0 1 -99 0
1/8/1985 0 2 -99 3
1/9/1985 0 -99 -99 -99
And I want to create new data frames by only having the date column and one PCP column like this.. for df1..
df1 =
date PCP1
1/1/1985 0
1/2/1985 0
1/3/1985 0
1/4/1985 0
1/5/1985 1
1/6/1985 0
1/7/1985 0
1/8/1985 0
1/9/1985 0
and df2...
df2 =
date PCP2
1/1/1985 -99
1/2/1985 -99
1/3/1985 0
1/4/1985 0
1/5/1985 -99
1/6/1985 -99
1/7/1985 1
1/8/1985 2
1/9/1985 -99
and so on for df3.. and df4...
and I want to delete rows with -99 for each data frame that will result to...
df1 =
date PCP1
1/1/1985 0
1/2/1985 0
1/3/1985 0
1/4/1985 0
1/5/1985 1
1/6/1985 0
1/7/1985 0
1/8/1985 0
1/9/1985 0
and df2...
df2 =
date PCP2
1/3/1985 0
1/4/1985 0
1/7/1985 1
1/8/1985 2
I'm not sure if I made it right, but I have written the following code, but I'm not sure how to remove the rows with -99 while doing the for loop..
# first I created a list of pcp list
n_cols = 4
pcp_list = []
df_names = []
for i in range(1,n_cols):
item = "PCP" str(i)
pcp_list.append(item)
item_df = "df" str(i)
df_names.append(item_df)
# and then I have created a new df for each name on the list by creating a dict
dfs ={}
for dfn, name in zip(df_names, pcp_list):
dfs[dfn] = pd.DataFrame(df, columns=['date', name])
# and then I was hoping I could remove the rows with -99
for df, name in zip(dfs, pcp_list):
df[name] = dfs[df[name] = -99]
Any help will be appreciated!
Thank you!
CodePudding user response:
You can create in dictioanry comrehension DataFrames:
d = {k: v[v != -99].reset_index() for k,v in df.set_index('date').to_dict('series').items()}
Create variables by name is not recommended, but possible:
for i, (k, v) in enumerate(df.set_index('date').to_dict('series').items()):
globals()[f'df{i}'] = v[v != -99].reset_index()