Home > Blockchain >  How can I delete rows from dictionaries using pandas
How can I delete rows from dictionaries using pandas

Time:09-27

If I have a data set like this one.

date    PCP1    PCP2    PCP3    PCP4
1/1/1985    0   -99 -99 -99
1/2/1985    0   -99 -99 -99
1/3/1985    0   0   -99 -99
1/4/1985    0   0   -99 -99
1/5/1985    1   -99 1   1
1/6/1985    0   -99 -99 -99
1/7/1985    0   1   -99 0
1/8/1985    0   2   -99 3
1/9/1985    0   -99 -99 -99

And I want to create new data frames by only having the date column and one PCP column like this.. for df1..

df1 = 
date    PCP1
1/1/1985    0
1/2/1985    0
1/3/1985    0
1/4/1985    0
1/5/1985    1
1/6/1985    0
1/7/1985    0
1/8/1985    0
1/9/1985    0

and df2...

df2 = 
date    PCP2
1/1/1985    -99
1/2/1985    -99
1/3/1985    0
1/4/1985    0
1/5/1985    -99
1/6/1985    -99
1/7/1985    1
1/8/1985    2
1/9/1985    -99

and so on for df3.. and df4...

and I want to delete rows with -99 for each data frame that will result to...

df1 = 
date    PCP1
1/1/1985    0
1/2/1985    0
1/3/1985    0
1/4/1985    0
1/5/1985    1
1/6/1985    0
1/7/1985    0
1/8/1985    0
1/9/1985    0

and df2...

df2 = 
date    PCP2
1/3/1985    0
1/4/1985    0
1/7/1985    1
1/8/1985    2

I'm not sure if I made it right, but I have written the following code, but I'm not sure how to remove the rows with -99 while doing the for loop..

# first I created a list of pcp list
n_cols = 4
pcp_list = []
df_names = []
for i in range(1,n_cols):
    item = "PCP"   str(i)
    pcp_list.append(item)
    item_df = "df"   str(i)
    df_names.append(item_df)

# and then I have created a new df for each name on the list by creating a dict
dfs ={}
for dfn, name in zip(df_names, pcp_list):
    dfs[dfn] = pd.DataFrame(df, columns=['date', name])

# and then I was hoping I could remove the rows with -99
for df, name in zip(dfs, pcp_list):
    df[name] = dfs[df[name] = -99]

Any help will be appreciated!

Thank you!

CodePudding user response:

You can create in dictioanry comrehension DataFrames:

d = {k: v[v != -99].reset_index() for k,v in df.set_index('date').to_dict('series').items()}

Create variables by name is not recommended, but possible:

for i, (k, v) in enumerate(df.set_index('date').to_dict('series').items()):
    globals()[f'df{i}'] =  v[v != -99].reset_index()
  • Related