Home > Enterprise >  I want to save the mean (by row) of different set of dataframe columns and store them in a new dataf
I want to save the mean (by row) of different set of dataframe columns and store them in a new dataf

Time:12-31

For doing so, I have a list of lists (which are my clusters), for example:

asset_clusts=[[0,1],[3,5],[2,4, 12],...]

and original dataframe(in my code I call it 'x') is as : return time series of s&p 500 companies

I want to choose column [0,1] of the original dataframe and compute the mean (by row) of them and store it in a new dataframe, then compute the mean of columns [3, 5], and add it to the new dataframe, and so on ...

mu=pd.DataFrame() 
for j in range(get_number_of_elements(asset_clusts)):
    mu=x.iloc[:,asset_clusts[j]].mean(axis=1)

but, it gives to me only a column and i checked, this one column is the mean of last cluster columns

in case of ambiguity, function of get_number_of_elements is:

def get_number_of_elements(clist):
    count = 0
    for element in clist:
        count  = 1
    return count

CodePudding user response:

def get_number_of_elements(clust_list):
    count = 0
    for element in clust_list:
        count  = 1
    return count

CodePudding user response:

I solved it and in case if it would be helpful for others, here is the final function:

def clustered_series(x, org_asset_clust):
    """
    x:return data
    org_asset_clust: list of clusters
    ----> mean of each cluster returns by row
    """
    def get_number_of_elements(org_asset_clust):
        count = 0
        for element in org_asset_clust:
            count  = 1
        return count
    mu=[]
    for j in range(get_number_of_elements(org_asset_clust)):
        mu.append(x.iloc[:,org_asset_clust[j]].mean(axis=1))
        cluster_mean=pd.concat(mu, axis=1)
        
    return cluster_mean
  • Related