Home > Net >  pandas- convert/transpose columns to rows without using loop
pandas- convert/transpose columns to rows without using loop

Time:06-17

I am trying to convert/tranpose a very wide dataframe to a row-based dataframe (for lack of better word).

The origin table looks like this

stock_list = ['AAPL', 'TSLA', ' MSFT', 'COIN', 'META' ...... ]
df

    date     price_AAPLE    volume_AAPLE      price_TSLA   volume_TSLA      price_MSFT    volume_MSFT    ...
    01-01      187            324234            517         5346             128             45143
    01-02      124            234234            512         5436345          130             42345 

I can easily transpose it in loop

cols = ['date', 'price', 'volume']

agg_df = pd.DataFrame()
for i in stock_list:
   dataframe = df[['date', f'price_{i}', f'volume_{i}']]
   dataframe.columns = cols 
   dataframe['stock'] = i
   agg_df = agg_df.append(dataframe)

In this way the dataframe looks like:

   date     price    volume       stock         
   01-01      187    324234        AAPL       
   01-02      124    234234        AAPL       
   01-01      517     5346         TSLA 
   01-02      512    5436345       TSLA
   ....      ....    ....          ....

So my question is how to convert/transpose the table without using loop?

I got a really long list, I believe using loop is a bit slow here.

Thank you !!!!

CodePudding user response:

Let us try wide_to_long

out = pd.wide_to_long(df, 
                      ['volume','price'], 
                      i = ['date'], 
                      j = 'stock', 
                      suffix = '\w ', 
                      sep='_').reset_index()
Out[27]: 
    date  stock   volume  price
0  01-01  AAPLE   324234    187
1  01-02  AAPLE   234234    124
2  01-01   TSLA     5346    517
3  01-02   TSLA  5436345    512
4  01-01   MSFT    45143    128
5  01-02   MSFT    42345    130

CodePudding user response:

This is similar to your final dataset.

df=pd.DataFrame({
    "date":["01-01"  ,"01-02"],
    "price_AAPLE":[187, 214],
    "volume_AAPLE":[324234, 234234],
    "price_TSLA":[517,512],
    "volume_TSLA": ["5346", "5436345"]})

df1 = pd.melt(df,id_vars="date", value_vars=["price_AAPLE", "volume_AAPLE", "price_TSLA", "volume_TSLA"])

df1[["A", "stock"]] = df1.variable.str.split("_", expand=True)

df2 = pd.pivot_table(df1, values = 'value', index = ['date', 'stock'], columns=['A'],  aggfunc=np.sum).reset_index()

print(df2)

CodePudding user response:

Try this and see if it works for you:

d1 = {'col1': [1, 2], 'col2': [3, 4]}
df1 = pd.DataFrame(data=d1)
transposed_df = = df1.T
  • Related