Home > front end >  Pandas iloc & loc & multi index
Pandas iloc & loc & multi index

Time:10-19

I have a dataframe like:

enter image description here

so I want add some columns by two "for loop" like:

new dataframe like picture:

enter image description here

my code does not work:

for I in range(0,len(df["date"]):
    for sigma in rang(1,2/5):
       df["P*sigma"].iloc[0:i]=df["p"].iloc[0:i]*df["sigma"].iloc[sigma]
print(df)

how do I write code to obtain the dataframe like second picture?

CodePudding user response:

You can do this with a MultiIndex, which can be done in various ways, but I always prefer using from_product().

Note that we will have to do some preparation before we can do this. We have to make sure the index is properly set on the original DataFrame, and we have to elongate the original DataFrame to allow the new rows.

import pandas as pd


df = pd.DataFrame({'date': ['2020/01/01', '2020/01/02', '2020/01/03'], 'p': [123, 231, 188]})
df = df.set_index('date')
sigma = [0, 1, 2, 5]

# Create new 2-level index
multi_index = pd.MultiIndex.from_product([sigma, df.index], names=['sigma', 'date'])

# Make longer
df = pd.concat([df] * len(sigma))

# Set new index
df = df.set_index(multi_index)

# Print result
print(df.head())
>>>                    p
>>> sigma p
>>> 0     2020/01/01  123
>>>       2020/01/02  231
>>>       2020/01/03  188
>>> 1     2020/01/01  123
>>>       2020/01/02  231

If you want to make new columns or use the index values, you can get those with get_level_values() like this:

df["p*sigma"] = df.index.get_level_values("sigma") * df["p"]
print(df.head())
>>>                     p  p*sigma
>>> sigma date
>>> 0     2020/01/01  123        0
>>>       2020/01/02  231        0
>>>       2020/01/03  188        0
>>> 1     2020/01/01  123      123

CodePudding user response:

After adding rows and column "sigma" like this You can use DataFrame.apply like

df["P*sigma"] = df.apply(lambda x: x["p"] * x["sigma"], axis=1)

CodePudding user response:

In python you can repeate an array using the mulipliciation sign *. If you have the free columns sigma, date and p it is easy to define a DataFrame in the correct shape. To create the new column just do an element wise multiplication (there is no need for an apply() call. Afterwards you can set the index, if wanted.

import pandas as pd
sigma = [0.5, 1, 2, 2/5]
date = ['2020/01//02', '2020/01//03', '2020/01//04']
p = [123,231,188]

df = pd.DataFrame({'sigma':sigma*len(p), 'date':date*len(sigma), 'p':p*len(sigma)})
df['p*sigma'] = df['p']*df['sigma']
df.set_index(['sigma', 'date'], inplace=True)
>>>df
                     p  p*sigma
sigma date                     
0.5   2020/01//02  123     61.5
1.0   2020/01//03  231    231.0
2.0   2020/01//04  188    376.0
0.4   2020/01//02  123     49.2
0.5   2020/01//03  231    115.5
...
  • Related