I have a data frame in which there is an indefinite number of columns, to be defined later. Like this:
index | GDP | 2004 | 2005 | ... |
---|---|---|---|---|
brasil | 1000 | 0.10 | 0.10 | ... |
china | 1000 | 0.15 | 0.10 | ... |
india | 1000 | 0.05 | 0.10 | ... |
df = pd.DataFrame({'index': ['brasil', 'china', 'india'],
'GDP': [1000,1000,1000],
'2004': [0.10, 0.15, 0.5],
'2005': [0.10, 0.10, 0.10]})
Being the column GDP the initial GDP, and the columns from 2004 onwards being floats, representing percentages, relating to GDP growth in each year.
Using percentages to get the absolute number of the GDP in each year, based on initial GDP. I need a dataframe like this:
index | GDP | 2004 | 2005 |
---|---|---|---|
brasil | 1000 | 1100 | 1210 |
china | 1000 | 1150 | 1265 |
india | 1000 | 1050 | 1155 |
I tried to use itertuples, df.columns and for loops, but i probably missing something.
Remembering that there are an indefinite number of columns.
Thank you very much in advance!
CodePudding user response:
A simple way is to count the columns and loop over:
num = df.shape[1]
start = 2
for idx in range(start, num):
df.iloc[:, idx] = df.iloc[:, idx-1] * (1 df.iloc[:, idx])
print(df)
which gives
index GDP 2004 2005
0 brasil 1000 1100.0 1210.0
1 china 1000 1150.0 1265.0
2 india 1000 1050.0 1155.0
CodePudding user response:
You can use df.columns
to access a list of the dataframes columns.
Then you can do a loop over all of these column names. Here is an example of your data frame where I multiplied every value by 2. If you want to do different operations to different columns you can add conditions into the loop.
df = pd.DataFrame({'index': ['brasil', 'china', 'india'],
'GDP': [1000,1000,1000],
'2004': [0.10, 0.15, 0.5],
'2005': [0.10, 0.10, 0.10]})
for colName in df.columns:
df[colName] *= 2
print(df)
this returns...
index GDP 2004 2005
0 brasilbrasil 2000 0.2 0.2
1 chinachina 2000 0.3 0.2
2 indiaindia 2000 1.0 0.2
Hope this helps!