Home > Mobile >  Loop for creating variable based on the name of other variable
Loop for creating variable based on the name of other variable

Time:08-10

Following is a snapshot of the data which I have

import pandas as pd
df = pd.DataFrame({
    'brand': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
    'X1': [7,7,7,7,9],
    'X2': [8,9,7,5,6],
})
print(df)

I am looking for a loop that will Identify how many "X" variables I have and then based on these "X" variables, it will create 'Y' variables. In the above case, I have X1 and X2, therefore, the new variables are Y1 and Y2(please see the code below).

If I had X1, X2, and X3 variables, then, the loop would automatically create Y1= 1, Y2= 2, and Y3= 3 variables and so on.

df2 = pd.DataFrame({
    'brand': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
    'X1': [7,7,7,7,9],
    'X2': [8,9,7,5,6],
    'Y1': [1,1,1,1,1],
    'Y2': [2,2,2,2,2],
})
print(df2)

I hope this is my question is clear.

P.S. I have just started to learn python today so apologies for basic questions.

Any help would be greatly appreciated. Thanks

CodePudding user response:

It's not clear which values do you want to use in your new columns, however you can try something like this:

current_columns = df.columns

for c in current_columns:
    if c[0] == 'X':
        df['y' c[1:]] = 'value for new column'

print(df.head())

CodePudding user response:

I presume you need to create new columns with values based on the previous column suffix

import pandas as pd

df = pd.DataFrame({
    'brand': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
    'X1': [7,7,7,7,9],
    'X2': [8,9,7,5,6],
    'X3': [18,19,17,15,16]
})

for col in df.columns:
    if col.startswith('X'):
        df['Y' col[1:]] = int(col[1:])
print(df)

Output:

     brand  X1  X2  X3 Y1 Y2 Y3
0  Yum Yum   7   8  18  1  2  3
1  Yum Yum   7   9  19  1  2  3
2  Indomie   7   7  17  1  2  3
3  Indomie   7   5  15  1  2  3
4  Indomie   9   6  16  1  2  3
  • Related