Home > Enterprise >  python pandas convert for loop in one line code
python pandas convert for loop in one line code

Time:01-18

I have a Dataframe df,you can have it by running:

import pandas as pd
  
data = [10,20,30,40,50,60]
  
df = pd.DataFrame(data, columns=['Numbers'])
  
df

now I want to check if df's columns are in an existing list,if not then create a new column and set the column value as 0,column name is same as the value of the list:

columns_list=["3","5","8","9","12"]

for i in columns_list:
   if i not in df.columns.to_list():
        df[i]=0

How can I code it in one line,I have tried this:

[df[i]=0 for i in columns_list if i not in df.columns.to_list()]

However the IDE return :

SyntaxError: cannot assign to subscript here. Maybe you meant '==' instead of '='?

Any friend can help ?

CodePudding user response:

import numpy as np
import pandas as pd

# Some example data
df = pd.DataFrame(
    np.random.randint(10, size=(5, 6)),
    columns=map(str, range(6))
)

#    0  1  2  3  4  5
# 0  9  4  8  7  3  6
# 1  6  9  0  5  3  4
# 2  7  9  0  9  0  3
# 3  4  4  6  4  6  4
# 4  6  9  7  1  5  5

columns_list=["3","5","8","9","12"]

# Figure out which columns in your list do not appear in your dataframe
# by creating a new Index and using pd.Index.difference:
df[ pd.Index(columns_list).difference(df.columns, sort=False) ] = 0

#    0  1  2  3  4  5  8  9  12
# 0  9  4  8  7  3  6  0  0   0
# 1  6  9  0  5  3  4  0  0   0
# 2  7  9  0  9  0  3  0  0   0
# 3  4  4  6  4  6  4  0  0   0
# 4  6  9  7  1  5  5  0  0   0

CodePudding user response:

try:

columns_list=["3","5","8","9","12"]

df = df.reindex(
  list(
    set(
      list(df.columns)   columns_list
    )
  ), 
  axis=1, 
  fill_value=0,
)
  • Related