Home > other >  pandas performance warning that can't get rid of
pandas performance warning that can't get rid of

Time:12-31

I have a panda dataframe df with a column name 'C'. I am creating 280 duplicate columns added to the same dataframe with names of 1 ... 280 as follows:

for l in range(1,281):
    df[str[l]] = df['C']

I haven't figured out how to do this operation more efficiently, however, this operation works as expected but I get the following performance warning message:

PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_base[str(d)]=col_vals 

I've tried to suppress this warning with

import warnings
warnings.simplefilter(action='ignore', category=pd.errors.PerformanceWarning)

The performance warning suppression works when running on 1 core however, I'm running this code with joblib with 30 cores.

When running this operation with joblib, the warnning suppresion doesn't work!

How can I get rid of this warning message with either of these 2 methods?

  1. how to supress the warning on joblib? or
  2. how to create duplicate columns in a more efficient way with no warnings?

CodePudding user response:

You can do this in one go:

df = pd.concat([df['C']] * 281, axis=1)
df.columns = list(range(1, 281))   ['C']
  • Related