Looping through dataframe to create new columns and avoid fragmentation warning-CodePudding

I am using a longer version of this code to loop through a large dataframe and create new columns:

categories = ['All Industries, All firms',
            'All Industries, Large firms']

for category in categories:
    sa[category   ', OP margin, %'] = sa[category   ', OP, SA, JPY bn'] / sa[category   ', Sales, SA, JPY bn'] * 100
    sa[category   ', Ordinary profit margin, %'] = sa[category   ', Ordinary profits, SA, JPY bn'] / sa[category   ', Operating and non-operating income, SA, JPY bn'] * 100

It works, but I know it isn't the best method, triggering the "PerformanceWarning: DataFrame is highly fragmented" warning.

I'd like to find a better solution. I know there are a few questions which discuss the fragmentation warning, but I haven't been able to work out how to use the answers in my case.

CodePudding user response：

Preallocate the space for new columns and just use the update method.

new_columns = ['OP margin, %', 'Ordinary profit margin, %']
empty_df = pd.DataFrame(columns=new_columns, index=sa.index)

sa.update(empty_df)


categories = ['All Industries, All firms', 'All Industries, Large firms']
for category in categories:
    sa[category   ', OP margin, %'] = sa[category   ', OP, SA, JPY bn'] / sa[category   ', Sales, SA, JPY bn'] * 100
    sa[category   ', Ordinary profit margin, %'] = sa[category   ', Ordinary profits, SA, JPY bn'] / sa[category   ', Operating and non-operating income, SA, JPY bn'] * 100