I am using a longer version of this code to loop through a large dataframe and create new columns:
categories = ['All Industries, All firms',
'All Industries, Large firms']
for category in categories:
sa[category ', OP margin, %'] = sa[category ', OP, SA, JPY bn'] / sa[category ', Sales, SA, JPY bn'] * 100
sa[category ', Ordinary profit margin, %'] = sa[category ', Ordinary profits, SA, JPY bn'] / sa[category ', Operating and non-operating income, SA, JPY bn'] * 100
It works, but I know it isn't the best method, triggering the "PerformanceWarning: DataFrame is highly fragmented" warning.
I'd like to find a better solution. I know there are a few questions which discuss the fragmentation warning, but I haven't been able to work out how to use the answers in my case.
CodePudding user response:
Preallocate the space for new columns and just use the update method.
new_columns = ['OP margin, %', 'Ordinary profit margin, %']
empty_df = pd.DataFrame(columns=new_columns, index=sa.index)
sa.update(empty_df)
categories = ['All Industries, All firms', 'All Industries, Large firms']
for category in categories:
sa[category ', OP margin, %'] = sa[category ', OP, SA, JPY bn'] / sa[category ', Sales, SA, JPY bn'] * 100
sa[category ', Ordinary profit margin, %'] = sa[category ', Ordinary profits, SA, JPY bn'] / sa[category ', Operating and non-operating income, SA, JPY bn'] * 100