I need to add rows to an existing Dataframe, one by one.
If I use append
, I got this warning:
main.py:41: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
df_html_consolidate = df_html_consolidate.append(row_html_good, ignore_index=True)
I have the following for loop:
df_html_consolidate = pd.DataFrame()
for _, row_html_good in df_html_good.iterrows():
canonical = str(row_html_good['Canonical Link Element 1'])
if "/" in canonical and canonical != row_html_good['Address']:
row_html_good['Address'] = canonical
consolidate_mappings[row_html_good['Address']] = row_html_good['Address']
df_html_consolidate = df_html_consolidate.append(row_html_good, ignore_index=True)
What is the best way to do this concat
?
CodePudding user response:
Instead of appending in a loop, you could store them in a list, construct a DataFrame and concat
:
lst = []
for _, row_html_good in df_html_good.iterrows():
canonical = str(row_html_good['Canonical Link Element 1'])
if "/" in canonical and canonical != row_html_good['Address']:
row_html_good['Address'] = canonical
consolidate_mappings[row_html_good['Address']] = row_html_good['Address']
lst.append(row_html_good)
df_html_consolidate = pd.concat([df_html_consolidate, pd.DataFrame(lst)])
If you want to append just one row, you could use loc
:
df.loc[len(df)] = a_list # the length of the list must match the number of columns
or
df.loc[len(df)] = pd.Series(a_list, index = df.columns)
or
df.loc[len(df)] = dict(zip(df.columns, a_list))
Note that your code doesn't seem to need to be looped. Maybe something like the following could do the job:
msk = (df_html_good['Canonical Link Element 1'].astype(str).ne(df_html_good['Address']) &
df_html_good['Canonical Link Element 1'].astype(str).str.contains('/'))
df_html_good['Address'] = np.where(msk, df_html_good['Canonical Link Element 1'].astype(str), df_html_good['Address'])