Home > Mobile >  How to append rows with concat to a pandas DataFrame using a for loop
How to append rows with concat to a pandas DataFrame using a for loop

Time:03-05

I need to add rows to an existing Dataframe, one by one. If I use append, I got this warning:

main.py:41: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  df_html_consolidate = df_html_consolidate.append(row_html_good, ignore_index=True)

I have the following for loop:

df_html_consolidate = pd.DataFrame()
for _, row_html_good in df_html_good.iterrows():
    canonical = str(row_html_good['Canonical Link Element 1'])

    if "/" in canonical and canonical != row_html_good['Address']:
        row_html_good['Address'] = canonical

    consolidate_mappings[row_html_good['Address']] = row_html_good['Address']

    df_html_consolidate = df_html_consolidate.append(row_html_good, ignore_index=True)

What is the best way to do this concat?

CodePudding user response:

Instead of appending in a loop, you could store them in a list, construct a DataFrame and concat:

lst = []
for _, row_html_good in df_html_good.iterrows():
    canonical = str(row_html_good['Canonical Link Element 1'])

    if "/" in canonical and canonical != row_html_good['Address']:
        row_html_good['Address'] = canonical

    consolidate_mappings[row_html_good['Address']] = row_html_good['Address']
    lst.append(row_html_good)

df_html_consolidate = pd.concat([df_html_consolidate, pd.DataFrame(lst)])

If you want to append just one row, you could use loc:

df.loc[len(df)] = a_list # the length of the list must match the number of columns

or

df.loc[len(df)] = pd.Series(a_list, index = df.columns)

or

df.loc[len(df)] = dict(zip(df.columns, a_list))    

Note that your code doesn't seem to need to be looped. Maybe something like the following could do the job:

msk = (df_html_good['Canonical Link Element 1'].astype(str).ne(df_html_good['Address']) & 
       df_html_good['Canonical Link Element 1'].astype(str).str.contains('/'))
df_html_good['Address'] = np.where(msk, df_html_good['Canonical Link Element 1'].astype(str), df_html_good['Address'])
  • Related