Home > database >  How to convert DataFrame.append() to pandas.concat()?
How to convert DataFrame.append() to pandas.concat()?

Time:02-25

In pandas 1.4.0: append() was deprecated, and the docs say to use concat() instead.

FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.

Codeblock in question:

def generate_features(data, num_samples, mask):
    """
    The main function for generating features to train or evaluate on.
    Returns a pd.DataFrame()
    """
    logger.debug("Generating features, number of samples", num_samples)
    features = pd.DataFrame()

    for count in range(num_samples):
        row, col = get_pixel_within_mask(data, mask)
        input_vars = get_pixel_data(data, row, col)
        features = features.append(input_vars)
        print_progress(count, num_samples)

    return features

These are the two options I've tried, but did not work:

features = pd.concat([features],[input_vars])

and

pd.concat([features],[input_vars])

This is the line that is deprecated and throwing the error:

features = features.append(input_vars)

CodePudding user response:

You can keep the DataFrames generated in the loop in a list and concatenate them with features once you finish the loop.

In other words, replace the loop:

for count in range(num_samples):
    row, col = get_pixel_within_mask(data, mask)
    input_vars = get_pixel_data(data, row, col)
    features = features.append(input_vars)
    print_progress(count, num_samples)
    

with the one below:

tmp = [features]
for count in range(num_samples):
    row, col = get_pixel_within_mask(data, mask)
    input_vars = get_pixel_data(data, row, col)
    tmp.append(input_vars)
    print_progress(count, num_samples)
features = pd.concat(tmp)

You can certainly concatenate in the loop but it's more efficient to do it only once.

CodePudding user response:

This will "append" the blank df and prevent errors in the future by using the concat option

features= pd.concat([features, input_vars])

However, still, without having access to actually data and data structures this would be hard to test replicate.

  • Related