In pandas 1.4.0: append()
was deprecated, and the docs say to use concat()
instead.
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Codeblock in question:
def generate_features(data, num_samples, mask):
"""
The main function for generating features to train or evaluate on.
Returns a pd.DataFrame()
"""
logger.debug("Generating features, number of samples", num_samples)
features = pd.DataFrame()
for count in range(num_samples):
row, col = get_pixel_within_mask(data, mask)
input_vars = get_pixel_data(data, row, col)
features = features.append(input_vars)
print_progress(count, num_samples)
return features
These are the two options I've tried, but did not work:
features = pd.concat([features],[input_vars])
and
pd.concat([features],[input_vars])
This is the line that is deprecated and throwing the error:
features = features.append(input_vars)
CodePudding user response:
You can keep the DataFrames generated in the loop in a list and concatenate them with features
once you finish the loop.
In other words, replace the loop:
for count in range(num_samples):
row, col = get_pixel_within_mask(data, mask)
input_vars = get_pixel_data(data, row, col)
features = features.append(input_vars)
print_progress(count, num_samples)
with the one below:
tmp = [features]
for count in range(num_samples):
row, col = get_pixel_within_mask(data, mask)
input_vars = get_pixel_data(data, row, col)
tmp.append(input_vars)
print_progress(count, num_samples)
features = pd.concat(tmp)
You can certainly concatenate in the loop but it's more efficient to do it only once.
CodePudding user response:
This will "append" the blank df and prevent errors in the future by using the concat option
features= pd.concat([features, input_vars])
However, still, without having access to actually data and data structures this would be hard to test replicate.