How do I CONCAT data from a dataframe to another dataframe?-CodePudding

I have built the following function and now .append will be removed from pandas in a future version. So I am weeling to convert this code with concat.

def MyDF(self,DF1,DF2):
    OutputDf = pd.DataFrame([]).reset_index(drop=True)
    for i in range(0,len(DF2)):
        OutputDf = OutputDf.append(DF2.loc[[i]])
        OutputDf = OutputDf.append(DF1.loc[(DF1['TheName'] == DF2['TheName'][i]) & (DF1['WGT'].apply(lambda x: float(x)) > 0) ])
        OutputDf = OutputDf.reset_index(drop=True)
    return OutputDf

I don't know how to use concat in this case, so how would I avoid .append there ?

Not sure that would work :

OutputDf = pd.Concat(OutputDf,DF2.loc[[i]])

CodePudding user response：

pandas.DataFrame.append and pandas.Series.append are Deprecated since version 1.4.0. See Deprecated DataFrame.append and Series.append

The alternative is using pandas.concat.

In OP's case, .append() is being used in two cases:

OutputDf = OutputDf.append(DF2.loc[[i]])
OutputDf = OutputDf.append(DF1.loc[(DF1['TheName'] == DF2['TheName'][i]) & (DF1['WGT'].apply(lambda x: float(x)) > 0) ])

Case 1

One can change to the following

OutputDf = pd.concat([OutputDf, DF2.loc[[i]]], ignore_index=True)

Case 2

One can change to the following

OutputDf = pd.concat([OutputDf, DF1.loc[(DF1['TheName'] == DF2['TheName'][i]) & (DF1['WGT'].apply(lambda x: float(x)) > 0) ]], ignore_index=True)

Notes:

As I do not have access to the dataframes and do not know the desired output, one might have to do some adjustments.

CodePudding user response：

I think pandas.concat() is easy to understand, so that, you just tell good bye to append and keep up to pandas.

At the beginning, just attention to objs, ignore_index and axis arguments. If you want to add rows one under the other, just you can give this with axis=0 argument. If you give axis=0, you can concat dataFrame objects vertically like .append(). If you give axis=1, this process will be done horizontally like the documentation says:

axis : {0/’index’, 1/’columns’}, default 0
The axis to concatenate along.

Also, you can use ignore_index rather than reset_index. To organize indexes, you can use ignore_index=True argument.

Summarily, if you have 2 dataframes to concat like your question, you can use something like this:

def MyDF(self,DF1,DF2):
    OutputDf = pd.DataFrame([]).reset_index(drop=True)
    for i in range(0,len(DF2)):
        process1 = DF2.loc[[i]]
        process2 = DF1.loc[(DF1['TheName'] == DF2['TheName'][i]) & (DF1['WGT'].apply(lambda x: float(x)) > 0) ]
        OutputDf = pd.concat([process1, process2], ignore_index=True)
    return OutputDf

You can make this code much shorter but it will decrease to readability, obviously. You may want to use:

def MyDF(self,DF1,DF2):
    OutputDf = pd.DataFrame([]).reset_index(drop=True)
    for i in range(0,len(DF2)):
        OutputDf = pd.concat([DF2.loc[[i]], DF1.loc[(DF1['TheName'] == DF2['TheName'][i]) & (DF1['WGT'].apply(lambda x: float(x)) > 0) ]], ignore_index=True)
    return OutputDf

Or, you give the pd.concat() part to return, but it will be harder to read, so that, it is your decision. Just don't forget to use [] in your code, be careful that the usage of concat:

pd.concat([process1, process2])  # use [] inside concat for dataframes

If you directly use pd.concat(process1, process2), it will give an error.