Home > Enterprise >  How to concat a series to a pandas dataframe in pyhton?
How to concat a series to a pandas dataframe in pyhton?

Time:12-03

I would like to iterate through a dataframe rows and concatenate that row to a different dataframe basically building up a different dataframe with some rows.

For example: `IPCSection and IPCClass Dataframes


allcolumns = np.concatenate((IPCSection.columns, IPCClass.columns), axis = 0)
finalpatentclasses = pd.DataFrame(columns=allcolumns)
for isec, secrow in IPCSection.iterrows():
    for icl, clrow in IPCClass.iterrows():
        if (secrow[0] in clrow[0]):
            pdList = [finalpatentclasses, pd.DataFrame(secrow), pd.DataFrame(clrow)]
            finalpatentclasses = pd.concat(pdList, axis=0, ignore_index=True)
display(finalpatentclasses)

The output is:

I want the nan values to dissapear and move all the data under the correct columns. I tried axis = 1 but messes up the column names. Append does not work as well all values are placed diagonally at the table with nan values as well.

CodePudding user response:

The problem with the current implementation is that pd.concat is being called with axis=0 and ignore_index=True, resulting in the values from secrow and clrow being concatenated vertically and the original indices being ignored. This causes the values to be misaligned with the columns of the final dataframe, as shown in the output.

To solve this problem, you can create a new dataframe that has the same columns as the final dataframe, and then assign the values from secrow and clrow to the appropriate columns in the new dataframe. After that, you can append the new dataframe to the final dataframe using the pd.concat function with axis=0, as before.

Here is a modified version of the code that should produce the desired output:

allcolumns = np.concatenate((IPCSection.columns, IPCClass.columns), axis=0)
finalpatentclasses = pd.DataFrame(columns=allcolumns)
for isec, secrow in IPCSection.iterrows():
    for icl, clrow in IPCClass.iterrows():
        if (secrow[0] in clrow[0]):
            # Create a new dataframe with the same columns as the final dataframe
            newrow = pd.DataFrame(columns=allcolumns)
            # Assign the values from secrow and clrow to the appropriate columns in the new dataframe
            newrow[IPCSection.columns] = secrow.values
            newrow[IPCClass.columns] = clrow.values
            # Append the new dataframe to the final dataframe
            finalpatentclasses = pd.concat([finalpatentclasses, newrow], axis=0)
display(finalpatentclasses)

This should result in a final dataframe that has the values from secrow and clrow concatenated horizontally under the correct columns, with no nan values.

CodePudding user response:

Alright, I have figured it out. The idea is that you create a newrowDataframe and concatenate all the data in a list from there you can add it to the dataframe and then conc with the final dataframe.

Here is the code:

allcolumns = np.concatenate((IPCSection.columns, IPCClass.columns), axis = 0)
finalpatentclasses = pd.DataFrame(columns=allcolumns)
for isec, secrow in IPCSection.iterrows():
    for icl, clrow in IPCClass.iterrows():
        newrow = pd.DataFrame(columns=allcolumns)
                values = np.concatenate((secrow.values, subclrow.values), axis=0)
                newrow.loc[len(newrow.index)] = values 
                finalpatentclasses = pd.concat([finalpatentclasses, newrow], axis=0)

finalpatentclasses.reset_index(drop=false, inplace=True) display(finalpatentclasses)

  • Related