How to transmit cell values of one row into another row of another dataframe with different columns-CodePudding

I have a problem converting the values of a row of one dataframe into a row in another dataframe. The column values of the columns by which i compare the rows are different.

here an example:

dfzwei = pd.DataFrame([
        {"name": 'web' , "b":10, "c": 2,"d": 21},
        {"name":' app', "b":77, "c": 4,"d": 12},
        {"name":'user' , "b":56, "c": 20,"d": 40},
        {"name":'code', "b":44, "c": 8,"d": 70},
        {"name":'this', "b":44, "c": 8,"d": 70},
        {"name":'well', "b":44, "c": 8,"d": 70}
    ])

df = pd.DataFrame([
        {"file":'bin\main\src\user.java', "b":10, "c": 0, "d": 99},
        {"file":'bin\main\src\web.java', "b":12, "c": 0, "d": 80},
        {"file":'bin\main\src\code.java', "b":16, "c": 1, "d": 90},
        {"file":'bin\main\src\app.cs', "b":18, "c": 10, "d": 33}
    ])
df2

i want to transmit the value of df2.b to df.b. i have tried it like this:

for line, row in enumerate(df.itertuples(), 1):
    for line2, row2 in enumerate(dfzwei.itertuples(), 1):
        if row2.name in row.file :
            df.at[row.Index, 'b'] =  dfzwei.at[row2.Index, 'b']
            df.at[row.Index, 'c'] =  dfzwei.at[row2.Index, 'c']
            df.at[row.Index, 'd'] =  dfzwei.at[row2.Index, 'd']

i need to make sure that the row2 of dfzwei is really in df. i cant be sure of that. the indexes of the two dataframes are not the same too. thats why i do the "if row2.name in row.file"
when i do it with big dataframes the cell values get randomly tangled up, only some are right. i would be very very glad to have a solutions for this, thank you very much for any hints.

EDIT

My mistake was to asume, that name occurs only once in the df.file column. i was iterating over the filpaths (file) in df and trying to match them with the classnames (name) in dfzwei. the issue was there were very similiar classnames . for example in df :

df = pd.DataFrame([
        {"file":'bin\main\src\userw6.java', "b":10, "c": 0, "d": 99},
        {"file":'bin\main\src\webapp.py', "b":12, "c": 0, "d": 80},
        {"file":'bin\main\src\code.cs', "b":16, "c": 1, "d": 90},
        {"file":'bin\main\src\app.java', "b":18, "c": 10, "d": 33}
    ])

so dfzwei had for example these classnames:

dfzwei = pd.DataFrame([
        {"name": 'web' , "b":10, "c": 2,"d": 21},
        {"name":' app', "b":77, "c": 4,"d": 12},
        {"name":'user' , "b":56, "c": 20,"d": 40},
        {"name":'w6', "b":44, "c": 8,"d": 70},
        {"name":'code7', "b":44, "c": 8,"d": 70},
        {"name":'well', "b":44, "c": 8,"d": 70}
    ])

so i was matching multiple classpaths in df with

if row2.name in row.file :

so my solution for this lies in making sure the right name fits in the right filepath. So how do i get the name in the file separated between the slashsign and '.' so i can compare the content with dfzwei.name?

CodePudding user response：

You can create a dict mapping:

df['b'] = df['class'].map(df2.set_index(df2['file'].str.rsplit('/', n=1).str[1])['b'])
print(df)

# Output
  class   b
0   web  12
1   app  18
2  user  10
3  code  16

CodePudding user response：

If anyone comes across this too: my solution finally was:

for line, row in enumerate(df.itertuples(), 1):
    str = df.at[row.Index, 'file'][df.at[row.Index, 'file'].rindex('\\') 1:]
    str =  re.search('(.*).java',str ).group(1)
    for line2, row2 in enumerate(dfzwei.itertuples(), 1):
        if row2.name == str :
            df.at[row.Index, 'b'] =  dfzwei.at[row2.Index, 'b']
            df.at[row.Index, 'c'] =  dfzwei.at[row2.Index, 'c']
            df.at[row.Index, 'd'] =  dfzwei.at[row2.Index, 'd']