Home > Software engineering >  Pandas - ApplyMap using corresponding cell values from two different DataFrames
Pandas - ApplyMap using corresponding cell values from two different DataFrames

Time:06-13

I have two pandas dataframes and want to apply a function "Stat / Percentage Played". The first data frame - "Stat" looks like this:

    -  -----  -------  -------  -------  -------
    0  Name   Round 1  Round 2  Round 3  Round 4
    1  Tom    10       22       18       21
    2  Angus  13       16       21       19
    3  Jack   18       19       16       17
    4  Harry  21       19       15       19
    5  Ben    19       24       22       27
    -  -----  -------  -------  -------  -------

The second dataframe shows the amount of time (percentage) a player has spent on the field, and looks like this:

    -  -----  -------  -------  -------  -------
    0  Name   Round 1  Round 2  Round 3  Round 4
    1  Tom    65       80       81       80
    2  Angus  72       76       85       79
    3  Jack   79       81       68       75
    4  Harry  79       78       62       77
    5  Ben    81       84       76       85
    -  -----  -------  -------  -------  -------

With these I want to applymap to each cell from both dataframes the get the corresponding player stat from that round to be divided by the playing time pertage for that player in the round. The function being applied for each cell would look like so:

    -  -----  -------  -------  -------  -------
    0  Name   Round 1  Round 2  Round 3  Round 4
    1  Tom    10/65    22/80    18/81    21/80
    2  Angus  13/72    16/76    21/85    19/79
    3  Jack   18/79    19/81    16/68    17/75
    4  Harry  21/79    19/78    15/62    19/77
    5  Ben    19/81    24/84    22/76    27/85
    -  -----  -------  -------  -------  -------

Furthermore the final values in my final dataframe should look like this:

    -  -----  -------------------  -------------------  -------------------  -------------------
    0  Name   Round 1              Round 2              Round 3              Round 4
    1  Tom    0.15384615384615385  0.275                0.2222222222222222   0.2625
    2  Angus  0.18055555555555555  0.21052631578947367  0.24705882352941178  0.24050632911392406
    3  Jack   0.22784810126582278  0.2345679012345679   0.23529411764705882  0.22666666666666666
    4  Harry  0.26582278481012656  0.24358974358974358  0.24193548387096775  0.24675324675324675
    5  Ben    0.2345679012345679   0.2857142857142857   0.2894736842105263   0.3176470588235294
    -  -----  -------------------  -------------------  -------------------  -------------------

The only way I can figure out a method is by converting to lists and using loops to complete this function, for example:

    datlist = df.values.tolist()
    headerList = datlist[:1]  # Sava Header row to merge after calulcations
    statList = datlist[1:]  # Remove Header row for calculations
    perlist = dfPerc.values.tolist()[1:]  # Remove Header row for calculations

    AdjList = []
    for z in range(len(statList)):
        PlayerList = []
        for i in range(len(statList[z])):
            if isinstance(statList[z][i], int) == True:
                adjStat = statList[z][i] / perlist[z][i]
            else:
                adjStat = statList[z][i]
            PlayerList.append(adjStat)
        AdjList.append(PlayerList)

    AdjList.insert(0, headerList[0])
    finaldf = pd.DataFrame(AdjList)
    print(finaldf)

Ouput:

    -  -----  -------------------  -------------------  -------------------  -------------------
    0  Name   Round 1              Round 2              Round 3              Round 4
    1  Tom    0.15384615384615385  0.275                0.2222222222222222   0.2625
    2  Angus  0.18055555555555555  0.21052631578947367  0.24705882352941178  0.24050632911392406
    3  Jack   0.22784810126582278  0.2345679012345679   0.23529411764705882  0.22666666666666666
    4  Harry  0.26582278481012656  0.24358974358974358  0.24193548387096775  0.24675324675324675
    5  Ben    0.2345679012345679   0.2857142857142857   0.2894736842105263   0.3176470588235294
    -  -----  -------------------  -------------------  -------------------  -------------------

However this method seems very inefficient and creates temporary lists which I feel is unnecessary. Is there a better method where I can complete this calculation without using lists and loops - for example the applymap function?

CodePudding user response:

Use pandas vectorization:

>>> columns = ['Round 1', 'Round 2', 'Round 3', 'Round 4']
>>> final = df1[columns] / df2[columns]

You can also use filter

>>> df1.filter(like='Round')/df2.filter(like='Round')

Can get the names back by assigning:

>>> final['Name'] = df1['Name']
  • Related