Home > Net >  Pandas dataframe: Calculate ratios within a group
Pandas dataframe: Calculate ratios within a group

Time:12-05

I have a dataframe of the following structure:

df = pd.DataFrame(
    {
        "Farm ID": ["1", "2", "2", "3", "3"],
        "Crop": ["Type A", "Type A", "Type B", "Type B", "Type B"],
        "Plot ID": ["A", "A", "B", "A", "B"],
        "Plot area": ["8", "4", "2", "3"],
        "More values": ["101", "215", "13", "0.6"],
        "More text": ["foo", "bar", "foobar", "barfoo"],
    }
)

For each plot I now want to know the ratio, the plot area has on the total farm area in case there are different crops on the plots. I then also want to combine plots with the same crop. So the output should look like this:

                 Plot ID    Plot area  area share More values   More text
Farm ID Crop                
1       Type A   A          8          1          101           foo
2       Type A   A          4          0.66       215           bar
        Type B   B          2          0.33       13            foobar
3       Type B   AB         35         1          0.642     barfoonothing

I know, that I can use dfNew = df.groupby(["Farm ID", "Crop"]).sum() to get the correct structure, and FarmArea = df.groupby(level=0).sum() to get the farm area, but I cannot figure out how to get area share. Any suggestions?

CodePudding user response:

You can use transform:

df = pd.DataFrame(
    {
        "Farm ID": ["1", "2", "2", "3"],
        "Crop": ["Type A", "Type A", "Type B", "Type B"],
        "Plot ID": ["A", "A", "B", "A"],
        "Plot area": [8, 4, 2, 3],
        "More values": ["101", "215", "13", "0.6"],
        "More text": ["foo", "bar", "foobar", "barfoo"],
    }
)
dfNew = df.groupby(["Farm ID", "Crop"]).sum()

dfNew["Area share"]=dfNew["Plot area"] / dfNew.groupby(["Farm ID"])["Plot area"].transform("sum")

dfNew

output:

        Plot area   Area share
Farm ID     Crop        
1   Type A  8   1.000000
2   Type A  4   0.666667
    Type B  2   0.333333
3   Type B  3   1.000000

Please note that I loaded Plot area as numeric, if that is not the case you may need to convert it before using transform.

  • Related