Home > OS >  How to modify a dataframe with conditions for each ID
How to modify a dataframe with conditions for each ID

Time:03-17

I have a pandas dataframe that looks like this (its a pretty big one)

id pen usd PEN/USD
1 5 0 pen/usd
1 0 20 pen/usd
1 15 0 pen/usd
1 25 0 pen/usd
1 35 0 pen/usd
1 45 0 pen/usd
2 0 15 usd
2 0 5 usd
2 0 5 usd
3 15 0 pen
3 10 0 pen
4 55 0 pen/usd
4 70 0 pen/usd
4 0 60 pen/usd

Thanks for your answer. My question was how to create the column "PEN/USD". Each id tells us if it has an amount in "pen" or "usd" depending on the column. For example: for id "1" there are 6 occurrences. 5 in "pen" and 1 in "usd". Therefore, it has "pen/usd". Even if you have only one amount of one of the two, it should be "pen/usd" in the column "PEN/USD".

CodePudding user response:

You can use groupby followed by a merge:

totals = df2.groupby('id', as_index=False)['pen', 'usd'].sum()
print(totals)

    id  pen usd
0   1   125 20
1   2   0   25
2   3   25  0
3   4   125 60

# Assing values to 'PEN/USD' 
totals.loc[(totals['pen']>0) & (totals['usd']>0), 'PEN/USD'] = 'pen/usd'
totals.loc[(totals['pen']==0) & (totals['usd']>0), 'PEN/USD'] = 'pen'
totals.loc[(totals['pen']>0) & (totals['usd']==0), 'PEN/USD'] = 'usd'

df.merge(counts[['id', 'PEN/USD']], on='id')


    id  pen usd PEN/USD
0   1   5   0   pen/usd
1   1   0   20  pen/usd
2   1   15  0   pen/usd
3   1   25  0   pen/usd
4   1   35  0   pen/usd
5   1   45  0   pen/usd
6   2   0   15  pen
7   2   0   5   pen
8   2   0   5   pen
9   3   15  0   usd
10  3   10  0   usd
11  4   55  0   pen/usd
12  4   70  0   pen/usd
13  4   0   60  pen/usd

CodePudding user response:

I assume that you want to check if an ID contains both currencies or just one of the two.

temp = """
1   5   0   pen/usd
1   0   20  pen/usd
1   15  0   pen/usd
1   25  0   pen/usd
1   35  0   pen/usd
1   45  0   pen/usd
2   0   15  usd
2   0   5   usd
2   0   5   usd
3   15  0   pen
3   10  0   pen
4   55  0   pen/usd
4   70  0   pen/usd
4   0   60  pen/usd
"""
temp2 = re.sub(r"\t", r",", temp)
temp2 = io.StringIO(temp) 
df = pd.read_csv(temp2, sep="\t",header=None)
df = df.rename(columns={0:"id",1:"pen",2:"usd",3:"PEN/USD"})

df.groupby(["id","PEN/USD"]).any().reset_index()

    id  PEN/USD  pen    usd
0   1   pen/usd  True   True
1   2   usd      False  True
2   3   pen      True   False
3   4   pen/usd  True   True
  • Related