Changing a certain cell of a DataFrame based off of other values-CodePudding

I have a list of files that all have a pd dataframe (1550 x 7) and I am working to edit them a little. Every row is related to a different atom. I am working to find the atoms in which the z value is between zmax and zmax-5. For those atoms I would like to change their 'moltype' from one to two. I have chosen to run a forloop iterating over the rows. I believe I can find the rows in which the z value is in the wanted range, but I am having trouble changing that atom's moltype in the forloop.

here is the simplified dataframe (organized by Z descending)

     index  atomtype  moltype  charge         x         y        z
724    725         1        1     0.0 -6.184180 -2.371150  28.2445
739    740         1        1     0.0  5.902450 -3.004580  28.2445
712    713         1        1     0.0 -0.344071  6.614240  28.2445
711    712         1        1     0.0  1.033570  6.542040  28.2445
736    737         1        1     0.0  4.166110 -5.148780  28.2445
..     ...       ...      ...     ...       ...       ...      ...
29      30         1        1     0.0 -1.716680 -6.396840 -27.0166
30      31         1        1     0.0  1.038610 -6.541230 -27.0166
33      34         1        1     0.0  2.371140 -6.184180 -27.0166
34      35         1        1     0.0  4.685090 -4.681490 -27.0166
0        1         1        1     0.0  6.614230  0.344075 -27.0166

here is the for loop I have been messing around with

for row in AtomData.itertuples():     #or iterrows()
    if (row.z) >= (zmax-5):
         AtomData.loc[row, 2]=2

CodePudding user response：

one way is to create a list and add it as a column (or instead of an existing column if you use a name that already exists):

l = []
for row in AtomData.itertuples():     #or iterrows()
    if (row.z) >= (zmax-5):
         l.append(2)
    else:
        l.append(1)
AtomData['new_atomtype'] = l

but you could do it without a loop, like this for instance:

AtomData['new_atomtype_2'] = (AtomData['z']>=(zmax-5)).astype(int) 1

CodePudding user response：

############### Recreate OP's df ##########
atomtype = [1,1,1]
moltype  = [1,1,1]
charge = [0.0, 0.0, 0.0]      
x = [-6.184180, 5.902450, -0.344071]    
y = [-2.371150, -3.004580, 6.614240]
z = [28.2445,25.2445,22.2445]

df = pd.DataFrame({"atomtype":atomtype, "moltype":moltype, "charge":charge, "x":x, "y":y, "z":z})
############################################


zmax = df.z.max() # find z max
index = [i for i,x in enumerate(df.z) if zmax-5 < x <= zmax] # find index of values between zmax and zmax - 5
df["moltype"][index] = 2 # replace moltype with 2 for those indexes
df

Output:

atomtype    moltype charge  x              y          z
1              2       0.0  -6.184180   -2.37115    28.2445
1              2       0.0  5.902450    -3.00458    25.2445
1              1       0.0  -0.344071   6.61424     22.2445