I have a list of files that all have a pd dataframe (1550 x 7) and I am working to edit them a little. Every row is related to a different atom. I am working to find the atoms in which the z value is between zmax and zmax-5. For those atoms I would like to change their 'moltype' from one to two. I have chosen to run a forloop iterating over the rows. I believe I can find the rows in which the z value is in the wanted range, but I am having trouble changing that atom's moltype in the forloop.
here is the simplified dataframe (organized by Z descending)
index atomtype moltype charge x y z
724 725 1 1 0.0 -6.184180 -2.371150 28.2445
739 740 1 1 0.0 5.902450 -3.004580 28.2445
712 713 1 1 0.0 -0.344071 6.614240 28.2445
711 712 1 1 0.0 1.033570 6.542040 28.2445
736 737 1 1 0.0 4.166110 -5.148780 28.2445
.. ... ... ... ... ... ... ...
29 30 1 1 0.0 -1.716680 -6.396840 -27.0166
30 31 1 1 0.0 1.038610 -6.541230 -27.0166
33 34 1 1 0.0 2.371140 -6.184180 -27.0166
34 35 1 1 0.0 4.685090 -4.681490 -27.0166
0 1 1 1 0.0 6.614230 0.344075 -27.0166
here is the for loop I have been messing around with
for row in AtomData.itertuples(): #or iterrows()
if (row.z) >= (zmax-5):
AtomData.loc[row, 2]=2
CodePudding user response:
one way is to create a list and add it as a column (or instead of an existing column if you use a name that already exists):
l = []
for row in AtomData.itertuples(): #or iterrows()
if (row.z) >= (zmax-5):
l.append(2)
else:
l.append(1)
AtomData['new_atomtype'] = l
but you could do it without a loop, like this for instance:
AtomData['new_atomtype_2'] = (AtomData['z']>=(zmax-5)).astype(int) 1
CodePudding user response:
############### Recreate OP's df ##########
atomtype = [1,1,1]
moltype = [1,1,1]
charge = [0.0, 0.0, 0.0]
x = [-6.184180, 5.902450, -0.344071]
y = [-2.371150, -3.004580, 6.614240]
z = [28.2445,25.2445,22.2445]
df = pd.DataFrame({"atomtype":atomtype, "moltype":moltype, "charge":charge, "x":x, "y":y, "z":z})
############################################
zmax = df.z.max() # find z max
index = [i for i,x in enumerate(df.z) if zmax-5 < x <= zmax] # find index of values between zmax and zmax - 5
df["moltype"][index] = 2 # replace moltype with 2 for those indexes
df
Output:
atomtype moltype charge x y z
1 2 0.0 -6.184180 -2.37115 28.2445
1 2 0.0 5.902450 -3.00458 25.2445
1 1 0.0 -0.344071 6.61424 22.2445