I have a subset dataframe from a much larger dataframe. I need to be able to create a for loop that searches through a dataframe and pull out the data corresponding to the correct name.
import pandas as pd
import numpy as np
import re
data = {'Name': ['CH_1', 'CH_2', 'CH_3', 'FV_1', 'FV_2', 'FV_3'],
'Value': [1, 2, 3, 4, 5, 6]
}
df = pd.DataFrame(data)
FL = [17.7, 60.0]
CH = [20, 81.4]
tol = 8
time1 = FL[0] tol
time2 = FL[1] tol
time3 = CH[0] tol
time4 = CH[1] tol
FH_mon = df['Values'] *5
workpercent = [.7, .92, .94]
mhpy = [2087, 2503, 3128.75]
list1 = list()
list2 = list()
for x in df['Name']:
if x == [(re.search('FV_', s)) for s in df['Name'].values]:
y = np.select([FH_mon < time1 , (FH_mon >= time1) and (FH_mon < time2), FH_mon > time2], [workpercent[0],workpercent[1],workpercent[2]])
z = np.select([FH_mon < time1 , (FH_mon >= time1) and (FH_mon < time2), FH_mon > time2], [mhpy[0],mhpy[1],mhpy[2]])
if x == [(re.search('CH_', s)) for s in df['Name'].values]:
y = np.select([FH_mon < time3, (FH_mon >= time3) and (FH_mon < time4)], [workpercent[0],workpercent[1]])
z = np.select([FH_mon < time3, (FH_mon >= time3) and (FH_mon < time4)], [mhpy[0],mhpy[1]])
list1.append(y)
list2.append(z)
I had a simple version earlier where I was just added a couple numbers, and I was getting really helpful answers to how I asked my question, but here is the more complex version. I need to search through and any time there is a FV in the name column, the if loop runs and uses data from the Name column with FV. Same for CH. I have the lists to keep track of each value as the loop loops through the Name column. If there is a simpler way I would really appreciate seeing it, but right now this seems like the cleanest way but I am receiving errors or the loop will not function properly.
CodePudding user response:
If the "Name" column only has values starting with "FV_" or "CH_", use where
:
df["Value"] = df["Value"].add(2).where(df["Name"].str.startswith("FV_"), df["Value"].add(4))
If you might have other values in "Name", use numpy.select
:
import numpy as np
df["Value"] = np.select([df["Name"].str.startswith("FV_"), df["Name"].str.startswith("CH_")], [df["Value"].add(2), df["Value"].add(4)])
Output:
>>> df
Name Value
0 CH_1 5
1 CH_2 6
2 CH_3 7
3 FV_1 6
4 FV_2 7
5 FV_3 8
CodePudding user response:
This should be what you want:
for index, row in df.iterrows():
if re.search("FV_", row["Name"]):
df.loc[index, "Value"] = 2
elif re.search("CH_", row["Name"]):
df.loc[index, "Value"] = 4