How to conditionally assign values from another dataframe?-CodePudding

I want to merge 2 dataframes without using the function '.merge' and I try to assign a value to a dataframe column based on an interval and an id.

intervals = pd.DataFrame(
    columns=["ID", "From", "To", "Interval"], data=[[1, 0, 100, "In"]]
)
print("intervals\n", intervals, "\n")

points = pd.DataFrame(
    columns=["ID", "From", "To", "Value"],
    data=[[1, 1, 2, "A"], [2, 2, 10, "B"], [1, 15, 150, "C"], [1, 345, 350, "D"]],
)
print('points\n',points,'\n')

My attempt:

def calculate_value(x):
    return intervals.loc[
        (x >= intervals["From"]) & (x < intervals["To"]), "Interval"
    ].squeeze()

desired_result = points.copy()
desired_result['Value_New'] = desired_result['From'].apply(calculate_value)

and the output:

But I want this:

The points must be between the interval, that's why "15 - 150" doesn't have value in "Value_New".

How to do that?

Source: How to assign values based on an interval in Pandas

CodePudding user response：

Here is one way to do it:

# Filter "points" dataframe to retain only the rows that match given conditions
mask = (
    (points["ID"].isin(intervals["ID"].values))
    & (points["From"] >= intervals["From"].values[0])
    & (points["To"] <= intervals["To"].values[0])
)

# Add a new column "Value_New" to "points" dataframe and fill it once with bool values
# and then with the desired values ("In" or empty string)
points = points.assign(Value_New=mask).pipe(
    lambda df: df.assign(Value_New=df["Value_New"].apply(lambda x: "In" if x else ""))
)

print(points)
# Output
    ID  From   To Value Value_New
0   1     1    2     A        In
1   2     2   10     B
2   1    15  150     C
3   1   345  350     D