Although not good coding practice, I've come to an special kind of problem, in which I need to go through a column of lists to erase particular values. I suppose one resolution could be managed with melting the 'neighbors' column, but I believe the code I've managed is close from the objective. I've prepared a reproducible example for better understanding:
import pandas as pd
import numpy as np
def removing_nan_neighboors(custom_df):
nan_list = list(custom_df[custom_df['values'].notna()]['customer'])
print(nan_list)
custom_df['neighbors'] = [x for x in custom_df['neighbors'] if x not in nan_list]
return custom_df
customer = [1, 2, 3, 4, 5, 6]
values = [np.nan, np.nan, 10, np.nan, 11, 12]
neighbors = [[6, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 1]]
df = pd.DataFrame({'customer': customer, 'values': values, 'neighbors': neighbors})
df = removing_nan_neighboors(df)
print(df)
customer values neighbors
0 1 NaN [6, 2]
1 2 NaN [1, 3]
2 3 10.0 [2, 4]
3 4 NaN [3, 5]
4 5 11.0 [4, 6]
5 6 12.0 [5, 1]
The objective is to erase the customer numbers from the neighbors, if they have NaN values:
customer values neighbors
0 1 NaN [6]
1 2 NaN [3]
2 3 10.0 []
3 4 NaN [3, 5]
4 5 11.0 [6]
5 6 12.0 [5]
But I have failed to get that far, for my function doesn't work as intended yet. Help is appreciated.
CodePudding user response:
Try:
df["cust_1"] = np.where(
np.isnan(np.roll(df["values"], 1)),
np.nan,
np.roll(df["customer"], 1),
)
df["cust_2"] = np.where(
np.isnan(np.roll(df["values"], -1)),
np.nan,
np.roll(df["customer"], -1),
)
df["neighbors"] = df[["cust_1", "cust_2"]].agg(
lambda x: list(x[x.notna()].astype(int)), axis=1
)
df = df.drop(columns=["cust_1", "cust_2"])
print(df)
Prints:
customer values neighbors
0 1 NaN [6]
1 2 NaN [3]
2 3 10.0 []
3 4 NaN [3, 5]
4 5 11.0 [6]
5 6 12.0 [5]
CodePudding user response:
If I understood your objective correctly, you want to erase such numbers from every neighbors
row that belong to that customer
rows, where values
is NaN
. So basically you want to get the result from your last cell.
I attempted to do that in a list comprehension approach:
df['neighbors_new'] = [[n for n in neighbor
if n not in df[df['values'].isna() == True]['customer'].values]
for neighbor in df.neighbors]