Home > database >  append column values to list if condition met in another column
append column values to list if condition met in another column

Time:03-28

Can't quite find the solution to this and it seems pretty basic, but say I have:

df ={'ID' : [1, 1, 1, 1, 1, 1, 1, 2, 2],
     'x':[76.551, 77.529, 78.336,77, 76.02, 79.23, 77.733, 79.249,  76.077],
     'y': [151.933, 152.945, 153.970, 119.369, 120.615, 118.935, 119.115, 152.004, 153.027],
    'position': ['start', 'end', 'start', 'NA', 'NA','NA','end', 'start', 'end']}
df = pd.DataFrame(df)

df
   ID       x        y position
0   1  76.551  151.933    start
1   1  77.529  152.945      end
2   1  78.336  153.970    start
3   1  77.000  119.369       NA
4   1  76.020  120.615       NA
5   1  79.230  118.935       NA
6   1  77.733  119.115      end
7   2  79.249  152.004    start
8   2  76.077  153.027      end

I want to plot the paths of the unique IDs with the 'start' and 'stop' points. I think I can do this with:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

cmap = cm.get_cmap('rainbow')
colors = cmap(np.linspace(0, 1, df['ID'].nunique()))
color_map = dict(zip(df['ID'].unique(), colors))

# fig, ax = plt.subplots()
plt.figure(figsize=(10,10))
for ID, subdf in df.groupby('ID'):
    plt.plot(subdf['x'], subdf['y'], marker=',', label=None, c=color_map[ID], linewidth=0.1)
# ax.legend()

plt.plot(x_start, y_start, 'og') # plot the start coords in green?
plt.plot(x_stop, y_stop, 'or') # plot the end coords in red?
plt.show()

but I'm unable to figure out how to append the start and stop coordinates to an empty array... I thought something this would work, but it doesn't:

x_start = []
x_end = []
y_start = []
y_end = []

for i in range(len(df)):
    if df.loc[df['position'] == 'start']:
        x_start.append(df['x'][i])
        y_end.append(df['y'][i])
    if df.loc[df['position'] == 'end']:
        x_end.append(df['x'][i])
        y_end.append(df['y'][i])

CodePudding user response:

Are you looking for:

x_start, y_start = df.loc[df['position'] == 'start', ['x', 'y']].values.T.tolist()
x_end, y_end = df.loc[df['position'] == 'end', ['x', 'y']].values.T.tolist()

Output:

>>> x_start
[76.551, 78.336, 79.249]

>>> y_start
[151.933, 153.97, 152.004]

>>> x_end
[77.529, 77.733, 76.077]

>>> y_end
[152.945, 119.115, 153.027]
  • Related