How to find smallest positive integer in data frame row-CodePudding

I have looked everywhere for this answer which must exist. I am trying to find the smallest positive integer per row in a data frame.

Imagine a dataframe

        'lat':[-120, -90, -100, -100],
        'long':[20, 21, 19, 18],
        'dist1':[2, 6, 8, 1],
        'dist2':[1,3,10,5]}```

The following function gives me the minimum value, but includes negatives. i.e. the df['lat'] column.

df.min(axis = 1)

Obviously, I could drop the lat column, or convert to string or something, but I will need it later. The lat column is the only column with negative values. I am trying to return a new column such as

df['min_dist'] = [1,3,8,1]

I hope this all makes sense. Thanks in advance for any help.

CodePudding user response：

In general you can use DataFrame.where to mark negative values as null and exclude them from min calculation:

df['min_dist'] = df.where(df > 0).min(1)
df
   lat  long  dist1  dist2  min_dist
0 -120    20      2      1       1.0
1  -90    21      6      3       3.0
2 -100    19      8     10       8.0
3 -100    18      1      5       1.0

CodePudding user response：

Filter for just the dist columns and apply the minimum function :

df.assign(min_dist = df.iloc[:, -2:].min(1))
Out[205]: 
   lat  long  dist1  dist2  min_dist
0 -120    20      2      1         1
1  -90    21      6      3         3
2 -100    19      8     10         8
3 -100    18      1      5         1

CodePudding user response：

You can sort each row then get second element like below:

>>> import numpy as np
>>> df['min_dist'] = df.apply(lambda row : np.sort(row)[1], axis=1)
>>> df

    lat      long   dist1   dist2   min_dist
0   -120      20    2          1    1
1   -90       21    6          3    3
2   -100      19    8          10   8
3   -100      18    1          5    1

CodePudding user response：

Just use:

df['min_dist'] = df[df > 0].min(1)