df:
country | year | index |
---|---|---|
Turkiye | 1992 | NaN |
Spain | 1992 | NaN |
US | 1992 | 1 |
Turkiye | 1993 | 1 |
Spain | 1993 | 1 |
US | 1993 | 0 |
Turkiye | 1994 | 1 |
France | 1994 | 0 |
Italy | 1994 | NaN |
Turkiye | 1995 | 0 |
Here, for example, in 1992 Turkiye and Spain are NaNs but the index exists for the US. So I am only interested in the earliest date that the index exists for, the country does not matter in this case.
My code is:
a = np.where(df["Index"]!= None)
a["year"].min()
a is not a data frame, I think for this reason I am having a problem. How can I solve this issue?
CodePudding user response:
use .loc
with .idxmin
after .dropna
df.loc[df.dropna()['year'].idxmin()]
country US
year 1992
index 1.0
Name: 2, dtype: object