I have the following data frame:
df = pd.DataFrame([([40.33, 40.34, 40.22],[-71.11, -71.21, -71.14],[12, 45, 10]), ([41.23, 41.40, 41.22],[-72.01, -72.01, -72.01],[11, 23, 15]), ([43.33, 43.34],[-70.11, -70.21],[12, 40]), ([41.23, 41.40], [-72.01, -72.01, -72.01], [11, 23, 15])], columns=['long', 'lat', 'accuracy'])
long lat accuracy
[40.33, 40.34, 40.22] [-71.11, -71.21, -71.14] [12, 45, 10]
[41.23, 41.40, 41.22] [-72.01, -72.01, -72.01] [11, 23, 15]
[43.33, 43.34] [-70.11, -70.21] [12, 40]
[41.23, 41.40] [-72.01, -72.01, -72.01] [11, 23, 15]
...
Each column contains a list of floats. I want to check if in each row in all three columns, the sizes of these lists are the same. What is the best way to do this, return another column named sanity
with TRUE
if all lists have the same size, FALSE
if at least one list has a different size compared to the rest?
The expected output is:
long lat accuracy sanity
[40.33, 40.34, 40.22] [-71.11, -71.21, -71.14] [12, 45, 10] TRUE
[41.23, 41.40, 41.22] [-72.01, -72.01, -72.01] [11, 23, 15] TRUE
[43.33, 43.34] [-70.11, -70.21] [12, 40] TRUE
[41.23, 41.40] [-72.01, -72.01, -72.01] [11, 23, 15] FALSE
CodePudding user response:
You can approach this with applymap
and nunique
:
df["sanity"] = df.applymap(len).nunique(axis=1).eq(1)
# Output :
print(df)
long lat accuracy sanity
0 [40.33, 40.34, 40.22] [-71.11, -71.21, -71.14] [12, 45, 10] True
1 [41.23, 41.4, 41.22] [-72.01, -72.01, -72.01] [11, 23, 15] True
2 [43.33, 43.34] [-70.11, -70.21] [12, 40] True
3 [41.23, 41.4] [-72.01, -72.01, -72.01] [11, 23, 15] False
CodePudding user response:
df['new_col'] = df.stack().str.len().unstack().nunique(axis=1)