Home > Blockchain >  How to find the longest list in a pandas series?
How to find the longest list in a pandas series?

Time:11-16

I am trying to print the longest list in a series and display top 10 lengths and the corresponding lists.

I tried to do

df["listoflists"].value_counts()

But it only prints the count of the keys but not the length of the keys i.e. lists. I also tried

print(df["listoflists"].applymap(len).idxmax(axis=1))

But getting an error AttributeError: 'Series' object has no attribute 'applymap' How can this be solved?

CodePudding user response:

Let's say you're having following dataframe :

values = [['a','a'], ['a','b','b','d','e'],
         ['a','b','b','a'], ['a','b','c','a'],
         ['a','b','b'],['a','b','b']

df = pd.DataFrame({'listoflists' :values })

For the longest list, you can try :

max(df.listoflists, key=len)

and for the top n list, you can try (n = 3 in this example) :

df['count'] = df.listoflists.map(len)
df.nlargest(3, ['count'])

CodePudding user response:

I'll show you how to do this on just some random data:

# easy random data 
df = pd.DataFrame({'listoflists':[[1,2],[3,4,5],[6,7,8,9]]}) 

# assign back to DataFrame so we can manipulate 
df['len'] = df['listoflists'].apply(len)

# then to get top N:
N = 1
df.sort_values(['len'],ascending=False).groupby(['len']).transform(min).head(N)
  • Related