Home > front end >  cast separate lists into to one list
cast separate lists into to one list

Time:09-09

I am following this example semantic clustering:

!pip install sentence_transformers
from sentence_transformers import SentenceTransformer
from sklearn.cluster import KMeans

embedder = SentenceTransformer('all-MiniLM-L6-v2')

# Corpus with example sentences
corpus = ['A man is eating food.',
          'A man is eating a piece of bread.',
          'A man is eating pasta.',
          'The girl is carrying a baby.',
          'The baby is carried by the woman',
          'A man is riding a horse.',
          'A man is riding a white horse on an enclosed ground.',
          'A monkey is playing drums.',
          'Someone in a gorilla costume is playing a set of drums.',
          'A cheetah is running behind its prey.',
          'A cheetah chases prey on across a field.'
          ]
corpus_embeddings = embedder.encode(corpus)

# Perform kmean clustering
num_clusters = 5
clustering_model = KMeans(n_clusters=num_clusters)
clustering_model.fit(corpus_embeddings)
cluster_assignment = clustering_model.labels_

clustered_sentences = [[] for i in range(num_clusters)]
for sentence_id, cluster_id in enumerate(cluster_assignment):
    clustered_sentences[cluster_id].append(corpus[sentence_id])

for i, cluster in enumerate(clustered_sentences):
  print("Cluster", i 1)
  print(cluster)
  print(len(cluster))
  print("")

Which results to the following lists:

Cluster  1
['The girl is carrying a baby.', 'The baby is carried by the woman']
2

Cluster  2
['A man is riding a horse.', 'A man is riding a white horse on an enclosed ground.']
2

Cluster  3
['A man is eating food.', 'A man is eating a piece of bread.', 'A man is eating pasta.']
3

Cluster  4
['A cheetah is running behind its prey.', 'A cheetah chases prey on across a field.']
2

Cluster 5
['A monkey is playing drums.', 'Someone in a gorilla costume is playing a set of drums.']
2

How to add these separate list to one?

Expected outcome:

list2[['The girl is carrying a baby.', 'The baby is carried by the woman'], .....['A monkey is playing drums.', 'Someone in a gorilla costume is playing a set of drums.']]

I tried the following:

list2=[]
for i in cluster:
  list2.append(i)
list2

But I returns me only the last one:

['A monkey is playing drums.',
 'Someone in a gorilla costume is playing a set of drums.']

Any ideas?

CodePudding user response:

Following that example, you don't need to anything to get a list of lists; that's already been done for you.

Try printing clustered_sentences.

CodePudding user response:

Basically, you need to get a "flat" list from a list of lists, you can achieve that with python list comprehension:

flat = [item for sub in clustered_sentences for item in sub]
  • Related