Home > OS >  how to unpack or unzip a column that contains a list of strings into row below along with other colu
how to unpack or unzip a column that contains a list of strings into row below along with other colu

Time:10-13

I first need to groupby a column, remove the unwanted values, and then unpack or unzip it into the next row.

My dataset looks like this:

   Text             tag
   drink coke       mic
   eat pizza        mic
   eat fruits       yes
   eat banana       yes
   eat banana       mic
   eat fruits       mic
   eat pizza        no
   eat pizza        mic
   eat pizza        yes
   drink coke       yes
   drink coke       no
   drink coke       no
   drink coke       yes

I used this function to groupby.

df = pd.DataFrame(df.groupby(['text'])['tag'].apply(lambda x: list(x.values)))
  Text           labels               
  eat pizza      [mic,no,mic,yes]    
  eat fruits     [yes,mic]           
  eat banana     [yes,mic]           
  drink coke     [yes,yes,no,no,yes] 

If in the columns labels there is a 'no' and a 'yes', I need to remove those values from the column labels, and the unpack back.

The output should look like this.

  Text             tag
   drink coke       mic
   eat pizza        mic
   eat fruits       yes
   eat banana       yes
   eat banana       mic
   eat fruits       mic
   eat pizza        mic

CodePudding user response:

Doing:

# Answer, does the group contain both yes and no?
contains_both = (df.groupby('Text')['tag']
                   .transform(lambda x: all(i in x.values for i in ('yes', 'no'))))

# We'll keep it if it doesn't contain both yes and no
# But if it does, remove the yes and no.
df = df[~contains_both | ~df.tag.isin(['yes', 'no'])]
print(df)

Output:

         Text  tag
0  drink coke  mic
1   eat pizza  mic
2  eat fruits  yes
3  eat banana  yes
4  eat banana  mic
5  eat fruits  mic
7   eat pizza  mic

FYI, your df calculation could be shortened to:

df = df.groupby('Text', as_index=False)['tag'].agg(list)

# Output:
         Text                      tag
0  drink coke  [mic, yes, no, no, yes]
1  eat banana               [yes, mic]
2  eat fruits               [yes, mic]
3   eat pizza      [mic, no, mic, yes]
  • Related