Python Dataframe Extract Colum Data Based cell value-CodePudding

From the dataframe, unique values are taken in "Region" and would like to extract the "Name" However, when one Name is in Region, it spells out the name (l-i-n-d-a instead of linda)

   import pandas as pd
   data = {
        'Region': ['South','West', 'North','West', 'North',  'South','West', 'North', 'East'],
        'Name': ['Tom', 'nick', 'krish', 'jack','peter','sam','jon','megan','linda']
    }
  
   df = pd.DataFrame(data)
   list_region = df['Region'].unique().tolist()
   for region in list_region:
     list_person = df.set_index('Region').loc[region, 'Name']
     for person in list_person:
       print(region   ' >> '   person)

partial output as below, linda was spelled out

North >> megan
East >> l
East >> i
East >> n
East >> d
East >> a

CodePudding user response：

you can try this:

df = pd.DataFrame(data)
list_region = df['Region'].unique().tolist()
for region in list_region:
    list_person = df.set_index('Region').loc[region, 'Name']
    if type(list_person) == str:
        list_person = [list_person]
    for person in list_person:
       print(region   ' >> '   person)

CodePudding user response：

You could use the value_counts() function, and then get only the index of the result:


import pandas as pd
data = {
        'Region': ['South','West', 'North','West', 'North', 'South','West', 'North', 'East'],
        'Name': ['Tom', 'nick', 'krish', 'jack','peter','sam','jon','megan','linda']
}
  
df = pd.DataFrame(data)

combinations = df.value_counts().index.to_list()

which yields:

[('East', 'linda'),
 ('North', 'krish'),
 ('North', 'megan'),
 ('North', 'peter'),
 ('South', 'Tom'),
 ('South', 'sam'),
 ('West', 'jack'),
 ('West', 'jon'),
 ('West', 'nick')]

and then for the formatting:

for item in combinations:
    print(item[0] ' >> ' item[1])

which outputs:

East >> linda
North >> krish
North >> megan
North >> peter
South >> Tom
South >> sam
West >> jack
West >> jon
West >> nick