From the dataframe, unique values are taken in "Region" and would like to extract the "Name" However, when one Name is in Region, it spells out the name (l-i-n-d-a instead of linda)
import pandas as pd
data = {
'Region': ['South','West', 'North','West', 'North', 'South','West', 'North', 'East'],
'Name': ['Tom', 'nick', 'krish', 'jack','peter','sam','jon','megan','linda']
}
df = pd.DataFrame(data)
list_region = df['Region'].unique().tolist()
for region in list_region:
list_person = df.set_index('Region').loc[region, 'Name']
for person in list_person:
print(region ' >> ' person)
partial output as below, linda was spelled out
North >> megan
East >> l
East >> i
East >> n
East >> d
East >> a
CodePudding user response:
you can try this:
df = pd.DataFrame(data)
list_region = df['Region'].unique().tolist()
for region in list_region:
list_person = df.set_index('Region').loc[region, 'Name']
if type(list_person) == str:
list_person = [list_person]
for person in list_person:
print(region ' >> ' person)
CodePudding user response:
You could use the value_counts()
function, and then get only the index of the result:
import pandas as pd
data = {
'Region': ['South','West', 'North','West', 'North', 'South','West', 'North', 'East'],
'Name': ['Tom', 'nick', 'krish', 'jack','peter','sam','jon','megan','linda']
}
df = pd.DataFrame(data)
combinations = df.value_counts().index.to_list()
which yields:
[('East', 'linda'),
('North', 'krish'),
('North', 'megan'),
('North', 'peter'),
('South', 'Tom'),
('South', 'sam'),
('West', 'jack'),
('West', 'jon'),
('West', 'nick')]
and then for the formatting:
for item in combinations:
print(item[0] ' >> ' item[1])
which outputs:
East >> linda
North >> krish
North >> megan
North >> peter
South >> Tom
South >> sam
West >> jack
West >> jon
West >> nick