My objective is to get the duplicated groups of column A and print/extract them into a new dataframe, ultimately to print each new dataframe into csv.
my current dataframe:
column A | column B |
---|---|
A | 2 |
A | 2 |
A | 3 |
B | 2 |
B | 3 |
B | 4 |
C | 2 |
C | 2 |
D | 2 |
D | 2 |
D | 3 |
desired output:
column A | column B |
---|---|
A | 2 |
A | 2 |
A | 3 |
column A | column B |
---|---|
B | 2 |
B | 3 |
column A | column B |
---|---|
C | 2 |
C | 2 |
column A | column B |
---|---|
D | 2 |
D | 2 |
D | 3 |
CodePudding user response:
You can loop over the unique values of column A and can diplay the data with specific value of column A
Code:
[df[df['ColA']==i] for i in set(df.ColA.values)]
Output;
[ ColA ColB
0 A 2
1 A 2
2 A 3,
ColA ColB
6 C 2
7 C 2,
ColA ColB
3 B 2
4 B 3
5 B 4,
ColA ColB
8 D 2
9 D 2
10 D 3]
CodePudding user response:
g = df.groupby('column A')
dup_chk = df.loc[df['column A'].eq('A'), 'column B']
out = [g.get_group(x)[lambda x: x['column B'].isin(dup_chk)] for x in g.groups]
out
(list of dataframes)
[ column A column B
0 A 2
1 A 2
2 A 3,
column A column B
3 B 2
4 B 3,
column A column B
6 C 2
7 C 2,
column A column B
8 D 2
9 D 2
10 D 3]
CodePudding user response:
Use
groupby
function to group each repeated elements in a row usefor
loop to loop through each group
grouped_df = df.groupby('column A')
for group in grouped_df:
print(group)