Suppose I have the following Dataframe in Python:
input_df = pd.DataFrame({
'Previous': ['1000', '1000', 'latex', 'latex'],
'Ignore':[None, None, ['free'], ['free']],
'New': ['100', '200', 'nylon', 'cloth']})
I would like to generate the following four dataframes:
df1 = pd.DataFrame({
'Previous': ['1000','latex'],
'Ignore': [None, ['free']],
'New': ['100','nylon']})
df2 = pd.DataFrame({
'Previous': ['1000','latex'],
'Ignore': [None, ['free']],
'New': ['100','cloth']})
df3 = pd.DataFrame({
'Previous': ['1000','latex'],
'Ignore': [None, ['free']],
'New': ['200','nylon']})
df4 = pd.DataFrame({
'Previous': ['1000','latex'],
'Ignore': [None, ['free']],
'New': ['200','cloth']})
How can I accomplish this?
Edit: I have arrived at the following solution by modifying @TheMaster 's answer:
out=[pd.DataFrame(j) for j in c([i[1] for i in input_df.iterrows()], len(input_df['Previous'].unique())) if len(pd.DataFrame(j)['Previous'].unique()) == len(input_df['Previous'].unique())]
This solution keeps only the output where the 'Previous' column has all unique entries.
CodePudding user response:
Try combinations
:
from itertools import combinations as c
out=[pd.DataFrame(j) for j in c([i[1] for i in df.iterrows()],2)]
Out[2]:
[ Previous New
0 1000 100
1 1000 200,
Previous New
0 1000 100
2 latex nylon,
Previous New
0 1000 100
3 latex cloth,
Previous New
1 1000 200
2 latex nylon,
Previous New
1 1000 200
3 latex cloth,
Previous New
2 latex nylon
3 latex cloth]