Python slice data from multiple child column-CodePudding

Its possible to slice my data where I have the main key for all child codes, considering the main code will be duplicate in some moments, so i have year column to support the data slice.

Thats my current data and I need to transform it

cod	child code	year	text
M01Q00500	M01Q00800	2018	text01
M01Q00800	M01Q00830	2018	text02
M01Q00830	M01Q00810	2018	text03
M01Q00810	M02Q00150	2018	text04
M01Q00810	M02Q00170	2018	text04
M02Q00150	null	2018	text05
M02Q00170	null	2018	text06

And that is what i looking

cod	child code	year	text
M01Q00500	M01Q00800	2018	text01
M01Q00500	M01Q00830	2018	text01
M01Q00500	M01Q00810	2018	text01
M01Q00500	M02Q00150	2018	text01
M01Q00500	M02Q00170	2018	text02
M01Q00800	M01Q00830	2018	text02
M01Q00800	M01Q00810	2018	text02
M01Q00800	M02Q00150	2018	text02
M01Q00800	M02Q00170	2018	text02
M01Q00830	M01Q00810	2018	text03
M01Q00830	M02Q00150	2018	text03
M01Q00830	M02Q00170	2018	text03
M01Q00810	M02Q00150	2018	text04
M01Q00810	M02Q00170	2018	text04
M02Q00150	null	2018	text05
M02Q00170	null	2018	text06

CodePudding user response：

I'm not clear what you are asking here. Your example appears to be longer than after slicing. are you perhaps looking to re-order the data?

a multi-index perhaps?

CodePudding user response：

If I have this correct I think you are looking to get a row in the data frame with every possible combiantion of the data.

Also eliminate spaces in column headers is a good habit and will save a lot of time. The code looks cleaners and faster to type


df = df.rename(columns={'child code' : 'child_code'})

df2=pd.DataFrame(columns=['cod','child_code', 'year','text'])


for item1 in df.cod.unique():
    for item2 in df.child_code.unique():
        for item3 in df.year.unique():
            for item4 in df.text.unique():
                new_row = pd.DataFrame({'cod' : [item1], 'child_code' : [item2], 'year' : [item3],'text': [item4]})
                df2 = pd.concat([df, df2])```