Its possible to slice my data where I have the main key for all child codes, considering the main code will be duplicate in some moments, so i have year column to support the data slice.
Thats my current data and I need to transform it
cod | child code | year | text |
---|---|---|---|
M01Q00500 | M01Q00800 | 2018 | text01 |
M01Q00800 | M01Q00830 | 2018 | text02 |
M01Q00830 | M01Q00810 | 2018 | text03 |
M01Q00810 | M02Q00150 | 2018 | text04 |
M01Q00810 | M02Q00170 | 2018 | text04 |
M02Q00150 | null | 2018 | text05 |
M02Q00170 | null | 2018 | text06 |
And that is what i looking
cod | child code | year | text |
---|---|---|---|
M01Q00500 | M01Q00800 | 2018 | text01 |
M01Q00500 | M01Q00830 | 2018 | text01 |
M01Q00500 | M01Q00810 | 2018 | text01 |
M01Q00500 | M02Q00150 | 2018 | text01 |
M01Q00500 | M02Q00170 | 2018 | text02 |
M01Q00800 | M01Q00830 | 2018 | text02 |
M01Q00800 | M01Q00810 | 2018 | text02 |
M01Q00800 | M02Q00150 | 2018 | text02 |
M01Q00800 | M02Q00170 | 2018 | text02 |
M01Q00830 | M01Q00810 | 2018 | text03 |
M01Q00830 | M02Q00150 | 2018 | text03 |
M01Q00830 | M02Q00170 | 2018 | text03 |
M01Q00810 | M02Q00150 | 2018 | text04 |
M01Q00810 | M02Q00170 | 2018 | text04 |
M02Q00150 | null | 2018 | text05 |
M02Q00170 | null | 2018 | text06 |
CodePudding user response:
I'm not clear what you are asking here. Your example appears to be longer than after slicing. are you perhaps looking to re-order the data?
a multi-index perhaps?
CodePudding user response:
If I have this correct I think you are looking to get a row in the data frame with every possible combiantion of the data.
Also eliminate spaces in column headers is a good habit and will save a lot of time. The code looks cleaners and faster to type
df = df.rename(columns={'child code' : 'child_code'})
df2=pd.DataFrame(columns=['cod','child_code', 'year','text'])
for item1 in df.cod.unique():
for item2 in df.child_code.unique():
for item3 in df.year.unique():
for item4 in df.text.unique():
new_row = pd.DataFrame({'cod' : [item1], 'child_code' : [item2], 'year' : [item3],'text': [item4]})
df2 = pd.concat([df, df2])```