Home > Enterprise >  group or unpivot df not considering empty values
group or unpivot df not considering empty values

Time:11-20

I have a df like this :

PRODUCTNUMBER Jerarquía principal Jerarquía secundaria marcas COT Ecommerce dabra-catalog Dexter-ecommerce Stockcenter-ecommerce
AD802309 Medias-Hombre ADIDAS 950699 NaN NaN NaN NaN
AD481076 NaN Adidas 950699 NaN NaN NaN NaN
AD481137 Medias-Hombre Adidas 950699 Medias-Hombre Medias-Hombre Medias-Hombre Medias-Hombre

and I need to get this output:

PRODUCTNUMBER PRODUCTCATEGORYNAME PRODUCTCATEGORYHIERARCHYNAME
AD802309 Medias-Hombre Jerarquía principal
AD802309 ADIDAS Jerarquía secundaria marcas
AD802309 950699 COT
AD481076 Adidas Jerarquía secundaria marcas
AD481076 950699 COT
AD481137 Medias-Hombre Jerarquía principal
AD481137 Adidas Jerarquía secundaria marcas
AD481137 950699 COT
AD481137 Medias-Hombre Ecommerce
AD481137 Medias-Hombre dabra-catalog
AD481137 Medias-Hombre Dexter-ecommerce
AD481137 Medias-Hombre Stockcenter-ecommerce

is it possible? "NaN" values must not be transposed

CodePudding user response:

Try:

df = (
    df.set_index("PRODUCTNUMBER")
    .stack()
    .reset_index()
    .rename(
        columns={
            0: "PRODUCTCATEGORYNAME",
            "level_1": "PRODUCTCATEGORYHIERARCHYNAME",
        }
    )
)

df = df[["PRODUCTNUMBER", "PRODUCTCATEGORYNAME", "PRODUCTCATEGORYHIERARCHYNAME"]]
print(df)

Prints:

   PRODUCTNUMBER PRODUCTCATEGORYNAME PRODUCTCATEGORYHIERARCHYNAME
0       AD802309       Medias-Hombre          Jerarquía principal
1       AD802309              ADIDAS  Jerarquía secundaria marcas
2       AD802309              950699                          COT
3       AD481076              Adidas  Jerarquía secundaria marcas
4       AD481076              950699                          COT
5       AD481137       Medias-Hombre          Jerarquía principal
6       AD481137              Adidas  Jerarquía secundaria marcas
7       AD481137              950699                          COT
8       AD481137       Medias-Hombre                    Ecommerce
9       AD481137       Medias-Hombre                dabra-catalog
10      AD481137       Medias-Hombre             Dexter-ecommerce
11      AD481137       Medias-Hombre        Stockcenter-ecommerce

CodePudding user response:

Try with melt

out = df.melt('PRODUCTNUMBER',
               value_name='PRODUCTCATEGORYHIERARCHYNAME',
               var_name='PRODUCTCATEGORYNAME').dropna()
Out[201]: 
   PRODUCTNUMBER          PRODUCTCATEGORYNAME PRODUCTCATEGORYHIERARCHYNAME
0       AD802309          Jerarquía principal                Medias-Hombre
2       AD481137          Jerarquía principal                Medias-Hombre
3       AD802309  Jerarquía secundaria marcas                       ADIDAS
4       AD481076  Jerarquía secundaria marcas                       Adidas
5       AD481137  Jerarquía secundaria marcas                       Adidas
6       AD802309                          COT                       950699
7       AD481076                          COT                       950699
8       AD481137                          COT                       950699
11      AD481137                    Ecommerce                Medias-Hombre
14      AD481137                dabra-catalog                Medias-Hombre
17      AD481137             Dexter-ecommerce                Medias-Hombre
20      AD481137        Stockcenter-ecommerce                Medias-Hombre

CodePudding user response:

need simple example for answer

Example

data = {'A': {'a': 'val1', 'b': 'val3'},
        'B': {'a': None, 'b': 'val4'},
        'C': {'a': 'val2', 'b': None}}
df = pd.DataFrame(data)

output(df):

    A       B       C
a   val1    None    val2
b   val3    val4    None



Code

when unpivot by stack, we can drop null automatic

df.stack().reset_index().set_axis(['col1', 'col2', 'col3'], axis=1)

result:

    col1    col2    col3
0   a       A       val1
1   a       C       val2
2   b       A       val3
3   b       B       val4
  • Related