Home > other >  Concatenate columns based on certain group
Concatenate columns based on certain group

Time:01-15

I have dataframe df1 ike this:

    Database/Schema Name          entity Name            temp
0           schema1               ITEM_DESC1             Item_Nbr INT(1,2) NOT NULL
1           schema1               ITEM_DESC1             Old_Nbr INT(3,2) NOT NULL
2           schema1               ITEM_DESC1             Order_Dept_Nbr SMALLINT(6,2) NULL
3           schema1               ITEM_DESC1             Acct_Dept_Nbr SMALLINT(9,2) NULL
4           schema2               ITEM_DESC2             Primary_Desc CHAR(20,2) NULL
5           schema2               ITEM_DESC2             Secondary_Desc CHAR(20,4) NULL
6           schema2               ITEM_DESC2             Color_Desc CHAR(6,5) NULL
7           schema2               ITEM_DESC2             Size_Desc CHAR(6,3) NULL

In this dataframe I have two different schemas and tables(ITEM_DESC1 and ITEM_DESC2). I want to build create table statement out of this.

So, from the above dataframe I need a new dataframe which will have 2 rows (since 2 different tables in df1 ) and the value would be

df2:

    ddl_statement
0   create table schema1.ITEM_DESC1 (Item_Nbr INT(1,2) NOT NULL,Old_Nbr INT(3,2) NOT NULL,Order_Dept_Nbr SMALLINT(6,2) NULL,Acct_Dept_Nbr SMALLINT(9,2) NULL)

1   create table schema2.ITEM_DESC2 (Primary_Desc CHAR(20,2) NULL,Secondary_Desc CHAR(20,4) NULL,Color_Desc CHAR(6,5) NULL,Size_Desc CHAR(6,3) NULL)

How can I achieve this with out using loop?

CodePudding user response:

Use groupby and f-strings:

df2 = df.groupby(['Database/Schema Name', 'entity Name'])['temp'] \
        .apply(lambda x: f"create table {x.name[0]}.{x.name[1]} ({', '.join(x)})") \
        .reset_index(drop=True).to_frame('ddl_statement')

Output:

>>> df2
                                                                                                                                                  ddl_statement
0  create table schema1.ITEM_DESC1 (Item_Nbr INT(1,2) NOT NULL, Old_Nbr INT(3,2) NOT NULL, Order_Dept_Nbr SMALLINT(6,2) NULL, Acct_Dept_Nbr SMALLINT(9,2) NULL)
1           create table schema2.ITEM_DESC2 (Primary_Desc CHAR(20,2) NULL, Secondary_Desc CHAR(20,4) NULL, Color_Desc CHAR(6,5) NULL, Size_Desc CHAR(6,3) NULL)
  •  Tags:  
  • Related