This question although may seem previously answered it is not. All transposing seem to relate to one column and pivoting the data in that column. I want to make a vertical table from a horizontal set of columns, for example:-
Take this example:-
MyPrimaryKey | Insurer_Factor_1_Name | Insurer_Factor_1_Code | Insurer_Factor_1_Value | Insurer_Factor_2_Name | Insurer_Factor_2_Code | Insurer_Factor_2_Value | Insurer_Factor_[n]_Name | Insurer_Factor_[n]_Code | Insurer_Factor_[n]_Value |
---|---|---|---|---|---|---|---|---|---|
XX-ABCDEF-1234-ABCDEF123 | Special | SP1 | 2500 | Awesome | AW2 | 3500 | ecetera | etc | 999999 |
[n] being any number of iterations
transforming it into a new vertical representation dataframe:-
MyPrimaryKey | Insurer_Factor_ID | Insurer_Factor_Name | Insurer_Factor_Code | Insurer_Factor_Value |
---|---|---|---|---|
XX-ABCDEF-1234-ABCDEF123 | 1 | Special | SP1 | 2500 |
XX-ABCDEF-1234-ABCDEF123 | 2 | Awesome | AW2 | 3500 |
XX-ABCDEF-1234-ABCDEF123 | [n] | ecetera | etc | 999999 |
There is also the possibility that the “Code” column may be missing and we only receive the name and value therefore requiring null to be added to the code column.
I've searched High and low for this, but there just doesn't seem to be anything out there?
Also there could be many rows in the first example...
CodePudding user response:
The reason you haven't found it is that there is not a magic trick to move a 'interstingly' designed table into a well designed table. You are going to have to hand code a query to either union the rows into your table, or select arrays that you then explode.
Sure you could probably write some code to generate the SQL that you want, but really they're isn't a good feature to magically translate this feature
format into a row based format.
CodePudding user response:
In order of preference:
Revisit your decision to send multiple files: It sounds like it would save a lot of work if you just sent multiple files.
Change the column schema: Put a delimiter (every 4th column) into the column schema allowing us to see the rows. We can then suck the file in as rows. Using a delimiter.
Write your own custom datasource: You can use the existing text one as a example for how you can write your own, that cold interpret every 3 columns as a row.
Write a custom UDF that takes all columns as a parameter and returns an array of rows, that you then call explode on to turn them into rows. This will be slow so I give it to you as the final option.