I've got a table "products" which consists of two columns: id, raw
Example:
id | raw |
---|---|
132 | Description: mangoes; Cost: 15-30; Saving: 64; ER: E 51; EnR: D 56; Eligible: Y$ $ Description: oranges; Cost: 1,000-1,400; Saving: 41; ER: C 75; EnR: D 58; Eligible: Y |
497 | Description: bananas; Cost: 25; Saving: 9; ER: E 52; EnR: D 56; Eligible: N$ $ Description: apples; Cost: 4,000-14,000; Saving: 165; ER: E 47; EnR: E 53; Eligible: Y$ $ Description: pears; Cost: 800-1,200; Saving: 393; ER: C 73; EnR: D 55; Eligible: Y |
This table needs to be converted to the final view like this:
id | Description | Cost | Saving | ER | EnR | Eligible |
---|---|---|---|---|---|---|
132 | mangoes | 15-30 | 64 | E 51 | D 56 | Y |
132 | oranges | 1,000-1,400 | 41 | C 75 | D 58 | Y |
497 | bananas | 25 | 9 | E 52 | D 56 | N |
497 | apples | 4,000-14,000 | 165 | E 47 | E 53 | Y |
497 | pears | 800-1,200 | 393 | C 73 | D 55 | Y |
I supposed to do that by the next steps:
- using SPLIT() by '$ $ ' to split each 'raw' string to rows
- using SPLIT() by ';' to split each row from the previous step to separate column and name it.
But I finally confused how to do that.
Could anyone suggest the solution or more elegant way to realize that?
I think the solution is on the surface, but I can't seem to get to it. Thanks.
CodePudding user response:
using SPLIT() by '$ $ ' to split each 'raw' string to rows
Split doesn't split into rows. It splits into array members. To split to rows, use a lateral join to the split_to_table table function:
select ID, split(VALUE, ';')
from t1, table(split_to_table(raw, '$ $'));
You could use string manipulation to form and parse and object, and then pull the columns from it like this:
with X as
(
select ID, parse_json('{"' || replace(replace(replace(trim(VALUE), ':', '":"'), ';','","'), '" ', '"') || '"}') as V
from t1, table(split_to_table(raw, '$ $'))
)
select ID
,v:Description::string as "Description"
,v:Cost::string as "Cost"
,v:Saving::string as "Saving"
,v:ER::string as "ER"
,v:EnR::string as "ENR"
,v:Eligible::string as "Eligible"
from X
;
However; though the title specifies "without using create function", it's going to make the situation cleaner. I wrote a Java UDF that takes an array of tokenized strings (separated by colons) and converts it into keys and values in an object.
create or replace function stk_array_to_object(A array)
returns object
language javascript
strict immutable
as
$$
var kv;
out = {};
for(let i = 0; i < A.length; i ) {
kv = A[i].split(":");
kv[0] = kv[0].trim();
out[kv[0]] = kv[1].trim();
}
return out;
$$;
You can then get an object like this:
select ID, stk_array_to_object(split(VALUE, ';')) as V
from t1, table(split_to_table(raw, '$ $'))
From there, you could do something like a CTE to project the keys to columns and hide the object:
with X as
(
select ID, stk_array_to_object(split(VALUE, ';')) as V
from t1, table(split_to_table(raw, '$ $'))
)
select ID
,v:Description::string as "Description"
,v:Cost::string as "Cost"
,v:Saving::string as "Saving"
,v:ER::string as "ER"
,v:EnR::string as "ENR"
,v:Eligible::string as "Eligible"
from X;
CodePudding user response:
Another approach -
First CTE is raw data.
Second CTE (cte_1) splits each data at '$ $'. Important to note here is to capture index, which will be later used in pivot grouping. Without this index value as there is nothing unique across rows of same ID.
Third CTE (cte_2), again splits data at ';' and separates the key and value into separate columns, using split_part
.
Final CTE (cte_3) does the pivot
on key to get the final desired output.
with cte(id, raw) as
(select * from values
(132,'Description: mangoes; Cost: 15-30; Saving: 64; ER: E 51; EnR: D 56; Eligible: Y
$ $ Description: oranges; Cost: 1,000-1,400; Saving: 41; ER: C 75; EnR: D 58; Eligible: Y'),
(497,'Description: bananas; Cost: 25; Saving: 9; ER: E 52; EnR: D 56; Eligible: N
$ $ Description: apples; Cost: 4,000-14,000; Saving: 165; ER: E 47; EnR: E 53; Eligible: Y
$ $ Description: pears; Cost: 800-1,200; Saving: 393; ER: C 73; EnR: D 55; Eligible: Y')
), cte_1 as (
select id,c.index,c.value from cte, lateral flatten (input=>split(raw,'$ $')) c
), cte_2 as (
select c.id, c.index, trim(split_part(f.value,':',1),' ') key,
trim(split_part(f.value,':',2),' ') val
from cte_1 c,
lateral flatten (input=>split(c.value,';')) f
), cte_3 as (
select * from
(select id, index,key, val from cte_2) tab
pivot (max(val) for key in ('Description','Cost','Saving','ER','EnR','Eligible'))
)
select id,"'Description'","'Cost'","'Saving'","'ER'","'EnR'","'Eligible'"
from cte_3;
ID | 'Description' | 'Cost' | 'Saving' | 'ER' | 'EnR' | 'Eligible' |
---|---|---|---|---|---|---|
132 | mangoes | 15-30 | 64 | E 51 | D 56 | Y |
132 | oranges | 1,000-1,400 | 41 | C 75 | D 58 | Y |
497 | pears | 800-1,200 | 393 | C 73 | D 55 | Y |
497 | bananas | 25 | 9 | E 52 | D 56 | N |
497 | apples | 4,000-14,000 | 165 | E 47 | E 53 | Y |
Note - in the data-section (main cte) there are carriage returns (around $ $) to avoid too much right-scrolling. Might need to adjust that if someone copy-pastes the example to test run.