In sql i want get distict sets of rows : identical group for Characteristic and Value only one time :
The column Characteristic can range from one to 10
Table :
Name | Characteristic | Value |
---|---|---|
Mary | eyes | Blu |
Mary | hair | blonde |
Mary | Sex | Female |
Jhon | eyes | Black |
Jhon | Hair | Black |
Jhon | Sex | Male |
Jhon | Nation | Franch |
Bill | eyes | Blu |
Bill | Hair | Blond |
Bill | Sex | Male |
Will | eyes | Green |
Will | Hair | Blond |
Will | Sex | Male |
Will | Nation | Spain |
Lilly | eyes | Blu |
Lilly | Hair | Blonde |
Lilly | Sex | Female |
mark | eyes | Black |
mark | Hair | Black |
mark | Sex | Male |
mark | Nation | Franch |
Anna | eyes | Blu |
Anna | Hair | Blonde |
Anna | Sex | Female |
Antonio | eyes | Black |
Antonio | Hair | Black |
Antonio | Sex | Male |
Antonio | Nation | Franch |
The result that i want to achieve :
Group | Characteristic | Value |
---|---|---|
1 | eyes | Blu |
1 | Hair | Blonde |
1 | Sex | Female |
2 | eyes | Black |
2 | Hair | Black |
2 | Sex | Male |
2 | Nation | Franch |
3 | eyes | Blu |
3 | Hair | Blond |
3 | Sex | Male |
4 | eyes | Green |
4 | Hair | Blode |
4 | Sex | Male |
4 | Nation | Spain |
and finally if it's possible :
Name | Characteristic | Value | Group |
---|---|---|---|
Mary | eyes | Blu | 1 |
Mary | Hair | Blonde | 1 |
Mary | Sex | Female | 1 |
Jhon | eyes | Black | 2 |
Jhon | Hair | Black | 2 |
Jhon | Sex | Male | 2 |
Jhon | Nation | Franch | 2 |
Bill | eyes | Blu | 3 |
Bill | Hair | Blond | 3 |
Bill | Sex | Male | 3 |
Will | eyes | Green | 4 |
Will | Hair | Blond | 4 |
Will | Sex | Male | 4 |
Will | Nation | Spain | 4 |
Lilly | eyes | Blu | 1 |
Lilly | Hair | Blonde | 1 |
Lilly | Sex | Female | 1 |
mark | eyes | Black | 2 |
mark | Hair | Black | 2 |
mark | Sex | Male | 2 |
mark | Nation | Franch | 2 |
Anna | eyes | Blu | 1 |
Anna | Hair | Blonde | 1 |
Anna | Sex | Female | 1 |
Antonio | eyes | Black | 2 |
Antonio | Hair | Black | 2 |
Antonio | Sex | Male | 2 |
Antonio | Nation | Franch | 2 |
CodePudding user response:
You can use STRING_AGG
to join all the characteristics together, then use ROW_NUMBER
and DENSE_RANK
to count them. Then you re-join that back to the base table.
For your first query, you can do it like this.
SELECT
Groups.GroupId,
t.Characteristic,
t.Value
FROM YourTable t
JOIN (
SELECT
t.Name,
t.GroupDefinition,
GroupId = DENSE_RANK() OVER (ORDER BY t.GroupDefinition),
RowId = ROW_NUMBER() OVER (PARTITION BY t.GroupDefinition ORDER BY t.Name)
FROM (
SELECT
t.Name,
GroupDefinition = STRING_AGG(Characteristic ':' Value, '|')
WITHIN GROUP (ORDER BY t.Characteristic)
FROM YourTable t
GROUP BY
t.Name
) t
) Groups ON Groups.Name = t.Name
WHERE Groups.RowId = 1;
The second query is as follows.
SELECT
Groups.GroupId,
t.*
FROM YourTable t
JOIN (
SELECT
t.Name,
t.GroupDefinition,
GroupId = DENSE_RANK() OVER (ORDER BY t.GroupDefinition),
RowId = ROW_NUMBER() OVER (PARTITION BY t.GroupDefinition ORDER BY t.Name)
FROM (
SELECT
t.Name,
GroupDefinition = STRING_AGG(Characteristic ':' Value, '|')
WITHIN GROUP (ORDER BY t.Characteristic)
FROM YourTable t
GROUP BY
t.Name
) t
) Groups ON Groups.Name = t.Name;
Another option would be to aggregate it into a JSON or XML format, then shred it back out without re-joining the base table.