Home > Mobile >  Extract year, number from a column
Extract year, number from a column

Time:06-24

Can you craft a Postgres sql query that will create a seperate column that will extract from an other column. The extraction needs to be a CVE title. Aka CVE- yyyy - xxxxx year will change and the x’s will change for the CVE. Sometimes the CVE in the other column will be in “()” sometimes it will end with “:”

We need to make sure when extracting the year(yyyy) can change, and the x's(a sequence of numbers)can vary could be 1 to 6.

The column is like

the vulnerability name (CVE-2019-0215) 
the vulnerability name (CVE-2019-0290) extra words 
the vulnerability name CVE-2018-23314: blah blah

Expected Output in a new column:

CVE-2019-0215
CVE-2019-0290
CVE 2018-23314

CodePudding user response:

If this is about the standard CVE format, extracting the year can easily be done using substring() with a regular expression:

substring('CVE-2022-1552' from 'CVE-([0-9]{4})-[0-9] ')

returns 2022

substring() will return the first matching group, so the regex "describes" the whole pattern and by using a capturing group for the year, only that will be returned.

If you need to match other formats, you need to adjust the regex accordingly.

CodePudding user response:

It is useful to layout the problem through sample data, and the expected result (ps: for this please learn some stackoverflow format capability).

sample data

CREATE TABLE mytable(
   sourcecol VARCHAR(100) NOT NULL
);
INSERT INTO mytable(sourcecol) VALUES ('the vulnerability name (CVE-2019-0215)');
INSERT INTO mytable(sourcecol) VALUES ('the vulnerability name (CVE-2019-0290) extra words');
INSERT INTO mytable(sourcecol) VALUES ('the vulnerability name CVE-2018-23314: blah blah');

query: using regex pattern in substring function

select substring(sourcecol from '(CVE-[0-9]{1,6}-[0-9]{1,6}) ')
from mytable

this regex "pattern" looks for the string starting with "CVE-" followed by 1 to 6 digits followed by "-" followed by 1 to 6 digits

result

 ---------------- 
|   substring    |
 ---------------- 
| CVE-2019-0215  |
| CVE-2019-0290  |
| CVE-2018-23314 |
 ---------------- 

see this dbfiddle

  • Related