Home > Mobile >  Get the last and previous for multiple entries using IN sorted by date
Get the last and previous for multiple entries using IN sorted by date

Time:01-03

My target: I have a list of stock_ids and want to get the last bids (its sorted by date) only one per stock_id.

For the picture, it means i want:

stock_id bid
3 663.91953
1 46.44281
2 9.02798

One problem is we have stocks like gazproms which are suspended, so one of the last quotes can be 2021-06-06 for example.

Take a where on quote_day = DATE(NOW()) would not work in this case.

I also need the same for the first lower date, which is not in the first query, this can be done over a second query.

My current solution with using PHP. This is working but the performance is not perfect like for 100 stocks it's take 5 seconds.

I'm able to use Redis, it would be also a option to save the bid somewhere.

Current:

select `quote_date`, 'stocks' as `type`, `bid`, `stock_id` as id
from ( 
  select t.*, row_number()
    over(partition by stock_id order by `quote_date` desc) as rn 
  from end_day_quotes_AVG t 
  where quote_date <= DATE({$date}) 
    AND stock_id in ({$val})
    and currency_id = {$c_id} 
) x where rn = 1

the day before:

select `quote_date`, 'stocks' as `type`, `bid`, `stock_id` as id
from ( 
  select t.*, row_number()
    over(partition by stock_id order by `quote_date` desc) as rn 
  from end_day_quotes_AVG t 
  where quote_date < DATE({$date})
    AND stock_id in ({$val})
    and currency_id = {$c_id}
) x where rn = 1 

Stock_id, quote_date, and currency_id are unique.

The Table I want data using server: 10.9.4-MariaDB-1:10.9.4

enter image description here

edit:

explained query:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra   
1   PRIMARY     <derived2>  ALL     NULL    NULL    NULL    NULL    220896  Using where
2   DERIVED     t   ALL     stock_id,quote_date     NULL    NULL    NULL    2173105     Using where; Using temporary

create Table:

CREATE TABLE `end_day_quotes_AVG` (
  `id` int(11) NOT NULL,
  `quote_date` date NOT NULL,
  `bid` decimal(15,5) NOT NULL,
  `stock_id` int(11) DEFAULT NULL,
  `etf_id` int(11) DEFAULT NULL,
  `crypto_id` int(11) DEFAULT NULL,
  `certificate_id` int(11) DEFAULT NULL,
  `currency_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

INSERT INTO `end_day_quotes_AVG` (`id`, `quote_date`, `bid`, `stock_id`, `etf_id`, `crypto_id`, `certificate_id`, `currency_id`) VALUES
(10537515, '2023-01-02', '16.48286', 40581, NULL, NULL, NULL, 2),
(10537514, '2023-01-02', '3.66786', 40569, NULL, NULL, NULL, 2),
(10537513, '2023-01-02', '9.38013', 40400, NULL, NULL, NULL, 2),
(10537512, '2023-01-02', '8.54444', 40396, NULL, NULL, NULL, 2),



ALTER TABLE `end_day_quotes_AVG`
  ADD PRIMARY KEY (`id`),
  ADD KEY `stock_id` (`stock_id`,`currency_id`),
  ADD KEY `etf_id` (`etf_id`,`currency_id`),
  ADD KEY `crypto_id` (`crypto_id`,`currency_id`),
  ADD KEY `certificate_id` (`certificate_id`,`currency_id`),
  ADD KEY `quote_date` (`quote_date`);


ALTER TABLE `end_day_quotes_AVG`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=10570526;

A generated filled query:

select `quote_date`, 'stocks' as `type`, `bid`, `stock_id` as id from 
( select t.*, row_number() over(partition by stock_id order by `quote_date` desc) as rn 
from end_day_quotes_AVG t where quote_date <= DATE('2023-01-02') AND stock_id in (2,23,19,41,40,26,9,43,22,
44,28,32,30,34,20,10,13,17,27,35,8,29,39,16,33,5,36589,25,18,6,38,37,3,45,7,21,46,15,4,24,31,36,38423,40313,
22561,36787,35770,36600,35766,42,22567,40581,40569,29528,22896,24760,40369,40396,40400,40374,36799,1,27863,
29659,40367,27821,24912,36654,21125,22569,22201,
23133,40373,36697,36718,26340,36653,47,34019,36847,36694) and currency_id = 2 ) x where rn = 1; 

CodePudding user response:

It looks like no indexes are being used as is, which can often be the case with large IN lists. Change to join from a values table constructor, create a (currency_id,stock_id,quote_date) index and it should be able to use that.

select `quote_date`, 'stocks' as `type`, `bid`, `stock_id` as id
from (
    select t.*, row_number() over(partition by stock_id order by `quote_date` desc) as rn 
    from (
        select null stock_id where 0
        union all
        values (2),(23),(19),(41),(40),(26),(9),(43),
            (22),(44),(28),(32),(30),(34),(20),(10),
            (13),(17),(27),(35),(8),(29),(39),(16),
            (33),(5),(36589),(25),(18),(6),(38),(37),
            (3),(45),(7),(21),(46),(15),(4),(24),
            (31),(36),(38423),(40313),(22561),(36787),(35770),(36600),
            (35766),(42),(22567),(40581),(40569),(29528),(22896),(24760),
            (40369),(40396),(40400),(40374),(36799),(1),(27863),(29659),
            (40367),(27821),(24912),(36654),(21125),(22569),(22201),(23133),
            (40373),(36697),(36718),(26340),(36653),(47),(34019),(36847),
            (36694)
    ) as stock_ids
    join end_day_quotes_AVG t on t.currency_id=2 and t.stock_id=stock_ids.stock_id and t.quote_date <= date('2023-01-02')
) x where rn = 1

(The select where 0/union are just to give the column a useful name, since mariadb's default is extremely unhelpful.)

CodePudding user response:

wI ant to get the last bids (its sorted by date) only one per stock_id.

I also need the same for the first lower date, which is not in the first query.

Are you looking for the two latest quotes of each bid as of a given date? If so, you can just modify the first query to allow row numbers 1 and 2:

select `quote_date`, 'stocks' as `type`, `bid`, `stock_id` as id 
from ( 
    select t.*, row_number() over(partition by stock_id order by quote_date desc) as rn f
    from end_day_quotes_AVG t 
    where quote_date <= DATE(?) AND stock_id in (?)  and currency_id = ? 
) x 
where rn <= 2  -- the latest two
  • Related