Thursday, March 18, 2021

Here's a requirement I had to recently solve using SQL. I managed to come up with a solution, but it's a bit elaborate and cumbersome.

Without sharing my solution just now, I was wondering how you would solve this.

Given a table:

create table orderline_arrivals
(orderlineid             number not null
,arrival_starttime       date not null
 check(mod(to_number(to_char(arrival_starttime,'MI')),5)=0)
,arrival_stoptime        date not null
 check(mod(to_number(to_char(arrival_stoptime,'MI')),5)=0)
,article_id              varchar2(10) not null
,quantity                number(4,0) not null
 check(quantity > 0)
,check(trunc(arrival_starttime) = trunc(arrival_stoptime))
,check(arrival_starttime < arrival_stoptime)
,primary key (orderlineid)
/

Let's say we have following four tuples: 


 Per orderline we need to distribute the quantity over the 5-minute slots for that orderline:

For a given date, then the final resultset should look like this:

22 comments:

  1. I did a possible solution on LiveSQL:

    https://livesql.oracle.com/apex/livesql/s/liudk1wjsssreegx8kp5l77ba

    Might not be my prettiest work, but it solves the case with only a single access of the table and a tiny bit of analytic functions.

    ReplyDelete
  2. select to_char(slot_start, 'hh24:mi') || ' - ' || to_char(any_value(slot_end), 'hh24:mi') slot,
    article_id,
    sum(case when s# <= r then ceil(quantity / num_of_slots) else trunc(quantity / num_of_slots) end) slot_quantity
    from (select slots.*,
    ol.*,
    row_number() over(partition by ol.orderlineid order by slots.slot_start) s#,
    round(mod(ol.quantity, ol.num_of_slots)) r
    from (select lb + (level-1)*5/1440 slot_start,
    lb + level*5/1440 slot_end
    from (select min(arrival_starttime) lb,
    max(arrival_stoptime) hb
    from orderline_arrivals
    where trunc(arrival_starttime) = date '2021-03-18')
    connect by lb + (level-1)*5/1440 < hb) slots,
    (select a.*,
    (arrival_stoptime - arrival_starttime) * 1440/5 num_of_slots
    from orderline_arrivals a) ol
    where ol.arrival_starttime(+) <= slots.slot_start
    and ol.arrival_stoptime(+) >= slots.slot_end)
    group by slot_start,article_id
    order by slot,article_id;

    ReplyDelete
    Replies
    1. Awesome. Based on your solution, I just added

      HAVING sum
      ( case when s# <= r
      then ceil(quantity / num_of_slots)
      else trunc(quantity / num_of_slots)
      end
      ) != 0

      in case the number of slots exceeds the quantity

      Delete
  3. Hi, a little too easy with Oracle. So here is the queries for MySQL.

    set the given date at the line :

    set @arrival_date=convert('2021-03-18',date); -- the given date

    For the last select, use :

    select * from reparts_period; -- for the distribution per orderline

    or

    select * from reparts_period_synt ; -- for the final resultset for a given date

    Cheers, Zilog.

    create table orderline_arrivals
    (orderlineid int not null
    ,arrival_starttime datetime not null
    check(mod(CAST(DATE_FORMAT(arrival_starttime, '%i') as UNSIGNED) ,5)=0)
    ,arrival_stoptime datetime not null
    check(mod(CAST(DATE_FORMAT(arrival_stoptime, '%i') as UNSIGNED),5)=0)
    ,article_id varchar(10) not null
    ,quantity int(4) not null
    check(quantity > 0)
    ,check(date(arrival_starttime) = date(arrival_stoptime))
    ,check(arrival_starttime < arrival_stoptime));

    alter table orderline_arrivals add primary key (orderlineid);

    DELETE FROM orderline_arrivals WHERE orderlineid>0;

    insert into orderline_arrivals(orderlineid, arrival_starttime, arrival_stoptime, article_id, quantity)
    values (1, convert('2021-03-18 08:00',datetime),convert('2021-03-18 08:15',datetime), 'A1', 10);
    insert into orderline_arrivals(orderlineid, arrival_starttime, arrival_stoptime, article_id, quantity)
    values (2, convert('2021-03-18 08:00',datetime),convert('2021-03-18 08:15',datetime), 'A2', 20);
    insert into orderline_arrivals(orderlineid, arrival_starttime, arrival_stoptime, article_id, quantity)
    values (3, convert('2021-03-18 08:10',datetime),convert('2021-03-18 08:20',datetime), 'A1', 5);
    insert into orderline_arrivals(orderlineid, arrival_starttime, arrival_stoptime, article_id, quantity)
    values (4, convert('2021-03-18 08:30',datetime),convert('2021-03-18 08:40',datetime), 'A3', 50);

    Queries in the next post ^^ (max 4kB...)

    ReplyDelete
  4. First part (4kb limit..)

    set @arrival_date=convert('2021-03-18',date); -- the given date
    with base as (select 1 rowid from dual union all select 2 rowid from dual), -- Cartesian product to simulate hierarchical statements
    counts as (select (@row_number:=@row_number + 1) rownum from (SELECT @row_number:=0) rowcount join base on 1 = 1 join base base2 on 1 = 1 join base base3 on 1 = 1 join base base4 on 1 = 1 join base base5 on 1 = 1 join base base6 on 1 = 1 join base base7 on 1 = 1 join base base8 on 1 = 1 join base base9 on 1 = 1), -- 512 entry to cover 288 periods at max
    orderline_arrivals_date as (select * from orderline_arrivals where @arrival_date between convert(arrival_starttime, date) and convert(arrival_stoptime,date)),
    period as (select min(arrival_starttime) period_start, max(arrival_stoptime) period_end,
    ((CAST(DATE_FORMAT(max(arrival_stoptime), '%H') as UNSIGNED) * 12 + (CAST(DATE_FORMAT(max(arrival_stoptime), '%i') as UNSIGNED) / 5))
    - (CAST(DATE_FORMAT(min(arrival_starttime), '%H') as UNSIGNED) * 12 + CAST(DATE_FORMAT(min(arrival_starttime), '%i') as UNSIGNED) / 5)) nb_period
    from orderline_arrivals_date),
    periods as (select convert((period_start + 500 * (counts.rownum - 1)), datetime) p_start,
    convert((period_start + 500 * counts.rownum), datetime) p_end
    from period join counts on period.nb_period >= counts.rownum ),
    orders_period as (select orders.*,
    ((CAST(DATE_FORMAT(arrival_stoptime, '%H') as UNSIGNED) * 12 + (CAST(DATE_FORMAT(arrival_stoptime, '%i') as UNSIGNED) / 5))
    - (CAST(DATE_FORMAT(arrival_starttime, '%H') as UNSIGNED) * 12 + CAST(DATE_FORMAT(arrival_starttime, '%i') as UNSIGNED) / 5)) nb_period
    from orderline_arrivals_date orders),
    repart as (select p_start, p_end, orderlineid, article_id,
    (quantity / nb_period) + ((case when p_start = orders.arrival_starttime then (quantity % nb_period) else 0 END) / nb_period)
    - ((case when p_end = orders.arrival_stoptime then (quantity % nb_period) else 0 END) / nb_period) qte
    -- (case when p_start = orders.arrival_starttime then (quantity % nb_period) else 0 END) modulus_start,
    -- (case when p_end = orders.arrival_stoptime then (quantity % nb_period) else 0 END) modulus_end
    -- orders.arrival_starttime, orders.arrival_stoptime
    from periods left join orders_period orders on orders.arrival_starttime <= periods.p_start and orders.arrival_stoptime > periods.p_start -- periods.p_start between orders.arrival_starttime and orders.arrival_stoptime
    order by p_start),
    reparts as (select articles.orderlineid, articles.article_id,
    repart.p_start, repart.p_end,
    round(sum(repart.qte), 0) qte_order
    from (select distinct orderlineid, article_id from orderline_arrivals_date) articles
    left join repart on repart.article_id = articles.article_id
    and repart.orderlineid = articles.orderlineid
    group by articles.article_id, articles.orderlineid, repart.p_start, repart.p_end),

    ReplyDelete
  5. Second part:

    repart_cols as (select (@col:=@col + 1) rcol, -- No PIVOT WITH MySQL...
    (@col1:=CONVERT((case when @col = 1 then id else @col1 end), CHAR)) rcol1,
    (@col2:=CONVERT((case when @col = 2 then id else @col2 end), CHAR)) rcol2,
    (@col3:=CONVERT((case when @col = 3 then id else @col3 end), CHAR)) rcol3,
    (@col4:=CONVERT((case when @col = 4 then id else @col4 end), CHAR)) rcol4,
    (@col5:=CONVERT((case when @col = 5 then id else @col5 end), CHAR)) rcol5,
    (@col6:=CONVERT((case when @col = 6 then id else @col6 end), CHAR)) rcol6,
    (@col7:=CONVERT((case when @col = 7 then id else @col7 end), CHAR)) rcol7,
    (@col8:=CONVERT((case when @col = 8 then id else @col8 end), CHAR)) rcol8
    from (SELECT @col:=0, @col1:=NULL, @col2:=NULL, @col3:=NULL, @col4:=NULL, @col5:=NULL, @col6:=NULL, @col7:=NULL, @col8:=NULL ) cols
    join (select distinct concat( orderlineid, '_', article_id) id from orderline_arrivals_date) orders on 1=1 ),
    reparts_period as (select 'slot' slot,
    @col1 col1,
    @col2 col2,
    @col3 col3,
    @col4 col4,
    @col5 col5,
    @col6 col6,
    @col7 col7,
    @col8 col8
    from (select distinct 1 from repart_cols) cols
    UNION ALL
    select CONCAT(reparts.p_start,'_', reparts.p_end) slot,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col1 THEN reparts.qte_order else NULL END) col1,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col2 THEN reparts.qte_order else NULL END) col2,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col3 THEN reparts.qte_order else NULL END) col3,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col4 THEN reparts.qte_order else NULL END) col4,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col5 THEN reparts.qte_order else NULL END) col5,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col6 THEN reparts.qte_order else NULL END) col6,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col7 THEN reparts.qte_order else NULL END) col7,
    (case when CONCAT(reparts.orderlineid,'_', reparts.article_id) = @col8 THEN reparts.qte_order else NULL END) col8
    from (select distinct 1 from repart_cols) cols left join reparts on 1 = 1),
    reparts_period_synt as (select concat(periods.p_start, '-', periods.p_end) slot, reparts.article_id, sum(reparts.qte_order) slot_quantity
    from periods left join reparts on reparts.p_start= periods.p_start
    group by concat(periods.p_start, '-', periods.p_end), reparts.article_id)
    select * from reparts_period_synt ;

    ReplyDelete
  6. I just registered on blogger to keep tracks.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. This is using BigQuery, without recursion. To get your null slots back I would need one more outer join after the aggregation, but I didn't bother :)
    DROP TABLE
    jdata.orderline_arrivals;

    CREATE TABLE
    jdata.orderline_arrivals AS
    SELECT
    1 AS orderlineid,
    TIMESTAMP ('2021-03-18 08:00:00' ) AS arrival_starttime,
    TIMESTAMP ('2021-03-18 08:15:00') AS arrival_stoptime,
    'A1' AS article_id,
    10 AS quantity
    UNION ALL
    SELECT
    2,
    TIMESTAMP ('2021-03-18 08:00:00' ),
    TIMESTAMP ('2021-03-18 08:15:00'),
    'A2',
    20
    UNION ALL
    SELECT
    3,
    TIMESTAMP ('2021-03-18 08:10:00' ),
    TIMESTAMP ('2021-03-18 08:20:00'),
    'A1',
    5
    UNION ALL
    SELECT
    4,
    TIMESTAMP ('2021-03-18 08:30:00' ),
    TIMESTAMP ('2021-03-18 08:40:00'),
    'A3',
    50;


    WITH
    time_array AS (
    SELECT
    timeslot_start,
    LEAD(timeslot_start) OVER (ORDER BY timeslot_start) AS timeslot_end
    FROM (
    SELECT
    *
    FROM
    UNNEST(GENERATE_TIMESTAMP_ARRAY('2021-03-18 08:00:00',
    '2021-03-18 08:45:00',
    INTERVAL 5 MINUTE) )timeslot_start) ),
    order_exploded AS (
    SELECT
    orderlineid,
    TIMESTAMP_ADD(arrival_starttime, INTERVAL time_between_orders*nr SECOND) AS item_arrivaltime,
    article_id
    FROM (
    SELECT
    orderlineid,
    arrival_starttime,
    arrival_stoptime,
    article_id,
    nr,
    SAFE_CAST(TIMESTAMP_DIFF(arrival_stoptime, arrival_starttime, SECOND)/quantity AS INT64) AS time_between_orders
    FROM
    jdata.orderline_arrivals oa,
    UNNEST(GENERATE_ARRAY(0,100)) nr
    WHERE
    nr <= oa.quantity))
    SELECT
    FORMAT_TIMESTAMP('%R',timeslot_start)||' - '||FORMAT_TIMESTAMP('%R',timeslot_end) AS slots,
    article_id,
    COUNT(item_arrivaltime) AS slot_quantity
    FROM
    order_exploded,
    time_array
    WHERE
    timeslot_start < item_arrivaltime
    AND timeslot_end >= item_arrivaltime
    GROUP BY 1,2
    ORDER BY 1,2;

    ReplyDelete
  9. Solved in MariaDB. No recursion:
    https://dbfiddle.uk/?rdbms=mariadb_10.5&fiddle=84c7237e07bff4cc73e1f9394c7969f8

    ReplyDelete
  10. My Oracle version :

    For the last select, use :

    select * from reparts_period -- for the distribution per orderline

    or

    select * from reparts_period_synt -- for the final resultset for a given date

    Largely more compact than MySQL ^^

    https://dbfiddle.uk/?rdbms=oracle_18&fiddle=cf2cf07cfa39f92b5e2723aee1484dbb

    ReplyDelete
    Replies
    1. To precise: It's somehow a translation from the MySQL, no optim.

      Delete
    2. I've separated the per orderline distribution and the final resultset here :

      https://dbfiddle.uk/?rdbms=oracle_18&fiddle=8d62dada9585c9c4b6e862d7c2236044

      Delete
    3. Less parenthesis (i love LISP) and better timestamp maths :

      https://dbfiddle.uk/?rdbms=oracle_18&fiddle=7d5b7d628e238a9f3f1b8daf0975842e

      Delete
    4. Condensed to 2à lines with a not so bad exec plan :

      https://dbfiddle.uk/?rdbms=oracle_18&fiddle=fddedae5bff378572629209f8149d5a7

      -- Oracle Version with final result set
      with period as (select min(arrival_starttime) period_start,
      max(arrival_stoptime) period_end,
      round(to_number(max(cast(arrival_stoptime as date)) - min(cast(arrival_starttime as date))) * 300,0) nb_period
      from orderline_arrivals
      where arrival_starttime between to_date('2021-03-18', 'RRRR-MM-DD') and to_date('2021-03-19', 'RRRR-MM-DD')),
      orders_period as (select orders.*,
      round(to_number(cast(arrival_stoptime as date) - cast(arrival_starttime as date)) * 300,0) nb_period
      from orderline_arrivals orders)
      select to_char(periods.p_start, 'HH24:MI:SS')||'-'||to_char(periods.p_end, 'HH24:MI:SS') slot,
      orders.article_id,
      round(sum((case when NVL(nb_period, 0) = 0 THEN NULL ELSE
      (quantity / nb_period) + (((case when p_start = orders.arrival_starttime then mod(quantity , nb_period) else 0 END)
      - (case when p_end = orders.arrival_stoptime then mod(quantity , nb_period) else 0 END)) / nb_period) END)) /*over (partition by orders.article_id, periods.p_start, periods.p_end)*/, 0) qte_order
      from (select (period_start + ((LEVEL - 1)/288)) p_start,
      (period_start + (LEVEL/288)) p_end
      from period
      connect by level <= nb_period) periods left join orders_period orders on orders.arrival_starttime <= periods.p_start and orders.arrival_stoptime > periods.p_start
      group by periods.p_start, periods.p_end, orders.article_id
      order by periods.p_start;

      Delete
    5. I just note i missed to order by article_id ^^.

      Here's the revised fiddle with the last order by including article_id :

      https://dbfiddle.uk/?rdbms=oracle_18&fiddle=65c3915ac03ecfcd2c6e772a245c2394

      Delete
  11. Short final query: https://livesql.oracle.com/apex/livesql/s/liv25sp6wzgmudhxa1xrhpgux

    Detailed query with comments: https://livesql.oracle.com/apex/livesql/s/liv0vini2eaav6xvnnkcx54d4

    ReplyDelete
  12. Short solution: https://gist.github.com/xtender/cf7253a74605f9740ad2c47950c14a13
    /// blogspot doesn't support 'code' in comments...
    select
    slot_1, slot_2, article_id,
    sum(case when (quantity - n*ceil(quantity/max_n)) >= (max_n-n)*floor(quantity/max_n)
    then ceil(quantity/max_n)
    else floor(quantity/max_n)
    end) as quantity_slot
    from orderline_arrivals a
    ,lateral(
    select
    level as n,
    round((arrival_stoptime-arrival_starttime)*24*60/5) as max_n,
    a.arrival_starttime + (level-1)*interval '5' minute as slot_1,
    a.arrival_starttime + (level )*interval '5' minute as slot_2
    from dual
    connect by a.arrival_starttime + (level )*interval '5' minute<=arrival_stoptime
    ) x
    group by slot_1, slot_2, article_id
    order by slot_1, slot_2, article_id;

    ReplyDelete
  13. This comment has been removed by the author.

    ReplyDelete
  14. how about

    select to_char(slot,'YYYY-MM-DD HH24:MI')||'-'||to_char(slot+interval '5' minute,'HH24:MI') slot
    , article_id
    , count(*) slot_quantity
    from ( select oa.*, ai.article_instance
    , oa.arrival_starttime+
    (width_bucket
    ( oa.arrival_starttime+(article_instance-1)*num_buckets/quantity*(interval '5' minute)
    , oa.arrival_starttime
    , oa.arrival_stoptime
    , oa.num_buckets
    )-1
    )*(interval '5' minute) slot
    from ( select oa.*, round((arrival_stoptime-arrival_starttime)*285) num_buckets
    from orderline_arrivals oa
    ) oa
    , lateral ( select level article_instance from dual connect by level <= oa.quantity) ai
    )
    group by slot, article_id
    order by slot, article_id;

    ReplyDelete
    Replies
    1. Maybe Toon Koppelaars expects the output to include the slot 08:20:00-08:25:00 and
      08:25:00-08:30:00. Nice idea to use width and lateral ^^.

      I suggest that for the twi 'null' slots :

      select to_char(periods.slot,'YYYY-MM-DD HH24:MI')||'-'||to_char(periods.slot+interval '5' minute,'HH24:MI') slot
      , article_id
      , count(article_id) slot_quantity
      from (select (p_start + ((LEVEL - 1)/288)) slot
      from (select min(cast(arrival_starttime as date)) p_start, max(cast(arrival_stoptime as date)) p_stop from orderline_arrivals)
      connect by level <= round(to_number(p_stop - p_start) * 300,0)) periods
      LEFT JOIN ( select oa.*, ai.article_instance
      , oa.arrival_starttime+
      (width_bucket
      ( oa.arrival_starttime+(article_instance-1)*num_buckets/quantity*(interval '5' minute)
      , oa.arrival_starttime
      , oa.arrival_stoptime
      , oa.num_buckets
      )-1
      )*(interval '5' minute) slot
      from ( select oa.*, round((cast(arrival_stoptime as date)-cast(arrival_starttime as date))*285) num_buckets
      from orderline_arrivals oa
      ) oa
      , lateral ( select level article_instance from dual connect by level <= oa.quantity) ai
      ) result ON result.slot = periods.slot
      group by periods.slot, article_id
      order by periods.slot, article_id;

      Delete