By Roman Pekar

2016-11-10 16:45:14 8 Comments

I need to store simple data - suppose I have some products with codes as a primary key, some properties and validity ranges. So data could look like this:

code    value   begin_date  end_date
10905   13      2005-01-01  2016-12-31
10905   11      2017-01-01  null

Those ranges are not overlapping, so on every date I have a list of unique products and their properties. So to ease the use of it I've created the function:

create function dbo.f_Products
    @date date
returns table
return (
    from dbo.Products as p
        @date >= p.begin_date and
        @date <= p.end_date

This is how I'm going to use it:

from <some table with product codes> as t
    left join dbo.f_Products(@date) as p on
        p.code = t.product_code

This is all fine, but how I can let optimizer know that those rows are unique to have better execution plan?

I did some googling, and found a couple of really nice articles for DDL which prevents storing overlapping ranges in the table:

But even if I try those constraint I see that optimizer cannot understand that resulting recordset will return unique codes.

What I'd like to have is certain approach which gives me basically the same performance as if I stored those products list on certain date and selected it with date = @date.

I know that some RDMBS (like PostgreSQL) have special data types for this (Range Types). But SQL Server doesn't have anything like this.

Am I missing something or there're no way to do this properly in SQL Server?


@Aducci 2018-02-16 22:05:58

You can create an indexed view that contains a row for each code/date in the range.

ProductDate (indexed view)
code    value   date
10905   13      2005-01-01
10905   13      2005-01-02
10905   13      ...
10905   13      2016-12-31
10905   11      2017-01-01
10905   11      2017-01-02
10905   11      ...
10905   11      Today

Like this:

create schema digits

create table digits.Ones (digit tinyint not null primary key)
insert into digits.Ones (digit) values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)

create table digits.Tens (digit tinyint not null primary key)
insert into digits.Tens (digit) values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)

create table digits.Hundreds (digit tinyint not null primary key)
insert into digits.Hundreds (digit) values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)

create table digits.Thousands (digit tinyint not null primary key)
insert into digits.Thousands (digit) values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)

create table digits.TenThousands (digit tinyint not null primary key)
insert into digits.TenThousands (digit) values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)

create schema info

create table info.Products (code int not null, [value] int not null, begin_date date not null, end_date date null, primary key (code, begin_date))
insert into info.Products (code, [value], begin_date, end_date) values 
(10905, 13, '2005-01-01', '2016-12-31'),
(10905, 11, '2017-01-01', null)

create table info.DateRange ([begin] date not null, [end] date not null, [singleton] bit not null default(1) check ([singleton] = 1))
insert into info.DateRange ([begin], [end]) values ((select min(begin_date) from info.Products), getdate())

create view info.ProductDate with schemabinding 
    dateadd(day, ones.digit + tens.digit*10 + huns.digit*100 + thos.digit*1000 + tthos.digit*10000, dr.[begin]) as [date]
    info.DateRange as dr
cross join
    digits.Ones as ones
cross join
    digits.Tens as tens
cross join
    digits.Hundreds as huns
cross join
    digits.Thousands as thos
cross join
    digits.TenThousands as tthos
    info.Products as p on
    dateadd(day, ones.digit + tens.digit*10 + huns.digit*100 + thos.digit*1000 + tthos.digit*10000, dr.[begin]) between p.begin_date and isnull(p.end_date, datefromparts(9999, 12, 31))

create unique clustered index idx_ProductDate on info.ProductDate ([date], code)

select *
from info.ProductDate with (noexpand)
    date = '2014-01-01'

drop view info.ProductDate
drop table info.Products
drop table info.DateRange
drop table digits.Ones
drop table digits.Tens
drop table digits.Hundreds
drop table digits.Thousands
drop table digits.TenThousands
drop schema digits
drop schema info

@Roman Pekar 2018-02-17 09:31:55

That one is good, i was thinking about this myself, the problem is that for long periods it will take a lot of space

@Pittsburgh DBA 2018-02-17 13:11:03

@RomanPekar What are you trying to accomplish, exactly? Do you want all product and date compbinations, with no gaps, or do you simply want to efficiently query the table you already have, which is Code, Value, Begin_Date, and End_Date? If it's the latter, these solutions are getting out of control. If it's the former, you need to edit your question to make this clear, because it is not.

@Aducci 2018-02-18 21:19:44

@RomanPekar - I can't imagine it taking up that much many products are there? What is the average time span?

@Salman A 2018-02-19 07:08:43

@RomanPekar search for "space-time tradeoff". You cannot minimize both at the same time.

@Roman Pekar 2018-02-19 11:56:55

@Aducci If we want to make it easy-to-use for end user, then it should work with, basically, any date passed (periods could be open-ended). Well, let's say there're 6000 products. If we use time from, let's say, 1990-01-01 till 2100-01-01 it will be 110 * 365 * 6000, which is 219M rows already.And it's still doesn't work with any date (for example 9999-12-31). So it's a fine working solution, but good 'range' solution is still better.

@Aducci 2018-02-19 16:22:59

@RomanPekar you could wrap the view with an inline table function in order to reduce the number of rows you need to store...using a where clause similar to where [date] = case when @date > getdate() then getdate() else @date end

@Pittsburgh DBA 2018-02-16 19:37:36

EDIT: My original answer was using an INNER JOIN, but the questioner wanted a LEFT JOIN.

  , [Value] VARCHAR(30) NOT NULL

code    value   begin_date  end_date
10905   13      2005-01-01  2016-12-31
10905   11      2017-01-01  null
INSERT INTO Products ([Code], [Value], Begin_Date, End_Date) VALUES (10905, 13, '2005-01-01', '2016-12-31')
INSERT INTO Products ([Code], [Value], Begin_Date, End_Date) VALUES (10905, 11, '2017-01-01', NULL)

CREATE NONCLUSTERED INDEX SK_ProductDate ON Products ([Code], Begin_Date, End_Date) INCLUDE ([Value])

CREATE TABLE SomeTableWithProductCodes

 INSERT INTO SomeTableWithProductCodes ([Code]) VALUES (10905)

Here is a prototypical query, with a date predicate. Note that there are more optimal ways to do this in a bulletproof fashion, using a "less than" operator on the upper bound, but that's a different discussion.

  , P.[Value]
  , P.[Begin_Date]
  , P.[End_Date]
   SomeTableWithProductCodes ST
   LEFT JOIN Products AS P ON
     ST.[Code] = P.[Code]
     AND '2016-06-30' BETWEEN P.[Begin_Date] AND ISNULL(P.[End_Date], '9999-12-31')

This query will perform an Index Seek on the Product table.

Here is a SQL Fiddle: SQL Fiddle - Products and Dates

Query Execution Plan

@Vladimir Baranov 2018-02-16 05:12:57

Approach with ROW_NUMBER scans the whole Products table once. It is the best method if you have a lot of product codes in the Products table and few validity ranges for each code.

        ,ROW_NUMBER() OVER (PARTITION BY code ORDER BY begin_date DESC) AS rn
    FROM Products
    WHERE begin_date <= @date
    <some table with product codes> as t
    LEFT JOIN CTE_rn ON CTE_rn.code = t.product_code AND CTE_rn.rn = 1

If you have few product codes and a lot of validity ranges for each code in the Products table, then it is better to seek the Products table for each code using OUTER APPLY.

    <some table with product codes> as t
        SELECT TOP(1)
        FROM Products
            Products.code = t.product_code
            AND Products.begin_date <= @date
        ORDER BY Products.begin_date DESC
    ) AS A

Both variants need unique index on (code, begin_date DESC) include (value).

Note how the queries don't even look at end_date, because they assume that intervals don't have gaps. They will work in SQL Server 2008.

@Anton 2018-02-15 23:45:37

First you need to create a unique clustered index for (begin_date, end_date, code)

Then SQL engine will be able to do INDEX SEEK.

Additionally, you can also try to create a view for dbo.Products table to join that table with pre-populated dbo.Dates table.

select p.code, p.val, p.begin_date, p.end_date, d.[date]
    from dbo.Product as p
        inner join dbo.dates d on p.begin_date <= d.[date] and d.[date] <= p.end_date

Then in your function, you use that view as "where @date =". The result can be either better or slightly worse... it depends on the actual data.

You also can try to make that view indexed (depends on how often it is being updated).

Alternatively, you can have better performance if you populate dbo.Products table for every date in the [begin_date] .. [end_date] range.

@Shnugo 2016-11-10 17:02:50

A solution without gaps might be this:

INSERT INTO @tbl VALUES({d'2016-10-01'}),({d'2016-09-01'}),({d'2016-08-01'}),({d'2016-07-01'}),({d'2016-06-01'});


DECLARE @DateFilter DATE={d'2016-08-13'};

FROM @tbl
WHERE [start_date]<[email protected]
ORDER BY [start_date] DESC

Important: Be sure that there is an (unique) index on start_date

UPDATE: for different products

--product 1
--product 1

DECLARE @DateFilter DATE={d'2016-08-13'};

WITH PartitionedCount AS
    FROM @tbl
    WHERE [start_date]<[email protected]
FROM PartitionedCount

@Roman Pekar 2016-11-10 17:06:32

Yes, this is nice way of solving the task. However, the performance is still not optimal. What I'd like to have is certain approach which gives me basically the same performance as if I stored those products list on certain date and selected it with date = @date.

@Shnugo 2016-11-10 17:09:39

@RomanPekar There must be a difference whether you grab into an index based on equality or if you have to locate a value within intervalls. Nevertheless: If there is an index (check the EP if it is used!) the WHERE [start_date]<[email protected] should be lightning fast and the TOP 1 ORDER BY [start_date] DESC should pick the relevant row instantly... It makes me wonder, that you do not observe better performance...

@Roman Pekar 2016-11-10 17:12:16

well, for one row it works fine, but what if I want to get the list of valid products on certain date? I can of course explode my table into dates and use equality check, but I really would like to know if there's more beautiful solution

@Shnugo 2016-11-10 17:16:51

@RomanPekar If you have the same start-dates (first of month) for all your products, simply use SELECT TOP 1 WITH TIES... Does this help?

@Shnugo 2016-11-10 17:20:24

@RomanPekar I just added an approach using ROW_NUMBER which would get all products in one go. Just join this resultset to your main query...

@Shnugo 2016-11-17 07:51:04

@RomanPekar Is this question solved? Do you need further help? Please allow me one hint: If this question is solved, it would be very kind of you, to tick the acceptance check below the (best) answer's vote counter. This will 1) mark this issue as solved 2) make it easier for followers to find the best solution 3) pay points to the answerer and 4) pay points to you. Since you've crossed the 15 points border yourself, you are - additionally - asked to vote on contributions. This is the SO-way to say thank you. Happy Coding!

Related Questions

Sponsored Content

16 Answered Questions

[SOLVED] Select columns from result set of stored procedure

37 Answered Questions

[SOLVED] How to return only the Date from a SQL Server DateTime datatype

10 Answered Questions

[SOLVED] Best way to get identity of inserted row?

  • 2008-09-03 21:32:02
  • Oded
  • 641088 View
  • 863 Score
  • 10 Answer
  • Tags:   sql sql-server tsql

24 Answered Questions

4 Answered Questions

[SOLVED] Fastest way to split overlapping date ranges

7 Answered Questions

[SOLVED] What are the options for storing hierarchical data in a relational database?

3 Answered Questions

[SOLVED] SQL: Selecting between dates

  • 2014-06-30 13:23:24
  • Lehel
  • 59 View
  • 0 Score
  • 3 Answer
  • Tags:   sql database

1 Answered Questions

[SOLVED] SQL - Select Data Based on Date Range Efficiently

  • 2013-11-11 12:18:38
  • user2906488
  • 1967 View
  • 2 Score
  • 1 Answer
  • Tags:   sql sql-server

3 Answered Questions

Sponsored Content