Home > Blockchain >  Select top 1000 from (range of tables)
Select top 1000 from (range of tables)

Time:10-23

I have a db where I create a new table every day and filled with data. Each DB object is named "TESTdata_xxxxx_DB" where xxxxx is incremented by 1 every day. I wish to select top 1000 rows where a condition is met from a variable range of tables. For example from tables TESTdata_00005_DB to TESTdata_00015_DB

SELECT TOP (1000) [ItemIndex]
  ,[Data1]
  ,[Data2]
  ,[Data3]     
FROM [TESTDB1].[dbo].[TESTdata_00005_DB] (to...)  [TESTdata_00015_DB]
WHERE Data1 like 'High' OR Data1 like 'Medium'
ORDER BY Data1

Clearly adding 'TO' does not work. If I add each object separated by a comma, then this fails too.

Any help would be appreciated.

CodePudding user response:

A stored procedure using dynamic SQL is a cool approach for this sort of problem. Create a numbers table in memory, use it to create your UNIONed super table, and then execute dynamic SQL to get the result you need:

CREATE PROCEDURE [dbo].[Top1000]
(
    @startDatabaseNumber INT,
    @endDatabaseNumber INT
)
AS
SET NOCOUNT ON

-- inspired by https://stackoverflow.com/a/33146869/7806251
DECLARE @digits TABLE(d INT)
INSERT INTO @digits 
    SELECT 
        d
    FROM 
        (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(d)
 
DECLARE @numbers TABLE(DatabaseNumber VARCHAR(5))       
INSERT INTO @numbers
        SELECT
            LEFT('00000', 5 - LEN(n))   n AS DatabaseNumber
        FROM
            (
            SELECT 
                CAST(
                    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)
                    ) AS VARCHAR(10)
                ) n
            FROM 
                @digits ones, 
                @digits tens, 
                @digits hundreds,
                @digits thousands
            ) x
        WHERE
            n BETWEEN @startDatabaseNumber AND @endDatabaseNumber

DECLARE @unionedTable VARCHAR(MAX) = (      
    SELECT
        STRING_AGG(
            CAST(
                '(
                    SELECT
                        [ItemIndex],
                        [Data1],
                        [Data2],
                        [Data3]
                    FROM
                        [TESTDB1].[dbo].[TESTdata_' DatabaseNumber '_DB]
                )'
            AS VARCHAR(MAX)) -- circumvents STRING_AGG() 8000 character constraint
        , ' UNION ')
    FROM
        @numbers
    )

DECLARE @sql VARCHAR(MAX) = '
    SELECT TOP 1000 
        *
    FROM 
        (' @unionedTable ') UnionedTable
    WHERE 
        Data1 IN (''High'', ''Medium'')
    ORDER BY 
        Data1
    ;
    '
    
EXECUTE sp_executesql @sql

Once this is run and exists, call it with whatever start and end numbers you'd like:

EXEC Top1000 5, 15;

Finally, as other commenters have expressed, I'd suggest not creating a new table each day if you can help it. Inserting each day's data into a running table is more natural and better practice.

CodePudding user response:

If you want a union of tables, meaning all the rows of all the tables dumped into one row set, then use the SQL UNION. For example (this is suggested code, I have not tested it but UNION is very standard SQL and should work everywhere)

SELECT TOP (1000) [ItemIndex]
 ,[Data1]
 ,[Data2]
 ,[Data3]     
  FROM [TESTDB1].[dbo].[TESTdata_00005_DB] 
  UNION  [TESTdata_00006_DB]
  UNION  [TESTdata_00007_DB]
  UNION  [TESTdata_00008_DB]
  WHERE Data1 like 'High' OR Data1 like 'Medium'
  ORDER BY Data1

By the way, though UNION works quite well it is not good performance and in my experience doing a bunch of unions may grow to be difficult to support in the long run. At some point you might consider using alternate data architecture such as a single table but with multiple partitions, one for each day.

  •  Tags:  
  • sql
  • Related