insert or replace: INSERT rows while SELECT returning large volumes of data against same table

I'm taking over some legacy coding and am checking to see if there may be a better approach to some sql code I've inherited.
Any input would be appreciated.

Here was the problem that the implemented code was to address:

Given a high volume (10 to 50 thousand inserts daily) transaction table, it was observed that when executing a stored procedure that executed a
SELECT against the table while inserts were being attempted, there was considerable delay before the inserts were committed. It turns out the SELECT query was taking 100% CPU. Because of the business needs of the environment, the inserts were required to be committed in a very timely fashion.

Present Solution:

The currently implemented workaround implements a cursor that after so may reads will execute
the WAITFOR command in order to yield the CPU. This is all done within a stored procedure where there are actually seven
sections that parse through data tables in order to populate the temp table.

The below is just a example of the currently implemented code that invokes the waitfor command.

DECLARE @.a1 [datetime]
DECLARE @.b1 [int]
DECLARE @.c1 [int]
DECLARE @.rowCount [int]
DECLARE @.sleepRowCount [int]
DECLARE @.waitForDelay varchar(20)
DECLARE @.parameterDate1 smalldatetime

DECLARE @.parameterDate2 smallDateTime

SET @.parameterDate1 = '1/1/2006'

SET @.parameterDate2 = '1/31/2006'

SET @.sleepRowCount = 10000
SET @.waitForDelay = '0:0:0.500'
CREATE TABLE [#tempTable] (
a smalldatetime,
b int,
c int
)
DECLARE tableCursor CURSOR FOR
SELECT my.somedate, my.someId, count(*)
FROM myTable my with (nolock, index=IX_somedate)
WHERE my.somedate BETWEEN @.parameterDate1 AND @.parameterDate2
GROUP BY my.somedate, my.someId
OPEN tableCursor
FETCH NEXT FROM tableCursor INTO @.a1, @.b1, @.c1
WHILE @.@.FETCH_STATUS = 0
BEGIN
SET @.rowCount = @.rowCount + 1
INSERT INTO #tempTable VALUES( @.a1, @.b1, @.c1 )
FETCH next FROM tableCursor INTO @.a1, @.b1, @.c1
-- Yield CPU
IF (@.rowCount > @.sleepRowCount)
BEGIN
WAITFOR delay @.waitForDelay
SET @.rowCount=0
END
END
CLOSE tableCursor
DEALLOCATE tableCursor

Zymore:

What is the composition of the "IX_somedate" index? Does this index contain both (1) someDate and (2) someId?

Dave

|||

If you are directly inserting into the table without any modifications to the values retrieved you could directly dump the data into the temp table.

INSERT INTO #temptable

SELECT my.somedate, my.someId, count(*)
FROM myTable my with (nolock, index=IX_somedate)
WHERE my.somedate BETWEEN @.parameterDate1 AND @.parameterDate2
GROUP BY my.somedate, my.someId

|||

Mugambo

The sql code listed is just a rough example to give the reader an idea of the methodology that has been implemented to overcome CPU hogging. In the actual code, the date and id would be columns in a composite index that also contains additional columns.

Could high CPU use point to inefficient index usage from a query?

|||

Zymore:

Really, I think I would also implement pretty much the way Dinakar has laid it out. What I was puzzling over was the index hint -- wandering why the code needed to resort to an index hint. If you have a cover index -- and from your response it seems that you do -- then the index hint should not be necessary. My worry when I saw your example was that you were going to get bombarded with hundreds of thousands of bookmark lookups; and these could grieve you.

However, if you are getting your data out of a cover index and are not incurring any bookmark lookups then your insert and select should be very efficient. It would be a good idea to get an execution plan of the query you outlined and present it. Also, try Dinakar's query and get an execution for that particular query. Getting such execution plans would greatly help discussion. I will also put together a mock-up so that we can try to compare notes.

Dave

|||

Zymore:

I mocked up your table with about 98K records ranging in date from 1/1/2006 - 1/3/2006 and sprayed these records over 32K unique "someID" keys. I then ran your query and also ran Dinakar's query. I used two different nonclustered indexes for testing purposes; one version had only the "someDate" field and the other had both the "someDate" and the "someId" field. I ran the mock-up on my 4-way SQL Server 2000 machine. The information that I recorded included:

The particular index arrangement used|||

Mugambo wrote:
I left out something: After running the mockup I feel that Dinakar's query held up well. I would try Dinakar's query WITHOUT the INDEX HINT. And please make sure you thank Dinakar for his help.

Well, Mugambo has done all the (dirty) work. So credit goes to Mugambo.

|||You can improve the performance further by using SELECT...INTO instead of creating the temporary table first and doing INSERT. SELECT...INTO is the fastest way to create and populate a table. It does minimal logging so it doesn't have the same overhead as fully logged insert statement.

Friday, March 9, 2012

INSERT rows while SELECT returning large volumes of data against same table

No comments:

Post a Comment

insert or replace

Blog Archive

About Me