By Skippy Fastol


2012-06-27 15:48:23 8 Comments

Let's say we have table Sales with 30 columns and 500,000 rows. I would like to delete 400,000 in the table (those where "toDelete='1'").

But I have a few constraints :

  • the table is read / written "often" and I would not like a long "delete" to take a long time and lock the table for too long
  • I need to skip the transaction log (like with a TRUNCATE) but while doing a "DELETE ... WHERE..." (I need to put a condition), but haven't found any way to do this...

Any advice would be welcome to transform a

DELETE FROM Sales WHERE toDelete='1'

to something more partitioned & possibly transaction log free.

8 comments

@Marko Juvančič 2019-03-06 10:39:26

I'll leave my answer here, since I was able to test different approaches for mass delete and update (I had to update and then delete 125+mio rows, server has 16GB of RAM, Xeon E5-2680 @2.7GHz, SQL Server 2012).

TL;DR: always update/delete by primary key. If you can't use PK directly, create a temp table and fill it with PK values and update/delete your table using that table. Use indexes for this.

I started with solution from above (by @Kevin Aenmey), but this approach turned out to be inappropriate, since my database was live and it handles a couple of hundred transactions per second and there was some blocking involved (there was an index for all there fields from condition, using WITH(ROWLOCK) didn't change anything).

So, I added a WAITFOR statement, which allowed database to process other transactions.

deleteMore:
WAITFOR DELAY '00:00:01'
DELETE TOP(1000) FROM MyTable WHERE Column1 = @Criteria1 AND Column2 = @Criteria2 AND Column3 = @Criteria3
IF @@ROWCOUNT != 0
    goto deleteMore

This approach was able to process ~1.6mio rows/hour for updating and ~0,2mio rows/hour for deleting.

Turning to temp tables changed things quite a lot.

deleteMore:
SELECT TOP 10000 Id 
  INTO #Temp 
  FROM MyTable WHERE Column1 = @Criteria1 AND Column2 = @Criteria2 AND Column3 = @Criteria3 

DELETE MT
  FROM MyTable MT
  JOIN #Temp T ON T.Id = MT.Id 

/* you can use IN operator, it doesn't change anything
 DELETE FROM MyTable WHERE Id IN (SELECT Id FROM #Temp)

 */
IF @@ROWCOUNT > 0 BEGIN
    DROP TABLE #Temp
    WAITFOR DELAY '00:00:01'
    goto deleteMore
END ELSE BEGIN
    DROP TABLE #Temp
    PRINT 'This is the end, my friend'
END

This solution processed ~25mio rows/hour for updating (15x faster) and ~2.2mio rows/hour for deleting (11x faster).

@Developer 2017-08-30 08:29:04

As I assume the best way to delete huge amount of records is to delete it by Primary Key. (What is Primary Key see here)

So you have to generate tsql script that contains the whole list of lines to delete and after this execute this script.

For example code below is gonna generate that file

GO
SET NOCOUNT ON

SELECT   'DELETE FROM  DATA_ACTION WHERE ID = ' + CAST(ID AS VARCHAR(50)) + ';' + CHAR(13) + CHAR(10) + 'GO'
FROM    DATA_ACTION
WHERE  YEAR(AtTime) = 2014

The ouput file is gonna have records like

DELETE FROM  DATA_ACTION WHERE ID = 123;
GO
DELETE FROM  DATA_ACTION WHERE ID = 124;
GO
DELETE FROM  DATA_ACTION WHERE ID = 125;
GO

And now you have to use SQLCMD utility in order to execute this script.

sqlcmd -S [Instance Name] -E -d [Database] -i [Script]

You can find this approach explaned here https://www.mssqltips.com/sqlservertip/3566/deleting-historical-data-from-a-large-highly-concurrent-sql-server-database-table/

@Ankush 2017-01-09 05:47:58

I have used the below to delete around 50 million records -

BEGIN TRANSACTION     
     DeleteOperation:
     DELETE TOP (BatchSize)
     FROM  [database_name].[database_schema].[database_table] 

     IF @@ROWCOUNT > 0
     GOTO DeleteOperation
COMMIT TRANSACTION

Please note that keeping the BatchSize < 5000 is less expensive on resources.

@WaitForPete 2016-11-23 14:37:07

My own take on this functionality would be as follows. This way there is no repeated code and you can manage your chunk size.

DECLARE @DeleteChunk INT = 10000
DECLARE @rowcount INT = 1

WHILE @rowcount > 0
BEGIN

  DELETE TOP (@DeleteChunk) FROM Sales WITH(ROWLOCK)

  SELECT @rowcount = @@RowCount
END

@Juan Ignacio Barisich 2018-07-26 17:17:44

Should be: SET @rowcount = @@RowCount ?

@Kevin Aenmey 2012-06-27 15:55:40

Calling DELETE FROM TableName will do the entire delete in one large transaction. This is expensive.

Here is another option which will delete rows in batches :

deleteMore:
DELETE TOP(10000) Sales WHERE toDelete='1'
IF @@ROWCOUNT != 0
    goto deleteMore

@MatBailie 2012-06-27 16:05:05

Oh my! How did I never realise that you can put a TOP() expression in a DELETE statement? Much more concise than my functionally identical answer! +1 to you sir!

@ErikE 2012-06-27 18:37:46

Frankly I didn't even remember you could use labels in SQL 2008. I'd rather see a WHILE statement. WHILE 1 = 1 BEGIN; DELETE ...; IF @@RowCount = 0 BREAK; END; This to me is clearer for the next sql writer who comes along that a loop is occurring, rather than figuring out the awful GOTO.

@Marcel N. 2012-06-27 15:53:50

You should try to give it a ROWLOCK hint so it will not lock the entire table. However, if you delete a lot of rows lock escalation will occur.

Also, make sure you have a non-clustered filtered index (only for 1 values) on the toDelete column. If possible make it a bit column, not varchar (or what it is now).

DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'

Ultimately, you can try to iterate over the table and delete in chunks.

Updated

Since while loops and chunk deletes are the new pink here, I'll throw in my version too (combined with my previous answer):

SET ROWCOUNT 100
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'

WHILE @@rowcount > 0
BEGIN
  SET ROWCOUNT 100
  DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'  
END

@ErikE 2012-06-27 18:40:01

SET ROWCOUNT is deprecated in SQL 2012.

@Cylindric 2012-06-27 15:52:32

One way I have had to do this in the past is to have a stored procedure or script that deletes n records. Repeat until done.

DELETE TOP 1000 FROM Sales WHERE toDelete='1'

@Joe Stefanelli 2012-06-27 15:55:08

@Cylindric 2012-06-27 15:55:33

To clarify (as it's a bit of a hack), I was doing this due to disk-space constraints on my logging directory at the time, rather than locking. I guess the locked-duration with this is the same, just spread out longer :)

@Aaron Bertrand 2012-06-27 15:56:30

Shrink the database? No, please don't do that. Hopefully you meant checkpoint or something...

@Cylindric 2012-06-27 15:56:35

@JoeStefanelli In an ideal world, perhaps. We don't all have infinite data-storage, and sometimes if you've deleted 90% of your database, it's okay to clean up your data files too. And if you're talking about index fragmentation, well just re-index.

@Aaron Bertrand 2012-06-27 15:57:58

it's okay to clean up your data files - after a complete operation that is atypical and in a case where you know you won't need that space again? Maybe, but that's very edge case. After every delete? I disagree 100%.

@Cylindric 2012-06-27 15:58:42

I wasn't talking about after every delete.

@Aaron Bertrand 2012-06-27 15:59:42

@Cylindric then fix the wording in your answer. deletes n records, then shrinks the database. Repeat until done. This is horrible advice.

@Cylindric 2012-06-27 16:01:07

I will, but read the answer. I didn't say "do this". I said "this is how I've done it before", and I intentionally didn't give code because it's a too domain-specific question.

@MatBailie 2012-06-27 16:01:40

@AaronBertrand - I agree, I wouldn't shrink after every delete. I can see how it helped Cylindric slowly paint his way out of a lack of disk space, but it's not appropriate here.

@Aaron Bertrand 2012-06-27 16:02:27

@Cylindric I don't now how anyone will take your answer except as an answer. If you don't want it to be an answer taken that way, then it should be a comment.

@aF. 2012-06-27 16:04:56

+1 now it seems legit :)

@Kevin Aenmey 2012-06-27 16:45:33

Group hug, everyone?

@Cylindric 2012-06-28 13:41:56

Far too soon :'(

@Chris Missal 2019-05-01 22:54:56

Here's an updated link for @JoeStefanelli's comment: sqlskills.com/blogs/paul/…

@HLGEM 2012-06-27 15:56:04

What you want is batch processing.

While (select Count(*) from sales where toDelete =1) >0
BEGIN
Delete from sales where SalesID in
(select top 1000 salesId from sales where toDelete = 1)
END

Of course you can experiment which is the best value to use for the batch, I've used from 500 - 50000 depending on the table. If you use cascade delete, you will probably need a smaller number as you have those child records to delete.

@MatBailie 2012-06-27 16:07:18

The repeated calls to COUNT(*) seem un-necessary when @@rowcount can be used. But it is a very portable solution to other RDBMSes...

@HLGEM 2012-06-27 17:01:40

True, I whipped this up in a hurry and didn't take time to optimize completely. And I am not really used to delete having the top keyword, hard to go against years of practice sometimes.

@Adrian Hedley 2016-06-27 07:46:38

Just Removed the : at the last deleteMore label because it gave me an error label was already declared. Worked fine with this small correction.

@Jamie Lester 2016-10-19 19:01:35

What about WHILE EXISTS (SELECT * FROM sales WHERE toDelete = 1) as a condition

Related Questions

Sponsored Content

11 Answered Questions

[SOLVED] Deleting millions of rows in MySQL

2 Answered Questions

1 Answered Questions

2 Answered Questions

[SOLVED] SQL locks parent table while deleting child table row

  • 2015-10-15 20:47:34
  • TheDruidsKeeper
  • 1269 View
  • 7 Score
  • 2 Answer
  • Tags:   sql-server tsql

1 Answered Questions

[SOLVED] Delete multiple rows from table with frequent reads and writes

1 Answered Questions

[SOLVED] How do I commit/execute deletion row by row in SQL (MSSQL)

5 Answered Questions

[SOLVED] best number of rows to delete in a delete query

12 Answered Questions

[SOLVED] How to delete or truncate table in SQL Server?

  • 2010-07-30 14:05:57
  • Jango
  • 24101 View
  • 5 Score
  • 12 Answer
  • Tags:   sql sql-server tsql

4 Answered Questions

[SOLVED] Truncate or Drop and Create Table

Sponsored Content