By Miguel E


2015-04-20 13:49:38 8 Comments

I have a table in MySQL that represents a queue of links to be processed. The links are processed by an external app, one by one, and deleted in the end. This is a high volume queue and I have multiple instances of the processing app, spread across several servers.

How can I ensure that each record is picked by one app only? Is there a way to flag/lock the record?

Right now, to avoid two or more picking up the same link, I'm allowing each instance only to pick up a certain set of records (based in the MOD of their ID), but this is not a transparent way to increase queue processing speed just by adding new instances.

3 comments

@Vlad Mihalcea 2019-04-16 12:51:25

As I explained in this article, MySQL 8 introduced support for both SKIP LOCKED and NO WAIT.

SKIP LOCKED is useful for implementing job queues (a.k.a batch queues) so that you can skip over locks that are already locked by other concurrent transactions.

NO WAIT is useful for avoiding waiting until a concurrent transaction releases the locks that we are also interested in locking. Without NO WAIT, we either have to wait until the locks are released (at commit or release time by the transaction that currently holds the locks) or the lock acquisition times out. Therefore, NO WAIT acts like a lock timeout with a value of 0.

For more details about SKIP LOCK and NO WAIT, check out this article.

@jynus 2015-04-20 19:44:19

First: MySQL is one of the worst possible pieces of software to implement this, specially if it is very dynamic. The reason is that engines like MEMORY and MyISAM have only full-table locks while more suitable engines like InnoDB have a higher write penalty (to provide ACID properties) and are optimized for accessing records that are spatially and temporally close (those are set on memory). There is also not a good change notification system for MySQL- it has to be implemented as a polling. There are dozens of pieces of software more optimized for that task.

Having said that, I have seen successfully implement this kind of access if the performance/efficiency requirements are not very high. Many people cannot afford to introduce and maintain a complete separate piece of technology just for a small part of the business logic.

SELECT FOR UPDATE is what you are looking for- read serialization. While an UPDATE/DELETE will always lock the row during a running MYSQL transaction, you may want to avoid a large transaction while the process is going on, so:

START TRANSACTION;
SELECT * FROM your_table WHERE state != 'PROCESSING' 
  ORDER BY date_added ASC LIMIT 1 FOR UPDATE;
if (rows_selected = 0) { //finished processing the queue, abort}
else {
UPDATE your_table WHERE id = $row.id SET state = 'PROCESSING'
COMMIT;

// row is processed here, outside of the transaction, and it can take as much time as we want

// once we finish:
DELETE FROM your_table WHERE id = $row.id and state = 'PROCESSING' LIMIT 1;
}

MySQL will take care of locking all the concurrent selects except one when selecting rows. As this can lead to a lot of locked connections at the same time, keep the initial transaction as small as possible and try to process more that 1 row at a time.

@Miguel E 2015-04-20 19:53:04

Thanks. Do you think performance can benefit from a larger lock (by changing the LIMIT to say 10)?

@jynus 2015-04-20 19:56:39

@MiguelE In general, yes, the more time you spend processing and the less likely you collide with other transactions, the better. But it may depend in some cases- it could also cause the opposite effect (more transaction being locked). Always test it first. It is also important to adequately index the table, or you may end up with a full table lock in some isolation modes.

@Julian 2017-05-17 15:51:25

And it would probably be a good idea to keep track of the date you started processing the row just in case the process hangs and you would like to implement a timeout mechanism.

@Chris Woods 2015-04-20 14:24:02

I have done something similar with offline DBCC checks (two servers doing backup restores and then a DBCC checkdb). One server gathers all the 31 server's backups yesterday and puts them into a queue and then that server and another pull from that queue. While not a lot of servers, the method should remain the same: Have the application server run an update query against the queue updating a date/time field and a "app server" field with that app server's name or better yet numerical ID. This will cause a lock or if there is a lock already from another server obtaining the next row, it'll be blocked and wait for the other app to finish getting the next row. You will then want the app to pull back the most recent record from the queue for it's app field and get whatever information you want from it. Using MySQL's internal locking mechanism (the update command) will ensure only one app server will get any one record.

Related Questions

Sponsored Content

1 Answered Questions

1 Answered Questions

[SOLVED] Is this table-backed-queue implementation safe?

1 Answered Questions

[SOLVED] Best way to apply a "friend system"-like table?

0 Answered Questions

Best way to structure table for forum

  • 2017-02-21 13:28:59
  • UserAa
  • 35 View
  • 1 Score
  • 0 Answer
  • Tags:   mysql

1 Answered Questions

Best way to structure data

2 Answered Questions

[SOLVED] FIFO queue table for multiple workers in SQL Server

1 Answered Questions

[SOLVED] Best way to create dynamic table for search

1 Answered Questions

1 Answered Questions

Efficient way to move rows across the tables?

3 Answered Questions

[SOLVED] How to Best Implement nearest neighbour search in mysql?

  • 2011-08-01 07:57:24
  • user4951
  • 19409 View
  • 10 Score
  • 3 Answer
  • Tags:   mysql

Sponsored Content