By Aeveus


2017-02-14 10:45:15 8 Comments

I've inherited myself a fun table with some peculiar indexes (and no primary key) on it, that looks as follows:

CREATE TABLE `my_wonky_table` (
  `id` bigint(20) unsigned NOT NULL,
  `login` varchar(127) DEFAULT NULL,
  `timestamp` int(10) unsigned NOT NULL,
  `ip` varchar(32) CHARACTER SET ascii DEFAULT NULL,
  `val_1` int(10) unsigned DEFAULT NULL,
  `val_2` varchar(127) DEFAULT NULL,
  `val_3` varchar(255) DEFAULT NULL,
  `val_4` varchar(127) DEFAULT NULL,
  `val_5` int(10) unsigned DEFAULT NULL,
  KEY `my_wonky_table_id_idx` (`id`),
  KEY `my_wonky_table_timestamp_idx` (`timestamp`),
  KEY `my_wonky_table_val_1_idx` (`val_1`,`id`),
  KEY `my_wonky_table_val_2_idx` (`val_2`,`id`),
  KEY `my_wonky_table_val_4_idx` (`val_4`,`id`),
  KEY `my_wonky_table_val_5_idx` (`val_5`,`id`),
  KEY `my_wonky_table_ip_idx` (`ip`(16),`id`),
  KEY `my_wonky_table_login_idx` (`login`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`id`)
(PARTITION pdefault VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */

Now, because it has no primary key, nor a fully unique index, MySQL adds a GEN_CLUST_INDEX (that's taking up quite some space, but that's a different matter).

Note that even though there is a column called id, it's not actually a unique entry. For this reason, I want to add an (auto incremented) column, that, combined with the id column, should act as the primary key.

However, I appear to be unable to add an auto_incremented column; I suspect this is because of the 'secret' column MySQL adds, because of the error message saying I already have an auto-incremented column, even though I don't believe I do.

The following:

ALTER TABLE my_wonky_table 
    ADD COLUMN count SMALLINT NOT NULL AUTO_INCREMENT,
    ADD PRIMARY KEY (id, count);

yields:

ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key

Any ideas?

2 comments

@Rick James 2017-02-15 22:50:51

It is not the "hidden PK" that is in the way. ALTER TABLE ... ADD PRIMARY KEY(...) will replace it. The problem is that the "partition key", id in your case, must be part of every UNIQUE key, and the PK is UNIQUE.

It is more efficient to do all ALTERs in a single statement. (There are a few exceptions.)

ALTER TABLE my _wonky_table
    ADD COLUMN count SMALLINT UNSIGNED AUTO_INCREMENT NOT NULL,
    ADD PRIMARY KEY(count, id);

count SMALLINT for the PK:

  • You are not expecting more than 32K rows in the table? In general you should not use PARTITIONing on a table with fewer than 1M rows.
  • Don't you want SMALLINT UNSIGNED?
  • "count" is an unusual name for the AI; that, plus having id not be the AI, will confuse readers; please rethink both column names.

"Index prefixing" is rarely good. Recommend removing "(16)".

KEY `my_wonky_table_ip_idx` (`ip`(16),`id`)

Do you really have only one partition? If so, why partition at all?

Let's see some of the SELECTs; we can discuss the utility (or lack of) of the indexes.

@Aeveus 2017-02-16 07:08:09

You're correct. I misunderstood how auto increment works. What I actually wanted was a count column that increments for each duplicate id occurrence, because id in this table is not unique. But this does not appear to be possible. If I add an auto increment column as described above, I would need to set it to bigint, as I have more than 5 billion rows, but I'm not sure whether that's better or worse than the current hidden key that MySQL uses. Prefix index is to be removed, I encountered a wonky bug with that a while ago. Partitioning is done by month, the timestamp is also in the id value.

@Rick James 2017-02-16 16:35:44

MyISAM (deprecated) had the trick of a AUTO_INCREMENT as the second column of the PK. It is possible (though clumsy) to simulate such. 5B - BIGINT - OK. The hidden key is 6 bytes, but has several caveats, so I reluctantly recommend the larger 8-byte BIGINT.

@Rick James 2017-02-16 16:39:20

Your id is a timestamp concatenated with something else? And the second part does not "dedup" sufficiently? If it did, then id could be the PK. And can you deduce timestamp by splitting id? With 5B rows, shrinking the disk footprint is an important task.

@Rick James 2017-02-16 16:42:37

Since each secondary key has the primary key implicitly added, we are talking about 6b / 8b / etc. But if id could be the PK, then it would be moving from +6b to +0b, thereby saving something like 300GB of disk space.

@Aeveus 2017-02-16 16:44:18

The timestamp column has the exact same value as the one that's inside id. The timestamp bitshifted 32 bits to the left. The column was simply kept for convenience, but now takes up quite some space. Adding a column is almost a no-go, as it seems like it will take too long. A lot of the secondary keys could be added into a big compound key, which shaves off a lot of space. But the problem with no primary key appears to be staying for a while.

@Rick James 2017-02-16 16:51:47

Combining indexes? Be careful. INDEX(a,b,id) is not the same as INDEX(a,id), INDEX(b,id); order does matter. More discussion .

@Aeveus 2017-02-16 16:54:48

Very true, I'm still looking at the queries, but they appear to be quite bunched up at the moment. The application that generates the queries is a bit strange, I'm still figuring out what SELECTs come out of it. Initial benchmarks seem promising, though.

@Rick James 2017-02-16 21:33:46

Consider using the slowlog to identify the slow SELECTs.

@Aeveus 2017-02-17 05:01:34

Surprisingly enough, query speed is not an issue. The indexes are simply growing too rapidly, which is what I wanted to reduce using a primary key and compound indexes. Slow query log was helpful in getting a big set of queries to benchmark against, though.

@Rick James 2017-02-17 22:47:47

If id starts with a timestamp, then id would probably be quite adequate for chunking, as I just now recommended in your other question.

@Ahmad Abuhasna 2017-02-14 11:59:03

You can have an auto-Incrementing column, as long as there is an index (key) on it, to fix this issue you need to execute those commands:

ALTER TABLE my_wonky_table ADD COLUMN count SMALLINT NOT NULL;
ALTER TABLE my_wonky_table ADD INDEX (count);
ALTER TABLE my_wonky_table CHANGE count count SMALLINT NOT NULL AUTO_INCREMENT;
ALTER TABLE my_wonky_table ADD PRIMARY KEY (id, count);

@ypercubeᵀᴹ 2017-02-14 12:21:47

You can do it in one statement: ALTER TABLE my_wonky_table ADD COLUMN count SMALLINT NOT NULL AUTO_INCREMENT, ADD INDEX (count), ADD PRIMARY KEY (id, count) ;

@ypercubeᵀᴹ 2017-02-14 12:23:02

Or (with different indexing): ALTER TABLE my_wonky_table ADD COLUMN count SMALLINT NOT NULL AUTO_INCREMENT, ADD PRIMARY KEY (count, id) ;

@Ahmad Abuhasna 2017-02-14 12:26:33

@ypercubeᵀᴹ I know, sorry about that, but I used to give tutorials to new employee, one command per statement, still have the old habit.

Related Questions

Sponsored Content

2 Answered Questions

How to improve query count execution with mySql replicate?

3 Answered Questions

[SOLVED] Optimizing a simple query on a large table

2 Answered Questions

[SOLVED] MySQL query taking too long

3 Answered Questions

[SOLVED] Need help in writing stored procedures in MYSQL?

2 Answered Questions

[SOLVED] Adding index to large mysql tables

2 Answered Questions

[SOLVED] Need help improving sql query performance

2 Answered Questions

[SOLVED] Getting a Deadlock,

  • 2014-08-24 12:59:07
  • user3191447
  • 623 View
  • 2 Score
  • 2 Answer
  • Tags:   mysql deadlock java

2 Answered Questions

[SOLVED] Should I create a multi-column UNIQUE index?

2 Answered Questions

[SOLVED] Finding rows for a specified date range

  • 2013-07-14 15:07:26
  • porton
  • 213 View
  • 2 Score
  • 2 Answer
  • Tags:   mysql datetime

1 Answered Questions

[SOLVED] Unexplained InnoDB timeouts

Sponsored Content