By Prateek Gupta


2015-08-02 11:34:08 8 Comments

Lets say I have the following table

-----------------------------
| user_id   | comment       |
-----------------------------
| 2         | thats cool    |
| 2         | awesome       |
| 3         | i hate this   |
| 3         | okay          |
| 6         | this is weird |
| 6         | hello?        |
| 6         | what is it    |
| 9         | how are you   |
| 16        | too slow      |
| 16        | yes           |
| 17        | alrighty      |
-----------------------------

How can you select two rows per user_id? So my results would be:

-----------------------------
| user_id   | comment       |
-----------------------------
| 2         | awsome        |
| 2         | thats cool    |
| 3         | i hate this   |
| 3         | okey          |
| 6         | this is weird |
| 6         | hello?        |
| 9         | how are you   |
| 16        | too slow      |
| 16        | yes           |
| 17        | alrighty      |
-----------------------------

Is this possible with a single efficient query? Or are sub-selected necessary?

1 comments

@Lennart 2015-08-02 12:10:39

You can use the technique described in:

http://blog.sqlauthority.com/2014/03/09/mysql-reset-row-number-for-each-group-partition-by-row-number/

to mimic:

row_number() over (partition by ... order by ...)

In your case that would be something like:

SELECT user_id, comment, row_number 
FROM (
    SELECT @row_number:=CASE WHEN @user_id=user_id
                             THEN @row_number+1                                
                             ELSE 1                          
                        END AS row_number
         , @user_id:=user_id AS user_id
        , comment     
    FROM t        
       , (SELECT @row_number:=0,@user_id:='') AS u     
    ORDER BY user_id, comment 
) as v 
WHERE row_number <= 2;

@Martin Smith 2015-08-02 14:11:08

Is this a documented, guaranteed technique in MySQL?

@Lennart 2015-08-02 14:35:27

I don't think so, all and all user variables appears to be a hack that more or less happens to work. The docs at: dev.mysql.com/doc/refman/5.7/en/user-variables.html mentions that the order of evaluation for expressions involving user variables is undefined. In practice it appears to be working, but I would not bet my life on it.

@ypercubeᵀᴹ 2015-08-02 14:48:55

Exactly, in MariaDB, the differences in the optimizer may result in not the wanted outcome. It has to do with the ORDER BY being optimized out of the query.)

@Lennart 2015-08-02 18:22:27

I wonder whether it is possible to trick the optimizer by adding LIMIT <large number> in the inner select. In theory the ORDER BY can be optimized out anyway if it is guaranteed that the limit is larger than the size of the sub-select, but I doubt that any optimizer would do that.

Related Questions

Sponsored Content

2 Answered Questions

MySQL Correlated Subquery (with multiple columns) per row

  • 2016-09-03 22:03:52
  • Dan
  • 3053 View
  • 1 Score
  • 2 Answer
  • Tags:   mysql subquery

1 Answered Questions

[SOLVED] Eav model sql select only products with all 3 values?

1 Answered Questions

[SOLVED] How to enrich event based data per row based on future events

  • 2016-05-06 23:11:06
  • Max
  • 72 View
  • 1 Score
  • 1 Answer
  • Tags:   mysql redshift

1 Answered Questions

database design help with multiple table relationships

2 Answered Questions

1 Answered Questions

[SOLVED] SELECT LIMIT 1 per column value?

2 Answered Questions

1 Answered Questions

1 Answered Questions

[SOLVED] Several values from different tables

Sponsored Content