By Alex


2010-04-07 18:17:29 8 Comments

It's easy to find duplicates with one field:

SELECT name, COUNT(email) 
FROM users
GROUP BY email
HAVING COUNT(email) > 1

So if we have a table

ID   NAME   EMAIL
1    John   [email protected]
2    Sam    [email protected]
3    Tom    [email protected]
4    Bob    [email protected]
5    Tom    [email protected]

This query will give us John, Sam, Tom, Tom because they all have the same email.

However, what I want is to get duplicates with the same email and name.

That is, I want to get "Tom", "Tom".

The reason I need this: I made a mistake, and allowed to insert duplicate name and email values. Now I need to remove/change the duplicates, so I need to find them first.

28 comments

@adesh 2019-06-25 16:30:23

You may want to try this

SELECT NAME, EMAIL, COUNT(*)
FROM USERS
GROUP BY 1,2
HAVING COUNT(*) > 1

@JIYAUL MUSTAPHA 2018-09-27 11:38:30

How to get duplicate record in table

 SELECT COUNT(EmpCode),EmpCode FROM tbl_Employees WHERE Status=1 
 GROUP BY EmpCode HAVING COUNT(EmpCode) > 1

@Parkofadown 2019-04-04 14:21:18

You can use the SELECT DISTINCT keyword to get rid of duplicates. You can also filter by name and get everyone with that name on a table.

@Arun Solomon 2019-03-18 17:32:16

To Check From duplicate Record in a table.

select * from users s 
where rowid < any 
(select rowid from users k where s.name = k.name and s.email = k.email);

or

select * from users s 
where rowid not in 
(select max(rowid) from users k where s.name = k.name and s.email = k.email);

To Delete the duplicate record in a table.

delete from users s 
where rowid < any 
(select rowid from users k where s.name = k.name and s.email = k.email);

or

delete from users s 
where rowid not in 
(select max(rowid) from users k where s.name = k.name and s.email = k.email);

@Sheriff 2019-01-10 12:46:42

To delete records whose names are duplicate

;WITH CTE AS    
(

    SELECT ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS T FROM     @YourTable    
)

DELETE FROM CTE WHERE T > 1

@Suraj Kumar 2018-10-26 16:44:01

We can use having here which work on aggregate functions as shown below

create table #TableB (id_account int, data int, [date] date)
insert into #TableB values (1 ,-50, '10/20/2018'),
(1, 20, '10/09/2018'),
(2 ,-900, '10/01/2018'),
(1 ,20, '09/25/2018'),
(1 ,-100, '08/01/2018')  

SELECT id_account , data, COUNT(*)
FROM #TableB
GROUP BY id_account , data
HAVING COUNT(id_account) > 1

drop table #TableB

Here as two fields id_account and data are used with Count(*). So, it will give all the records which has more than one times same values in both columns.

We some reason mistakely we had missed to add any constraints in SQL server table and the records has been inserted duplicate in all columns with front-end application. Then we can use below query to delete duplicate query from table.

SELECT DISTINCT * INTO #TemNewTable FROM #OriginalTable
TRUNCATE TABLE #OriginalTable
INSERT INTO #OriginalTable SELECT * FROM #TemNewTable
DROP TABLE #TemNewTable

Here we have taken all the distinct records of the orignal table and deleted the records of original table. Again we inserted all the distinct values from new table to the original table and then deleted new table.

@gbn 2010-04-07 18:20:18

SELECT
    name, email, COUNT(*)
FROM
    users
GROUP BY
    name, email
HAVING 
    COUNT(*) > 1

Simply group on both of the columns.

Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":

In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.

Support is not consistent:

@webXL 2013-03-18 19:22:42

The part I always blank on is HAVING. WHERE doesn't work!

@bjan 2013-04-11 04:27:36

@webXL WHERE works with single record HAVING works with group

@user797717 2014-06-10 09:36:29

@gbn Is it possible to include the Id in the results? Then it would be easier to delete those duplicates afterwards.

@gbn 2014-06-10 09:59:13

@user797717: you'd need to have MIN(ID) and then delete for ID values not in the last if MIN(ID) values

@user797717 2014-06-10 10:35:08

@gbn Perfect. Thanks. I can see now the ID's.

@Ankit Dhingra 2016-09-16 06:38:58

What about cases where any of the columns have null values?

@gbn 2016-09-16 09:58:30

@AnkitDhingra The NULL values are grouped like "bob" or "gbn". No special behaviour

@Chamnap 2016-10-14 14:09:36

@gbn, can you show me the code how to include ID?

@Rick Sanchez 2016-11-15 08:53:34

I get the error: #1055 - Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'tools.keywords.key_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

@gbn 2016-11-21 20:39:37

@RickSanchez then add tools.keywords.key_id to the GROUP BY

@Antoine Pelletier 2018-04-11 20:28:51

@bjan Could this sentence've been pronounced by some kind of SQL god. It's a part of SQL knowledge that should be included in the answer. I'm extremely glad to have noticed your comments. It all makes sens now

@KM. 2010-04-07 18:22:43

try this:

declare @YourTable table (id int, name varchar(10), email varchar(50))

INSERT @YourTable VALUES (1,'John','John-email')
INSERT @YourTable VALUES (2,'John','John-email')
INSERT @YourTable VALUES (3,'fred','John-email')
INSERT @YourTable VALUES (4,'fred','fred-email')
INSERT @YourTable VALUES (5,'sam','sam-email')
INSERT @YourTable VALUES (6,'sam','sam-email')

SELECT
    name,email, COUNT(*) AS CountOf
    FROM @YourTable
    GROUP BY name,email
    HAVING COUNT(*)>1

OUTPUT:

name       email       CountOf
---------- ----------- -----------
John       John-email  2
sam        sam-email   2

(2 row(s) affected)

if you want the IDs of the dups use this:

SELECT
    y.id,y.name,y.email
    FROM @YourTable y
        INNER JOIN (SELECT
                        name,email, COUNT(*) AS CountOf
                        FROM @YourTable
                        GROUP BY name,email
                        HAVING COUNT(*)>1
                    ) dt ON y.name=dt.name AND y.email=dt.email

OUTPUT:

id          name       email
----------- ---------- ------------
1           John       John-email
2           John       John-email
5           sam        sam-email
6           sam        sam-email

(4 row(s) affected)

to delete the duplicates try:

DELETE d
    FROM @YourTable d
        INNER JOIN (SELECT
                        y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank
                        FROM @YourTable y
                            INNER JOIN (SELECT
                                            name,email, COUNT(*) AS CountOf
                                            FROM @YourTable
                                            GROUP BY name,email
                                            HAVING COUNT(*)>1
                                        ) dt ON y.name=dt.name AND y.email=dt.email
                   ) dt2 ON d.id=dt2.id
        WHERE dt2.RowRank!=1
SELECT * FROM @YourTable

OUTPUT:

id          name       email
----------- ---------- --------------
1           John       John-email
3           fred       John-email
4           fred       fred-email
5           sam        sam-email

(4 row(s) affected)

@Tanmay Nehete 2014-09-13 04:03:53

try this code

WITH CTE AS

( SELECT Id, Name, Age, Comments, RN = ROW_NUMBER()OVER(PARTITION BY Name,Age ORDER BY ccn)
FROM ccnmaster )
select * from CTE 

@Martin Silovský 2017-02-22 15:02:29

This selects/deletes all duplicate records except one record from each group of duplicates. So, the delete leaves all unique records + one record from each group of the duplicates.

Select duplicates:

SELECT *
FROM table
WHERE
    id NOT IN (
        SELECT MIN(id)
        FROM table
        GROUP BY column1, column2
);

Delete duplicates:

DELETE FROM table
WHERE
    id NOT IN (
        SELECT MIN(id)
        FROM table
        GROUP BY column1, column2
);

Be aware of larger amounts of records, it can cause performance problems.

@Ali Azhar 2018-03-16 09:09:59

Error in delete query - You can't specify target table 'cities' for update in FROM clause

@Martin Silovský 2018-04-16 18:12:21

There is neither table 'cities' nor update clause. What do you mean? Where is an error in the delete query?

@thoroc 2019-02-05 08:48:52

How does that work with OP's data?

@Martin Silovský 2019-02-06 07:16:07

What does the "OP" mean?

@rahul kumar 2017-12-05 12:41:01

SELECT column_name,COUNT(*) FROM TABLE_NAME GROUP BY column1, HAVING COUNT(*) > 1;

@AncAinu 2016-03-14 14:22:19

If you want to delete the duplicates, here's a much simpler way to do it than having to find even/odd rows into a triple sub-select:

SELECT id, name, email 
FROM users u, users u2
WHERE u.name = u2.name AND u.email = u2.email AND u.id > u2.id

And so to delete:

DELETE FROM users
WHERE id IN (
    SELECT id/*, name, email*/
    FROM users u, users u2
    WHERE u.name = u2.name AND u.email = u2.email AND u.id > u2.id
)

Much more easier to read and understand IMHO

Note: The only issue is that you have to execute the request until there is no rows deleted, since you delete only 1 of each duplicate each time

@Dickon Reed 2016-04-09 05:29:28

Nice and easy to read; I'd like to find a way that deleted multiple duplicate rows in one go though.

@Whitecat 2017-01-17 17:55:37

This doesn't work for me as I get You can't specify target table 'users' for update in FROM clause

@AncAinu 2017-01-17 23:12:52

@Whitecat seems like a simple MySQL problem: stackoverflow.com/questions/4429319/…

@Nigel Horne 2017-08-28 14:31:37

Fails for me. I get: "DBD::CSV::st execute failed: Use of uninitialized value $_[1] in hash element at /Users/hornenj/perl5/perlbrew/perls/perl-5.26.0/lib/site_per‌​l/5.26.0/SQL/Eval.pm line 43"

@GiveEmTheBoot 2019-02-18 16:15:06

I think that where clause should be " u.name = u2.name AND u.email = u2.email AND (u.id > u2.id OR u2.id > u.id)" isn't it?

@AncAinu 2019-02-25 11:47:38

@GiveEmTheBoot since u and u2 are the same tables, it doesn't really matter, but yeah, I guess doing u.name = u2.name AND u.email = u2.email AND u.id <> u2.id would work just as well

@Yusha 2019-04-15 18:17:24

@GiveEmTheBoot your adding extra complication really. I would just keep it the way AncAinu has put it. It really doesn't matter. Also, I just want to point out - props to AncAinu, this is really a genius explanation. If people have their tables laid out correctly with IDs generated for each record this works flawlessly. Great, and easy to read and maintain!

@Debendra Dash 2016-09-26 12:23:30

By Using CTE also we can find duplicate value like this

with MyCTE
as
(
select Name,EmailId,ROW_NUMBER() over(PARTITION BY EmailId order by id) as Duplicate from [Employees]

)
select * from MyCTE where Duplicate>1

@Debendra Dash 2016-09-12 18:18:51

select id,name,COUNT(*) from India group by Id,Name having COUNT(*)>1

@Narendra 2016-09-08 06:41:06

select name, email
, case 
when ROW_NUMBER () over (partition by name, email order by name) > 1 then 'Yes'
else 'No'
end "duplicated ?"
from users

@Rich Benner 2016-09-08 07:36:40

Code only answers are frowned upon on Stack Overflow, could you explain why this answers the question?

@Narendra 2016-09-09 01:27:36

@RichBenner: I didn't find the response such as, each & every row in the result and which tells us which all are duplicate rows and which are not in one glance and that to not group by, because if we want to combine this query with any other query group by is not a good option.

@Antoine Reinhold Bertrand 2019-07-16 16:15:31

Adding Id to the select statement and filtering on duplicated , it give you the possibility to delete the duplicated ids and keep on of each.

@Panky031 2016-07-22 20:29:29

SELECT * FROM users u where rowid = (select max(rowid) from users u1 where
u.email=u1.email);

@Darrel Lee 2016-07-01 19:09:40

This is the easy thing I've come up with. It uses a common table expression (CTE) and a partition window (I think these features are in SQL 2008 and later).

This example finds all students with duplicate name and dob. The fields you want to check for duplication go in the OVER clause. You can include any other fields you want in the projection.

with cte (StudentId, Fname, LName, DOB, RowCnt)
as (
SELECT StudentId, FirstName, LastName, DateOfBirth as DOB, SUM(1) OVER (Partition By FirstName, LastName, DateOfBirth) as RowCnt
FROM tblStudent
)
SELECT * from CTE where RowCnt > 1
ORDER BY DOB, LName

@veritaS 2016-04-14 23:02:15

This should also work, maybe give it try.

  Select * from Users a
            where EXISTS (Select * from Users b 
                where (     a.name = b.name 
                        OR  a.email = b.email)
                     and a.ID != b.id)

Especially good in your case If you search for duplicates who have some kind of prefix or general change like e.g. new domain in mail. then you can use replace() at these columns

@gaurav singh 2013-12-31 10:07:33

Try the following:

SELECT * FROM
(
    SELECT Id, Name, Age, Comments, Row_Number() OVER(PARTITION BY Name, Age ORDER By Name)
        AS Rank 
        FROM Customers
) AS B WHERE Rank>1

@Joe Ruder 2016-09-14 00:42:10

A slight change to SELECT * helped me solved an hour search. I have never used the OVER(PARTITION BY before. I never cease to be amazed at how many ways to do the same thing in SQL!

@Indivision Dev 2015-11-17 10:21:14

A little late to the party but I found a really cool workaround to finding all duplicate IDs:

SELECT GROUP_CONCAT( id )
FROM users
GROUP BY email
HAVING ( COUNT(email) > 1 )

@Chef_Code 2016-01-31 20:36:00

Appears to be a syntactical sugar work around. Nice find.

@v010dya 2016-12-10 08:39:21

Keep in mind that GROUP_CONCAT will stop after some predetermined length, so you might not get all the ids.

@Clint Eastwood 2018-08-02 15:29:26

GROUP_CONCAT does not work in PostgreSQL

@PRADEEPTA VIRLLEY 2015-07-22 07:12:07

 SELECT name, email 
    FROM users
    WHERE email in
    (SELECT email FROM users
    GROUP BY email 
    HAVING COUNT(*)>1)

@user4877838 2015-05-08 06:41:06

SELECT id, COUNT(id) FROM table1 GROUP BY id HAVING COUNT(id)>1;

I think this will work properly to search repeated values in a particular column.

@Jeroen 2015-05-08 07:14:36

This doesn't quite add anything to the top answer, and technically doesn't even really differ from the code OP's posted in the question.

@Shygar 2017-03-30 18:23:40

This result showed up in a google search, so I'm voting for it since it solved my issue.

@Lauri Lubi 2015-01-01 14:43:33

If you want to find duplicate data (by one or several criterias) and select the actual rows.

with MYCTE as (
    SELECT DuplicateKey1
        ,DuplicateKey2 --optional
        ,count(*) X
    FROM MyTable
    group by DuplicateKey1, DuplicateKey2
    having count(*) > 1
) 
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
    AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt

http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/

@Muhammad Tahir 2014-12-11 10:28:52

How we can count the duplicated values?? either it is repeated 2 times or greater than 2. just count them, not group wise.

as simple as

select COUNT(distinct col_01) from Table_01

@Jeroen 2015-05-08 07:18:39

How would this work for the question as asked? This does not give rows that duplicate information in multiple columns (e.g. "email" and "name") in different rows.

@naveed 2014-10-15 15:38:41

 select emp.ename, emp.empno, dept.loc 
          from emp
 inner join dept 
          on dept.deptno=emp.deptno
 inner join
    (select ename, count(*) from
    emp
    group by ename, deptno
    having count(*) > 1)
 t on emp.ename=t.ename order by emp.ename
/

@shekhar singh 2014-08-26 10:07:17

If you wish to see if there is any duplicate rows in your table, I used below Query:

create table my_table(id int, name varchar(100), email varchar(100));

insert into my_table values (1, 'shekh', '[email protected]');
insert into my_table values (1, 'shekh', '[email protected]');
insert into my_table values (2, 'Aman', '[email protected]');
insert into my_table values (3, 'Tom', '[email protected]');
insert into my_table values (4, 'Raj', '[email protected]');


Select COUNT(1) As Total_Rows from my_table 
Select Count(1) As Distinct_Rows from ( Select Distinct * from my_table) abc 

@xDBA 2014-06-16 08:50:58

In case you work with Oracle, this way would be preferable:

create table my_users(id number, name varchar2(100), email varchar2(100));

insert into my_users values (1, 'John', '[email protected]');
insert into my_users values (2, 'Sam', '[email protected]');
insert into my_users values (3, 'Tom', '[email protected]');
insert into my_users values (4, 'Bob', '[email protected]');
insert into my_users values (5, 'Tom', '[email protected]');

commit;

select *
  from my_users
 where rowid not in (select min(rowid) from my_users group by name, email);

@Chris Van Opstal 2010-04-07 18:20:36

Try this:

SELECT name, email
FROM users
GROUP BY name, email
HAVING ( COUNT(*) > 1 )

Related Questions

Sponsored Content

28 Answered Questions

[SOLVED] How can I prevent SQL injection in PHP?

37 Answered Questions

24 Answered Questions

[SOLVED] Find all tables containing column with specified name - MS SQL Server

17 Answered Questions

[SOLVED] Finding duplicate rows in SQL Server

54 Answered Questions

[SOLVED] Remove duplicate values from JS array

33 Answered Questions

[SOLVED] How do I UPDATE from a SELECT in SQL Server?

4 Answered Questions

[SOLVED] Inserting multiple rows in a single SQL query?

37 Answered Questions

[SOLVED] How can I remove duplicate rows?

22 Answered Questions

[SOLVED] Find duplicate records in MySQL

  • 2009-05-12 18:24:21
  • Chris Bartow
  • 723243 View
  • 614 Score
  • 22 Answer
  • Tags:   mysql duplicates

1 Answered Questions

[SOLVED] Count of distinct rows with duplicate email IDs

  • 2010-07-21 16:14:44
  • acadia
  • 3475 View
  • 4 Score
  • 1 Answer
  • Tags:   sql

Sponsored Content