How to delete duplicate records in different ways in SQL.
for example:
Consider the combination of EMPNO & ENAME columns in EMP table is having duplicate records as below,
SELECT * FROM emp ORDER BY empno;
1001,Ash
1001,Ash
1002,Bash
1003,Cash
1004,Dash
1004,Dash
1. Using ROWID and simple SUB-QUERY
===================================
DELETE FROM emp
WHERE ROWID NOT IN (SELECT MAX(rowid) FROM emp GROUP BY empno,ename);
2. Using ROWID & corelated SUB-QUERY
====================================
DELETE FROM emp a
WHERE a.ROWID > (SELECT MIN(b.rowid) FROM emp b WHERE a.empno = b.empno AND a.ename = b.ename);
3. Using ROWID & analytical function
====================================
SELECT * FROM emp WHERE ROWID IN
( SELECT r1 FROM
( SELECT ROWID r1,ROW_NUMBER() OVER(PARTITION BY empno,ename ORDER BY empno)R FROM emp) WHERE R > 1
);
4. Create new table with DISTINCT values, Drop the original table and Alter the new table as original table
=============================================================================================================
CREATE TABLE new_table AS (SELECT DISTINCT empno,ename FROM emp ORDER BY empno);
DROP TABLE emp;
ALTER TABLE new_table RENAME TO emp;
5. Add one more column “row_num” and insert sequence of numbers for all records
===============================================================================
ALTER TABLE emp ADD row_num NUMBER;
UPDATE emp SET row_num = rownum;
DELETE FROM emp WHERE row_num NOT IN
( SELECT MIN(row_num) FROM emp GROUP BY empno,ename
);
Above methods 1,2,& 3 are identical methods and based on the performance of query it can be used.
Method 4 & 5 can be used at situation which requires to do that.