this is part four of sequence of
interview questions and answers video
series in this video we'll discuss how
to delete all duplicate rows except one
from a sequel server table this is a
very common interview question it is the
table with duplicate data notice that
mark record is duplicated three times
Mary record two times in Bendrick or
three times now the delete query that we
are going to write should delete all the
duplicate rows except one per employee
meaning after we execute the delete
query the employees table should be left
with just three records as you can see
here let's see how to achieve this let's
flip to sequence of management studio so
I've already created this employee's
table and populated it that some test
data here is the sequel script that can
do it
I'll have the script available on my
blog in case you need it now to write
the delete query we are going to make
use of a common table expression and
let's call this common table expression
employee CTE we are going to select all
the columns from employee's table along
with all the columns we also want row
number so we are going to make use of
row number function so row number / we
want to partition we want the data to be
partitioned by this ID column and then
we also want to order the data by ID
column and then based on that generate
the row number and let's give this row
number a name let's call it row number
and let's see what we get from this ete
so let's select you know whatever this
CTE produces so let's go ahead and
execute this query look at this look at
this row number column so first of all
look at these first three records so the
data is partitioned by ID and then the
row numbers are unique within that
partition when the new partition starts
look at that the wrong number starts
from one so here we are partitioning the
data basically by the ID column and you
know within the partition the row
numbers are unique when a new partition
starts you know the note row number is
going to start from one again now using
this column we can very easily delete
all the rows except one in the delete
query we can use this rule number as the
filter so instead of selecting all the
rows from employees what we can do is
delete from employees where row number
column is greater than one okay so what
is it going to do it's going to delete
anything that has got row number greater
than one meaning it's going to leave
just one record per employee let's
actually execute this and see what
output we get so now let's go ahead and
select all the rows from employees table
look at that now the table is left with
just three records
so basically partition by is going to
divide the query result set into
partitions and here we are partitioning
the result set by ID column that's it
for today thank you for listening have a
great day