foreign
[Music]
Technologies
in this video we will see how to remove
duplicates or duplicate rows using power
query in power bi
remove duplicates option is helpful
transformation in power bi power query
because it will give unique values list
however sometimes after removing
duplicates
we will not get unique list of values
let's see how to avoid this problem in
removing duplicates or duplicate rows
with the help of other transformation in
power bi power query
here we have an employee ID table with
different email IDs this table contains
all employee email IDs it also contains
the duplicate email IDs
these email IDs are repeated twice or
multiple times
let's say we want to remove these
duplicates or duplicate rows from this
column
we can easily
right click on this column
and select
remove duplicates or alternatively we
can go to Home tab
select remove rows
and choose remove duplicates
fourth will do the same thing
select remove duplicates
and it will remove the duplicates or
duplicate rows which contains repeated
email IDs as you can see now there is no
duplicate image in this column that's
how remove duplicates works very simple
and very effective
there are scenarios where remote
duplicates will not work properly or not
provide the correct result for example
let's look at employee details table
this table contains the information
about employee name and employee email
IDs employee details contains the
duplicate or duplicate records let's see
how to remove these duplicates with the
help of other transform in power bi
power query
select employee table
and right click on this column
and select remove duplicates
remove duplicate step is already added
and the script are for remove duplicates
using
table dot distinct
that is also added
but still we are seeing the duplicate
data
there are different reasons why remove
duplicates does not work
let's see one of the reasons you can see
that
two values are slightly different
this is with lowercase
and this is with uppercase
power query is a case sensitive language
power query will consider these two
values are different and unique so these
two are not considered as same values
as a result remove duplicates do not
consider as duplicate entries
one easy way to fix this issue is to
change all these column values to
uppercase
or lower case and do the remove
duplicates
for changing the
values to uppercase
select the steps previous to remove
duplicates
right click
transform
and select uppercase
sir
now we can see all values converted to
uppercase and now if you click on remove
duplicate steps
there is still something going on let's
have a look again
these data are existing twice after
converting to uppercase and remove
duplicate still it contains duplicates
it looks little strange
one of the other reasons remote
duplicates does not work
that the length of these two values
might be different
let's find the length of these values
exactly same or not
to find the length select the column
and go to add column tab
in extract
choose length to extract the length of
the selected column values
we can see that new column length with
the length of each value in this column
the length of this
two values
is different even though if it looks
same but the length is different it is
possible with some Extra Spaces in
beginning or
end of this value
probably in this scenario we might have
space in end of the value or there might
be some other characters we do not see
as well
let's see how we can handle this
right click on the column
select transform
and choose trim
to trim The Unwanted spaces from each
row values of this column
actions should be done before remove
duplicate step
so select uppercase text
right click
transform
and trim
insert
and now if you go to remove duplicates
we can see only one marks and there is
no
another one because it removed The
Unwanted space so it consider as only
one character sometimes we might have
junk characters in the values in such
scenarios we use clean to remove
unwanted junk characters
right click on the column
select transform and choose clean
and sometimes you want to remove some
extra unwanted characters in such
scenario
use replace values to replace those
values with blank
these are the easiest and effective ways
to remove duplicates or duplicate rows
in power bi power query
if this video helpful to you like this
video and share this video with others
ask your questions or provide your
feedback about this video in below
comment section
to watch more videos like this visit our
official YouTube channel
Technologies to book your slot for
online training visit our official
website petacarttechnologies.com
to get notifications on upcoming videos
subscribe to this Channel and activate
Bell icon
that's all for this video thanks for
watching
[Music]
thank you
[Music]
[Music]
thank you