The meaning of "identical" depends on user. Some want to get rid of any duplicates no matter on what CD they were published. Others want ot keep all copies of a song in varying samplers. Some of them may originate from the same recording session and from the same digital sampling but with different compression, others may differ in mixing and digital conversion (re-edit) from the same analog tapes and additionally others are different recording sessions (e.g. single and album versions). At the extreme live sessions with greatly varying content and length may have the same filename and tags (artist - title).
As a consequence: Give as much information as possible to the user (tags, musical similarity, quality) and let him decide. It should be his decision whether to rely on automatic generated data what to throw away and what to keep.