Author Topic: Version 1.0.0 released (Read 77936 times)

Admin · « **on:** November 16, 2009, 19:00:37 »

+ New tag editor (results tab)
+ Some improvements in comparing algorithm
+ Some visual improvements
- Many small visual/translation bugs fixed

Release version

If you interesting in translating Similarity in your language write to support email.

emn13 · « **Reply #1 on:** November 16, 2009, 19:22:39 »

Interesting date there for the 1.0 release ;-).

Good work, though!

ferbaena · « **Reply #2 on:** November 20, 2009, 04:37:18 »

version 0936 used the 4 cores and 8 threads on I7 860

the new version 1.0 barely uses 4 threads and the performance decrease is noticeable

the edit option is nice though

keep up the good work

Admin · « **Reply #3 on:** November 20, 2009, 18:20:40 »

ferbaena
yes, this is bug, used only single thread, download new version.
thanks for your message.

ferbaena · « **Reply #4 on:** November 23, 2009, 18:03:31 »

Yes new 101 now behaves like the 0936, all cores, all threads.

Problem:

I know for a fact that there are two duplicate files on two different

folders (actually there are more but this example is with two).

I create a new folder and make copies of the two duplicate files on that folder.

I run Similarity (content only) on this new folder with different settings
but it is only when content is down to 0.65 that it detects the duplicates.

Now the big problem:

I have around 72.000 mp3 to scan The cache is 74150

If I set the content at 0.65 there will be more than a couple of million
duplicates by the end of the scan (rigth now I tried scanning 29404 and it has found 400016 duplicates, having checked only 7462) and the experimental algorithm will take a couple of days to finish.

Now the question:

Why it takes a setting of 0.65 to find these duplicates now, if running the program before with content settings between 0.85 and 0.95 found the majority of the others?

I know it is difficult and it's not a Similarity-only problem.

I bought a license for Phelix from Phonome Labs a couple of years ago and it does not find all of the duplicates either.

Thank you

Admin · « **Reply #5 on:** November 24, 2009, 10:42:52 »

ferbaena
it's empirical value,
whats does experemental algoritm show on this 2 files ? (only 2 this files, don't need others to scan)

surbaniak · « **Reply #6 on:** November 26, 2009, 20:24:41 »

TAG EDITOR looks amazing !
(Will give more feedback after I work with it for a while)

surbaniak · « **Reply #7 on:** November 26, 2009, 20:42:31 »

Found first problem with the TagEditor (minor):
The Duration and Size fields are interchanged in the file list table below.

surbaniak · « **Reply #8 on:** November 26, 2009, 20:47:38 »

Found second problem with the TagEditor (minor):
First I applied the string "Test" to the Album field. <-Worked great.
Then I tried to apply string "" (empty string) to the Album filed. <-Did not work.Field remains populated with "Test".
So now i can't blank out that field, likely all STR fields behave that way. I think <empty string> should be a valid entry .... or did you reserve it as a special value ?

gbowers · « **Reply #9 on:** December 24, 2009, 23:59:25 »

ferbaena:

Problem:

I know for a fact that there are two duplicate files on two different

folders (actually there are more but this example is with two).

I create a new folder and make copies of the two duplicate files on that folder.

I run Similarity (content only) on this new folder with different settings
but it is only when content is down to 0.65 that it detects the duplicates.

Now the big problem:

I have around 72.000 mp3 to scan The cache is 74150

If I set the content at 0.65 there will be more than a couple of million
duplicates by the end of the scan (rigth now I tried scanning 29404 and it has found 400016 duplicates, having checked only 7462) and the experimental algorithm will take a couple of days to finish.

Now the question:

Why it takes a setting of 0.65 to find these duplicates now, if running the program before with content settings between 0.85 and 0.95 found the majority of the others?

I know it is difficult and it's not a Similarity-only problem.

I bought a license for Phelix from Phonome Labs a couple of years ago and it does not find all of the duplicates either.

Please explain how resolved this problem via following quote:

Admin:

ferbaena
it's empirical value,
whats does experemental algoritm show on this 2 files ? (only 2 this files, don't need others to scan)

Thanks

Admin · « **Reply #10 on:** December 26, 2009, 03:13:41 »

gbowers
simple, i need to know, is experemental algorithm shows better results or no.

ferbaena · « **Reply #11 on:** January 26, 2010, 03:03:19 »

The problem has not been solved...

I am trying the newest version 110

6 files on the folder (3 repeats)

Compare method: Content only; Experimental enabled

1.00 down to 0.89 finds 0
0.88 down to 0.76 finds 2 experimental: 63.1% each
0.75 down to 0.72 finds 4 experimental: 78.3% and 63.1% pairs
0.71 down finds 6 experimental: 63.1%, 78.3$ and 5.9%

Does the setting during the first pass when the cache is created affects the results for future comparisons at different settings?

thank you

ferbaena · « **Reply #12 on:** January 26, 2010, 05:45:48 »

.. and speaking of the new version, 110

I liked the previous two-column presentation better.

Much easier to read the results.

Admin · « **Reply #13 on:** January 26, 2010, 12:11:47 »

ferbaena
Please send this 6 files to our email we check them and algorithm.

PS. New version designed to show multiply duplicates to 1 file (in time we remove unneccasary records, like 1,2 and 2,1, but not remove triples 1,2,3 and 2,3,4)

gbowers · « **Reply #14 on:** February 16, 2010, 18:43:32 »

Admin:

ferbaena
Please send this 6 files to our email we check them and algorithm.

PS. New version designed to show multiply duplicates to 1 file (in time we remove unneccasary records, like 1,2 and 2,1, but not remove triples 1,2,3 and 2,3,4)

Please indicate if you resolved problem shown above.

Thanks