I bought Similarity Premium about a year ago. I am currently running version 1.7.1.
Over the years I have been digitizing my entire LP and CD collection. In some cases I not only have the original vinyl LP, but the 1st CD pressing, and subsequent remasters. I am at the point where I would like to review similar albums, select the best audio version and make only that one available for a jukebox type application. Listening to each album and making a subjective analysis and selection is not possible. I purchased Similarity to give me a tool that would make a technical analysis of each track, provide an overall evaluation, and give me some guidance. In the end, before I remove the technically inferior versions, I listen to each track to confirm that the technical evaluation is in line with my subjective evaluation. Occasionally, a technically inferior version (eg exessive clipping, lower high frequency range) may in fact be the better sounding.
For my purposes, looking for duplicate tracks or comparing individual tracks to cherry pick the best, is not a requirement. Cobbling together a single album by selecting the technically best from several versions, results in an uneven and confusing listening experience. Neither is processing or comparing pictures a requirement.
My music collection is organized by Artist/ReleaseYear-Album Title-ReissueYear/Track. So I simply want Similarity to process each track sequentially, analyze each track, and log the result in alphabetic order (the same order it was processed in). This would result in each similar album to be logged near its predecessor and make it easy for visual comparison. Subsequent analysis using sorting of columns to identify particularly bad tracks or some sort of search and filter function would be useful.
Up to now I have been using Similarity to analyze the occasional album. I have been trying to learn how to interpret the results. I have only recently realized the relationship between the dynamic range analysis provided to me by a Foobar add-on and the Similarity Max (abs) field. I have reviewed the documentation for the Mean and Abs fields several times but still can't relate them to the listening experience.
I now want to use Similarity to process the complete collection to identify the best album versions. I am running the program on a backup computer in my home. It is an 'older' computer (older than my main computer but still sufficiently fast to provide 24/7 server functionality) that I have highly optimized for Similarity processing but it still takes about two days to process 30,000 files. The time is not a big issue. I turn Similarity on, come back in two days, and all would be good, except when an instability occurs in the program/computer occurs and I have to restart Similarity. I don't want to spend another two days to go through the whole analysis.
I have reviewed the forum for any similar experiences and came across a discussion here
http://www.similarityapp.com/forum/index.php?topic=841.0If I understand the Admin response, Similarity analyses each file, puts results in memory, and does comparisons on the file. This naturally results in memory limitations, slowing down the process as more and more files are added, and if the results are not written to disk, all will be lost if the program fails. The point was that Similarity was not designed to process large collections.
The Admin also suggests that somehow, not all is lost, because when Similarity restarts it does not reanalyse the files in the cache.
I apologize for the long prologue to my questions but I want to make sure you understand where I am coming from'
1. While Similarity is analysing the status bar shows a total files and currently processed files count. Is this a count of audio files or also a count of images being processed?
2. The initial Similarity scan failed after processing about 15,000 files (computer blue screen of death and Similarity may not have been the cause). When I restarted Similarity, I saw no update of the cache counter. The second run that completed successfully appeared to be a little faster but that was probably because I raised the processing priority of Similarity and not due to any previously stored analysis results. I now have 30,000 files analysed but I am afraid to turn off Similarity because I don't know how to get the files into the cache. I don't want to have to rerun the complete analysis every time I start up Similarity. So how do I get the analysis stored for future viewing?
3. I selected 30,000 files as an initial run realizing the time processing constraints. I will now go in and remove bad albums, fix some file names and tags and do other file editing functions. I then want to add a new batch of albums, do a similar run against them. I want Similarity to review the already scanned files for changes, remove deleted files from its data base, re-analyse any that have been changed, ingore those that have not changed but leave them in the analyses result lists, and then proceed to analyse the newly added files, and add all changes and additions to the cache, for subsequent runs. After several runs, I expect that I should have a visible log analyses for viewing of all files in the collection. It is not clear from the response to the above forum entry, that Similarity can do this. Can it?
4. The Admin response to the forum question suggested that the processing limitations of Similarity are all due to all analyses results have to be maintained in memory resulting in slowing down and instability of the program. If that is the case, as a Similarity customer I would prefer being able to identify which files to include in the analysis (audio or image, audio and image) and whether or not a comparison should take place in real time or not. In my case I would turn off image processing and real-time comparison. This would result in only a few analysis results being in memory at a time (just enough to optimize disk writes), faster processing because all the program has to do is sequential read the file list, determine changes, deletions, additions, process as required, and write the results to disk. No need to do comparative analysis in this run. If the results are written to disk shortly after analyses is performed, and the database checkpointed properly, there would be no issue of loosing the results of a long run. Am I misinterpreting the Admins response? The suggestion is that I can't tell Similarity I don't want it to keep analysis results in memory and not doing a comparison during the analysis.
If a time saving is possible by turning off image and real-time comparison, I would offset the saving by adding the option to do a spectrum and sonogram analysis during the analysis scan, and adding that to the collection data base. The time and space to do this may not be acceptable to all Similarity users so they should be selectable options.