Hi,
When comparing file contents, would it be possible to have a configuration option to only compare the first x bytes of the files (value to be manually configured)? I am currently using ViceVersa to backup & sync my music collection comprised of hundreds of thousands of FLAC files and the differences are always in the headers (files are different when I update some TAGs but the rest of the data stream is always identical). With VIceVersa I can only compare the headers, but ViceVersa is quite buggy and unstable... FreeFileSync would be a better alternative for me but the lack of this option makes it many times slower than ViceVersa (my music collection is more than 10TB in size).
Thanks!
Comparing only parts of files
- Posts: 5
- Joined: 27 Oct 2021
- Posts: 2451
- Joined: 22 Aug 2012
> files are different when I update some TAGs but the rest of the data stream is always identical
I suppose changing the tags in the header will also update the modified-date of your files.
So, why compare based on file content and not on file date (and size)?
That should make your comparison much faster.
I suppose changing the tags in the header will also update the modified-date of your files.
So, why compare based on file content and not on file date (and size)?
That should make your comparison much faster.
- Posts: 1038
- Joined: 8 May 2006
(There are duplicate file finders out there, & probably more so for audio rather then video, that can choose to ignore tags, or to only compare tags. Something like that might ? be more appropriate?
AllDup can "Ignore the meta data of FLAC files".)
AllDup can "Ignore the meta data of FLAC files".)
Some tagging programs have the option of maintaining the original file date/time.I suppose changing the tags in the header will also update the modified-date of your files.
- Posts: 5
- Joined: 27 Oct 2021
I am always preserving the original file time when mass updating tens or hundreds of files... otherwise this would trigger a massive rescan of my music library which would take many hours and also a resync of my backups... if I keep the original times unchanged, this allows me to rescan my library and trigger backups incrementally at more convenient times. But of course then I have to use a comparison by file contents. I am currently using ViceVersa to compare only the first 32K of the files which is very fast, but ViceVersa is very slow at copying files and very buggy too, it keep on crashing randomly... so I was hoping to find another more reliable software, but unfortunately ViceVersa seems to be the only backup software on Earth which allows to compare only the headers...
FreeFileSync is much faster at copying files, but much slower at comparing file contents as it always takes into account the full files, which takes ages...
Also FreeFileSync is often filling up my backup drivers by saving all updated files to the RecycleBin tmp folder which needlessly takes up TB's of space and I need to manually clean it up periodcally... This means I can't leave FreeFileSync back up my files at night because it would get stuck in a while and I would need to take manual action in the morning to unblock it...
FreeFileSync is much faster at copying files, but much slower at comparing file contents as it always takes into account the full files, which takes ages...
Also FreeFileSync is often filling up my backup drivers by saving all updated files to the RecycleBin tmp folder which needlessly takes up TB's of space and I need to manually clean it up periodcally... This means I can't leave FreeFileSync back up my files at night because it would get stuck in a while and I would need to take manual action in the morning to unblock it...
- Posts: 5
- Joined: 27 Oct 2021
But I don't want to ignore metadata... on the contrary, I want to back up my files if only metadata has changed... of course the data stream will stay the same when I edit TAGs...(There are duplicate file finders out there, & probably more so for audio rather then video, that can choose to ignore tags, or to only compare tags. Something like that might ? be more appropriate?
AllDup can "Ignore the meta data of FLAC files".)
therube, 28 Oct 2021, 16:02
- Posts: 1038
- Joined: 8 May 2006
Duplicate Cleaner 5 (pay version) can compare tags (only).
(dropdown also has, "Ignore content (match by tags or attributes)" (you're able to choose particular tags you'd wish to compare)
(I know these features exist, but haven't messed with them, cause I'm not really concerned about the particular feature.)
Oh, & different file formats, & different tag versions, can store their data at different places in a file.
Like with ID3, ID3v1, the tag is stored at the end of the file, & with ID3v2 it is stored at (towards) the beginning.
(dropdown also has, "Ignore content (match by tags or attributes)" (you're able to choose particular tags you'd wish to compare)
(I know these features exist, but haven't messed with them, cause I'm not really concerned about the particular feature.)
Oh, & different file formats, & different tag versions, can store their data at different places in a file.
Like with ID3, ID3v1, the tag is stored at the end of the file, & with ID3v2 it is stored at (towards) the beginning.
- Posts: 5
- Joined: 27 Oct 2021
But Duplicate Cleaner doesn't seem to be a backup tool... I want to back up all files for which TAGs are different, not find duplicates... in the description it doesn't say anything about backing up the files from one drive to another...
- Posts: 1038
- Joined: 8 May 2006
No, they are not backup tools.
But maybe they could be used in conjunction... (or maybe not).
---
Maybe you could automate something?
(As it is, FFS is not designed to do what you're looking for.)
Use the (UNIX-like) cmp command.
So if you know the tag has to be stored in the first 4096 bytes (I have no clue), you could do something like this (pseudo-code):
You would run that in a loop, for each of your .flac file pairs,
& for any that fail the compare (of their first 4096 bytes),
their names are written to a file, that can then be parsed,
to copy from source to backup.
DiffUtils for Windows has cmp.exe.
(You'd need both the Binaries & Dependencies, ZIP [or the .exe installer].)
But maybe they could be used in conjunction... (or maybe not).
---
Maybe you could automate something?
(As it is, FFS is not designed to do what you're looking for.)
Use the (UNIX-like) cmp command.
So if you know the tag has to be stored in the first 4096 bytes (I have no clue), you could do something like this (pseudo-code):
cmp.exe -n 4096 --quiet michaelgarrison_source.flac michaelgarrison_backup.flac
if ERRORCODE > 0, echo %filename% >> i_need_to_back_this_up.TXT
& for any that fail the compare (of their first 4096 bytes),
their names are written to a file, that can then be parsed,
to copy from source to backup.
DiffUtils for Windows has cmp.exe.
(You'd need both the Binaries & Dependencies, ZIP [or the .exe installer].)
- Posts: 5
- Joined: 27 Oct 2021
Thanks for the suggestion... yes I could go for the custom script approach and it seems to be the only way forward. I never imagined it would be so hard to find a backup software which allows for partial comparisons... For me this could be a nice performance enhancement, because if the files are different in the first KBs, then they are different so comparison can stop there... Of course this would require the users to know their files well before enabling it, but it could even be a paying option.
I myself would gladly pay to have something like that as it would be a great time saver for me... and if ViceVersa would be more stable, I would stick with it...
I myself would gladly pay to have something like that as it would be a great time saver for me... and if ViceVersa would be more stable, I would stick with it...