Idea: synchronization based on pre-generated checksum files

Discuss new features and functions
raven
Posts: 2
Joined: 4 Feb 2016

Post by raven

That is, I specify in the program the generated checksum file for Destenation or Source, based on it the program determines what to copy.
Why this is important: at this stage, when synchronizing over a network (ftp or Samba), you actually have to copy all the content from the network to compare.
With a low data transfer speed, this takes too much time.
User avatar
Plerry
Posts: 1031
Joined: 22 Aug 2012

Post by Plerry

This has been suggested and discussed before.

An FFS sync runs only on one machine, e.g. the machine of your left base location.
In said case, determining the checksums of files in the left base location (and subtree) could be quick.
However, for determining the checksum of the files (and subtree) in the right, remote base location, all data still needs to traverse to your left base location local machine, as that is where FFS is running.
Then you might just as well directly compare the (full) remote data with the local data, rather than first calculate the checksums of the local and remote data and then compare the checksums ...
If each base location would have a local machine determining the checksums that would be another story, but that would require software to run at all locations involved in the sync, not just one.
Perhaps OK if you "control" all locations involved.
But the advantage of FFS is that it only needs to run in one location, while allowing you to run syncs to, from and between locations where you have "just" file access.

Having pre-stored checksum files is generally a bad idea. It tells you the checksum at the time it was calculated. Even if the checksum data would be based on first writing and then reading the just written data (something that is hard to guarantee, as it might be read from cache rather than from disk), the stored data might have been corrupted since calculating and storing the checksum.
Fossie
Posts: 2
Joined: 3 Aug 2020

Post by Fossie

I'd like to add a vote for this to be implemented.

It would be a great way to allow for data scrubbing. Every time a file is updated, recalculate the checksum. Schedule periodically verification of checksums. If a mismatch is found, we know the file has been corrupted, and should be given the option to replace it with the copy in the other location (after verifying that copy is good).

I do control both environments and I expect most people do, if not they're probably using a cloud service, which pretty much makes FreeFileSync obsolete in terms of supplying a backup storage solution.
User avatar
Plerry
Posts: 1031
Joined: 22 Aug 2012

Post by Plerry

Your vote is for using chechsums for validation, which is something different than using chechsums to determine which files need to be copied, as was proposed by TS raven.
In either case, the first part of my my earlier reply still holds.

A for your (Fossie's) proposal: as FFS only runs in one location, having FFS compare the chechsum to the stored data requires all data to be transferred to the machine running FFS. Then FFS might just as well simply directly compare the data of the left and right location rather than calculating the chechsums.

Obviously, nothing prevents you from using a dedicated data verification tool based on checksums, running at either of the sync locations (if you have control over both ends).
But (at least in my view) it does not make sense to include this functionality in FFS, just like e.g. for the frequently suggested duplicate-finder function.
Fossie
Posts: 2
Joined: 3 Aug 2020

Post by Fossie

Thanks for your reply, Plerry.

The way I see it, adding validation functionality would only improve the quality of what FFS is doing. Rather than just copying files and make sure they appear updated in two locations, one could argue that the users primary concern and interest is in the contents of the files. This requires some level of validation/scrubbing.

If FFS is not the way to go to this end, do you know of any data verification tools that might be suitable? Googling only seem to lead me to database cleaning products :/

Best regards
User avatar
Plerry
Posts: 1031
Joined: 22 Aug 2012

Post by Plerry

It obviously differs per platform, but for Windows see e.g. here.