First option "1. Compare by File time and size" is fast, ok?
Second option "2. Compare by File content" is reliable, right?
Why not have an option that do both?
Somethin like that: Check all files using "option 1", then, with the result, re-check all "considered not equal" files contents, assuring that they have in fact changed.
Maybe this approach mitigates the performance issue, but avoiding false positives (with date problems)
[FEATURE REQUEST] Comparision improvement
- Posts: 3
- Joined: 28 Jan 2024
- Posts: 2485
- Joined: 22 Aug 2012
Your proposal is remarkable.
> then, ... , re-check all "considered not equal" files contents, assuring that they have in fact changed.
The first Comparison (based on date/size) would already have shown the files are NOT equal,
so why verify that based on Content?
For such files, to become in sync they would need to be updated according to the set sync rules, irrespective of the outcome of the Content comparison.
More commonly, users want to be absolutely sure files are truly identical before deciding NO sync action is required. Thus, they prefer a Comparison on Content, even if file date and size are identical.
However, for locations that are largely in sync, a stepped comparison (first on date/size and, if those are equal, then on Content) would hardly save comparison time with respect to directly comparing by Content.
> then, ... , re-check all "considered not equal" files contents, assuring that they have in fact changed.
The first Comparison (based on date/size) would already have shown the files are NOT equal,
so why verify that based on Content?
For such files, to become in sync they would need to be updated according to the set sync rules, irrespective of the outcome of the Content comparison.
More commonly, users want to be absolutely sure files are truly identical before deciding NO sync action is required. Thus, they prefer a Comparison on Content, even if file date and size are identical.
However, for locations that are largely in sync, a stepped comparison (first on date/size and, if those are equal, then on Content) would hardly save comparison time with respect to directly comparing by Content.
- Posts: 3
- Joined: 28 Jan 2024
>The first Comparison (based on date/size) would already have shown the files are NOT equal,
so why verify that based on Content?
My scenario:
- Windows has a bug that keep updating "modify date" on some files (I did a little research and found it to be the case mostly with .eml files). Then the FFS always sync that files again and again. To address this issue I might turn off Window indexing on .eml files. I will try that and see what else I get from here.
Thanks for the response!
https://drive.google.com/file/d/1bTkjl3stSvzlaSeqPjS-MKiTqq4-wp3K/view?usp=sharing
so why verify that based on Content?
My scenario:
- Windows has a bug that keep updating "modify date" on some files (I did a little research and found it to be the case mostly with .eml files). Then the FFS always sync that files again and again. To address this issue I might turn off Window indexing on .eml files. I will try that and see what else I get from here.
Thanks for the response!
https://drive.google.com/file/d/1bTkjl3stSvzlaSeqPjS-MKiTqq4-wp3K/view?usp=sharing
- Posts: 4106
- Joined: 11 Jun 2019
I can see how this would be useful! Of course, the original intention of comparing by content is to catch files that update but maybe keep the same date/time and files that suffer from corruption without date/time changing. I think the issue is that this problem you have will only grow and grow and grow. While running a content comparison will be quick for those 15 or so items now, what about 5,000 emails from now? 10,000?>The first Comparison (based on date/size) would already have shown the files are NOT equal,
so why verify that based on Content?
My scenario:
- Windows has a bug that keep updating "modify date" on some files (I did a little research and found it to be the case mostly with .eml files). Then the FFS always sync that files again and again. To address this issue I might turn off Window indexing on .eml files. I will try that and see what else I get from here.
Thanks for the response!
https://drive.google.com/file/d/1bTkjl3stSvzlaSeqPjS-MKiTqq4-wp3K/view?usp=sharing rafaelbg, 28 Jan 2024, 19:25
Seems better to fix the root issue, which is what you said you are looking into
- Posts: 3
- Joined: 28 Jan 2024
Yes, agreed! Thanks again!
- Posts: 3
- Joined: 24 Dec 2021
I wonder if a hash like BLAKE3 is used for content comparison. While still slower than file date and size, BLAKE3 is definitely faster than, for example, the various flavors of SHA.