I just noticed that under some circumstances files that have hardlinks and get moved slip through FreeFileSyncs 'moved file' detection. Instead of moving them on the target directory, they get deleted and copied again.
This happened on a large sync job and I didn't manage to reproduce this in a small example.
If this matters, they happened to be hardlinked files on both sides. I haven't synced this job yet, so the situation is still present. If you need to know more, please tell me.
Moved Hardlinks sometimes slip through FFS's automatic move detection.
- Posts: 14
- Joined: 14 Sep 2012
- Site Admin
- Posts: 7211
- Joined: 9 Dec 2007
The move detection in FFS is a little conservative when it finds duplicate file ids (e.g. followed symlink, or hard link) on one side and only detects a single move when it could detect two moves. Maybe it's related to such checks, but I really need a test case in order to analyze if there is something to improve.
- Posts: 14
- Joined: 14 Sep 2012
Do you have any idea what I could do to analyze the present situation to help with a test case?
- Site Admin
- Posts: 7211
- Joined: 9 Dec 2007
PS: I just verified that in the most recent versions of FFS, duplicate file id's on one side do not participate in move detection at all. The reasoning is that "move" should be an optimization of "copy + delete" with the same observable effect.
But if there is more than one item with the same file id (and size and date), e.g. one could be a regular file, while the other is just an alias, e.g. a followed symlink, then FFS decides to not move, but "copy + delete" to be on the safe side.
But if there is more than one item with the same file id (and size and date), e.g. one could be a regular file, while the other is just an alias, e.g. a followed symlink, then FFS decides to not move, but "copy + delete" to be on the safe side.
- Posts: 14
- Joined: 14 Sep 2012
Okay, that is not what I experienced, at least not with 5.18.
Situation:
Folder 1: two files (hardlinked to each other)
Folder 2: same two files (doesn't matter if they're hardlinked to each other, too)
If i move one of the two hardlinked files, move detection works. If I move both of the files, they get copied & deleted.
Thanks for making such an awesome piece of "everyone should use it" :)
Situation:
Folder 1: two files (hardlinked to each other)
Folder 2: same two files (doesn't matter if they're hardlinked to each other, too)
If i move one of the two hardlinked files, move detection works. If I move both of the files, they get copied & deleted.
Thanks for making such an awesome piece of "everyone should use it" :)
- Site Admin
- Posts: 7211
- Joined: 9 Dec 2007
> If i move one of the two hardlinked files, move detection works. If I move both of the files, they get copied & deleted.
Yes, this is exactly the case I described where FFS does deliberately not "move". When I say "no duplicate file ids" I mean "no duplicate file ids for files that exist on one side only". So when you move only one file in your example there is no duplicate id since FFS does not consider the other items that are still in sync.
I think this is a good decision in general: Imagine the following scenario where FFS would pick an arbitrary item for move instead:
Starting with both sides (and all four files' content) being equal:
Left:
--------------
file.txt ->a regular file
fileX.txt ->a regular file
Right:
---------------
file.txt ->a regular file
fileX.txt ->a symlink to file.txt on right
Then on the left side the user deletes "file.txt" and renames "fileX.txt" to "fileY.txt". When comparing with FFS, the following is to be done on the right side:
- file.txt and fileX.txt should be deleted and
- fileY.txt should be created
Move detection kicks in and finds that both the to be deleted files match the to be created file (same id, size, date).
FFS decides to rename "fileX.txt" to "fileY.txt" and to delete "file.txt"
After sync both sides look like:
Left:
fileY.txt ->a regular file
Right:
fileY.txt ->a symlink to file.txt
Ups, let's hope right hand side is not your backup drive, since there is no data anymore, just a symlink!
This example is just to illustrate that trying to manage file moves when there are duplicate file ids is tricky business. This does not mean there are no ways for optimization. E.g. in the example above where FFS has the chance to decide which of the two candidates it uses for move source, it could check if one is a symlink and prefer the regular file. But duplicate file ids are not a very common scenario, so it's better to start with a safe default logic and only improve if all corner cases can be reliably dismissed.
Yes, this is exactly the case I described where FFS does deliberately not "move". When I say "no duplicate file ids" I mean "no duplicate file ids for files that exist on one side only". So when you move only one file in your example there is no duplicate id since FFS does not consider the other items that are still in sync.
I think this is a good decision in general: Imagine the following scenario where FFS would pick an arbitrary item for move instead:
Starting with both sides (and all four files' content) being equal:
Left:
--------------
file.txt ->a regular file
fileX.txt ->a regular file
Right:
---------------
file.txt ->a regular file
fileX.txt ->a symlink to file.txt on right
Then on the left side the user deletes "file.txt" and renames "fileX.txt" to "fileY.txt". When comparing with FFS, the following is to be done on the right side:
- file.txt and fileX.txt should be deleted and
- fileY.txt should be created
Move detection kicks in and finds that both the to be deleted files match the to be created file (same id, size, date).
FFS decides to rename "fileX.txt" to "fileY.txt" and to delete "file.txt"
After sync both sides look like:
Left:
fileY.txt ->a regular file
Right:
fileY.txt ->a symlink to file.txt
Ups, let's hope right hand side is not your backup drive, since there is no data anymore, just a symlink!
This example is just to illustrate that trying to manage file moves when there are duplicate file ids is tricky business. This does not mean there are no ways for optimization. E.g. in the example above where FFS has the chance to decide which of the two candidates it uses for move source, it could check if one is a symlink and prefer the regular file. But duplicate file ids are not a very common scenario, so it's better to start with a safe default logic and only improve if all corner cases can be reliably dismissed.
- Posts: 14
- Joined: 14 Sep 2012
Thank you for this explanation. I agree, symlinks can produce complex situations. The issue above already arises if the user decides to delete file.txt and you can just let it happen. You never know which file is target of a symlink, for sure if the symlink is outside the sync directory. So I'm not sure if FFS should consider the starting point as synchronized at all. But at the moment I can't think of similar problems imposed by the usage of hardlinks. So please take my vote for an improvement in this area on the wishlist ;)