Hack FFS to detect moved files at first sync / How does move detection work? [answered]

Posts: 8 · Konzertheld 14 Apr 2020, 03:50

Hey there,

a topic that has come up in several forums, also here, still has no satisfying answer for me. Here's my use case:

I have a folder with a lot of big files. Ages ago I manually copied that folder to a PC in a different building to have a backup. Since then, I have renamed and moved some of the files. I would now like to reflect those changes on the remote PC without copying or deleting.

I know FFS can detect moved files in the second run. I also know that hashing files is slow and not even 100% reliable so apparently nobody uses it to detect moved files. So here are my thoughts and questions.

How does the FFS move detection work exactly? I know it creates a database, but I can't read it as it is binary. What is in that database and how is it used? And can I somehow trick FFS into creating a database for the origin folder as if it once looked like the remote folder - as if I had used FFS for the first copy? Because I think that is what it would need. FFS does not know my origin folder once actually looked like the remote folder so now it can't tell what changed. Is there no way to get this done? I tried hard links, but those do not work across anything and also even locally FFS did not seem impressed.

If anything of this has been answered yet, feel free to point me there because then I didn't find it yet.

Thank you!

Posts: 2456 · Plerry 14 Apr 2020, 11:09

I am not the developer, but will give it a shot.
From the FFS manual page describing Detect Moved Files

To make this work, FreeFileSync requires database files ("sync.ffs_db") to compare the current file system state against the time of the last synchronization.

This is based on the file-ID (if supported by the file-system and OS), which should not change when renaming or moving a file within the local file-system.
Each side involved in the sync must have its own (local) database-file containing the local status per the end of the previous sync, so it can be compared to the local status at the moment of the renewed compare/sync. If there was no previous sync, there is no such database.
As file-IDs are not preserved across e.g. network locations, file-IDs (and thus the FFS database) can only be used to locally determine file moves or renamings, and not between different (network) locations.

Posts: 8 · Konzertheld 16 Apr 2020, 00:14

Thank you! Aaah okay it uses the filesystem's file IDs... that is actually already enough to answer my post and all the questions in my head. Where do you know that from? Is that written somewhere and I overlooked it?

Posts: 12 · florentine 16 Apr 2020, 06:49

I also know that hashing files is... not even 100% reliableKonzertheld, 14 Apr 2020, 03:50

wait, what? could you elaborate on that?

although i don't think FFS uses hashing anyway

Posts: 8 · Konzertheld 16 Apr 2020, 06:52

I also know that hashing files is... not even 100% reliableKonzertheld, 14 Apr 2020, 03:50
wait, what? could you elaborate on that? florentine, 16 Apr 2020, 06:49

I simply meant that hashes are not 100% duplicate-free, just 99%. So if you have a lot of files, you might end up with a hash accidentally created twice. With md5 for example "a lot of files" actually is not that big a number, like, tens of thousands, if I remember correctly.

Posts: 2456 · Plerry 16 Apr 2020, 10:42

... Aaah okay it uses the filesystem's file IDs... Where do you know that from? Is that written somewhere and I overlooked it?Konzertheld, 16 Apr 2020, 00:14

Straight from the horses mouth: viewtopic.php?t=5935&p=19556#p19556

Posts: 8 · Konzertheld 16 Apr 2020, 21:20

Thanks Plerry!