Problem using the Mirror function

Posts: 9 · Leah 8 May 2021, 15:54

Hi! I am using FFS to backup a hard drive using the mirror option (the comparison option is set to file time and size). The source disk has 927.2 GB of data (176,892 files) but the backup disk has 934.5 GB of data (177,267 files). Thus, there is a 7.3 GB difference (375 files) between both disks. What really bothers me is that there are more files in the backup disk than in the source disk; if anything, I would have expected it to be the other way around. When I do a folder-by-folder comparison, FFS indicates that all the folders are identical. The same thing happens if I do a complete disk-to-disk comparison. I have used other programs to check the size of each of the main folders in the source and destination drives and the sizes are exactly the same. If all the folders have the same size, why is there such a big difference in the size of the data and in the number of files between both disks? Should I be running the comparison using other parameters? A list of the files excluded from comparison using the filter option is included below. Any help will be greatly appreciated!

P.S. Both disks are formatted in NTFS and I am running FFS on Mac OS X version 10.10.5 (Yosemite). I am using FFS version 11.0 because I noticed that later versions fail to compare files and folders with Spanish characters such as ñ, á, é, í, ó, ú etc., which are present in many of my files and folders.

Again, thanks a lot for your help.

Files excluded from comparison:

/.fseventsd/
/.Spotlight-V100/
/.Trashes/
/Thumbs.db
*/.DS_Store
*/._.*
*/._*

Posts: 4065 · xCSxXenon 8 May 2021, 18:10

Those excluded locations could easily be the reason the space usage is different. There is also the possibility that the two locations have different cluster sizes, which means the data size is the same, but the size on disk can vary.

Posts: 9 · Leah 9 May 2021, 19:31

Thank you very much for your reply. I followed your advice and I ran FFS again without any excluded locations, but except for a couple of files, FFS indicated that all the folders are identical, and the size difference between both disks remains (over 7 GB). I also looked at the disks and both of them have the same device block size (512 bytes) and the same allocation block size (4096 bytes). I don't know if these parameters are the cluster size that you are referring to (sorry, bit of a newbie here) but it was all the information I could get in MacOS. Is there anything else I could try? Also, should I report the problem I found when comparing folders with Spanish characters, and if so, where can do it?

Again, thanks a lot for your help!

Posts: 4065 · xCSxXenon 11 May 2021, 04:57

Posting a help thread specifically about the Spanish characters issue would be best.
As for the original issue, I would probably not worry about it, but you could run "Disk Inventory X" on both disks to get a map of all the data that reside on each.

Posts: 9 · Leah 12 May 2021, 16:36

Thanks a lot for your kind reply.

I will look into the program you suggested to see if I can find out anything else. As for the Spanish characters issue, I just tried with the 11.10 version which was released a few days ago and the problem is solved, so there is no need to report it. Again, thanks a lot for your help, I really appreciate it.

Posts: 1041 · therube 12 May 2021, 17:03

If you have access to a Windows machine (that can access your NTFS devices), using Everything, you can do a search something like:

!dupe: !sizedupe:

I believe that will show all files that are not duplicate & are not the same size.
(dupe:, finds duplicated file names. sizedupe: finds duplicated file sizes. ! negates the function.
Both work globally, against any indexed disk, so if there are other devices that Everything happens to see, you'd want to exclude those from its index.)

That should point you towards your 7G.
(Heh. Cell companies are just bringing out 5G & you're already on 7G.)

(I'm thinking you'll be better off with the Everything 1.5 Alpha version, as the above search, I believe, is more correct.

Depending on whether your NTFS drives are seen as fixed or removable, you may need to manually tell Everything to index it. Tools | Options | Indexes -> NTFS)

Posts: 9 · Leah 12 May 2021, 20:33

Thank you very much for your kind reply. I will try your suggestion and hopefully it will help me find out where those 7G (heh!) are hiding... I really appreciate your help!

Posts: 1041 · therube 14 May 2021, 17:49

I indexed my C: drive
I indexed my G: drive

C: is my new(er) computer
G: is my older (ancient) computer, networked

When I switched from one to another, I (mostly) copied what was on G: to C:.

I searched for items (file & directories) that were not duplicated between the 2 machines (or drives, if you will)
- !dupe:
I excluded (from displaying) 2 directories that are on C: (as they're not relevant)
(/lib is on a L: drive on the 'G' computer. /000 contains directories/files from G: that are not particularly relevant to my current computer)
- !c:/lib !c:/000
I'm only (displaying) the names through the 2nd directory level (like g:/out rather then something like G:\out\0tmp\2020.1221.b4.QT-0tmp\tmp)
- parents:2
Then I've limited the (display) to only directories (rather then a combination of directory & file names)
- folder:

What the above is showing me - assuming I'm understanding correctly...

there are directories in G:/out that are not on C:/out
- 0credit @ 0 MB, 0neil @ 7 MB, 0sonnypm @ 723 MB
there is a /7+ Taskbar Tweaker/ directory in C:/DEV/ that is not on G:
- G: is XP so is not as brain dead as MS made Win7, so the utility isn't needed ;-)
...

sorted by size, & I very quickly see that C:/Windows/winsxs holds the biggest difference between the two systems - 9 GB

now, change my filtering, say removing 'folder:' (which then causes files to also display), i then immediately see that a Win7.iso file is my largest single file, 5 GB

Note that the dupe: function deals with names rather then paths.
And since it is dealing with a name, it only matters that the name is duplicated (or not duplicated in this case).

IOW...

I have a G:\BASIS\MKRecover but I have no C:\BASIS\ directory, but I do have C:\000\BASIS\MKRecover, so the fact that the same (directory, in this case) name exists - irregardless of path, causes it to match the dupe: (duplicated name) function (& when I negate that, !dupe:) that would cause the MKReover directory to not show up in my listing. Name is the same - just happens to be at different locations.

Similarly, as this is named based, the same name could exist but be totally unrelated to some other same named (file or) directory.

So if you have c:/trains/red & g:/cathedrals/red & c:/fruits/artichokes/red, the (file or) directory named "red" would be matched by the dupe: function.

IOW, if you were after cathedrals & wanted to make sure that cathedrals were on both C: & G:, "red" would not be sufficient as "red" is also dup'd with trains & artichokes.

Posts: 4065 · xCSxXenon 14 May 2021, 18:38

Did you post in the wrong thread? What does any of that matter?

Posts: 1041 · therube 19 May 2021, 16:57

In this case, it may help the OP in discovering her (file size & number) discrepancy between her source & target - in an easy & concise manner.

It also points out, somewhat generally, how to use Everything, & how filtering affects what & how data is displayed.

And further, how Everything may help for those who have "moved files around" (& or renamed) since a backup has been made. Helping to point out where they've been moved to (& or renamed as).

Something like that could be used instead of, or in addition to (another) BRU sync which might potentially be copying (syncing) the same files again (only that they have been moved/renamed). (Multiple methods to a similar endpoint. Can be used singularly or in combination... depending on ones situation.)

As it is, I have a backup... of stuff. Been doing a lot of housecleaning. And the next time I run BRU (I'll know for sure how the above ideas pan out ;-)) it will show that I have a lot of "new" & or deleted files, where in fact, virtually all changes are moved/renamed (or outright) deletions.

Now I can simply let BRU do it's thing, which would be fine, & then go in after the fact & delete duplicates, or I can examine the changes (which I hope Everything will point out), & make some judicious changes (moves/renames) on the target end, so that when when I do throw BRU at it, it will find far less changes then it otherwise would & in the end, using a combination of both will make it easier on me.

Posts: 1041 · therube 9 Jun 2021, 20:36

Heh.
I was just reviewing this, & not quite sure where my head was.
In the above, any reference to BRU (quite a different utility) should be replace by "FFS".
%s/BRU/FFS/g
;-).