Large number of small files

Get help for specific problems
Posts: 3
Joined: 28 Mar 2023

Paul_FFS

My backups include a very large number of small sidecar files that are dragging the sync to unreasonable lengths. I found another post that started me in the right direction on finding a key part of the problem with the "safe copy" and temp files but not fond of having to completely disable that and wondering if the behavior I'm seeing could be tweaked so I can use safe copy and not have sync times spreading out to weeks.

My big notice on this is a recent cold storage backup getting updated, the update is only 32GB but that's spread across 250k files. when it started it estimated overnight would finish but coming back in the morning it's pushed out to 24 days estimated. Cancelling doesn't cancel right away so I had to force quit and use windows to delete the tmp folder on the destination but that even took over two hours to delete (skipping recycle) so all night was just creating those files and not making any progress on the sync. restarting is going to be form scratch with zero progress made so I'm wondering if there is a way to force the sync to do one file at a time rather than try to do them all at once? that way if I have to cancel, the restart should be faster since some files are already done. This will also make a difference on disk space since it won't need so much for a copy of everything and can reuse the space after deleting each file in turn.

This is less of a problem to the warm storage backing up to the NAS but I think the process would still benefit if the sync was done one or just a few files at a time.
Posts: 5
Joined: 12 Dec 2021

SkillerAmSkillen

I don´t know if you can force it to sync one file at a time, but you could try to donate 1$ or so to get the donation edition, with this you can parallel copy multiple files, which is particulary handy for a large amount of small files and will shorten the time to copy them.
Posts: 959
Joined: 8 May 2006

therube

If you attempt the same, if you attempt similar, by other means does it result in a significantly quicker copy time?

200K TINY files in 4.min, local SSD.
Image
User avatar
Posts: 3727
Joined: 11 Jun 2019

xCSxXenon

I don´t know if you can force it to sync one file at a time, but you could try to donate 1$ or so to get the donation edition, with this you can parallel copy multiple files, which is particulary handy for a large amount of small files and will shorten the time to copy them. SkillerAmSkillen, 29 Mar 2023, 14:06
Note: Parallel threads does not magically make things faster all the time. It is designed to be applicable toa situation where each transfer thread is artificially throttled. It cannot increase speed if the bottleneck is hardware.

OP, what are you syncing from/to? HDDs or SSDs? 250k files is obviously going to take longer than 2k files of the same size, but you may look into investing into higher tier storage solutions.
I think the process would still benefit if the sync was done one or just a few files at a time.
By default, FFS transfers one file at a time.

Now, ignoring any other advice or upgrades, you have a couple options:
1) Run a compare first, select a subset of files (eg. 50k files or 25k, whatever), then sync that subset only. Repeat until all data is synced.
2) Create multiple saved configurations where each configuration syncs a subset of data. This can be done in two ways:
a. Multiple configs that sync the whole source to the destination, then each config uses filters to only sync certain sets of data.
b. Multiple configs that sync sets of subfolders in the source to the destination
Posts: 3
Joined: 28 Mar 2023

Paul_FFS

Some transfers have to be done via USB3.x but even on machines that can be opened and the destination drive plugged-in via sata it's far from ideal and this transfer speed isn't the frustration that triggered the post. The frustration comes in that the process was left running for over 15 hours and when cancelled it would have taken hours for the cancellation to actually free up system resources and that absolutely zero progress was made in the time spent. all the process had done so far was create (tens of thousands) tmp files on the destination disk (and wasn't done yet) so when restarted that had to start over with the first file again. I would rather the sync make a tmp, copy file, verify, delete temp, repeat for next file rather than make all temps, make all copies, verify all, delete all. Even if this made the total process time twice as long, I'd get some progress out of every start/stop and wouldn't have to have an abundance of free disk space to accommodate an extra copy of everything as tmp and recycle bin all at once.

As a short term workaround I am breaking these into even smaller batches but it opens up a lot of opportunity for mistakes and omissions and will still have hiccups for batches that can only get so small...
User avatar
Site Admin
Posts: 7089
Joined: 9 Dec 2007

Zenju

Let's repeat, to avoid further misunderstanding:
By default, FFS transfers one file at a time. xCSxXenon, 30 Mar 2023, 14:56
Posts: 3
Joined: 28 Mar 2023

Paul_FFS

Then how am I ending up with tens of thousands of temp files all at once at sync completion or if I hit stop? why if I cancel after 15 hours and restart was the file count needing to be synced unchanged but all those temp files had to be cleaned up? I just did a sync of 64k files, went through everything but hung itself during cleanup - same deleting <file123> for over an hour, flat graphs, etc... 58k files in the temp folder to be deleted manually...
User avatar
Site Admin
Posts: 7089
Joined: 9 Dec 2007

Zenju

Then how am I ending up with tens of thousands of temp files all at once at sync completion or if I hit stop? why if I cancel after 15 hours and restart was the file count needing to be synced unchanged but all those temp files had to be cleaned up? I just did a sync of 64k files, went through everything but hung itself during cleanup - same deleting <file123> for over an hour, flat graphs, etc... 58k files in the temp folder to be deleted manually... Paul_FFS, 01 Apr 2023, 09:50
I don't know which temp files you are talking about.