Slower (on Mac) since 8.7?

Get help for specific problems
Posts: 10
Joined: 27 Nov 2017

DerekPap

I've been using FreeFileSync for ages on both Windows and Mac and think it's excellent but one issue I've been having is that the speed seems to have dropped since v8.7. Whenever I've tried updating to latest version(s) since they've been almost 50% slower in transfer .. in fact I tried a latest test just now with running 8.7 and was getting around 370Mb which is what I normally get (internal MAC SSD to USB 3 SSD) and then used latest 9.5 and was only getting 220Mb.

Has it got fundamentally slower since 8.7 or is there a new setting perhaps that I'm missing introduced later that affects the speed?

Any help much appreciated as having to stick to 8.7 still.

Thanks Derek
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Can you reproduce a speed difference between 8.7 and 8.8?
Posts: 10
Joined: 27 Nov 2017

DerekPap

I haven't check recently but when I first discovered the problem a few months back I did try each version and believe I found 8.7 was the last version that speed was ok so I presume I found 8.8 was the first slow one .. but when I have a moment I will download 8.8 and do a check again to make sure.
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Thanks! If there really is a reproducible performance break between these versions, it can be fixed.
Posts: 10
Joined: 27 Nov 2017

DerekPap

ok just done a check .. 8.8 is ok but 8.9 isn't so it appears the performance changed with something in 8.9?
Posts: 10
Joined: 27 Nov 2017

DerekPap

Just for reference .. 8.7 & 8.8 was getting about 370Mb but with 8.9 is was about 225Mb
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Are you able to regain the perf when going back from 8.9 to 8.8? (correct perf measurement is tricky...)
Posts: 10
Joined: 27 Nov 2017

DerekPap

yes indeed .. for reference done some more tests switching between 8.8 / 8.9 alternatively and resuming/suspending a virtual machine between times for mix of large and small files with following results :

v8.8 - 51 files / 6.15Gb = 363MB/sec elapsed 17 seconds
v8.9 - 41 files / 5.50Gb = 227MB/sec elapsed 24 seconds
v8.8 - 36 files / 5.14Gb = 365MB/sec elapsed 14 seconds
v8.9 - 38 files / 5.23Gb = 228MB/sec elapsed 23 seconds
v8.8 - 42 files / 5.50Gb = 364MB/sec elapsed 15 seconds
v8.9 - 38 files / 5.34Gb = 225MB/sec elapsed 24 seconds

The numbers are as reported by FreeFileSync in summary screen at the end of sync.

As you can see v8.8 is averaging about 364MB/sec whereas v8.9 is averaging about 226MB/sec so significant difference between the two!
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Perfect, you might indeed be onto something. I'll dig deep into the tons of changes now...
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

In my testing I can see a speed difference, but it's not as large as in your case. Still I believe I've found what's causing this. Your overall transfer speed is very high, which may be the reason why some additional memory copies that were added in 8.9 cause the slow down.
I've created a new beta that gets rid of the file buffering (at least at the application level). Please let me know the results for your test case:
http://www.mediafire.com/file/dp758lrrheddz1p/FreeFileSync_9.6_beta_macOS.zip
Posts: 10
Joined: 27 Nov 2017

DerekPap

Hi Zenju,

Well I think you've cracked it! :) Doing some tests (see below) comparing v8.8 with v9.6 beta not only has the problem gone away but it's now significantly faster (rather than slower) than v8.8 which I guess is due to the other improvements/updates made between 8.8 -> 9.6?

v8.8 - 43 files / 6.14GB - 353Mb / 18 seconds
v9.6 - 47 files / 6.46GB - 449GB / 14 seconds
v8.8 - 48 files / 6.75GB - 352Mb / 19 seconds
v9.6 - 43 files / 6.09GB - 448GB / 14 seconds
v8.8 - 47 files / 6.24GB - 337Mb / 19 seconds
v9.6 - 43 files / 6.16GB - 449GB / 14 seconds

As you can see I'm now getting around 448MB/sec vs 352MB/sec (with v8.9 it was about 226MB/sec) so really good .. I guess the question being now is the "additional memory copies that were added" you mentioned were presumably added to improve something (?) and hence by removing the file buffering will that be degrading for other scenarios? Or was it possible that it did degrade other scenarios but just not noticed as at slower speeds in general (i.e. without SSD to SSD) it was significant to be noticed until now?

Anyway thanks for finding the solution, good work!

Is it safe for me to continue using the 9.6 beta you posted or should I wait until an official 9.6 release?

Thanks again,

Derek
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Well I think you've cracked it! :)DerekPap, 29 Nov 2017, 10:07
The credit mostly goes to you, for reporting this issue in first place and doing the testing. Perf bugs are one of the most difficult ones to fix, since usually nobody reports them (FFS 8.9 is 9 month old!) and even when confirmed, you're lucky when there is only a single cause, like we have here.

During file copy, three types of buffers are involved:
1. buffering at the hardware level, managed by the hard disk
2. at the operating system level
3. at the application level

The last beta you tested basically got rid of the 3rd buffer, and did what is called "unbuffered I/O". This would be slow if there weren't the other buffers 1 and 2. An application-level buffer 3 is standard procedure for file-copying purposes and optimizes potential overhead when calling into the OS too often and for too little bits of data. In the case of file copying there is not really much benefit since the data blocks that are copied are reasonably large, but what's really surprising is that 3 can actually be a perf pessimization. Considering that hard-drives operate(d) at a speed that was magnitudes slower than memory, it shouldn't matter whether data would be copied around for buffering purposes. Maybe the use of SSDs has changed that and uncovered a new bottleneck, that would not have mattered for lower disk access speeds.
it's now significantly faster (rather than slower) than v8.8 which I guess is due to the other improvements/updates made between 8.8 -> 9.6? DerekPap, 29 Nov 2017, 10:07
Exactly, there has been lot's of meticulous fine tuning after 8.8.
Is it safe for me to continue using the 9.6 beta you posted or should I wait until an official 9.6 release? DerekPap, 29 Nov 2017, 10:07
It *should* be safe, but if you want to be on the safe side, better wait for the official release. The code changes in the beta have not gone through a proper review yet.

PS: I'll have to do some perf testing on Linux as well.
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

After testing on Linux and Windows, it seems the SSD-explanation does not hold:

1. Windows: Unbuffered write is 80% slower for variable/small data blocks (e.g. "export file list"). No perf drawback during file copy.

2. Linux: Unbuffered write is 22% slower for variable/small data blocks, when the target volume is a network share, it's 3400% [!] slower. No perf drawback during file copy!

3. macOS: Due to technical limitations I wasn't able to test "variable/small data blocks" yet. Unbuffered file copy is 30-40% faster.

But: even this faster version is still *twice* as slow as file copy via "Finder" and also twice as slow as the Linux file copy test case on the same network share; same results for local file copy. Other means to copy files like "cp" or "native" copy commands (which technically are implemented similarly like "cp") such as "fcopyfile" are equally slow.
This implies that the perf issue in general is not only related to the application layer buffering (I still don't understand why this pessimization exists, while e.g. it does not on Linux, Windows) but seems to be a general issue with the Posix API providing C-level streams on macOS.
This could be explained if on macOS (which is built upon BSD), the Posix API itself was suboptimal and built on top of more performant lower-level access routines, which "Finder" is using.
Posts: 10
Joined: 27 Nov 2017

DerekPap

Hi Zenju,

If I read you right in what you say, the 9.5 beta (with 3rd application level buffer removed) whilst much faster on MacOS than with it, it's far slower on both Windows and Linux?

Also on the macOS, I haven't found it slower than Finder (must admit haven't done real tests) and the speed I'm getting as mentioned previously of around 450GB/sec is actually close to the manufacturers specs for the external device I'm writing to which is a Samsung T5.

So what's the thoughts then on going forward? Is it maybe something that could be added (i.e. removal of application buffering) as a switchable option in the sync settings so users could see which works best for there setup and turn it on or off? Or do you see it being something that might be different between builds for macOS vs Windows vs Linux? I'm guessing that's not ideal as best to keep as much common across platforms I would've thought.

Derek
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

If I read you right in what you say, the 9.5 beta (with 3rd application level buffer removed) whilst much faster on MacOS than with it, it's far slower on both Windows and Linux? DerekPap, 30 Nov 2017, 18:26
Windows/Linux: Far slower for variable block sizes, no difference for file copy.
Also on the macOS, I haven't found it slower than Finder (must admit haven't done real tests) DerekPap, 30 Nov 2017, 18:26
I've tested with a single 2 GB file. And Finder is twice as fast as FFS unbuffered, but equally fast as FFS (no matter if buffered or not) on Windows and Linux. It would be great if you could test if you see this difference with Finder, too.
So what's the thoughts then on going forward? DerekPap, 30 Nov 2017, 18:26
Ideally get the feedback of a few more FFS users on macOS to see how FFS compares against Finder.
The perf issue seems confirmed: viewtopic.php?t=2717
But it's not yet clear if unbuffered I/O is the right cure. At least it doesn't make sense considering that memory copies at 2GB/s, so it should not have such a large impact on an operation of only 300 MB/s.
Posts: 10
Joined: 27 Nov 2017

DerekPap

Hi Zenju,

Ok I've run some tests comparing v8.8, v9.6Beta & Finder copying a folder with files & sub-folders from internal to external SSD - Total size approx. 14GB made up of 290 files including few large files of 11GB, 2GB, 300MB, 200MB with the remainder being small files (it's a virtual machine folder hence large files which is mainly what I copy as I use VM's a lot - perhaps it's this that's making the difference? i.e. few BIG files rather than lots of small ones?) with the results as follows :

FreeFileSync 8.8

36 secs - 369MB/sec
34 secs - 393MB/sec
35 secs - 387MB/sec

FreeFileSync 9.6 beta

28 secs - 480MB/sec
28 secs - 476MB/sec
28 secs - 479MB/sec

Finder

28 secs
28 secs
28 secs

As you can see, the unbuffered 9.6 beta is just as fast as Finder! This seems at odds with your findings of Finder being faster?

Hope this info helps, let me know if anything else I can do or test.

Derek
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Thanks for testing and reporting your numbers! It's quite possible that the FFS vs Finder difference I'm still seeing is due to pecularities of my test setup.

After testing more code variants it has become clear that the additional memory copies are not what has caused the slowdown, so my doubts have been justified. That said, I'm still not 100% certain what the trigger is, but I've narrowed it down to a few possibilities that are essentially implementation details. (QOI issue of std:vector on clang/macOS? unexpected cost of zero-initializing memory? problem with function inlining?).

In any case, the implementation I've chosen until further is more elegant than the <= 9.5 variant, gets rid of a few operations one of which I suspect to correlate with the perf slowdown. So all that's left is for you to kindly test and let me know if this variant also fixed the perf issues in your scenario:

http://www.mediafire.com/file/iu5w67az0v5a5il/FreeFileSync_9.6_beta_macOS.zip
Posts: 10
Joined: 27 Nov 2017

DerekPap

Hi Zenju,

Just run some tests with original v8.8, 9.6 first beta and latest 9.6 beta you posted yesterday ..

FreeFileSync 8.8

36 secs - 369MB/sec
34 secs - 393MB/sec
35 secs - 387MB/sec
37 secs - 366MB/sec

FreeFileSync 9.6 beta (1)

28 secs - 480MB/sec
28 secs - 476MB/sec
28 secs - 479MB/sec
28 secs - 480MB/sec
28 secs - 479MB/sec
28 secs - 474MB/sec

FreeFileSync 9.6 beta (build 5/12/17)

27 secs - 479MB/sec
30 secs - 450MB/sec
28 secs - 475MB/sec
29 secs - 464MB/sec
29 secs - 468MB/sec
28 secs - 475MB/sec
28 secs - 474MB/sec

Looks like the latest is pretty close to the first (unbuffered) one although seems slightly less consistent perhaps?

So I think we're close with your latest code tweaks .. you'll know better than me from the code side but would it be worth trying (if possible) combining your latest code tweaks together with the unbuffered just to see if that speeds things up even more or not? Might help to clarify exactly which things are causing/fixing the issue? But if not then this latest version is looking good anyway, have you had a chance to see how it performs under Windows/Linux as yet?

Derek
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

Your latest numbers look great! So now we have the same perf like the unbuffered blockwise copy of "FreeFileSync 9.6 beta (1)" for file copying, and in addition also improved all other I/O that relies on streams (e.g. config save/load, export file list, FTP, SFTP, etc.).
Might help to clarify exactly which things are causing/fixing the issue? DerekPap, 06 Dec 2017, 10:45
The cost/benefit ratio is bad for such a task. It will not be easy to 100% narrow down the trigger, which is already known to be some platform or compiler dependent cause. Also, the new implementation is better from a theoretical point of view. So other than to satisfy curiosity (like "oh, the old memory access pattern confused the OS' prefetcher heuristics...") there is not much to win. That's my estimate according to Pareto.
Posts: 10
Joined: 27 Nov 2017

DerekPap

Hi Zenju,

Sounds good, so do you see a 9.6 release coming soon now? Or ok to to use latest beta for now until then?

I quite understand on not worth spending time to find the exact cause, as you say, looks like we have a good working version now .. must admit though it would still be interesting to know what changed between 8.8 & 8.9 that caused the initial slowdown in the first place but not important :)

Thanks for all your help on this issue.

Derek
User avatar
Site Admin
Posts: 7282
Joined: 9 Dec 2007

Zenju

I'll try to release today!
Posts: 2
Joined: 5 Oct 2018

jot

Hi Zenju,

I have a similar issue and found this post some time ago, but did not have the time to go into much detail thus far.

I have an speed-issue between Linux (Ubuntu 16.04) and Windows (7 Prof.). The Linux system is actually the host, and Windows is running on that as a guest in VMware (v14). I would expect to be FreeFileSync to be slower in the Windows VM, but in fact it is way slower on the Linux host. It is sync-ing the exact same directories on both systems, but on Linux it is about a factor 10 slower than Windows (25-30 s vs. 2-3 s).

Sync-ing is done between directories on the Linux host and a directories on a network-drive. Windows uses the same directories on the host using "Shared Folders" in the VM.

First I thought it might be due to the older version (v9.6) I used, but a few days ago I installed the latest version (v10.4) and the behavior is still the same.

I checked the setting of both, but could not find a difference. Is there anything I might have overlooked?

This is not a major issue. It's more something I wonder about, and it would be nice to have it solved, but I do not always have the time to extensively test things myself, so I fully understand when you have other/better things to do yourself.

Thanks in advance for any ideas.

In case you need more "data", just let me know, and I see whether I can help.

Best Regards,
Jan