Faster scanning

Discuss new features and functions
Posts: 15
Joined: 25 Mar 2005

lee_jay

A possible idea for faster scanning:

If multiple volumes are to be scanned, start processes to scan each unique
volume all at the same time. I'm currently scanning 7 different volumes, and
FFS does them all sequentially. Since the scan takes around half-an-hour, a
significant savings could be had if they were all started at once.

Thanks for a great utility!
User avatar
Site Admin
Posts: 7058
Joined: 9 Dec 2007

Zenju

This is an old topic which I had almost forgotten. I have opened a
corresponding tracker item, to have it in focus:
[404, Invalid URL: https://sourceforge.net/tracker/?func=detail&aid=3382880&group_id=234430&atid=1093083]
Posts: 15
Joined: 25 Mar 2005

lee_jay

Thank you!

Also, it's not just source and target, since we can use many sources and many
targets. It would be nice to start many sources at the same time and many
targets at the same time provided they are all on different volumes.

Thanks again!
User avatar
Site Admin
Posts: 7058
Joined: 9 Dec 2007

Zenju

Sure, this is the feature request, despite the imprecise statement. There
could be one thread per disk (remember disk <-> volume is a N to N relation).
Posts: 15
Joined: 25 Mar 2005

lee_jay

Thanks for implementing this! It seems to work!
Posts: 15
Joined: 25 Mar 2005

lee_jay

Hmmm...not so sure now. I'm having a little trouble. Scanning one of my
folders over the Gigabit network is at least one whole order of magnitude
slower than it was in .13 - possibly as much as 2 orders of magnitude over
wireless (300Mbps wireless N) on that same folder. Oddly the other folders
seem to scan at a reasonable speed. This one has the most subfolders by far
(tens of thousands) and it is therefore the last one to finish. I tried
different positions in the queue and it didn't seem to matter. Not sure what
to do about that.
User avatar
Site Admin
Posts: 7058
Joined: 9 Dec 2007

Zenju

Are you on 64 or 32 bit Windows? I can assemble a few testing version to
identify the perf bottleneck.
Can you provide me with the total times (precision up to a second) for your
testcase of
1. v3.19 ( = sequential directory scanning)
2. v3.20 (= parallel scanning)
Some description of the testcase would be helpful, like number of folder pairs
and type of folders (e.g. network, local hdd, ssd, cd-rom)
Posts: 15
Joined: 25 Mar 2005

lee_jay

I'm not sure how to do a parallel install so I'll answer what I know now
instead.

On 3.16 (I never had .19 installed), I had the same config file reading four
local folders and the corresponding four on another machine over wireless N
networking. This particular folder (one of the four on the network) would scan
at approximately 500-800 files per second on .19 and it scans at around 10-20
files per second on 3.20. The odd thing is that the other folders scan at
normal speed, just all in parallel instead of series. This is all on Win 7
64bit. This particular folder is a problem because it contains approximately
100,000 subfolders each with about one file in them, but like I said it was
faster before. If there's an easy way to to a parallel install, please let me
know and I'd be happy to do further testing for you.
Posts: 15
Joined: 25 Mar 2005

lee_jay

Okay...I figured out the parallel install thing but I couldn't get 3.16 to
scan at all (hit compare, screen flashes, that's it). Odd.

So then I went to time 3.20 and.....it's running at normal speed now! What the
heck? Could installing 3.16 separately have had any effect on anything? I
don't see how. Same computers, same network, same folders, same everything.
I'll keep an eye on it and report back if I find anything new.
Posts: 15
Joined: 25 Mar 2005

lee_jay

Okay 3.16 (from before, not today) about 670,000 elements 36:32, 3.20 - 16:48
(for compare only). That one odd folder with all the subfolders paces
everything.
User avatar
Site Admin
Posts: 7058
Joined: 9 Dec 2007

Zenju

> Could installing 3.16 separately have had any effect on anything?
Practically impossible.

> 3.16 (from before, not today) about 670,000 elements 36:32, 3.20 - 16:48
(for compare only)
So 3.20 is actually faster in your tests now??

>I figured out the parallel install thing
The simplest thing is to install v3.20 and 3.19 locally.
Posts: 15
Joined: 25 Mar 2005

lee_jay

> So 3.20 is actually faster in your tests now??

Yeah...it's weird. The day I posted the issue it was well on its way to a 4
hour compare. Then it did 16:48 a couple days later with no changes other than
installing 3.16 (which I agree should have nothing to do with anything). Like
I said, I'll keep an eye on it and if it acts up again I'll let you know. For
now, it's working well. Thanks!
User avatar
Site Admin
Posts: 7058
Joined: 9 Dec 2007

Zenju

Good, so it seems this was some temporary issue then. I made a few tests
scanning two network drives in parallel (v3.20) compared to in sequence
(v3.19) and could reproduce a stable 30% perf improvent.
This indicates that the general multithreaded algorithm seems to work as
expected and traversing multiple network drives in parallel imposes no perf
drawback due to increased seeking times.
Posts: 51
Joined: 13 May 2017

Lady Fitzgerald

A possible idea for faster scanning:

If multiple volumes are to be scanned, start processes to scan each unique
volume all at the same time. I'm currently scanning 7 different volumes, and
FFS does them all sequentially. Since the scan takes around half-an-hour, a
significant savings could be had if they were all started at once.

Thanks for a great utility! lee_jay, 29 Jul 2011, 20:01
Curious. I have scanned up to four volumes (four separate SSDs in this case) at a time using four separate instances of FFS running simultaneously.