Keep previous versions of files

Discuss new features and functions
Posts: 4
Joined: 21 Oct 2011

sm1987

Hello I am new on this software, looks really good but I cant find how to
manage or keep previous versions of backuped files.

I mean, I have a Document and I make the sync and work ok. Then I change the
Document and in the next sync I wanna have the new version but keeping the
older one...

How can I do that?
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

2011-10-21 22:59:16 GMT

Setup deletion handling to "versioning" in synchronization settings. This will
keep older versions (= deleted or overwritten files) in a time-stamped user-
defined directory.
Posts: 4
Joined: 21 Oct 2011

sm1987

"Setup deletion handling to "versioning" in synchronization settings. This
will keep older versions (= deleted or overwritten files) in a time-stamped
user-defined directory."

Good thanks, working now.

1) The user-defined directory must be outside the folder where we backup

2) Is a little confusing how the "deleted" files are saved. If I have a folder
named "EXAMPLE" and inside this folder a file named "FILE1", and I everyday
change the content of the file and back up with Free File Sync, when I need to
find a previous version of "FILE1" I dont have all the versions in the same
folder, I need to open every time-stamped folder and look inside if there is a
version of "FILE1"

Good software anyway, now I will check how it works with large volume of
files.
Posts: 9
Joined: 21 Sep 2011

geoff987654

I agree with sm1987, having the old versions of files in individual time
stamped folders is a problem.
Appart from finding the old copy you dont know how long to keep the date
stamped folders.
Some files are change every day and some every week, month or year. If you
keep the folders for 6 months you may have hundreds of copies of some file and
none of others.
I would to be able to keep the last x copies of a file. In stead of just
writing the file to the appropiate folder the program would check for a
privous one and rename it with somthing like in the namepreferably just before
the extention.
I have been looking at writing a batch file to copy the files from the date
stamped folders into one folder renaming the on the way but I am not a
programer and I am strugling.
This would be a realy useful feature.
Geoff
Ps I would be quite happy with a seprate batch file as this could run a night
or some quiet time.
Posts: 4
Joined: 21 Oct 2011

sm1987

Good points geoff987654

"I would to be able to keep the last x copies of a file."
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> 1) The user-defined directory must be outside the folder where we backup
Or put it inside one of the folders giving it same special name like
"OLD_VERSIONS" and use an exclude filter.

> when I need to find a previous version of "FILE1" I dont have all the
versions in the same folder
This task can be accomplished by using Explorer's search function (find all)
and sorting by date.
Posts: 4
Joined: 21 Oct 2011

sm1987

Great!

thanks Zenju for the tips!
Posts: 2
Joined: 9 Jan 2012

rdwil

It would be great of there would be an option to give the number of versions
you want to keep and the base folder where the versions are stored. Something
like this for example:

Number of versions: 3
Base folder: D:\old versions

So this would in the end create the following folders:
D:\old versions\version 1
D:\old versions\version 2
D:\old versions\version 3

Then in D:\old versions\version 1\ would be stored all the files (folder
structure copied from the original) that where changed or deleted in the
original folder. If there is another change then files from D:\old
versions\version 1 to D:\old versions\version 2, etc.

I assume that you could make it so that when a file is changed the same file
is moved from the current highest version (e.g. version 2) to the next version
(e.g. version 3) or be deleted (if the maximum version has been reached). Then
this process repeats for the version next in rank until the original file can
be moved to version 1.

A versioning like this with a folder for each version would be very very
useful.
Posts: 9
Joined: 21 Sep 2011

geoff987654

I have set FFS to put the deleted files in a temporary folder. I have written
a VBS script to move the files from the date stamped folders into a permanent
folder with the same structure as the synced folder. It checks to see if the
file already exists and renames it if required. If more than the required
number of old copies exist the oldest is deleted.
I call this from a .BAT file run as a scheduled task each day.
This has the advantage that is dose not slow down the sysnc process
The script is given below.
Usage - Sript.vbs (Source Folder ) (Destination folder) ( default is 5)
I am not a programmer so I apologise for the state of the script. I am sure
there are redundant definitions and all sorts of other NONOs.
USE AT YOUR OWN RISK - PLEASE TEST WITH DUMMY FILES FIRST.
One problem I know of.
There is nothing in the script which makes sure the old files are numbered in
the correct order, but so far it has been ok .
I am more than happy for some to tidy this up and repost it.
Geoff
Dim objFSO, objFolder, strOutput, strDestfolder, intBasepathlen
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set args = WScript.Arguments
If args.Count < 2 Then
strOutput = " Source, Destination and No. versions required" & vbCrLf
MsgBox strOutput
else
strRootFolder = args.Item(0)
strdestFolder = args.Item(1)
If args.Count > 2 Then
intVersions = args.Item(2)
else
intVersions = 5
End if
If Not objFSO.FolderExists(strRootFolder) Then
strOutput = "Source Folder " & vbCrLf & strRootFolder & vbCrLf & " Does not
exist" & vbCrLf
MsgBox strOutput
else
if not Mid(strRootFolder, len(strRootFolder)-1) = "\" then
strRootFolder = strRootFolder & "\"
End if
If Not objFSO.FolderExists(strdestFolder) Then
strOutput = "Destination Folder " & vbCrLf & strdestFolder & vbCrLf & " Does
not exist" & vbCrLf
MsgBox strOutput
else
if not Mid(strdestFolder, len(strdestFolder)-1) = "\" then
strdestFolder = strdestFolder & "\"
End if
intLevel = 0

Call GetFolderSize (strRootFolder)

End if
End if
End if

Function GetFolderSize (strFolderPath)
Dim objCurrentFolder, colSubfolders, objFolder, colFiles, objFile, strOldfile,
strNewfile, strFiletest, intLentoext, strNewfolder
intLevel = intLevel +1
Set objCurrentFolder = objFSO.GetFolder(strFolderPath)
Set colSubfolders = objCurrentFolder.SubFolders
For Each objFolder In colSubfolders
If intLevel = 1 then
intBasepathlen = Len(objFolder.Path)+2
End If
strNewfolder = strDestfolder & Mid(objFolder.Path, intBasepathlen )
If Not objFSO.FolderExists(strNewfolder) Then
objFSO.CreateFolder(strNewfolder)
End if
Set colFiles = objFolder.files
For Each objFile In colFiles
strOldfile = objFile.Path
strNewfile =strDestfolder & Mid(objFile.Path, intBasepathlen )
If objFSO.FileExists(strNewfile) Then
intLentoext = InStrRev(strNewfile, ".")
strFileName = Left (strNewfile,intLentoext-1)
StrFileExt = Mid (strNewfile,intLentoext+1)
strFileRen = strFileName & "{" & intVersions & "}." & StrFileExt
If objFSO.FileExists(strFileRen) Then
objFSO.DeleteFile strFileRen, True
End If
For intcounter = intVersions-1 To 1 Step -1
strFiletest = strFileName & "{" & intcounter & "}." & StrFileExt
If objFSO.FileExists(strFiletest) Then
strFileRen = strFileName & "{" & intcounter+1 & "}." & StrFileExt
objFSO.MoveFile strFiletest, strFileRen
End If
Next
objFSO.MoveFile strNewfile, strFileName & "{1}." & StrFileExt
End If
strNewfolder = left(strNewfile, InStrRev(strNewfile, "\"))
objFSO.copyFile strOldfile, strNewfolder
next
GetFolderSize (objFolder.Path)
If intLevel = 1 Then
objFSO.DeleteFolder objFolder, True
End If
Next
intLevel = intLevel -1
End Function
Posts: 2
Joined: 9 Jan 2012

rdwil

Thank you geoff987654.

I am unfortunately not a programmer and do not understand how to implement
your solution. I understand that you need to run this script separately from
the FFS? Could that not create the possibility of running the script to often
or to few so file versions are not correctly implemented?

Thank you
Posts: 9
Joined: 21 Sep 2011

geoff987654

I am using the programs as follows
Realtimesysnc monitors 2 folders (one local and one on mapped net work drive).
The freefilesy profile that is triggered by a change to either folder
versioning is set to put the deleted files in a folder
C:\synchistory\TempArch.
So at the end of the day temparch contains a number of Date stamped folders
with names like 2012-01-10 183102 containing the deleted files in a folder
tree that matches their original localtion.
I have a simple batch file that contains the following
"C:\Program Files\MyProgs\Singlefolder.vbs" " C:\synchistory\TempArch \"
"C:\synchistory\PermArch "
A windows scheduled task runs this batch file at 3:00 am. It runs my VBS
script Singlefolder.vbs. I did it this way because I did not want to hard
write the folder names into the script but could not get the Tack scheduler to
pass them to the script.( messy but it works )
The script works its way through the date stamped folder till it finds a file,
It then looks in the corresponding point on the folder tree in the permarch
folder. If there is no file of that name there it move the file from temparch
to permarch. If there is a file there it check how many and deletes any more
that 5, renames the files so that file becomes file{1} and file{1} becomes
file{2} etc. (it actually has to be done in descending order). Having made a
space it now moves the file from temparch to permarch.
It then goes back to working through the Date stamped folder moving any files
it finds as above. When it reaches the end it finally delete the folder, and
starts on the next one.
By the end of the process the Temparch folder is empty and all the files have
been moved to Permarch. This is why you should test it with some dummy files.
Copy old date stamped folder and use those untill you are happy it is working.
To set the script
Copy the text following my name from my last post into a text file using
notepad and save it as something like C:\Program
Files\MyProgs\Singlefolder.vbs
Copy the text from my batch file above, change the folder and file names to
suit your setup then save it as something like C:\Program
Files\MyProgs\Singlefolder.bat. Then set up a windows task to run the batch
when it suits you. You need not worry about running the batch too often. If
the temp folder is empty it just dose nothing.

I hope this helps
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

Maybe a "parsable" expression is the way to go. Some food for thoughts:

File to be deleted: C:\BaseDirectory\SubDirectory\SomeFile.txt

%version% = 2011-10-15 023455
%baseDir% = C:\BaseDirectory\
%relPath% = SubDirectory\
%shortName% = SomeFile
%extension% = .txt

G:\VERSIONING\%version%\%relPath%%shortName%%extension%
=> G:\VERSIONING\2011-10-15 023455\SubDirectory\SomeFile.txt

G:\VERSIONING\%relPath%%shortName%.%version%%extension%
=> G:\VERSIONING\SubDirectory\SomeFile.2011-10-15 023455.txt

%baseDir%%relPath%%shortName%.%version%%extension%
=> C:\BaseDirectory\SubDirectory\SomeFile.2011-10-15 023455.txt

%baseDir% could be used to keep files in their respective source volume
(currently not supported), instead of moving both sides to a single volume.

Anonymous

With regard to the comment by geoff987654 and sm1987 about the problem with
individual time stamped folders as the versioning method, I agree. However,
I've come up with a script that will create hardlinks to files in the time-
stamped folders to a "tree view" of those files. Here is the code, with the
header explaining this in more detail.




@goto :BEGIN
=============================================================================================

HardLinks - create hardlinks of files under date/time-named directories created by FreeFileSync revisioning backup
to a single tree structure with file version references

For example, a FreeFileSync Versioning file tree structure like this:

.\BACKUP\REVISIONS
|
+---2012-07-28 151719
| \---3
| TestFile.txt
|
+---2012-07-28 152033
| \---3
| \---c
| TestFile.txt
|
\---2012-07-28 152130
\---3
\---c
TestFile.txt

creates the following tree structure with hardlinks to source files:


.\BACKUP\REVISIONS\_HARDLINKS
|
\---3
| TestFile_2012-07-28 151719.txt
|
\---c
TestFile_2012-07-28 152033.txt
TestFile_2012-07-28 152130.txt


This script should exist in the base directory where FreeFileSync creates date/time-named directories.
Directories processed will be moved to a REVISIONS subdirectory, and hardlinks are created under
REVISIONS\_HARDLINKS. For example:

.\REVISIONS
+---_HARDLINKS
| +---1
| | \---a
| +---2
| | \---b
| \---3
| \---c
|
+---2012-07-28 151106
| \---1
|
+---2012-07-28 151520
| +---1
| | \---a
| \---2
| \---b
|
+---2012-07-28 151719
| \---3
|
+---2012-07-28 152033
| +---1
| | \---a
| +---2
| | \---b
| \---3
| \---c
|
\---2012-07-28 152130
\---3
\---c


- A text file of the tree structure and files can be found in REVISIONS\_Hardlinks\Tree.txt

- A text file containing a list of source files and corresponding hardlinks can be found in REVISIONS\_HardLinks.txt
This file is used by "HARDLINKS.CMD CLEAN" command to clean up unpaired files. Deleting a file from the date/time branch
or _Hardlinks branch does not delete the "paired" file. Running:
HARDLINKS.CMD CLEAN
will process the log file, and for each pair list, if either of the files does not exist
(the link or source file was deleted, but not the other) then the remaining file will be deleted

the REVISIONS\_HardLinks.lst file is updated with the current existing files

File deleted are recorded in REVISIONS\Deleted.lst

The clean process is automatically run at each time you run HARDLINKS.CMD.
To disable the automatic CLEAN process each time HARDLINKS is run, set AUTOCLEAN=N below

You can manually run the CLEAN process by passing CLEAN as a command line parameter:
HARDLINKS CLEAN


==================================================================================================================
:BEGIN
@echo off & setlocal ENABLEDELAYEDEXPANSION

set REVSUBDIR=REVISIONS
set HARDLINKSDIR=_HARDLINKS
set AUTOCLEAN=Y

if NOT EXIST %REVSUBDIR% (
fsutil>nul 2>&1
if errorlevel 1 goto :FSUTILERR
echo Fsutil OK
fsutil fsinfo volumeinfo %~D0 | find /c "NTFS"
if ERRORLEVEL 1 goto :NOTNTFS
md %REVSUBDIR%
)

set PWD=%CD%

if /I "%~1"=="CLEAN" goto :CLEAN

for /F "tokens=*" %%A in ('dir 20*. /ad /b') do (
TITLE Processing %%A
echo.
echo ================================================
echo Processing %%A
call :PROCESS "%%A"
)
tree /A /F %REVSUBDIR%\%HARDLINKSDIR% >%REVSUBDIR%\%HARDLINKSDIR%\Tree.txt
goto :EOF

:PROCESS
set REV=%~1
move "%REV%" %REVSUBDIR%>nul
set RELPATH=%PWD%\%REVSUBDIR%\%REV%
for /f "tokens=*" %%a in ('dir "%REVSUBDIR%\%REV%\*.*" /s /b /a-d') do (
echo %%a
set SRCFILE=%%a
set RELSRCFILE=!SRCFILE:%RELPATH%=!
call :HARDLINK "%%a"
)
if /I "%AUTOCLEAN%"=="Y" goto :CLEAN
if "%~1"=="" pause
goto :EOF

:HARDLINK
set FILEROOT=%~N1
set FILEEXT=%~X1
set RELSRCPATH=!RELSRCFILE:%~NX1=!
set RELSRCPATH=!RELSRCPATH:~1,-1!
if NOT EXIST "%PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!" md "%PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!"
echo.
fsutil hardlink create "%PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!\%FILEROOT%_%REV%%FILEEXT%" "!SRCFILE!"
echo %PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!\%FILEROOT%_%REV%%FILEEXT%^|!SRCFILE! >> %REVSUBDIR%\_HardLinks.lst
goto :EOF

:CLEAN
if EXIST "%TEMP%\_HardLinks.tmp" del "%TEMP%\_HardLinks.tmp"
for /F "tokens=1,2 delims=^|" %%A in (%REVSUBDIR%\_HardLinks.lst) do (
set DEL=
if NOT exist "%%A" set DEL=%%B
if NOT exist "%%B" set DEL=%%A
if NOT "!DEL!"=="" (
echo Deleting !DEL!
DEL "!DEL!"
echo %DATE%: !DEL! >> %REVSUBDIR%\Deleted.lst
) ELSE (
echo %%A^|%%B >> "%TEMP%\_HardLinks.tmp"
)
)
type "%TEMP%\_HardLinks.tmp" > %REVSUBDIR%\_HardLinks.lst
if "%~1"=="" pause
goto :EOF

:FSUTILERR
echo.
fsutil
echo.
pause
goto :EOF

:NOTNTFS
echo.
echo %~D0 is not an NTFS volume
echo.
pause
goto :EOF
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

@BKeadle: I really like your new file hierarchy! By restricting the new naming
convention to the "REVISIONS" folder there is no conflict with old versions
during scanning. Thous no need for some fancy exclusion rule filtering out
some "magic" phrase . In fact I do not see any reason why your naming
convention should not be the FFS default. It would even make a setting like
"max number of revisions" feasible at all, while with the current hierarchy it
is rather pointless.
Advocates of the current naming scheme better tell me here. If there is no
compelling reason to keep it, I'll replace it with BKeadle's version
(*without* providing an option for the user to select).
Posts: 7
Joined: 1 Aug 2012

clh42

I JUST found this software today and looks like I found it just in time. I
also noticed the version handling as a bit short of the capabilities of some
other sync software I've tried. My biggest thing is that I also want the
capability to to have FFS automatically delete older revisions with options
for different criteria. Most of what I've read from others' posts have said
"after x revisions" but I'd also like to see "after x
days/weeks/months/years".

Two other comments I have regarding how FFS tracks versions. Now, I have to
admit that I have NOT actually done a sync yet so this is based on how I THINK
this works (because I don't want to actually mess up a previous sync from
other software I already have until I'm sure how it works). I'm also doing a
Mirror operation instead of a 2-way sync.

First, I'm assuming that my true specified destination directory still always
contains my most current version from my source. Is that correct?

So, my first comment is on how to store old revisions. I do like being able to
specify a custom revisions directory, but instead of storing every file in a
separate dated directory, why not just mimic the exact same subdirectory
structure and just append the revision date to the file NAME all within the
same directory? For example, filename.exe.YYYY-MM-DD
Then you'd have all your revisions easily identifiable in the same directory.
This is how another piece of sync software I found does it.

My other comment is that I agree with someone else who posted that if I were
doing a 2-way sync, I'd like to be able to specify separate custom versioning
directories for the left and right sides.

Thank you for your consideration.
Posts: 7
Joined: 1 Aug 2012

clh42

I guess you need keep the time in the revision info as well as the date so
using the example of just using the same directory and just appending the
revision date/time to the actual filename, you end up with
filename.ext.YYYY-MM-DD-HHMMSS
and I suppose HH should be in 24 hour format for sorting of the files in
correct order in Explorer.

And just to clarify, when I say "same directory", I don't mean that the old
revisions are literally kept in the same actual destination directory, I just
mean that the specified custom versioning directory keeps the same exact
structure as the actual destination directory and doesn't need date based
subdirectories within it.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> assuming that my true specified destination directory still always contains
my most current version from my source


Yes. You should make a few tests first with sample folders if you are afraid
to damage your production data.


> I do like being able to specify a custom revisions directory, but instead...


This seems identical to BKeadle's suggestion.


> I'd like to be able to specify separate custom versioning directories for
the left and right sides.


Except for performance reasons, this is a minor requirement. Assuming that a
two-way sync models a shared-ownership data source, then there is logically
only the need for a single revisions directory.


> filename.ext.YYYY-MM-DD-HHMMSS


So far the plan for the new naming convention is:
<revisions directory>\<subdir>\<filename>.YYYY-MM-DD HHMMSS.<ext>
In contrast to current convention:
<revisions directory>\YYYY-MM-DD HHMMSS\<subdir>\<filename>.<ext>
Posts: 7
Joined: 1 Aug 2012

clh42

Sounds good to me. I kind of like putting the revision date and time before
the extension so the file still works with any file associations for it. I
look forward to trying this, but also still really want to see the ability to
set an automatic delete of old versions after a certain # of revisions and/or
date period.

I didn't quite follow BKeadle's post about the hardlinks before I posted my
previous reply but now that I reread it I guess he was doing the same thing
just in a different way.

I guess the biggest issue I'd see for existing FFS users is how they migrate
to the new revisioning structure if they upgrade their FFS version. They'd
apparently end up with the old structure for old versions for syncs performed
before the update and the new structure for syncs performed after the update.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> migrate to the new revisioning structure


Hm, this problem could probably be solved by a script. I'm looking in
BKeadle's direction... ;)
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

Here is the update implementing the new versioning scheme:
[404, Invalid URL: http://freefilesync.sourceforge.net/FreeFileSync_5.7_beta_setup.exe]

Let me know if you find any problems.

Anonymous

Hey! Sorry, I didn't know this conversation was going on. Though I have
"Monitor this" checked, I didn't receive any notifications about these
updates.

I had since updated the hardlinks.cmd to provide a more flexible choice for a
delimeter between the file-root and file-extension - but yes, the date/version
reference should be a suffix of the file-root, preserving the file-extension
so that associations to the filetype stay entact. Having a choice of delimeter
allows for flexibility to generate a file listing and then act on the file
listing through other utilities/scripts so that (file-root) (delim) (time-
stamped-version) (file-extension) is easily parsed for actions. Simply modify
the DELIM= variable just under the :BEGIN label to a valid delimeter of your
choice. I've using "---" by default.

As for migrating to the "new revision structure", that's easy. Run
hardlinks.cmd against the old structure to create the new structure, then just
delete the old structure! When you delete the old structure, you're just
removing 1 "link counter" to each file, not actually deleting any files
(unless the file is down to 1 remaining hardlink) - TEST FIRST !!! Do
I understand that correctly about the transition between the old and the new?





@goto :BEGIN
==================================================================================

HardLinks - create hardlinks of files under date/time-named directories created by FreeFileSync revisioning backup
to a single tree structure with file version references

For example, a FreeFileSync Versioning file tree structure like this:

.\BACKUP\REVISIONS
|
+---2012-07-28 151719
| \---3
| TestFile.txt
|
+---2012-07-28 152033
| \---3
| \---c
| TestFile.txt
|
\---2012-07-28 152130
\---3
\---c
TestFile.txt

creates the following tree structure with hardlinks to source files:


.\BACKUP\REVISIONS\_HARDLINKS
|
\---3
| TestFile_2012-07-28 151719.txt
|
\---c
TestFile_2012-07-28 152033.txt
TestFile_2012-07-28 152130.txt


This script should exist in the base directory where FreeFileSync creates date/time-named directories.
Directories processed will be moved to a REVISIONS subdirectory, and hardlinks are created under
REVISIONS\_HARDLINKS. For example:

.\REVISIONS
+---_HARDLINKS
| +---1
| | \---a
| +---2
| | \---b
| \---3
| \---c
|
+---2012-07-28 151106
| \---1
|
+---2012-07-28 151520
| +---1
| | \---a
| \---2
| \---b
|
+---2012-07-28 151719
| \---3
|
+---2012-07-28 152033
| +---1
| | \---a
| +---2
| | \---b
| \---3
| \---c
|
\---2012-07-28 152130
\---3
\---c


- A text file of the tree structure and files can be found in REVISIONS\_Hardlinks\Tree.txt

- A text file containing a list of source files and corresponding hardlinks can be found in REVISIONS\_HardLinks.txt
This file is used by "HARDLINKS.CMD CLEAN" command to clean up unpaired files. Deleting a file from the date/time branch
or _Hardlinks branch does not delete the "paired" file. Running:
HARDLINKS.CMD CLEAN
will process the log file, and for each pair list, if either of the files does not exist
(the link or source file was deleted, but not the other) then the remaining file will be deleted

the REVISIONS\_HardLinks.lst file is updated with the current existing files

File deleted are recorded in REVISIONS\Deleted.lst

The clean process is automatically run at each time you run HARDLINKS.CMD.
To disable the automatic CLEAN process each time HARDLINKS is run, set AUTOCLEAN=N below

You can manually run the CLEAN process by passing CLEAN as a command line parameter:
HARDLINKS CLEAN


==================================================================================================================
:BEGIN
@echo off & setlocal ENABLEDELAYEDEXPANSION

set AUTOCLEAN=Y
set DELIM=---
set REVSUBDIR=REVISIONS
set HARDLINKSDIR=_HARDLINKS

if NOT EXIST %REVSUBDIR% (
fsutil>nul 2>&1
if errorlevel 1 goto :FSUTILERR
echo Fsutil OK
fsutil fsinfo volumeinfo %~D0 | find /c "NTFS"
if ERRORLEVEL 1 goto :NOTNTFS
md %REVSUBDIR%
)

set PWD=%CD%

if /I "%~1"=="CLEAN" goto :CLEAN

for /F "tokens=*" %%A in ('dir 20*. /ad /b') do (
TITLE Processing %%A
echo.
echo ================================================
echo Processing %%A
call :PROCESS "%%A"
)
tree /A /F %REVSUBDIR%\%HARDLINKSDIR% >%REVSUBDIR%\%HARDLINKSDIR%\Tree.txt
goto :EOF

:PROCESS
set REV=%~1
move "%REV%" %REVSUBDIR%>nul
set RELPATH=%PWD%\%REVSUBDIR%\%REV%
for /f "tokens=*" %%a in ('dir "%REVSUBDIR%\%REV%\*.*" /s /b /a-d') do (
echo %%a
set SRCFILE=%%a
set RELSRCFILE=!SRCFILE:%RELPATH%=!
call :HARDLINK "%%a"
)
if /I "%AUTOCLEAN%"=="Y" goto :CLEAN
if "%~1"=="" pause
goto :EOF

:HARDLINK
set FILEROOT=%~N1
set FILEEXT=%~X1
set RELSRCPATH=!RELSRCFILE:%~NX1=!
set RELSRCPATH=!RELSRCPATH:~1,-1!
if NOT EXIST "%PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!" md "%PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!"
echo.
fsutil hardlink create "%PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!\%FILEROOT%%DELIM%%REV%%FILEEXT%" "!SRCFILE!"
echo %PWD%\%REVSUBDIR%\%HARDLINKSDIR%\!RELSRCPATH!\%FILEROOT%%DELIM%%REV%%FILEEXT%^|!SRCFILE! >> %REVSUBDIR%\_HardLinks.lst
goto :EOF

:CLEAN
if EXIST "%TEMP%\_HardLinks.tmp" del "%TEMP%\_HardLinks.tmp"
for /F "tokens=1,2 delims=^|" %%A in (%REVSUBDIR%\_HardLinks.lst) do (
set DEL=
if NOT exist "%%A" set DEL=%%B
if NOT exist "%%B" set DEL=%%A
if NOT "!DEL!"=="" (
echo Deleting !DEL!
DEL "!DEL!"
echo %DATE%: !DEL! >> %REVSUBDIR%\Deleted.lst
) ELSE (
echo %%A^|%%B >> "%TEMP%\_HardLinks.tmp"
)
)
type "%TEMP%\_HardLinks.tmp" > %REVSUBDIR%\_HardLinks.lst
if "%~1"=="" pause
goto :EOF

:FSUTILERR
echo.
fsutil
echo.
pause
goto :EOF

:NOTNTFS
echo.
echo %~D0 is not an NTFS volume
echo.
pause
goto :EOF
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

Very nice, if someone needs help on migration, I'll reference this batch file!
I just finished migrating my huge FFS revisions directory using the script and
have a few comments:


> ('dir 20*. /ad /b') do


Newer FFS versioning schemes have sub directories names as "<batch job name>
<timestamp", e.g. "SyncJob 2012-12-12 121212". I changed this to "dir *. /ad
/b" and used a mass file renamer to remove the batch job name from the target
hard links


> dir "%REVSUBDIR%\%REV%\*.*"


I had problems that some files located in sub directories that did not have an
extension were moved to the "REVISIONS" directory, but no entries in
"_HARDLINKS" were created. I first thought changing "*.*" to "*" would be the
cause, but apperently this didn't help. Files without an extension are not
very important in my case, so I didn't investigate further.

The "CLEAN" leaves empty folders behind, also "Deleted.lst" and
"_HardLinks.lst" had to be removed manually. Well, not a big deal.

And a (well-known) performance hint: minimizing the console window during work
speeds up processing by magnitudes (some newbie programmer should tell ms how
to fix this; insects on the other hand do appreciate the 200 Hz refresh rate
at expense of CPU)

I have yet again updated the versioning scheme after my first-hands experience
after the migration:
[404, Invalid URL: http://freefilesync.sourceforge.net/FreeFileSync_5.7_beta_setup.exe]

new scheme:
<revisions directory>\<subdir>\<filename>.<ext> YYYY-MM-DD HHMMSS.<ext>

Anonymous

Thanks. Good points about some of the details in that script - I'll have to
take a second look. You point out that this script would become obsolete 88
years from now - we wouldn't want that to happen! :-)

However...after reviewing the purpose of my hardlinks.cmd script and this
forum thread with a peer, your pre-beta method of versions copy is still of
interest. I think I'd miss having your revision directory structure - I like
having both - a time-stamped tree of revisions as you have now (pre-beta)
*and* my hard linked structure (though that is dependent on a supported file
system). Could it be a configurable option?
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

Generally I try to avoid options, whose existence most of the time
demonstrates lack of understanding of the relevant scenarios. Why would you
like the old revisions structure in some cases? Programmatically the old
structure makes it easy (and efficient) to implement a "limit revisions"
parameter, but from a users point of view the new layout seems to functionally
absorb the old one.

As a side note I plan to implement a "limit" of some kind. I see three
candidates:
1. revision limit per file
2. date limit per file (remove revisions older than x days)
3. limit by total file size (this would essentially be a Windows recycle bin re-implementation)

My design decisions are based on two key scenarios that "versioning" is
supposed to facilitate:
1. a recycle bin replacement for volumes that do not support it, to resolve accidental deletion.
2. a simple CVS replacement

1.) demands a "limit revisions" parameter with value of "1"
2.) does not need a limit

Are there any other key scenarios I'm missing to consider?
Posts: 7
Joined: 1 Aug 2012

clh42

I'm only doing 1-way syncs to an external USB hard drive as a "backup". For
me, the revision history is a "just in case" scenario if I'd want to get back
an old version of, or a deleted file. I figure if I haven't needed to get an
old/deleted file in in "x" days (which based on my testing with a paid
application that I was free trialing and my free trial expired, would be 30
days), that I don't need it after that.

I've never used the Recycle Bin, all the way back to Windows 95. For some
reason, as a techy person, I just don't like the idea of it taking up space on
my active hard drive (granted, it's somewhat of an unreasonable dislike of the
recycle bin, but I've just never like it as a tech person). I shift-delete
everthing to bypass the recycle bin. But having an "x" day, automatically
purged retentioned copy on my explicit backups, separate from my active drive,
I like.

However, from work, our backups at work on all of our servers are x revisions,
not days, so I'm familiar with it either way.

Why not implement both as separate options, maybe via radio buttons to select
which we want. You're going to have to have a config option to specify the "x"
value either way whether it's days or revisions. Why not go the extra step of
providing the option to choose whether we want based on # of revisions or
based on time period?
Posts: 24
Joined: 25 Nov 2009

bkeadle

I like, and welcome, the limit revision ideas you offer. As for your
"previous" revision tree, it's nice to see in explorer tree view files that
have changed on a give run of a sync. True, I could just enable logging and
refer to the log, but somehow the revision tree is easier. In the new revision
structure, I suppose I could do a search on the revision date/time stamp and
see the a tree result that way (sorta) of changed files on a given sync pass.
If having 2 different options for revision copies (like file compare Time &
Size vs. File Content) is problematic, so be it. The new structure may well be
preferred for most.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> I could do a search on the revision date/time stamp


This will be a feasible way to solve the problem. FFS takes care to use the
same time stamp for all files revisioned within a sync session and for the log
file name as well. Generally, having a simple revisions hierarchy that matches
the user's source data seems more important than being able to quickly find
the "diff" of the last update.


> I figure if I haven't needed to get an old/deleted file in in "x" days that
I don't need it after that.


This is a good reason to require a limit on the date of deletion as it is
encoded in the time-stamp appended to the file name (as opposed to the last
file modification time).
I'm almost inclined to add two options "limit revision count" and "limit to
last x days". But there is a problem with limiting the time: In order to find
all the outdated files, FFS would need to scan the entire versioning directory
at the end of every sync. If the data is stored on a slow device like an usb
stick or on a network share, the performance may be prohibitive.
For "limit revision count" on the other hand it's easy and fast (at least it
scales with the number of deletions) to limit the number of a group of files
located in the same directory.
Also it seems a "limit revision count" could also solve your problem of
providing means to undo an accidental deletion? If this is true, we would only
need "limit revisions count" as an option.

There is also the question how to handle symlinks. For the old versioning
scheme, it seemed obvious that both file and directory symlinks should be
moved. I think this is also true with the new versioning in the case of file
symlinks. But directory symlinks may be a different story: Currently (in the
beta) I do create directories lazily when versioning, i.e. only when needed.
This simplifies the code internally, but may be actually even useful. Also the
time-stamp is not applied to the directories that are revisioned. So for
directory symlinks this seems to indicate that they should simply be deleted
without revisioning!? Any outragous complaints about this plan?
Posts: 24
Joined: 25 Nov 2009

bkeadle

> But there is a problem with limiting the time: In order to find all the
outdated files, FFS would need to scan the entire versioning directory at the
end of every sync.



I don't understand why this would be any more difficult than the revision
count. For any given file you're making a revision copy, couldn't you just do
a wildcard search for that file, and delete any of the resulting files that
exceed the time limit? e.g.:



dir %filename-root%*.%extension%
for each file if older than x days, delete



Also, you mention the new versioning naming syntax:

> <revisions directory>\<subdir>\<filename>.<ext> YYYY-MM-DD HHMMSS.<ext>



I would suggest/prefer instead:

> <revisions directory>\<subdir>\<filename> YYYY-MM-DD HHMMSS.<ext>


with an optional delimiter between <filename> and date/time stamp, for the
reasons previously mentioned in my hardlinks.cmd?

Also, what are your thoughts about being able to *RESTORE* files from a
certain date - sync back? Are we left to our own devices for copying back a
revisioned file and having to handle the removal of the date/time suffix?
*THIS* actually may be a good reason to preserve (as an option/alternative)
your previous revision structure, so that if I needed to do a restore of files
from a previous backup, it's only a matter of:



xcopy <revisions directory>\YYYY-MM-DD HHMMSS\*.* <destination> /s



whereas with only the new proposed revision structure, restoring more than a
few files would be a tedious process or require a pretty sophisticated script
to make it happen.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> I don't understand why this would be any more difficult than the revision
count


In order to cleanup files older than "last x days" you need to travese the
complete revisions directory including subdirectories. On the other hand if
you need to ensure that only a fixed number of revisions exist per file, you
only need to check a single directory, and only do this if you add a new
revision. Former could be a performance issue if the directory tree is huge,
and it doesn't scale with the number of deletions, as the second option does.


> <revisions directory>\<subdir>\<filename> YYYY-MM-DD HHMMSS.<ext>


A lot of files have the same name but a different extension, e.g. "source.cpp,
source.h". The extension is therefore used as part of the name.


> with an optional delimiter between <filename> and date/time stamp, for the
reasons previously mentioned in my hardlinks.cmd?


What reasons do you mean exactly?


> what are your thoughts about being able to *RESTORE*


That's a difficult question, the lines between synchronization and versioning
are blurry in this area. FFS should stay focussed on the synchronization
aspect, but may satisfy a good deal of versioning tasks as long as it "doesn't
hurt" the design. The fact that FFS provides a user-readable revisions
structure almost implies that it will also be the user who does the restore,
just because he can, in contrast to a closed implementation that all the other
tools specialized on versioning do. I see the core benefit of being able to
restore individual files (easy with the new naming convention), rather than
restoring complete sets of files. Latter seems a less mainstream scenario,
biggest usage is probably source code versioning. But users who want a
sophisticated source code versioning may use GIT or SVN. Personally I use
FreeFileSync for versioning, and for small projects it seems the perfect
balance between ease of use and benefit. The scenario of restoring a connected
set of files for a given date in time, in my experience, has never been more
demanding than to restore the files for a specific version of FreeFileSync. So
I am having trouble seeing the demand for more sophistication. This is either
because I didn't have these specific problems yet, that require restoring a
version for an arbitrary point in time, or the framework I'm working in is
powerful enough to suit all relevant needs.
Posts: 7
Joined: 1 Aug 2012

clh42

I was working on this reply as Zenju posted his response a few minutes
ago, but as I already had this mostly written, here's my thoughts.



> Also it seems a "limit revision count" could also solve your problem of
providing means to undo an accidental deletion?


Well, yes in general, but it still affects the resulting files in the
revisions directory. Revision count means that if I modify a file every day,
or multiple times a day, I still can never have more than X revisions, which
means if I modify a file a lot, I might end up only a few days back of old
versions and how far back will vary depending on how often the file is
modified. And if I delete a file with revision count purging, that deleted
file will never get deleted from the revisions directory, ever.

With a date age purging, I know my often modified file is around for a fixed
period of time regardless of how often it's modified. And I know deleted files
are still purged out after that period of time to free up space (again, if I
haven't needed it back after I my specified period, I deem that I don't need
it).



> But there is a problem with limiting the time: In order to find all the
outdated files, FFS would need to scan the entire versioning directory at the
end of every sync.


Well, you make a good point. I was going to say, just handle it on a per file
basis as you sync each file, just look for files older the date range and
delete them as you sync each file and copy over the new revisions for modified
files. You already know the main file name as you sync each folder and I don't
think it will add a it won't add a whole lot more time to filter and delete
revisions older than the date range right after you copy the new revisions in.
But as I reasoned through that, I see the catch in this is that, that works
for files that are being synced, but it doesn't help for files that were
modified and synced at one time but don't get synced again, like my example of
the deleted file, or even just a file that might get modified several times in
a short period and then . So yeah, I guess at some point you'd have to go back
and scan everything I guess just to be safe.

I don't know what to tell you, and I don't know how they do it, but I've
looked at 3 other sync apps and they all offer both a revision count and
number of days type of purging. My only other thought was that I saw other
disccusion threads for FFS about whether or not to use a database for some or
all types of syncs. I can tell you that the other software I've checked out
does all use databases for all types of syncs, and maybe tracking revisions
the database makes it easier to handle purging of old revisions with either
method. That's purely a guess though, I really have no idea.



> I don't understand why this would be any more difficult than the revision
count. For any given file you're making a revision copy, couldn't you just do
a wildcard search for that file, and delete any of the resulting files that
exceed the time limit?


bkeadle, as Zenju basically just said in his previous reply he wrote as I
was writing this, I don't think he necessarily meant that it wasn't that
difficult to code, it's the performance hit of having to scan through
everything as described above, which doesn't have to be done with the revision
count scheme.



> Also, what are your thoughts about being able to *RESTORE* files from a
certain date - sync back? Are we left to our own devices for copying back a
revisioned file and having to handle the removal of the date/time suffix?


I have to defend Zenju on this one from his comments in his previous
reply. I'll again point to what I've seen in other sync software. Software
marketed as "sync" software, at least the ones I've seen, do not provide a
"restore" function. If you truly want the full functionality of that, you need
to look at software that's marketed specifically as actual "Backup" software.
Although, bkeadle, could you maybe reverse your script to create a linked
directory structure mimicing the old way from the new way?


Last, I'll throw this out there just for comparison. I've rechecked the other
sync software that I've tried in how they handle versioning. 1 of them does it
the way FFS did originally, with a separate directory for each sync. The other
2 do it the way I suggested and has now been implemented in the FFS beta, by
using the same original source directory structure and renaming the old files
to include the date and time of the sync operation in the file name.

And with the original FFS date method that would actually make the date
revision method easier because you no longer have to scan through for dates,
but you just look at the date revision directories and delete entire date
directories (not caring about the individual files within the date directory)
older than x days. But on the # of revisions method, now it's harder because
you're not guaranteed that a specific file exists in all old revision date
directories, so you have to search back through every directory and count
revisions as you search, and track which one is the oldest.

BUT, as I mentioned above, all 3 of the other programs include the options to
purge either on # of revisions or on # of days, so they manage it somehow,
using either method of storing old revisions.
Posts: 24
Joined: 25 Nov 2009

bkeadle

> In order to cleanup files older than "last x days" you need to travese the
complete revisions directory including subdirectories. On the other hand if
you need to ensure that only a fixed number of revisions exist per file, you
only need to check a single directory, and only do this if you add a new
revision



Are talking about the new structure here? In the old structure, yes, you would
need to "traverse the complete revisions directory", but in the new structure,
for each file copied, they're all in the same directory - thus whether you
want x number of revisions or revisions older than x days, they're all there
in the same directory to determine. Perhaps this debate is getting mixed up
with my suggestion that the old structure remain as an option. If the old
structure were selected, then yes, the version limit would cost too much in
terms of performance.


> A lot of files have the same name but a different extension, e.g.
"source.cpp, source.h". The extension is therefore used as part of the name.


Yes, I understand that. You said the new format for the filename is to be:


<filename>.<ext> YYYY-MM-DD HHMMSS.<ext>


And I'm suggesting that the middle <ext> reference be removed, and just keep
the extension at the end where it belongs:


<filename> YYYY-MM-DD HHMMSS.<ext>




> What reasons do you mean exactly?



"Having a choice of delimeter allows for flexibility to generate a file
listing and then act on the file listing through other utilities/scripts so
that <filename> (delim) <YYYY-MM-DD HHMMss> <ext> is easily parsed for
actions. Simply modify the DELIM= variable just under the :BEGIN label to a
valid delimeter of your choice. I've using "---" by default."
This would be especially useful to address the ability to do a restore using
the new naming convention as I mentioned earlier - especially if the old
revisioning structure is not an option moving forward, which would make
restore files from a certain date easy. As for the suggestion that FFS is
intended and a sync utility, not a backup utility doesn't make sense to me,
especially in the context of keep file revisions.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> can never have more than X revisions, which means if I modify a file a lot,
I might end up only a few days back of old versions


Good point, a limit on "revision count" cannot avoid old versions getting
lost, if a specific file is updated and synced repeatedly in a short time
frame.
But the deeper question is what do you want to achieve? If you just want do
make sure you are able to restore a version, say 10 days back, then you just
would not set a limit at all. Maybe you want to conserve disk space? In this
case, perhaps an option "limit revision total to X bytes" may be more
suitable?
From a performance point of view it is the same if I traverse the revisions
directory including subdirectory in order to remove outdated files or if I
count the bytes and delete as many old files as are required to get below a
"bytes limit".


> filter and delete revisions older than the date range right after you copy
the new revisions in. But as I reasoned through that, I see the catch in this
is that, that works for files that are being synced


Yes. If the user sets option "limit to x days" he naturally expects this to be
applied to all files, not just the ones that are newly synced. If the
functionality would be weakened to just apply to files touched, it may not be
useful at all anymore; but at least the performance problem would be gone and
it would scale nicely ;)


> other software I've checked out does all use databases for all types of
syncs


Most of the performance considerations concerning scanning all sub-directories
could be invalidated if there were an index file, which contains the location
and date of all revisioned files. I don't like this idea too much, because the
information is redundant and can theoretically become out of sync with the
real representation. Perhaps this is even more tricky than it should be, since
it's an implementation detail leaking out for reasons not apparent for the
user.
Also the performance argument should not be overestimated: In my relatively
large revisions directory located on a non-buffered USB memory stick,
traversing the full tree structure takes less than a second. In conceivably
larger scenarios where this may become a real bottleneck I should show a
status in FFS like "scanning revisions directory". This will make it
transparent to the user what FFS is doing and allow him to just select "limit
revision count" instead, which does not show this behavior. So he can decide
himself if "last x days" is worth the performance drawback or not.


> Are talking about the new structure here?


Yes. The performance may become a problem if the requirement is to "remove all
outdated files" rather than only the files that are newly revisioned.


> And I'm suggesting that the middle <ext> reference be removed


In this case sorting would not work because files that differ in extension
only would be intermixed according to the date.


> choice of delimeter


I don't think a delmiter is really needed. The time stamp is special enough to
be detectable programatically:
A simple filter that matches a versioned file could be:
* *-*-* *
or to be more safe:
* ????-??-?? ??????

or even with regex
.* {4}-{2}-{2} {6}
.* \d{4}-\d{2}-\d{2} \d{6}

Just to make sure I don't miss something when thinking about requirements;
these are the goals a limit option (whatever that will be) should fulfill:

- limit number of revisions in order to have a clearly arranged, short, user-friendly list when viewed with a file manager
- limit disk space
Posts: 24
Joined: 25 Nov 2009

bkeadle

Ah! The light has gone on.

As for the limit in X days and traversing the directory discussion: yes I was
only considering the files that are being modified to be checked for limit X
days - I hadn't considered *all files* in the REVISIONS subdir to be evaluated
for x days - that being the case, yes, I agree that can be a performance
concern. So perhaps a check-box option for "Apply limit on all files" or
"Apply to only modified files" (or some better verbiage). Personally, I would
only use "apply only to modified files".

A "revision total to X bytes" sounds interesting - that would be a unique
feature, wouldn't it? If it's "cheap" to implement as an option, why not -
could prove to be useful.

And about the filename convention using your 2 references of <ext> vs. my
suggestion of just once (the suffix). I now see your point and makes good
sense.
Posts: 74
Joined: 17 Mar 2008

mfreedberg

Let me cast my vote on this very interesting discussion - and I am really
excited about this change, as I have been doing a lot of clean up of older
"monthly" folders based on a folder naming convention designed to let me do
the purging more easily.

My vote: keep it simple at first and implement a version limit. If necessary,
you could add this as a limit per folder pair, so that we could have some
flexibility to manage the version limits based on how often a particular file
set may change.

I understand the drive to include other filters, like "make sure this revision
folder is never bigger than..." but then the one time you increase the number
of files or make some larger changes and you then unexpectedly bump one of
your revisions, then this option will seem ill-advised.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> A "revision total to X bytes" sounds interesting


This would essentially be a Windows Recycle Bin reimplementation; I'm
beginning to like this idea... it would be simpler to grasp than "limit last x
days", have more clear semantics and do a similar thing.


> My vote: keep it simple at first and implement a version limit.


I agree to keep it simple. It's hard to pull features back later and it's
difficult to decide if users complaining miss a feature or are just unwilling
to use the alternative.
But I figured to first go with the "limit bytes" option, which looks like a
safe bet considering the Recycle Bin similarities.
I'm still uncertain under what scenarios a version limit is actually useful.
It doesn't guarantee that the total file size is kept in bounds and also does
not handle the scenario of frequent updates for a single file. I first thought
it might help to keep the revisions directory small and more managable. But in
my tests, this isn't really a big deal. If you look for a specific file, you
enter the initial letters in explorer and find it instantly.


> add this as a limit per folder pair


In any case, the limit options will be at both local and global folder pair
level. Since they are part of the sync config, local configuration, if
existent, overwrites the global one.
Posts: 24
Joined: 25 Nov 2009

bkeadle

But I figured to first go with the "limit bytes" option, which looks like a
safe bet considering the Recycle Bin similarities.



Boy, I sure don't know about this. "Limit bytes" per file or per directory?
That sure seems like a moving target, as one backup may be 10s of MB and
another 100s of MB (or GB as the case may be). Seems like the most practical
and common use would be number of revisions of any given/changed file. But in
the case of multiple changes in a single day to a single file, then there may
need to have number of days modifier. And as mfreedberg points out:


but then the one time you increase the number of files or make some larger changes and you then unexpectedly bump one of your revisions, then this option will seem ill-advised.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> "Limit bytes" per file or per directory?


This was about a limit on the total number of bytes of the whole revisions
directory.
Posts: 24
Joined: 25 Nov 2009

bkeadle

Ah, I guess that makes more sense. However, how would you determine that? That
sounds hairy. You would first need to scan the entire REVISIONS root to get
current size, right? That can be costly - unless you're keeping track of that
in some index file. Then how would you determine *what* to delete in order to
make room for the next backup? Say I'm 10MB from my limit. I have 9 1 MB files
to copy, and 1 11MB file to copy. Seems like it'd be difficult to reconcile
what to delete to make room.

Seems like the x revisions and and x revision-days is a more simple
implementation and less costly (or maybe my light has just gone off again. :-)
)
Posts: 7
Joined: 1 Aug 2012

clh42

Gotta put my own big "no" vote in for limit by size, or at least not without
having it as an additional option to either (or both) by revisions or by days.
I want to know there's consistency in my retention.

And again, comparing to other sync software I've tried, they all offer by days
or by # of revisions, and none offer based on size.

If anything, I could see size being on top of # of revisions or # of days so
that regardless of those specs it never exceeds a certain size and doesn't
fill up the drive with revisions, but still keeps # of revisions or # of days
as long as the size isn't exceeded, but I'd still rather manage size myself
and adjust my # of revisions or # of days if I start filling up my drive with
revisions. And having ONLY the option by size wouldn't be used by me.
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> That sounds hairy.


If that's hairy, then so is Windows Recycle Bin, it's the same aproach.
Performance will be similar (but probably much better :)


> how would you determine *what* to delete


Delete as many old files (= use time-stamp at end of filename) until the total
is below the threshold.

For all three options "limit versions count, limit by days, limit by size"
performance won't be the decisive aspect. But I certainly don't want to
implement all three options. As a mathematician, I'd like to find a formal
argument for one or a combination of two options, or maybe a third one not
considered yet. Seems this one needs deeper thinking.
Posts: 24
Joined: 25 Nov 2009

bkeadle

"As a mathematician..." - well that helps to explain why this probably makes
more sense to you. :-)

"Delete as many old files..." - from what directories? You say size is based
on the entire REVISIONS directory structure, so you would be deleting older
files from a random mix of subdirectories, the oldest in any given directory
under REVISIONS? Again, unless you're keeping an index of these files, I don't
know how you're going to efficiently do that - but, you're the mathematician
*AND* the programmer, so you clearly know better than me!
:-)

Also, if you're just deleting the oldest, say you have 10 MB free, and need to
copy a 100 MB file. How do you know you won't be whacking the only backup copy
of a bunch of small files in order to make the room? If you're oldest file
happens to be a 200MB file, then you're an easy one delete away from making
room. But if your oldest files happen to be a bunch of small files (like
valuable scripts!), you could run the risk of deleting some "precious" files.
For that matter, that oldest 200MB might be your only, last backup of that
file. Just sounds too arbitrary of what might get deleted.

However, if you implement an index, then all these concerns go away, as you'll
be able to more intelligently (I think) ensure that you're not deleting your
only backup of a particular file.

Of the 3 options, limit version count and limit days seems not only easier,
but more appropriate for FFS to manage as part of the process. Limit by size
seems it would be more of a storage management issue handled outside of FFS.
Personally, I can see where I would used the first 2 options, but can't
conceive of when I'd ever use the last option - limit by size.

Speaking of storage management...that might be better handle by using hard
links as posted here (hint, hint)!
:-)
Posts: 7
Joined: 1 Aug 2012

clh42

I agree with bkeadle, I think by size is not useful as it's not consistent
since you never know what's going to need to be synced and how that might
affect what files are left. I don't mean that from a programmer standpoint, I
just mean from an end user standpoint that I don't see how anybody would find
that useful.
User avatar
Posts: 73
Joined: 22 May 2006

Giangi

So far I was only reading this interesting topic, personally I do not use the
versioning because I do not agree on keeping the files on one "side" only... I
have wrote somewhere else that it's a nonsense to move from one side to the
other the "versioned" files... :-)

Anyway me too I would like to give my "NO" vote for the limit-by-size version,
I agree that only limit-by-time and limit-by-number should be implemented! :-)

Ciao, Giangi
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> You say size is based on the entire REVISIONS directory structure, so you
would be deleting older files from a random mix of subdirectories



> I don't see how anybody would find that useful.


Hm, Microsoft seems to think it's useful, and most users probably think
Windows Recycle Bin is useful, so a "limit by total size" cannot be a plain
stupid idea, at least. Generally, the problem of accidentally deleting
required data because you delete a single large file is less a big deal than
one might think. This is because the limit for Windows Recycler is quite
large, like 10% of the volume's total size by default.

All three options do a similar thing, but have different trade-offs. I'd like
to get rid of this redundancy and have like two orthogonal features.
User avatar
Posts: 73
Joined: 22 May 2006

Giangi

> Hm, Microsoft seems to think it's useful, and most users probably think
Windows Recycle Bin is useful, so a "limit by total size" cannot be a plain
stupid idea, at least


Uhm... I think you are mixing apples with pears... :-)
The Recycle Bin is just ONE single container, while with the term
Versioning you mean a place where storing different versions of the sime
item.
So for the first is correct to do a delete-by-size while for the second is
not! ...of course is just my opinion... :-)
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

> Recycle Bin is just ONE single container


It's actually one container per volume, each with a limit on total size.


> with the term Versioning you mean


It's not fixed what "versioning" should mean for FFS. I picked the name
because I think it sounds nice and it creates "versions", i.e. files with a
time stamp. The goal is not so much to implement the perfect versioning
(whatever the exact definitions may be), but rather find the optimal
functionality for a third "deletion handling" option next to "delete
permanently" and "use recycler".
User avatar
Posts: 73
Joined: 22 May 2006

Giangi

Ok, I have omitted the "per volume"... but it's still one container "flat",
without a directory structure... :-)

I understand what you mean, but the word versioning gives the user a
"specific" idea of what it means! ...at least it did it with me (and english
is not my mother language... :-)
Adding a time-stamp to the file make it a "real" versioning system (of course
not as accurate as a CSV system can be!) :-)

Anyway I was just trying to not promote the limit-by-size logic... :-))))
User avatar
Posts: 2946
Joined: 22 Aug 2012

Plerry

I really support the new version approach!

Like Giangi I don't seen the need for a limitation in size.

I like the approach of X-versions and Y-days.
However, I would suggest to be able to choose between an AND and an OR of
these two conditions.
Programmatically this seems to be straigh forward.
If not, I would personally prefer the AND condition; i.e. versions only get
deleted if they are at least Y-days old
and if there are at least X newer versions.

Next: the nice option to select a file in Windows Explorer and then via right
mouse click see (and restore?) the
available previous versions ...

Plerry
User avatar
Site Admin
Posts: 7505
Joined: 9 Dec 2007

Zenju

I'm a little surprised that there is no interest in a "limit by total size"
which is a "recycle bin" alternative for USB sticks and network shares. After
all this is the most prominent way of versioning unconsciously used by all
Windows users. And they seem to be okay with the limitations; I've rarely seen
requests for alternate ways to manage deleted files on Windows.

I've done some more research, the three options discussed so far "limit count,
limit total size, limit by days" are the standard limits used in versioning. I
had hoped there would be some alternate, more elegant solution, but it doesn't
seem to be the case. Most are variants or combinations of the three options
discussed. Still I was surprised to find one peculiar approach:

- keep all revisions which are a fixed number of days old only. Then keep one version per week, one per month and one per year. In other words: The older the revision the bigger the distance in time between two of them.
Posts: 24
Joined: 25 Nov 2009

bkeadle

"Recyle bin" is fine as a fail-safe to recover deleted files, but should not
be counted on as a backup "method". One should not "file" things in trash -
it's not good practice in the physical world, nor in the digital world. :-)
"trash" implies uneeded, and should it get deleted, it should be of no
concern. But using FFS as a backup tool doesn't mean what I keep in backup are
"trash", except for conditions of count or age (days).