Title: | Managing Expiration for Cache Directories |
---|---|
Description: | Implements an expiration system for access to versioned directories. Directories that have not been accessed by a registered function within a certain time frame are deleted. This aims to reduce disk usage by eliminating obsolete caches generated by old versions of packages. |
Authors: | Aaron Lun [aut, cre] |
Maintainer: | Aaron Lun <[email protected]> |
License: | GPL-3 |
Version: | 1.13.0 |
Built: | 2024-10-05 05:57:40 UTC |
Source: | https://github.com/bioc/dir.expiry |
Remove versioned directories that have passed on expiration limit.
clearDirectories(dir, reference = NULL, limit = NULL, force = FALSE)
clearDirectories(dir, reference = NULL, limit = NULL, force = FALSE)
dir |
String containing the path to a package cache containing any number of versioned directories. |
reference |
A package_version specifying a reference version to be protected from deletion. |
limit |
Integer scalar specifying the maximum number of days to have passed before a versioned directory expires. |
force |
Logical scalar indicating whether to forcibly re-examine |
This function checks the last access date in the *_dir.expiry
files in dir
.
If the last access date is too old, the corresponding subdirectory in path
is treated as expired and is deleted.
The age threshold depends on limit
, which defaults to the value of the environment variable BIOC_DIR_EXPIRY_LIMIT
.
If this is not specified, it is set to 30 days.
If reference
is specified, any directory of that name is protected from deletion.
In addition, directories with version numbers greater than (or equal to) reference
are not deleted,
even if their last access date was older than the specified limit
.
This aims to favor the retention of newer versions, which is generally a sensible outcome when the aim is to stay up-to-date.
This function will acquire exclusive locks on the package cache directory and on each versioned directory before attempting to delete the latter.
Applications can achieve thread safety by calling lockDirectory
prior to any operations on the versioned directory.
This ensures that clearDirectories
will not delete a directory in use by another process, especially if the latter might update the last access time.
By default, this function will remember the values of dir
that were passed in previous calls,
and will avoid re-examining those same dir
s for expired directories on the same day.
This avoids unnecessary file system queries and locks when this function is repeatedly called.
Advanced users can force a re-examination by setting force=TRUE
.
Expired directories are deleted and NULL
is invisibly returned.
Aaron Lun
touchDirectory
, which calls this function automatically when clear=TRUE
.
# Creating the package cache. cache.dir <- tempfile(pattern="expired_demo") # Creating an older versioned directory. version <- package_version("1.11.0") version.dir <- file.path(cache.dir, version) lck <- lockDirectory(version.dir) dir.create(version.dir) touchDirectory(version.dir, date=Sys.Date() - 100) unlockDirectory(lck, clear=FALSE) # manually clear below. list.files(cache.dir) # Clearing them out. clearDirectories(cache.dir) list.files(cache.dir)
# Creating the package cache. cache.dir <- tempfile(pattern="expired_demo") # Creating an older versioned directory. version <- package_version("1.11.0") version.dir <- file.path(cache.dir, version) lck <- lockDirectory(version.dir) dir.create(version.dir) touchDirectory(version.dir, date=Sys.Date() - 100) unlockDirectory(lck, clear=FALSE) # manually clear below. list.files(cache.dir) # Clearing them out. clearDirectories(cache.dir) list.files(cache.dir)
Mark directories as locked or unlocked for thread-safe processing, using a standard naming scheme for the lock files.
lockDirectory(path, ...) unlockDirectory(lock.info, clear = TRUE, ...)
lockDirectory(path, ...) unlockDirectory(lock.info, clear = TRUE, ...)
path |
String containing the path to a versioned directory.
The |
... |
For For |
lock.info |
The list returned by |
clear |
Logical scalar indicating whether to remove expired versions via |
lockDirectory
actually creates two locks:
The first lock is applied to the versioned directory (i.e., basename(path)
) within the package cache (i.e., dirname(path)
).
This provides thread-safe read/write on its contents, protecting against other processes that want to write to the same versioned directory.
Concurrent read operations are also permitted by setting exclusive=FALSE
in ...
to define a shared lock..
The second lock is applied to the package cache and is always a shared lock, regardless of the contents of ...
.
This provides thread-safe access to the lock file used in the first lock,
protecting it from deletion when the relevant directory expires in clearDirectories
.
If dirname(path)
does not exist, it will be created by lockDirectory
.
clearDirectories
is called in unlockDirectory
as the former needs to hold an exclusive lock on the package cache.
Thus, the clearing can only be performed after the shared lock created by lockDirectory
is released.
lockDirectory
returns a list of locking information, including lock handles generated by the filelock package.
unlockDirectory
unlocks the handles generated by lockDirectory
.
If clear=TRUE
, versioned directories that have expired are removed by clearDirectories
.
It returns a NULL
invisibly.
Aaron Lun
# Creating the relevant directories. cache.dir <- tempfile(pattern="expired_demo") version <- package_version("1.11.0") handle <- lockDirectory(file.path(cache.dir, version)) handle unlockDirectory(handle) list.files(cache.dir)
# Creating the relevant directories. cache.dir <- tempfile(pattern="expired_demo") version <- package_version("1.11.0") handle <- lockDirectory(file.path(cache.dir, version)) handle unlockDirectory(handle) list.files(cache.dir)
Touch a versioned directory to indicate that it has been successfully accessed in the recent past.
touchDirectory(path, date = Sys.Date(), force = FALSE)
touchDirectory(path, date = Sys.Date(), force = FALSE)
path |
String containing the path to a versioned directory.
The |
date |
A Date object containing the current date. Only provided for testing. |
force |
Logical scalar indicating whether to forcibly update the access date for |
This function should be called after any successful access to the contents of a versioned directory,
to indicate that said directory is still in use by expiry-aware processes.
A stub file is updated with the last access time to allow clearDirectories
to accurately check for staleness.
For a given path
and version
, this function only modifies the files on its first call.
All subsequent calls with the same two arguments, in the same R session and on the same day will have no effect.
This avoids unnecessary touching of the file system during routine use.
The caller should lock the target directory with lockDirectory
before calling this function.
This ensures that another process calling clearDirectories
does not delete this directory while its access time is being updated.
If the target directory is locked, any writes to the stub file itself are thread-safe, even for shared locks.
By default, this function will remember the values of path
that were passed in previous calls,
and will avoid re-updating those same path
s with the same date when called on the same day.
This avoids unnecessary file system writes and locks when this function is repeatedly called.
Advanced users can force an update by setting force=TRUE
.
The <version>_dir.expiry
stub file within path
is updated/created with the current date.
A NULL
is invisibly returned.
Aaron Lun
lockDirectory
, which should always be called before this function.
# Creating the package cache. cache.dir <- tempfile(pattern="expired_demo") dir.create(cache.dir) # Creating the versioned subdirectory. version <- package_version("1.11.0") version.dir <- file.path(cache.dir, version) lck <- lockDirectory(version.dir) dir.create(version.dir) # Setting the last access time. touchDirectory(version.dir) list.files(cache.dir) readLines(file.path(cache.dir, "1.11.0_dir.expiry")) # Making sure we unlock it afterwards. unlockDirectory(lck)
# Creating the package cache. cache.dir <- tempfile(pattern="expired_demo") dir.create(cache.dir) # Creating the versioned subdirectory. version <- package_version("1.11.0") version.dir <- file.path(cache.dir, version) lck <- lockDirectory(version.dir) dir.create(version.dir) # Setting the last access time. touchDirectory(version.dir) list.files(cache.dir) readLines(file.path(cache.dir, "1.11.0_dir.expiry")) # Making sure we unlock it afterwards. unlockDirectory(lck)