Skip to content

File downloads caching #367

Open
Open
@silvae86

Description

@silvae86

Dendro Version if known (or site URL)

v0.3

Please describe the expected behaviour

Dendro should not produce new files for download if there are no modifications done between the time of last production of a temporary file.

This will also allow the validation of a download (checking if error occurs when trying to download a file), before actually downloading the file, with minimal performance loss. If this is not implemented, the temporary files are produced twice; one time for the request that checks if the download "can be performed", another for the actual streaming of the data.

The call to the download method in window_controller.js would become this:

                    $(function ()
                    {
                        $.ajax({
                            url: url,
                            //timeout: 5000,
                            success: function ()
                            {
                                $("#" + hiddenIFrameID).attr("src", url);
                            },
                            error: function (err)
                            {
                                new PNotify({
                                    title: "Failed to download resource",
                                    text: "If you are using B2DROP, check credentials or file dont exist on storage",
                                    type: "error",
                                    opacity: 1.0,
                                    delay: 5000,
                                    addclass: "stack-bar-top",
                                    cornerclass: "",
                                    stack: stack_topright
                                });
                            }
                        });
                    });

Please describe the actual behaviour

Currently, temporary files are produced from MongoDB EVERY time a user requests a download.

Possible ways to fix the problem (programmers)

Implement proposed workflow (see attached pic).

First, implement a class, tempfiles_cache.js, which would connect to MongoDB and create a collection called "downloads_cache". Every document in this collection would have the following fields:

  • uri: the uri value of the resource (file or folder) that was cached for download
  • timestamp: the date when the temporary file was last produced for this resource
  • file_path: the path in the local filesystem where the temporary file resides

Second, need to create a new property in element.js, ddr:dateFSModified, which will have the date of last modification relevant in terms of file system. ddr:lastModified has the date of last modification in terms of metadata, but it makes no sense to have to refresh temporary files just because the metadata was updated. This means that there can be no caching for backups at this time, because the temporary backup needs to be updated taking into account both metadata and file system modification timestamps

Workflow

The workflow is as follows:

  1. A download of a file or folder is requested
  2. Verify in cache if there is a temporary file with that URI.
    2.1 If it is a folder, see if any of its children are "dirty" (ddr:dateFSModified > timestamp in cache ). This can be performed with a simple query on Virtuoso, with nie:isLogicalPart+ to check recursively for any child or grandchild, etc that is dirty
    2.1.1a If any child is dirty (or the folder itself), it is necessary to refresh the temporary zip file for that folder. Run the zipping code as usual.
    2.1.1b Update cache record with the new zip file's location in the local filesystem
    2.1.1c Serve file that is in updated cache record (file_path)
    2.1.2a If no children are dirty, serve the file (file_path) in the current cache record
    2.2. If it is a file, check if it is "dirty" (ddr:dateFSModified > timestamp in cache )
    2.1.1a If the file is dirty, produce new temporary file
    2.1.1b Update cache record with the temporary file's location in the local filesystem
    2.1.1c Serve file that is in updated cache record (file_path)
    2.1.2a If the file is not dirty, serve the file (file_path) in the current cache record

Keeping track of changes

It is necessary to keep track of the last modification of a file or a folder at several times:

  1. Whenever creating a folder, need to set the ddr:dateFSModified value to the current date
  2. Whenever uploading a new file, need to set the ddr:dateFSModified value to the current date
  3. Whenever cutting a resource, need to update its parent folder's ddr:dateFSModified to the current date
  4. Whenever renaming a resource, need to update its ddr:dateFSModified to the current date
  5. Whenever deleting a resource, need to update its ddr:dateFSModified to the current date
  6. Whenever uploading a file, need to update its parent folder's ddr:dateFSModified to the current date
  7. Whenever restoring a folder from backup, need to update its ddr:dateFSModified value to the current date

Workflow diagram (drawn quickly while over-caffeinated)

img_20180420_1537335

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions