File attachment downloads stop after 1 day #1744
Labels
Bug | Bogue
Dev
Task for implementation of a technical solution
High Priority | Haute priorité
Refined l Affiné
Stories and tasks that are ready to be worked on
Describe the bug
Files cannot be downloaded after ~one day (possible extended by a few more hours), whereas we expect these to live either 3 days or 7 days depending if these were attached or linked.
Bug Severity
(SEV-1 Critical, SEV-2 Major, SEV-3 Minor, SEV-4 Low)
Major
To Reproduce
You could also try this out by identifying different files that were uploaded in the S3 bucket
notification-canada-ca-production-document-download
, match with the sent notification, decode the customization field in the database in order to get the download URL which will also contain the secret key to decode the file. Identify files in the bucket that were sent within 24h and some after 48h+/72h-, the first should work where the last should fail, contrary to our own download policies (link mode=7 days / attach mode=3 days).Expected behavior
The files that are linked should be available to download within 7 days. The files that are attached should be available to download within 3 days.
Impact
Describe overall user/system impact to consider when prioritizing this issue.
Impact on Notify users:
The files they shared to recipients won't be received if not opened in time.
Impact on Recipients:
They cannot download their file attachments after 1 day and a few hours (depending on S3 applicable expiration lifecycle).
Impact on Notify team:
None
Additional context
Cause
The root issue is how we check for malicious files with the file scanner, and how we retrieve the scan status. We store the status in a s3 bucket (notification-canada-ca-production-document-download-scan-files) that is a duplicate of our file downloads s3 bucket (notification-canada-ca-production-document-download).
When we send the file to be scanned, the scanner report back the status on the duplicate s3 bucket (notification-canada-ca-production-document-download-scan-files). Our DD-API layer checks for its status before sending it back to the API layer. But if it cannot find the file and its status in the duplicate s3 bucket, it throws a 404 (not found error code).
Now that happens after 1 day, because the duplicate s3 bucket that stores the files is set to expire after 1 day only (screenshot 1), whereas our original bucket for file downloads stores for either 3 (screenshot 2) or 7 days (screenshot 3) depending if that is attached or linked (in that order).
Hence when we check for a status for a file that should be there, we can't retrieve the file scan status and simply returns a 404. The attachments cannot be downloaded past 1 day.
Screenshot 1
Screenshot 2
Screenshot 3
Quick solution
A quick solution is to align the S3 bucket lifecycle management configuration of
notification-canada-ca-production-document-download-scan-files
to the one containing the original attachments, i.e.notification-canada-ca-production-document-download-scan-files
. We might need to ensure that atmp
folder exists within the S3 scan file bucket so we can apply its own lifecycle management configuration (as the bucket won't have this folder on creation and we need to guarantee it for environment recovery purpose -- we will hit this issue pretty quick anyway as we recreate ourdev
environment every weekend).The text was updated successfully, but these errors were encountered: