#AWS - S3 Lifecycle FAQ

#aws_tips_and_tricks

·

9 min read

S3.png

FAQ:

Q: If we set up a life cycle will it be applied to the existing files?

  • Ans: Yes, the Lifecycle rule applies to all the data present inside the S3 bucket, whether it is uploaded before or after the addition of the lifecycle rule.

  • For example, you have two objects in your S3 bucket, one is 100 days old and the other was recently uploaded just 2 hours before. Now, You added a life cycle for expiring the objects after 30 days, the 100 days object will not wait for another 30 days but it will expire whenever the lifecycle rule runs first. The other recently uploaded object will wait to become 30 days old, and after that lifecycle will run on it.

Q: Can we set up a rule to delete the file under a specific folder?

  • Ans: Yes, if your files are being stored under a particular prefix then you can configure Lifecycle to only have the purview of that particular prefix. This way any other data uploaded to your S3 bucket would not be deleted.

Q: Is it possible to read the expired objects in S3?

  • Ans: It is possible but not guaranteed, that the object may or may not be readable after some time.

Q: Is it possible to trigger the lifecycle rule as per timezone?

  • Ans: Lifecycle configs are executed and fully managed by AWS. AWS executes them at midnight UTC and their execution cannot be automated according to a specific time zone. Lifecycle rules will only execute at Midnight UTC.

Q: Can we control the order in which the rules are invoked/executed?

  • Ans: No, it is not possible as of now, the Lifecycle will come into play concurrently.

Q: How does expiration work in the case of buckets without versioning enabled?

  • Ans: The expiration rule will permanently delete the object in case of a non-versioning bucket.

Q: Do we need a second rule for deleting the expired objects in case of buckets without versioning?

  • Ans: The second rule to permanently remove the delete markers is not needed in the case of a non-versioning enabled bucket.

Q: Is it possible to recover the deleted objects?

  • Ans: No, ​​deleted objects cannot be recovered.

Q: If the setup expires after 1 day and deletes after 1 day, will the delete rule wait for one more day after the object is expired?

  • Ans: In the case of a non-versioning bucket the above implementation will not make any difference and the object will be expired after 1 day.

  • When you are using a Versioning enabled Bucket, the first rule will create a delete marker of the object after 1 day. The second rule will wait for the delete marker to become 1 day old, and then it will permanently delete the marker.

  • So your object will be deleted after 2 days.

I.e: If the period is set to 45 Days.

  • In the case of the Non-versioning bucket - The object will get removed after 45 Days. In case of Versioning enabled Bucket -> Object will be converted to delete marker after 45 Days -> Delete Marker will expire after another 45 days -> Total Time to permanent delete = 90 Days.

Q: When the expired data is deleted the entire folder is deleted instead of the files in that folder. How to avoid this and only delete the expired objects and keep the folder intact?

  • Ans: Amazon S3 has a flat structure with no hierarchy as we would see in a typical file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. Amazon S3 does this by using key name prefixes for objects. As for now, there is no way to keep the folder intact.

Q: What is the frequency of s3 lifecycle rules?

  • Ans: S3 lifecycle only runs once a day at 00:00 UTC and tags the objects that fall under its purview for the actions that you have directed it to.

Q: Why are the expired objects not deleted immediately?

  • Ans: S3 will asynchronously remove these from the Bucket on the backend. This can take some time to complete as S3 performs this operation while ensuring that the service remains available.

Q: Will there be a delay in deleting the files?

  • Ans: Yes, if you have a large number of objects, therefore, it might cause a delay in the deletion of objects.

Q: Is the expired object charged?

  • Ans: Once the lifecycle has tagged the data for deletion, then you do not incur any charges for the storage. For example, if an object is scheduled to expire and Amazon S3 does not immediately expire the object, you won't be charged for storage after the expiration time.

Q: Is there any cost for the S3 life cycle?

  • Ans: There is no cost for applying the S3 lifecycle.

Q: What is a transition cost?

  • Ans: Transitioning data from S3 Standard to S3 Standard-Infrequent Access will be charged $0.01 per 1,000 requests.

Q: Is there a cost for expiration action?

  • Ans: No, there is no cost for deleting objects via lifecycle.

Q: Are there any charges for early deletion?

  • Ans: Yes, S3 offers a few storage classes such as glacier deep archive, One zone IA, etc, which has a constraint of minimum storage duration. If an object residing in any such storage class is deleted before the minimum storage duration is completed then you will incur charges for early deletion. For more information please refer to the following document.

Q: Are there any other costs involved in the S3 lifecycle?

  • Ans: There are per-request ingest charges when using PUT, COPY, or lifecycle rules to move data into any S3 storage class. Consider the ingest or transition cost before moving objects into any storage class.

Q: Can compressing the file help?

  • Ans: Compressing the file would help reduce the amount of storage your data claims. But this might only be effective if the compression causes a significant difference in storage. S3 doesn't support any native capability of compressing data, one alternative would be to download the data, compress it and then re-upload it, and later delete the uncompressed data.

Q: Is there any other option to keep the files but reduce the cost?

  • Ans: You can consider transitioning your data into a different storage class that offers storage at a reduced rate. Here, If your use case is of archiving your objects such that you rarely access this data. Then you can consider S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, or S3 Glacier Deep Archive. Or, if you can bear the loss of data in case of any physical loss of an Availability Zone resulting from disasters then you can consider transitioning your data to S3 one zone IA which is again a cheaper alternative as compared to S3 standard storage class.

Q: What are S3 Glacier and S3 Glacier Deep Archive storage types?

  • Ans: The S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive storage classes are designed for low-cost data archiving. These storage classes offer the same durability and resiliency as the S3 Standard and S3 Standard-IA storage classes but at reduced rates of storage. However, do note there are retrieval charges involved with these storage classes. The S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive objects are not available for real-time access. You must first restore the S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive objects before you can access them. You can reduce S3 Glacier Deep Archive retrieval costs by using bulk retrieval, which returns data within 48 hours.

  • Also, S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive come with a constraint of minimum storage duration of 90 and 180 days respectively. Hence, If your use case is to archive data for a longer duration of time, such that it will be rarely accessed then you may consider transitioning your data to S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, or S3 Glacier Deep Archive depending on your needs for retrieval. A comparison table and more details can be found in the below documents.

Q: What prefix do we need to provide?

  • Ans: While specifying a prefix, you just need the prefix path and not the entire path that includes the S3 bucket name. And please make sure that the below checkboxes are marked. The following values are specified for the immediate deletion of objects present in the above-mentioned prefix.

Q: Is there any cost for deleting the objects?

  • Ans: DELETE API requests are free and you are not charged for these.

Q: Is there a cost applied for life cycle rules?

  • Ans: There is no cost in setting up the lifecycle rule.

Q: Can we set up a rule for a specific file type?

  • Ans: No, the S3 lifecycle does not support deleting specific file types.

Q: How to delete specific files?

  • Ans: You can create a lifecycle rule and tag a specific object or specify the prefix(folder) you want the rule to apply to.

Q: How to create a lifecycle rule to expire only .csv files older than 45 days?

Q: When we set up the days for the expiry of an object, which value is considered created or modified?

  • Ans: Please note Lifecycle actions such as expiration will consider the object created date for calculating the object lifespan after which it will delete the object depending upon the number of days specified in the field "Days after object creation field".

Q: What are the limitations of S3 filters?

  • Ans: At this time S3 only supports filters based on prefix and/or object tags. For example, you can create a filter for the prefix "myfolder/" with tags "Expire": "true" so only the objects under the prefix with the specified tag will be expired.

  • Caveats of using tags are that you must add tags to each of the objects and that adding and keeping the tags have extra charges.

Q: Is there a cost for tagging an object?

  • Ans: Yes, To PUT the tags you are charged $0.005 per 1,000 PUT requests.

Q: Is there a cost for maintaining the tags?

  • Ans: Yes, to maintain the tags you are charged $0.01 per 10,000 tags per month

Q: What else do we need to consider?

  • Ans: Please note apart from storage, you also pay for requests made against your S3 buckets and objects. Thus, you may need to take all this into account while generating an estimate of the cost that you will incur.

Ref: