Home > Software design >  What does EC2 store and why does it even need a storage solution like EBS or Instance Store?
What does EC2 store and why does it even need a storage solution like EBS or Instance Store?

Time:12-06

If you use EC2 and launch instances, you can add EBS volumes. So a storage option. However, what I still don't understand exactly is why. Why is there or does EC2 even need a storage option like EBS or Instance Store? What does EC2 store anyway? And why it makes sense that there is EBS?

I know that EBS volume is persistent block storage and data is not lost after exit, unlike instance store. I just don't really understand what EBS is useful for. For which cases and applications is EBS used? Or does using EBS have more to do with creating snapshots that you can create to cache data and then save it to S3?

I've already read a lot and tried to make it understandable somehow, but somehow I can't get any further here. I would be really happy if someone could shed some light on this for me. Thank you already!

CodePudding user response:

Think of an Amazon EC2 instance as a normal computer. Inside, there is CPU, RAM and (perhaps) a hard disk.

When an EC2 instance has a hard disk, it is called Instance Storage and it behaves just like a normal hard disk in a computer. However, when you turn off the instance and stop paying for it, the EC2 instance can give that computer to somebody else. Rather than giving your data to somebody else, the disk is erased. So, anything you stored on Instance Store is gone! (In truth, instance store is also a virtualised disk, but this is close enough.)

In fact, in the early days of EC2, this was the only storage available. If you wanted to keep data after the instance was turned off, you first had to copy it to Amazon S3. People didn't like this, so they invented Amazon EBS.

If you want to keep your data so that it is still there when you turn on the instance in future, it needs to be stored on a network disk and that is what Amazon EBS provides. Think of it a bit like a USB drive that you can plug into one computer, then disconnect it and plug it into another computer. However, rather than being a physical device, it uses a storage service that keeps multiple copies of the data (in case a disk fails) and lets you modify the size of the disk. You are charged based on the amount of storage space assigned and how long the data is kept ("GB-Month").

Amazon EBS Snapshots are simply a backup of the disk. A snapshot contains all the data currently on the disk, allowing you to create a new disk anytime that will contain an exact copy of the disk as it was when the snapshot was created. This is great for backups, but is also very useful for creating multiple EC2 instances with the same disk content. An Amazon Machine Image (AMI) is actually just an Amazon EBS Snapshot plus a bit of metadata. When a new EC2 instance is launched, it uses an AMI to populate the boot disk rather than loading the operating system from scratch every time.

It is possible to create an AMI that populates an Instance Store disk. This way, you don't actually need to use an Amazon EBS volume. This is good for instances that don't need to permanently keep any data -- they could simply store information in a database or Amazon S3 instead of saving it on disk. Instance Store disks can be very fast since they don't send data across the network, so this is very useful in some situations.

In summary:

  • Instance Store is a normal disk in a computer (but it gets erased when the instance turns off so nobody else sees your data)
  • Amazon EBS volumes are network-attached storage that stays around until you delete it
  • Related