The Cloud Computing buzz is everywhere. The concept of grid computing on the Internet to provide elasticity and virtualization of resources is quite appealing, and hence there has been a lot of academic brain-storming going on recently that has given rise to abstract ideas on how cloud computing is destined to change the way technology resources are deployed and used.
Until now, small developers did not have the capital to acquire massive compute resources and insure they had the capacity they needed to handle unexpected spikes in load. Amazon EC2 enables any developer to leverage Amazon’s own benefits of massive scale with no up-front investment or performance compromises. Developers are now free to innovate knowing that no matter how successful their businesses become, it will be inexpensive and simple to ensure they have the compute capacity they need to meet their business requirements.
The "Elastic" nature of the service allows developers to instantly scale to meet spikes in traffic or demand. When computing requirements unexpectedly change (up or down), Amazon EC2 can instantly respond, meaning that developers have the ability to control how many resources are in use at any given point in time. In contrast, traditional hosting services generally provide a fixed number of resources for a fixed amount of time, meaning that users have a limited ability to easily respond when their usage is rapidly changing, unpredictable, or is known to experience large peaks at various intervals.
I was able to go through the Getting Started Guide and I had myself a Linux environment in the Amazon cloud in no time:
Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to requisition machines for use, load them with your custom application environment, manage your network’s access permissions, and run your image using as many or few systems as you desire.
To use Amazon EC2, you simply:
* Create an Amazon Machine Image (AMI) containing your applications, libraries, data and associated configuration settings. Or use pre-configured, templated images to get up and running immediately.
* Upload the AMI into Amazon S3. Amazon EC2 provides tools that make storing the AMI simple. Amazon S3 provides a safe, reliable and fast repository to store your images.
* Use Amazon EC2 web service to configure security and network access.
* Start, terminate, and monitor as many instances of your AMI as needed, using the web service APIs.
* Pay only for the resources that you actually consume, like instance-hours or data transfer.
Based on my recent experience, here are some initial thoughts (with bias on security):
To add your AMI:
1. Make sure you’re logged in to the site.
2. Click the “Add a Document” link in the Tools box to the right.
3. Enter as much information as possible in the form, then click Preview. (Tip: You can use HTML in the listing body.)
4. If everything looks good, click Submit. Otherwise, click “Go Back/Edit” and make your corrections.
Important: Your listing will show up on the site after a quick review by AWS.
Interesting. I wonder what the "quick review" entails. What if someone submits an AMI with a back-door installed? Does the Amazon team have the resources and processes to identify malicious AMIs before sharing them with their customers?
Keys to the Cloud. The Amazon services are based upon a key-based approach (you get your own key-pair to authenticate to Amazon and to sign your own AMIs) - this is good. There is the burden of key management, but it is still a better approach than implementing a static password based system.
Firewall. Instances initially boot in a firewalled environment where you have to explicitly open up ports to allow inbound access. This is a good approach as well.
Dude, Where’s My Data? Data in the virtual instance only persists as long as the instance is running. From Amazon’s FAQ:
Q: What happens to my data when a system terminates?
The data stored on a specific instance persists only as long as that instance is alive. You have several options to persist your data:
1. Prior to terminating an instance, backup the data to persistent storage, either over the Internet, or to Amazon S3.
2. Run a redundant set of systems with replication of the data between them.
We recommend you should not rely on a single instance to provide reliability for your data.
This means that existing applications and systems need to be re-engineered to persist data in the cloud (outside of the virtual environment). Makes sense, in order to take advantage of the elasticity of cloud computing, your data has got to be ‘in the cloud’ and not tied to a single virtual instance. This may have some legal implications, and it may make some organizations (initially) uncomfortable to come to terms with the idea that their live data does not persist on their own hardware.
I’m afraid we are likely to see cases of security issues arising from badly re-engineered application code as developers attempt to code their applications to persist data using services like S3 instead of local data stores.
Risk Based on Data. Most often, organizations fail to risk inventory their assets based on the type of data the system reads and writes. The cloud computing paradigm will force organizations to think of data foremost when building a risk inventory. This is a good thing.
Security Principles. Obvious and well known security principles apply to cloud based services like EC2. You’ve got to ensure that your VMs are configured securely, that your applications are developed securely, and that you communicate securely - think about authentication, authorization, access control, cryptography, and monitoring on all layers and tiers of the system.
The Threat of Mono-culture. I’m reminded of Dan Geer’s words on the threat of mono-culture. If you start up hundreds of instances of a virtual image, a vulnerability in one instance will apply to all other instances of the same image. Imagine a situation where a remotely exploitable vulnerability is found in the generic kick start image Amazon recommends to its customers - suddenly, the security of a considerable amount of resources and data within the cloud will be at stake.
Cloud Insecurity. Security issues within the Amazon web services will have an extremely high impact on EC2 customers. For example, suppose a malicious user is able to invoke the services behind ec2-terminate-instances to terminate instances outside of his or her role. Such a vulnerability could be abused to black-out the Amazon cloud.
Perimeter? What Perimeter? The concept of relying on a network based parameter has been losing steady ground. Cloud computing services like EC2 will be a catalyst to this recommendation - data and resources will be distributed in a shared cloud space. The concept of network based perimeter will no longer apply. Instead, security controls will need to be assured on all layers and tiers of the architecture. However, there are bound to be cases where organizations will try to build trust within the cloud to construct a virtual perimeter to imitate legacy designs.
Service Provider Liability. As the concept of cloud computing gains ground, it is likely that the service providers will seek to implement technical solutions that will allow them to provide resources in the cloud without the legal liability of hosting and computing secret or illegal data. For example, a consumer or legal requirement may warrant the customer of the cloud to have the ability to compute or store data in the cloud without exposing the computation result or data to the provider. This may facilitate tangible products to arise from academic concepts of zero knowledge based solutions.
Single Point of Failure. Amazon provides the concept of Zones and Regions (currently limited to 1 region):
Amazon EC2 now provides the ability to place instances in multiple locations. Amazon EC2 locations are composed of regions and availability zones. Regions are geographically dispersed and will be in separate geographic areas or countries. Currently, Amazon EC2 exposes only a single region. Availability zones are distinct locations that are engineered to be insulated from failures in other availability zones and provide inexpensive, low latency network connectivity to other availability zones in the same region. Regions consist of one or more availability zones. By launching instances in separate availability zones, you can protect your applications from failure of a single location.
This is good - Amazon allows for instances to be booted in different zones to prevent impact from the failure of a particular location. But what about Amazon as a whole as the single point of failure? The concept of resources being distributed geographically makes this scenario less probable. As cloud offerings from other companies emerge, it may make sense for larger organizations to host on other cloud service offerings to further decrease the single point of failure scenario. Doing so could be a little difficult since competing services may require adherence to specific programming languages and environments. For example, the Google App Engine SDK is currently limited to Python and is not based on the concept of allowing users to configure full blown virtual environments. Perhaps I’ll write my thoughts on the Google App Engine in the near future.
I’m excited about the concept of cloud based computing. It’s the future, and Amazon has done a good job of turning the hype into reality. I’ll be interested to see how Google’s offerings mature, and what Microsoft and IBM have up their sleeves.
These are just my initial thoughts on security implications of the emerging cloud computing paradigm. I’ll continue to post updates as I have time to think about it some more. If you’d like to share some ideas, I’d be interested to hear them.