What is Cloud Computing?
Until recently, computing meant a program that ran on a desktop or
laptop computer on your desk, or a server in your lab. Or, using the
internet, you could use a program that was running on a server somewhere
else in the world. But it was always a specific piece of hardware in a
specific location that was running the program.
In the context of cloud computing,
cloud refers to the internet. And then, cloud computing means that the computing is happening
somewhere in the cloud. You don't know where the computing is happening, most of the time, you
can't know where it is happening (since it can keep moving around), and the most important factor is that you
don't care.
Some service provider is providing you with virtual computers, or
virtual disks, or virtual file-systems, or virtual databases, or even
higher level constructs (to be described later), and guaranteeing that
they will take care of everything related to the virtual hardware that
you got - you just need to upload your program and run it.
To understand this better, consider milk. In the old days, everybody
had a cow. And they squirted milk out of the cow. And then made butter,
buttermilk, paneer, paneer pakodas, and
ras-malai from it. But
more recently, businesses have sprung up who will deliver you milk in
plastic packets at your door, or even butter, paneer pakodas and
ras-malai. This model has proved to be so convenient to people
(especially those who hate the smell of cow-dung) that very few people
now have cows. (If you're a student, and have a cow in your hostel,
please let us know - we'd love to hear from you!)
The cow in this example is like the hardware. And milk is the
product. Having your own cow is equivalent to traditional computing
using your own labs. Getting milk delivered to your door is cloud
computing.
How is Cloud Computing Implemented?
Cloud computing largely depends upon virtualization technology.
Virtualization refers to the technique in which all the capabilities of a
piece of hardware are faithfully reproduced in a software program. So,
for example, a virtual machine has a virtual CPU, virtual memory and a
virtual disk. The virtual CPU might emulate, for example, an Intel x86
chip, and then it is able to take an executable file consisting of x86
instructions and execute them all. This virtual machine thus behaves
just like a real machine - you can install an Operating System on it,
you can boot into the OS, and then install other programs into the OS
that you just installed. You can reboot the machine, power it off, and
power it on again, just like real machines.
However, the virtual machine is just an executable program that is
stored on some large real machine somewhere. When the executable is run,
it behaves like a machine. When the program is shutdown, it saves the
entire state of the machine (including contents of the virtual RAM,
contents of the virtual disk, contents of the CPU registers, etc) on a
file on the real disk of the real machine.
The most interesting thing is that the program and the data of the
virtual machine can be copied to another real machine, and when the
program is run there, it will behave exactly like the first virtual
machine, and continue executing from exactly where it left off. Thus, a
key feature of virtual machines are that they can be moved from one real
machine to another, and in fact, from one geographic location to
another, with very little effort. Advanced virtualization techniques
allow this kind of virtual machine migration to be done without
requiring the virtual machines to be shutdown -
i.e. the
virtual machine can move from one location to another while the programs
inside the virtual machines continue to run uninterrupted.
VMWare is the leader in building virtual machine software, and Xen is
the most important open source alternative to it. There are of course
many other smaller players into this market.
Types of Cloud Computing - IaaS, PaaS and SaaS
There are different kinds of cloud computing, but before we
understand that we need to understand what is computing. Computing
really can be broken up into these pieces:
- Hardware
- CPU
- Memory
- Disk (File-System, Database)
- Software
- Operating System (Linux, Windows, Solaris)
- Software Development Environment (Visual Studio, Java+Eclipse, Ruby on Rails, Python)
- The actual programs/applications that people use (Documents,
Spreadsheets, Sales Management Software, Customer Relationship
Management Software, Accounting Packages, etc)
Each of the things mentioned above can be 'virtualized' and put in
the cloud independently. Thus, Amazon EC2 gives a CPU+memory in the
cloud. Amazon EBS gives a disk in the cloud, and S3 gives a file-system
in the cloud. Microsoft Azure gives a Visual Studio Development
environment in the cloud - so that apps developed using Visual Studio
can be run 'in the cloud' without you having to worry about the hardware
that it runs on. Similarly Google App Engine gives a Java or Python
environment in the cloud where you can run your Java/Python apps and
they take care of the hardware. Finally, Google Docs is an example of
software in the cloud - you directly create documents, presentations,
spreadsheets via your web browser. Microsoft too has Office 365, which
is their SaaS offering and includes SaaSified versions of Word, Excel,
etc. There is no hardware or software to install.
Depending upon what is being virtualized, we get three types of Cloud Computing:
- IaaS or Infrastructure as a Service: these are various services where the hardware is being virtualized. Virtual machines (i.e. CPU + Memory), virtual disks (e.g. Amazon EBS), virtual file-systems (e.g. Amazon S3), virtual databases (e.g.
Google BigTable, Amazon SimpleDB, SQL Azure) are all examples of
infrastructure. Basically, these are services that are looking to
replace all the hardware infrastructure that sits in your server rooms
and labs.
- PaaS or Platform as a Service: these are various services where the software development platform (i.e. programming language, runtime environment, etc.)
is being virtualized. Google AppEngine (Java/Python), Microsoft Azure
(.NET/Visual Studio) are examples of PaaS. In the cow & milk
example, PaaS would be equivalent to getting paneer or khoya delivered to your home. You can use this to cook your own delicious items.
- SaaS or Software as a Service: these are various
services that have decided to skip the hardware and software engineers
altogether and directly approach the end-user with software that s/he
wants to use. In IaaS you can install your own OS and software and use
it. In PaaS you can write programs in that platform and run them. In
SaaS you need to do nothing. There is ready-made software that you can
directly start using. Like SalesForce - software used by sales agents.
In the cow & milk example, SaaS is equivalent to home-delivery of
cooked food (paneer makhanwala and ras-malai).
Advantages of Cloud Computing
There are a number of advantages Cloud Computing has over the old way of doing things:
- Convenience: Cloud Computing is easy. Not having to deal with real machines, and disk failures, and electricity failures, etc
is a huge benefit. Anyone who has had to deal with cleaning cow-dung,
going to the vet for treating cow diseases, and complaining neighbours
will appreciate the huge convenience of milk packets over having a cow.
- Cost: There are two different cost advantages to cloud computing.
Sometimes it is cheaper than the physical alternative. At other times,
the advantage comes from the fact that you have to pay small
installments every month instead of a large chunk of money when you're
buying the infrastructure.
- Cheaper: Usually cloud computing turns out to be cheaper. This is
mainly because cloud computing providers are able to share their
infrastructure across a large number of customers, giving them economies
of scale, and higher utilization. You can't buy just half of a physical
server, but the typical IaaS provider sells low-end virtual machines
which are roughly equivalent to 1/10th of a server.
- Pay-as-you-go: Imagine you're a startup. Buying a server will cost
you $4000. And that's Rs. 2L that you don't really have right now. By
contrast, buying compute cycles on Amazon EC2 might cost you $100 per
month - which is much more manageable. And at times when you're not
really using the server, you shut it off, and don't pay for it. If
during a busy month, you need two servers, you get a second server for
just one month, and then delete it at the end of the month. Much better
than having to buy an entire second server that will be useless after
the first month.
- Easy scalability: If you're a growing company, and the demand for
computing suddenly increases (for example, if your website is mentioned
in TechCrunch and you suddenly get 10,000 new customers), it is very
difficult to suddenly scale up your physical infrastructure. That would
involve buying new servers, migrating programs, files, and databases.
And a whole bunch of other setup. By contrast, IaaS providers provide
these services at the click of a button. PaaS and SaaS providers take
care of scaling completely, in a manner transparent to you, and you
don't even need to think about it.
- Location Independence: A cloud computing service can be used from
where-ever you are, whereas most physical infrastructure ties you down
to one place.
There are a bunch of other advantages, but this should be enough to
keep you happy for now. Check out our further reading section if you
want more on this.
Important platforms and players in Cloud Computing
In infrastructure as a service, the clear leader is Amazon. EC2 is
its service that gives you virtual machines in the cloud on which you
can install whatever operating system you want. Amazon has a bunch of
other infrastructure as a service offerings, including EBS (virtual hard
disks in the cloud), S3 (a simple service that allows you to store and
retrieve files), SimpleDB (a non-relational database in the cloud),
Amazon Relational Database (did you guess that this is a relational
database in the cloud?). It has further offerings in the form of:
messaging, queuing services, caching services, content delivery
services, monitoring services, load balancing services,
ecommerce/payments/billing services. But we'll leave the discussion
about those for another day.
The most well-known instance of Platform as a Service is Google's
AppEngine which allows programs to be written in Java or Python (or in
fact any language written for the JVM, like Scala, Closure, JRuby), with
Google BigTable as the corresponding database offering. SalesForce,
which revolutionized cloud computing by showing that
software-as-a-service can actually make lots of money, is a force to be
reckoned with in PaaS space because of their Force.com platform. This
requires programming in Apex (a Java-like programming language).
SalesForce has also bought Heroku (a Ruby+Rails PaaS provider) so expect
more Ruby/Rails here.
Lots of big players are entering the PaaS market. Microsoft is of
course in here with Azure that gives .NET in the cloud, and SQL Azure,
which essentially gives SQLServer in the cloud. VMWare is jumping in
with Cloud Foundry which supports Java (Spring), Ruby (Rails and
Sinatra), and JavaScript (Node.js), and MySQL, MongoDB and Redis as
databases. Interesting, the code that runs Cloud Foundry is open source.
Red Hat has OpenShift, another free PaaS offering that supports a large
number of frameworks and languages. Oracle, IBM, and a bunch of
startups (like dotCloud, Cumulogic/CloudBees). Amazon is also entering
this space with an offering called Elastic Beanstalk - which allows Java
deployment in the cloud.
Software as a Service has become so common that it is impossible to
make a listing. Pretty much any software that you can think of has a
SaaS alternative these days.
How should students get ready for Cloud Computing?
As a student who is soon going to be dumped in the big-bad world,
what are the best ways of picking up real-world cloud computing skills?
Here are some suggestions:
- Google AppEngine: This is totally free, and you can use Java (so no need to learn a new language). Just go to the Getting Started with Java on Google AppEngine
page, and follow the instructions there. In no time, you will have set
up your first AppEngine app. After this, build another app - in some
area that you find interesting. Maybe something to do with cricket
scores. Or Bollywood movie ratings. Or motorbikes. And build an
interesting website, fully in AppEngine. If you're lucky, your site will
go viral and Google will take care of the scaling, and you won't have
to look for a job!
- If you are OK with learning a new language, I would highly recommend building AppEngine apps in Python.
- Amazon EC2: Amazon gives free access to a basic EC2 machine in the cloud for one year.
That gives you a full machine of your own, where you are the root user.
Get one of these and then use web-based tutorials to do interesting
things with this machine - like experimenting with different operating
systems (various flavors of linux), installing a firewall, installing a
web-server, and other more advanced stuff.
- If you're into Microsoft Technologies, try out Windows Azure free
for 3 months (25 hours of compute time) for free. The tools that you'll
need to develop the app (Visual Studio, and local Azure simulator for
testing your app) are available for free to students who register for
the DreamSpark program. Or, I would
suggest forming a student group in your college and getting in touch
with Microsoft, and convincing them to offer free Azure for students of
your college. I would suggest trying to convince Aditee Rele.
Further Reading
This article just scratches the surface. There are lots of interesting areas to do further research. These could include:
- Understanding the strengths and weaknesses of IaaS vs Paas vs SaaS
- Things to worry about when migrating an in-house app to the cloud
- How to set-up auto-scaling of your cloud app (because auto-scaling is really not so auto)
- Failure-proofing your app. How to ensure that you survive even if your cloud computing provider has a failure
- Data Portability and Vendor Lock-In
- How to estimate the costs of your cloud computing infrastructure (since pricing is a nightmare in the cloud)
- How to choose a cloud computing provider
- Worrying about security and local laws
- Private clouds
- Cross-platform cloud computing frameworks. These are libraries that
allow you to write apps in a way that the same app can run on different
cloud computing services.
- Mobile App Development + Cloud Computing = A match made in heaven
- Disadvantages of cloud computing
Here are some starting points for your further reading:
- More on The Basics
- Latest Trends in Cloud Computing (as of June 2011)
- Cloud Security: Threats and Mitigations - A broad guide on how to think about security in the cloud. This is still a developing field
- iCloud - Hype or Tipping Point?
- An overview of Apple's iCloud Offering, analysis of competitive
landscape and implications. Read, because you should always watch Apple
carefully.
- Further, further reading:
- IndicThreads Conference on Cloud Computing, 2010, 2011.
Conference with good selection of cloud topics - with detailed dives
into specific topics. All presentations are online and accessible via
the website.