Accelerating Time To Science: Transforming Research in The Cloud
Accelerating Time To Science: Transforming Research in The Cloud
aws.amazon.com/grants
Amazon Public Data Sets
Public Data Sets
AWS hosts “gold standard” reference
data at our expense in order to
catalyze rapid innovation and
increased AWS adoption
A few examples:
1,000 Genomes ~250 TB
Common Crawl
OpenStreetMap
Actively Developing…
Cancer Genomics Data Sets ~2-6 PB
SKA Precursor Data 1PB+
Nepal
Earthquake
Individuals
around the
world are
analyzing
before/after
imagery of
Kathmandu
in order to
more-
effectively
direct
emergency
response and
recovery
efforts
Peering with all global research networks
Terms:
• AWS waives egress fees up to 15% of total AWS bill, customers are responsible for anything
above this amount
• Majority of traffic must transit via NREN with no transit costs
• 15% waiver applies to aggregate usage when consolidated billing is used
• Does not apply to workloads for which egress is the service we are providing (e.g. live video
streaming, MOOCs, Web Hosting, etc…)
• Available regardless of AWS procurement method (i.e. direct purchase or Internet2 Net+)
aws.amazon.com/genomics
Data-Intensive Computing
The Square Kilometer Array will link 250,000 radio
telescopes together, creating the world’s most
sensitive telescope. The SKA will generate zettabytes
of raw data, publishing exabytes annually over 30-40
years.
Trans
AZ
it
• Geographic area
where AWS services
AZ AZ AZ are available
• Customers choose
region(s) for their
AWS resources
Trans • Eleven regions
AZ
it worldwide
Availability Zone (AZ)
Availability Zones
• Low-latency links AVAILABILITY
ZONE 1
AZ AZ AZ
Trans
AZ it
Data Center Data Center
• 1 of 28 AZs world-wide
• All regions have 2 or more
Data Center Data Center AZs
• Each AZ is 1 or more DC
– No data center is in two AZs
– Some AZs have as many as 6
DCs
• DCs in AZ less than ¼ ms
Example AWS Data Center
VPC REGIO
• Logically isolated
EC
N
section of the AWS 2
cloud, virtual AVAILABILITY
ZONE 1
network defined by
the customer
• When launching AVAILABILITY
ZONE 2
instances and other EC
EC2
resources, customers EC2
2
place them in a VPC
AVAILABILITY
• All new customers ZONE 3
UNDIFFERENTIATED
• And all of this while you don’t know where the capacity is
• Serve your customers
HEAVY LIFTING
Making Spot Fleet Requests
• Simply specify:
– Target Capacity – The number of EC2 instances that you want in your fleet.
– Maximum Bid Price – The maximum bid price that you are willing to pay.
– Launch Specifications – # of and types of instances, AMI id, VPC, subnets or AZs,
etc.
– IAM Fleet Role – The name of an IAM role. It must allow EC2 to terminate instances on
your behalf.
Spot Fleet
• Will attempt to reach the desired target capacity given the choices that were given
• Manage the capacity even as Spot prices change
• Launch using launch specifications provided
Using Spot Fleet
• Create EC2 Spot Fleet IAM Role
• Requesting a fleet:
– aws ec2 request-spot-fleet --spot-fleet-request-config file://mySmallFleet.json
• Describe fleet:
– aws ec2 describe-spot-fleet-requests
– aws ec2 describe-spot-fleet-requests --spot-fleet-request-ids <sfr-………..>
• Describe instances within the fleet
– aws ec2 describe-spot-fleet-instances --spot-fleet-request-id <sfr-…………>
• Cancel Spot Fleet (with termination):
– aws ec2 cancel-spot-fleet-requests --spot-fleet-request-ids <sfr-…………..> -terminate-
instances
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html
mySpotFleet.json
{
"SpotPrice": "0.50",
"TargetCapacity": 20,
"IamFleetRole": "arn:aws:iam::123456789012:role/myspotfleetrole",
"LaunchSpecifications": [
{
"ImageId": "ami1a2b3c4d",
"InstanceType": "cc2.8xlarge",
"SubnetId": "subneta61dafcf"
},
{
"ImageId": "ami1a2b3c4d",
"InstanceType": "r3.8xlarge",
"SubnetId": "subneta61dafcf"
}
]
}
Elastic File System
The AWS storage portfolio
• Content repositories
• Development environments
• Home directories
• Big data
Amazon Elastic Container Service
+
Key Components
Docker Daemon
Task Definitions
Containers
Clusters
Container Instances
Typical User Workflow
I have a Docker
image, and I want
to run the image
on a cluster
Typical User Workflow
Push
Image(s)
Typical User Workflow
Declare
resource
requirements
Create Task
Amazon
Definition ECS
Typical User Workflow
Use custom AMI
with Docker
support and ECS
Agent. Instances
will register with
default cluster.
Run EC2
Instances
Typical User Workflow
Get information
about cluster state
and available
resources
Describe
Amazon
Cluster ECS
Typical User Workflow
Run
Amazon
Task ECS
Typical User Workflow
Get information
about cluster state
and running
containers
Describe
Amazon
Cluster ECS
Thank you!
Jamie Kinney
jkinney@amazon.com
@jamiekinney
Additional resources…
• aws.amazon.com/big-data
• aws.amazon.com/compliance
• aws.amazon.com/datasets
• aws.amazon.com/grants
• aws.amazon.com/genomics
• aws.amazon.com/hpc
• aws.amazon.com/security