Best Amazon EC2 instance type for Ray Cloud Browser on metagenomes and bacterial genomes

Hi,

I am using spot instances on Amazon Elastic Compute Cloud (EC2) to deploy a few installations of Ray Cloud Browser. Initially, I opted for m1.small instances because the Ray Cloud Browser web service that answers a bunch of HTTP GET API calls was not optimized. Namely, the C++ back-end code was memory-mapping a huge file (~16-20 GiB). This huge bloated binary file was the index -- the source of information about the huge graph describing a given biological sample. A recent patch improved the performance by packing information in every available bit in the binary file, reducing the number of blocks by 75%, hereby enhancing performance as well.

Recently, there were a lot of peaks in the m1.small spot pricing, and I figured out that my use case was all about bursts -- discrete HTTP API calls.


I then looked at the pricing history for the last 3 months, and these pesky peaks seem to be a 2013 thing.

The t1.micro spot instance pricing history also has these sophisticated highs.

With all these recent pricing surges, I decided to provision some standard Elastic Block Storage (EBS) volumes so that my stuff stays in the cloud when the spot market price exceeds what I want to pay as a customer.

Before today, I was using ephemeral volumes, which are basically just EBS storage volumes that vanish when the instance is stopped or terminated.
I don't use ephemeral EBS volumes anymore as they are not useful for my user story.

Running one m1.small on-demand instance costs 47.48 $ per month whereas one single 64-GiB EBS standard volume costs 6.40 $ per month (excluding input/output requests). And running one m1.small spot instance costs around 5.11 $ per month whereas running one t1.micro spot instance costs around 2.19 $ per month.

Therefore, running Ray Cloud Browser on one t1.micro spot instance using one 64-GiB EBS volume costs 8.59 $ per month. Basically, it costs nothing at all considering the cost of other things in genomics research, such as sequencing runs on instruments, bioinformatician-hours, developper-hours, and so on.

And the software called Ray Cloud Browser costs nothing too -- I authored it and it's free software distributed under the GNU General Public License version 3.

I hope that people everywhere use Ray Cloud Browser to visualize their DNA samples !

The cloud is really a nice thing, it removes barriers. But you have to use it to gain experience in order to lower your costs.


Comments

Popular posts from this blog

Le tissu adipeux brun, la thermogénèse, et les bains froids

My 2022 Calisthenics split routine

Adding ZVOL VIRTIO disks to a guest running on a host with the FreeBSD BHYVE hypervisor