Geo-distributed web hosting – the Tronkle way

Background

The world wide web started with a simple web server but pretty soon the simple web server became insufficient. So the good old web added created some services to sustain its fast growth: dedicated servers, collocation and huge data centers, clusters and load balancers, content delivery networks (CDN), virtual dedicated servers (VDS), virtual private servers (VPS) and the fashionable cloud computing. In a simplified view the clusters replicate the same service over several computers so that they can sustain higher loads for that particular service. The content delivery networks bring part of the data and sometimes some services closer to the user so that the user would benefit from better connectivity and the some of the load would also be spread across several servers but this time in different locations throughout the continent or even the entire world. The VPS or VDS (they are arguably the same thing) bring much lower prices for good old dedicated servers and collocation services. The trendy cloud computing brings fast scalability to the VPS – you can actually start or stop many VPSs at the same time.

If you look closely to all these services there’s something missing: Why not a cluster spread around the world in which balancing decision is made according to user proximity to the server? Why not a content delivery network that not only distributes part of the data or some services but all data and all services across the world? Why not bring back the simplicity of the good old web hosting to such a service? Why not be able to choose the locations of the nodes? Why not make it affordable? We couldn’t see why any of these was not already there so we created Tronkle – which is all of these put together.

What is Tronkle

In short: a geographically distributed web hosting service. Actually it’s having an entire copy of your website on several servers around the world. Each time a user visits your site he will actually visit the site on the server that is closest to him. The best part is that you don’t have to change anything to your site to use Tronkle: you just bring your site to Tronkle and you use it just like any other traditional web hosting service. In addition to that you get to choose the locations from a list of available servers around the world according to your site demographics. Last but not least we promote honesty in relation with our customers: in our minds this translates to giving our customers unlimited number of domains and unlimited number of databases and charging for traffic and storage. In other words you pay for what we pay and you don’t pay for things that we get virtually for free.

So far Tronkle hosting offers PHP / Mysql services only. We might consider adding other platforms (maybe RoR?) if the demand is very high.

Why use Tronkle

Using a geo-distributed service has several advantages:

  • speed – if the site is served from a location closer to the user it should benefit from better network connectivity (lower latencies and higher bandwidth); this is the very reason of existence of the very successful content delivery networks
  • load balancing – the load is automatically distributed over several servers which means that your site can be visited by more users at the same time; this is the very reason of existence of clusters; for some sites (with a low user base) this might not sound like a great feature but if your site will be “digged” or “slashdotted” you will see how useful this is
  • automatic fail-over – if one server fails your site will still be served from the other servers; everyone offers lots of nines in their availability but the truth is that every server and every hardware equipment does fail from time to time; if you do the math you will see that usually the nines would translate to a few days of downtime per year; our servers fail too but the chances of several servers failing at the same time are lower
  • ease of useTronkle is as easy to use as a traditional web hosting service; you don’t have to modify your site; unlike CDNs and other services – there is no API to use with Tronkle – everything is performed automatically and transparently; unlike dedicated servers, VPSs, VDSs and cloud instances you don’t need to perform and setup / maintenance / admin work
  • flexibility – choosing your server locations is not only something innovative (you don’t have something like that in CDNs or other services) but it also brings your costs down and might also improve performance for some setups
  • low cost – compared to what it offers Tronkle is very cheap; you might actually find it cheaper than some “professional” web hosting services

Downsides

It’s great to talk about how good a service is but everything and every service has its’ disadvantages and Tronkle is no exception. Tronkle has been designed to speed up things on “the read side” and there’s a penalty on “the write site”. This penalty comes from the fact that each time you write to a file or to a database this change has to be propagated across several servers in different parts of the world. This is obviously slower than writing on a single server or on several servers in the same location (data center) and it also increases traffic.

The first good news is that the write penalty is not that bad; it’s actually not noticeable for the user most of the time and part of this is achieved by the approach in our software implementation – we’ve actually worked a lot on optimizing things on database writes and we have plans for taking this further. The even better news is that the user can minimize the impact by wisely choosing the number of servers and their locations and also by eliminating unnecessary writes from his site. A classic example of a rather unnecessary write would be the “analytics” write – writing a row in a database or a line in a file for each page access so that we can keep track of our site visitors. Such analytics services are offered for free by a number of companies and they usually do a much better job of anything that could be developed in-house and hosted on a single server.

However the best news is that most of the websites perform much more reads than writes. You can consider several examples here: in a forum a message is written once but viewed many times, in an online shop the user looks at several products but places only one order, in a corporate site the updates are relatively infrequent and so on.

What’s so revolutionary about Tronkle technology

First of all we consider Tronkle to be revolutionary because it’s the first service of its’ kind that we know of. It’s revolutionary technology also because we developed our own software which does things in a different way in order to be able to provide the kind of flexibility, ease of use and automatization that we wanted for our services. You will understand better the revolution behind Tronkle by reading the technical section below.

How Tronkle does it – some technical stuff

When creating a geo-distributed web hosting service like Tronkle the main issues to solve are: load-balancing, distributing files and distributing databases. For all these we use a combination of our very own in-house developed software together with free open source tools. All communication between servers is encrypted.

The load-balancing is done by dns. When resolving a domain hosted on Tronkle the user will only get the server that is closest to him that is hosting that particular domain and it’s also up and running at the moment. If the closest server is down it will return the next closest server that is up.

When it comes to replicating files there are several possible approaches. We chose a rather simple setup which detects when a file or directory has been modified and replicates it automatically to all servers where that file or directory is supposed to exist. There are some downsides to this approach like a rather big overhead and replication delays (up to a few seconds) but since these files are supposed to be html, css, php, js and such they we expect them not to change very often and in this scenario this is an acceptable approach.

The most difficult part in the entire project is database replication. We considered some solutions before creating our state-of-the-art software that works on top of Mysql to do the replication assuring data consistency. The first obvious solution is to use Mysql built-in replication (or even clustering) capabilities. However using this solution we couldn’t let the users choose the location for their servers. Trying to build such flexibility over Mysql replication capabilities would be quite difficult and would also bringing much overhead – a complicated setup that would be difficult to install and maintain and in the end a more expensive solution because it wastes resources.

After not being able to find anything that would distribute a Mysql database and suit our needs we started building our very own piece of software that would do just that. We tried several approaches ranging from some very simple and naive ones to some setups that were so complicated that they were almost impossible to stabilize and also had poor performance. The current approach is a rather elegant one with pretty good performance, quite stable and with little compromise. That’s why we consider it to be a revolutionary brand new technology and we are proud of it.

Current status and roadmap for the future

Tronkle has been launched in April 2010 in public beta and we plan to keep it in a public beta state for quite some time because implementing such a technology always requires time in order to become fully stable. However beta does not mean that it’s unstable – quite the opposite – it actually works even better than expected.

In the near future we will work on fixing bugs and also make some performance improvements for our core software. We also plan to add some new features to the administration part and make all necessary adjustments according to the usage patterns of our clients so that everyone would benefit from maximum performance. We have many plans for developing new services based on Tronkle platform as well as optimizing the current ones.

In the current deployment you can choose from 4 different locations: 3 in the United States (New York, Los Angeles and Dallas) and 1 in Europe (London) but this is just a humble beginning. We plan to add more locations in the near future according to demand: we currently take into consideration California, Florida, Chicago area, Germany, Netherlands and possibly some Asian locations. Let us know what location would be useful for you and we’ll take it into consideration.

Slashdot     Delicious  

Leave a Reply

You must be logged in to post a comment.