July 2008
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
Flickr

Patio Tomato
Patio Tomato Flower
Flowering Cactus
Flowering Cactus
Caladium
Laurie's Entries

Subscribe!

Subscribe in Yahoo!
Subscribe in Newsgator
Subscribe in Pluck RSS reader
Subscribe with Bloglines
Site Info

Powered by
Movable Type 3.34
Sponsored Links



Search
Google
Web kf6nvr.net

« Sprint's Power Vision: Any Good? | Main | My NAS Tale: Picking a NAS »

Bittorrent: The Slow Download Method?

So, last week I needed to download the Fedora Core 4 DVD. It's about 2.4GB. I started by just clicking on the main download link. The download rate on this was really slow.  Having recently downloaded the same thing at work for a work project, I knew that many of the mirror sites were no faster, though. So, I decided to try out the torrent.

I didn't have a torrent client installed, so I did the natural thing.  I downloaded Azureus, arguably the most popular client as it's also always near the top of popularity for all SourceForge projects.  i got it up and running and grabbed the torrent.  Within a few minutes it was transferring at 50KBps (B is for byte and b is for bit).  This wasn't too bad, but for 2.4GB it was going to take a long, long time.  I gave it another 10 minutes and it was still only hovering around 60-70KBps.  On a typical site, I normally get well over 300KBps, with peaks up into the 600KBps range.

So, I decided go and try out one of the mirrors for the hell of it.  What would it hurt?  Instead of the typical HTTP mirror, I decided to try out an FTP mirror, figuring it might have less protocol overhead (is this true).  Well, the top link in the western US section  fit the bill; it was an Oregon university link.  So I clicked.  It immediately started downloading at over 500KBps and went up to 550KBps.  All of this while the torrent was still running.  Already I was getting 10x the speeds.  I killed the torrent and after just about an hour I had downloaded the entire 2.4GB at a sustained transfer rate of 622KBps.  Why would I ever use a torrent if it's less than a tenth the speed of a decent site?

I've been doing a lot of thinking this last week on why the torrent was so slow.  The idea behind torrents, of course, is that the bandwidth is distributed and the entire universe of people downloading the particular torrent add to the overall download bandwidth of the torrent.  So why would it be so much slower?

Well, there are a number of factors.  On factor for many people is that they have a firewall where they can't open up the ports on it.  This is not the case for me; we run a WRT54G with OpenWRT on it.  The ports were open and configured properly.

Another factor is that the universe of people isn't always large enough to give everyone full theoretical bandwidth.  In particular, we have a fairly fast connection (for US standards) and so we expect faster transfer rates than normal. For a normal user, getting 80KBps is roughly have of their expected 1.5Mbps connection.

The other big problem as I see it, though, is that the vast majority of people are on connections were the uplink is a tiny fraction of their downlink.  A typical US DSL line will have 1.5Mbps down and 128Kbps up.  Our connection is around 6 Mbps down and 608Kbps up, a similar ratio.  That is, a typical ratio is 10x the dowlink speed as the uplink speed.  So, from here we can do some simple math.

Let's say everyone in this hypothetical torrent universe has the same connection speeds.  For simplicity, we'll use 1MBps down and 100KBps up.  So, now let's say we have 10 clients in the universe downloading the same torrent.  We now have a bandwidth demand of 10MBps but only 1MBps of bandwidth being provided.  Why would the eleventh client join in when they'll just get one tenth of the current available bandwidth (they don't download from themselves).  Interestingly, one tenth of the bandwidth happens to be equal to the speed of the uplink everyone has: 100 KBps.  In fact, the bandwidth available when everyone is downloading is exactly the average of all of the uplink speeds summed divided by the number of clients.  Since, on average, everyone's uplink is a tenth of their downlink this means that, on average, each person will get a download speed of about one tenth of their maximum download speed.  Funny enough, that's exactly what I was getting.

Now you might be asking, "What about seeds?"  Well, this is the single piece that seperates reality from theory.  The theory is that when a client is done downloading it will continue to upload but it will no longer have any download demand for that particular file.  So, if after our first ten theoretical clients finish downloading, another 10 will come on.  Now we still only have a download demand of 10MBps.  However our available upload bandwidth is now doubled to 2 MBps.  This means that the new 10 clients will each get to download at 200 KBps.  So, if this follows, we need 100 clients in the system, with only 10 downloading and everyone uploading.  This gives us a total bandwidth demand of 10 MBps and a total bandwidth supply of 10 MBps. (You'll actually need more since there is some protocol overhead in finding peers but we'll leave that out for now as it's relatively irrelevant.)

Unfortunately, reality doesn't quite work that way.  And, in fact, I don't think it can realistically work that way.  First off, a typically person downloads a file and gets off the network.  Yes, this means their bandwidth demand goes away.  However, they take their bandwidth supply with them.  There are a few reasons why someone might want to do this.  One reason is that they may be turning their machine off.  A more typical reason is that they need the bandwidth for something else.  It's common that when your uplink bandwidth is saturated your overall net experience will suffer.  Response times get worse, other people on the network complain, and you get booted.  (QoS can help solve some of these problems, but it isn't available in most routers.)  However, another more common reason might be that the client is going to go off and start downloading another torrent.  If they left their old torrent in seed mode and started a new one, the bandwidth would likely be split between the two.  This means that they would only be giving a twentieth of their download demand back as bandwidth supply.  This makes it even harder for this next torrent universe to build up the supply they need, even if this particular client goes into seed mode. 

Take, for example, a torrent for a daily podcast.  If a particular client decides to leave each torrent they download going in seed mode indefinitetly, after only two weeks each torrent will only get 10KBps of supply, or a hundredth of their download demand.  If everyone did these, then the universe would need 1000 clients to provide enough bandwidth supply to only 10 clients.  This is not realistic.  So it's actually in the next torrents benefit that a client stop seeding the previous torrent.  However, the previous torrent suffers for this. 

Now, when you compare to a single server feeding to 10 clients, you only need a server with a 10MBps connection to feed them.  This is fairly trivial for a single server to provide.  Now why doesn't that server just join the torrent as a seed?  Well, sometimes they do.  In our example, each server that can seed at 10MBps is equivalent to 100 clients worth of bandwidth supply.  This is also a great way to load balance between multiple servers rather than having to do (relatively) complex routing, bandwidth sharing, and load balancing.  But do 10 servers seeding at 10MBps actually provide the same amount of bandwidth as a single 100MBps server?

Usually not, since there is protocol overhead and for clients there is often a loss of bandwidth when a lot of ports are in use.  Not only that, since all bandwidths get averaged out, those that could have downloaded of a server and gotten off the network more quickly won't be able to.  This goes back to my real example.  I could have been demanding 622KBps of bandwidth for hours on the torrent, but instead I used that much bandwidth for about an hour on an FTP server and went away, freeing up the bandwidth much sooner.

I don't think torrents are the solution to overall network bandwidth.  I think they are a great solution for small sites to distribute the bandwidth demands to others.  This is because the torrents scale.  In our example, each additional client adds demand ot the tune of 10x their uplink however, I've shown that they will typically only get bandwidth equivalent, on average, to their uplink bandwidth.  This means that 10 people can download at the same rate as 10000 or even 1000000 people.  They don't get to use their full download bandwidth, of course.  In the end, the bandwidth actually becomes synchronous in a particular torrent universe. 

So, the reason torrents are so popular isn't because a particular client can download faster.  It's because they scale with the downloading universe.  And anyone who can stay behind for however long as a seed just helps the situation by however little. So, if you need to download quickly, finding a fast FTP server will always win out unless it takes you hours to find that server.  For Fedora Core DVDs, I'll stick with finding an FTP server since that will almost always be faster for me -- and I'll end up taking more than my fair share of bandwidth out of the universe of torrent clients since I won't stick around to seed a file I need to burn to DVD and delete of my system.

(There is, of course, another reason people like torrents.  Sometimes they are the only way to download something.  But that's a completely different topic.)

Posted by Shane on February 27, 2006 9:50 AM |

TrackBacks

TrackBack URL for this entry:
http://www.kf6nvr.net/mt/kf6nvr-tb.cgi/677