What really happens when you navigate to a URL

Posted by Tushar Bedekar

1. You enter a URL into the browser 




 2. The browser looks up the IP address for the domain name 


Now the question comes in your mind What is DNS??????????
Answer is->Full Form of DNS is Domain Name System.The DNS translates Internet domain and host names to IP addresses. DNS automatically converts the names we type in our Web browser address bar to the IP addresses of Web servers hosting those sites.

The DNS lookup proceeds as follows:
  • Browser cache – The browser caches DNS records for some time. Interestingly, the OS does not tell the browser the time-to-live for each DNS record, and so the browser caches them for a fixed duration (varies between browsers, 2 – 30 minutes).
  • OS cache – If the browser cache does not contain the desired record, the browser makes a system call (gethostbyname in Windows). The OS has its own cache.
  • Router cache – The request continues on to your router, which typically has its own DNS cache.
  • ISP DNS cache – The next place checked is the cache ISP’s DNS server. With a cache, naturally.
  • Recursive search – Your ISP’s DNS server begins a recursive search, from the root nameserver, through the .com top-level nameserver, to Facebook’s nameserver. Normally, the DNS server will have names of the .com nameservers in cache, and so a hit to the root nameserver will not be necessary.

3. The browser sends a HTTP request to the web server



You can be pretty sure that Facebook’s homepage will not be served from the browser cache because dynamic pages expire either very quickly or immediately (expiry date set to past). So, the browser will send this request to the Facebook server: GET http://facebook.com/ HTTP/1.1 Accept: application/x-ms-application, image/jpeg, application/xaml+xml, [...] User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; [...] Accept-Encoding: gzip, deflate Connection: Keep-Alive Host: facebook.com Cookie: datr=1265876274-[...]; locale=en_US; lsd=WW[...]; c_user=2101[...] The GET request names the URL to fetch: “http://facebook.com/”. The browser identifies itself (User-Agentheader), and states what types of responses it will accept (Accept and Accept-Encoding headers). TheConnection header asks the server to keep the TCP connection open for further requests. The request also contains the cookies that the browser has for this domain. And so the cookies store the name of the logged-in user, a secret number that was assigned to the user by the server, some of user’s settings, etc. The cookies will be stored in a text file on the client, and sent to the server with every request.

 4. The facebook server responds with a permanent redirect 



This is the response that the Facebook server sent back to the browser request: imageHTTP/1.1 301 Moved Permanently Cache-Control: private, no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Expires: Sat, 01 Jan 2000 00:00:00 GMT Location: http://www.facebook.com/ P3P: CP="DSP LAW" Pragma: no-cache Set-Cookie: made_write_conn=deleted; expires=Thu, 12-Feb-2009 05:09:50 GMT; path=/; domain=.facebook.com; httponly Content-Type: text/html; charset=utf-8 X-Cnection: close Date: Fri, 12 Feb 2010 05:09:51 GMT Content-Length: 0 The server responded with a 301 Moved Permanently response to tell the browser to go to “http://www.facebook.com/” instead of “http://facebook.com/”. 

 5. The browser follows the redirect 


The browser now knows that “http://www.facebook.com/” is the correct URL to go to, and so it sends out another GET request: GET http://www.facebook.com/ HTTP/1.1 Accept: application/x-ms-application, image/jpeg, application/xaml+xml, [...] Accept-Language: en-US User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; [...] Accept-Encoding: gzip, deflate Connection: Keep-Alive Cookie: lsd=XW[...]; c_user=21[...]; x-referer=[...] Host: www.facebook.com The meaning of the headers is the same as for the first request.


 6. The server ‘handles’ the request 


 The server will receive the GET request, process it, and send back a response. imageThis may seem like a straightforward task, but in fact there is a lot of interesting stuff that happens here – even on a simple site like my blog, let alone on a massively scalable site like facebook. Web server softwareThe web server software (e.g., IIS or Apache) receives the HTTP request and decides which request handler should be executed to handle this request. A request handler is a program (in ASP.NET, PHP, Ruby, …) that reads the request and generates the HTML for the response. Request handlerThe request handler reads the request, its parameters, and cookies. It will read and possibly update some data stored on the server. Then, the request handler will generate a HTML response. 

 7. The server sends back a HTML response 


 Here is the response that the server generated and sent back: HTTP/1.1 200 OK Cache-Control: private, no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Expires: Sat, 01 Jan 2000 00:00:00 GMT P3P: CP="DSP LAW" Pragma: no-cache Content-Encoding: gzip Content-Type: text/html; charset=utf-8 X-Cnection: close Transfer-Encoding: chunked Date: Fri, 12 Feb 2010 09:05:55 GMT The entire response is 36 kB, the bulk of them in the byte blob at the end that I trimmed. The Content-Encoding header tells the browser that the response body is compressed using the gzip algorithm. After decompressing the blob, you’ll see the HTML you’d expect: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" id="facebook" class=" no_js"> <head> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <meta http-equiv="Content-language" content="en" /> ... In addition to compression, headers specify whether and how to cache the page, any cookies to set (none in this response), privacy information, etc. 

 8. The browser begins rendering the HTML 


 Even before the browser has received the entire HTML document, it begins rendering the website: 

 9. The browser sends requests for objects embedded in HTML 



 As the browser renders the HTML, it will notice tags that require fetching of other URLs. The browser will send a GET request to retrieve each of these files. Here are a few URLs that my visit to facebook.com retrieved: Imageshttp://static.ak.fbcdn.net/rsrc.php/z12E0/hash/8q2anwu7.gif http://static.ak.fbcdn.net/rsrc.php/zBS5C/hash/7hwy7at6.gif … CSS style sheetshttp://static.ak.fbcdn.net/rsrc.php/z448Z/hash/2plh8s4n.css http://static.ak.fbcdn.net/rsrc.php/zANE1/hash/cvtutcee.css … JavaScript files http://static.ak.fbcdn.net/rsrc.php/zEMOA/hash/c8yzb6ub.js http://static.ak.fbcdn.net/rsrc.php/z6R9L/hash/cq2lgbs8.js 

 10. The browser sends further asynchronous (AJAX) requests 


 In the spirit of Web 2.0, the client continues to communicate with the server even after the page is rendered. For example, Facebook chat will continue to update the list of your logged in friends as they come and go. To update the list of your logged-in friends, the JavaScript executing in your browser has to send an asynchronous request to the server. The asynchronous request is a programmatically constructed GET or POST request that goes to a special URL. In the Facebook example, the client sends a POST request to http://www.facebook.com/ajax/chat/buddy_list.php to fetch the list of your friends who are online. This pattern is sometimes referred to as “AJAX”, which stands for “Asynchronous JavaScript And XML”, even though there is no particular reason why the server has to format the response as XML. For example, Facebook returns snippets of JavaScript code in response to asynchronous requests. 

3 comments:

back to top