Thursday, 6 September 2007

JavaScript HTTP Status detection

I was using Selenium to do some SEO testing, and for political reasons the tests had to run in the Selenium IDE.
Some of the tests were for HTTP Status: certain pages should return a 301 redirect. So I was asked to find a way to extend the Selenium user-extensions.js file so that it could find the HTTP Status for a page.
I found that it's not possible to do this in JavaScript directly: there is a way to find the HTTP Status code, but unfortunately the 301 redirects are interpreted before JavaScript gets a chance to log them.
In the end, we had to implement a solution where we had our own web-service (written in C#) to read the HTTP Status for a URL, and then we just called that from the Selenium JavaScript. Because the language we were using had access to the HTTP headers from the start, it was able to pick up 301s fine (and we could also code special cases like 404s easily).

Update:
At Angelblade's request, I'm posting some of the code we used to find the HTTP status of a URL. I'm not sure how much use it will be in Java though, since a lot of the processing is handled for us in the .NET objects!
It's probably not the best code, and I'm a bit hesitant about posting it, but it could at least provide a starting point for anyone else who wants to accomplish something similar.


public Int32 GetStatus(String encodedUrl)
{

const Int32 Request_Timeout = 90000;
HttpWebRequest request = (HttpWebRequest) WebRequest.Create(encodedUrl);
request.Timeout = Request_Timeout;
request.AllowAutoRedirect = false;
try
{
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
response.Close();
return (Int32) response.StatusCode;
}
catch (WebException webException)
{
if (webException.Status == WebExceptionStatus.ProtocolError &&
webException.Message == "The remote server returned an error: (410) Gone.")
{
return 410;
}
else
{
throw webException;
}
}

}

9 comments:

Adamm said...

Is there a method in the Selenium API that returns the HTTP status code?

Marv said...

Hi Adamm, no, unfortunately not - that's why we had to look at extending it.
If you're interested, we eventually gave up using the Selenium IDE as it was too cumbersome to keep tests maintained. I'm now experimenting with using the Selenium RC (C# version) from within some Unit Tests. The C# service we wrote earlier is now used directly (i.e. not via a web service) to check status codes and is working well.

Angelblade said...

@Marv-Hi i came across your blog when doing a google search. Is it possible to share ur code and what you did to get the status codes? I'm using Selenium RC (Java) and if i could use ur changes that would be a great help

Marv said...

Hi Angelblade,

I've added my code to the post, but I don't know whether it will help!

Cheers,

Marv

Angelblade said...

Just correct me if i'm wrong but it isn't getting the status of the request made by the browser but rather making another http request through ur method and checking to see if that request ( not the one made by the browser) is successful?

Marv said...

Hi Angelblade,

Yes, you are right - we had to fire off a separate request. For our purposes, it was enough to assume that every call to the same URL would result in the same HTTP status code. Hooking into the browser event model was too difficult, though I guess it might be possible in Firefox if you write/extend a Firefox extension. Writing a generic one for multiple browsers would, again, be quite tricky!

Cheers,
Marv

Angelblade said...

Ahh ok thanks. I'm looking for what the http status made by the browser is. We want to know if it was say a 500 to reload the page again. Yeah i realize it will be a bit tough to do it hence when i came across ur blog i was a bit curious :D

quinn said...

Could you post the javascript code that worked for detecting statuses other than 301 redirects?

Marv said...

Hi quinn,

Sorry for not replying sooner. The JavaScript code we attempted is lost in the sands of time, but we basically followed the same approach of making another request to the same page via XmlHttpRequest. These two links look useful...
How to detect HTTP Status from Javascript
Ajax help