How bad is a query string?

In common web development people use query strings to pass parameters to the receiving web page. This technology is available in basically every language dealing with the web, such as ASP.NET, PHP, JSP, JavaScript etc.

Sure, query strings aren’t always the best way to do things, it depends on the situation, but in my opinion there are a lot of cases where it’s a justified and good approach. There are definitely a lot of scenarios when one can’t post forms to achieve this effect, but instead has to resort to query strings, for instance, when it comes to making a direct page available for bookmarking.

And yes, one can implement so-called friendly URLs, but from what I’ve seen it isn’t really the best approach either.

However, as most web developers are aware of now, is that if you use query strings it will negatively affect your search engine ranking. My question is: why? Should we change a common web developing standard just because search engines have a hard time dealing with it? And who are they to judge, using query strings extensively themselves?

22 Comments

  • I think there is one more fundamental question about this: why exactly search engines have problems with query strings? Can someone point me to any resource on this?

  • I think it's about semantics. I don't think search engines have problems with query strings — they are just interpretating the definition of a resource on the internet.

    The definition of URL is "Uniform Resource Locator". In the days of static systems with seperate files for each documents, the location of the resource is the location of the document and the resource remains alive as long as the document exists.

    With query strings you're modifying a single document to produce certain content. The document itself is not sufficient enough to be a resource for content, as it cannot provide any without the parameters.

    For example, look at the difference between search functionality and a weblog. Entering parameters for a search script results in unpredictable content. Each time it is entered the content can be difference. As such, I would not consider it a uniform resource for specific data.

    A weblog with articles on the other hand is perfectly capable of always providing the same data for the same parameters. Due to this predictability, I would say it's possible to consider it a uniform resource.

    So, why search engines do not handle URLs with query strings is because they cannot possibly know whether the URL they are sending the user to still contains the same information as it did at the point it was indexed. It is not a uniform resource for the data they indexed.

    Excuse me if I mixed up the different acronyms and all. πŸ˜‰

  • <blockquote cite="http://www.robertnyman.com/2006/01/11/how-bad-is-a-query-string/">And yes, one can implement so-called friendly URLs, but from what I’ve seen it isn’t really the best approach either.

    I forgot to ask, but why isn't it?

  • I think one of the reasons search engines (SEs) are putting less weight on dynamic pages is because of spammers. It's fairly easy to make a script that feeds random words on a page based on a parameter. Thay way you can make hundreds of pages just by changing the query string. SEs know this, and the chance of hitting a spam page on a static page is less than on a dynamic one.

  • Ã&Acirc says:

    I try to minimize the query string as much as I can. When I look in the browser ans see a short pretty URL instead of a long, parameter loaded URL, strangely I feel I have done a good job.

    But that is not to say that I don't use them at all or go out of my way of avoiding the QS parameters, but I prefer to find other ways of passing parameters.

  • Marco says:

    I also fail to see what's wrong with non-crufty URL's. What exactly are the problems you're hinting at?

  • 4rn0 says:

    I use Apache's mod_rewrite a lot to generate 'friendly URLS' on my content managed websites and I was curious as to why you think this is not 'the best approach'?

    For me and my customers it's worked out great so far: excellent indexing on Google, addresses are easier to remember, etc.

  • It is perfectly sensible there should but less emphasis on query strings and query string is the part of a URL that conveys parametric data to the server.

    I suppose it depends upon why type of string though.

  • Robert Nyman says:

    Dejan,

    I think, nowadays at least, it's a conscious choice by them, actually, and not a technical problem.

    Jeroen,

    Interesting perspective! I definitely understand the data integrity-approach and being a uniform resource, but it is also about every day web developing.

    So, why search engines do not handle URLs with query strings is because they cannot possibly know whether the URL they are sending the user to still contains the same information as it did at the point it was indexed.

    This argument can be easily turned around. If the information has changed isn't about if a query string has been used or not, it applies to every URL. What is important to me is that the search engine conveys when that information was indexed, then naturally it can change after that date.

    Emil,

    The spam argument sounds very reasonable, and maybe one of the reasons.

    Arní,

    That sounds like a good standpoint.

    Jeroen, Marco, 4rn0,

    I have no problems with friendly URL/rewrites per se, but in my eyes it's still a workaround to achieve the same thing, and also a solution that isn't the default implementation in all the web developing languages.

    Robert W,

    Well, most web pages are dynamic, so if that behavior is achieved through a query string or any other way (say a friendly URL), I don't think it should matter.

  • Joel says:

    I agree with Jeroen that the problem has to do with static vs dynamic content. It used to be that a URL located a document, but now a URL can point to a program (or part of one) that produces different output based on the commands sent to it (via the query string). Since the SEs don't know how the program behaves (and they all tend to behave differently) they can't effectively index its output. If there were some document you could place in your app root (like an appindex.xml) that specified which query string parameters produced new documents and which had minimal effect on the output, then a SE could probably do a better job of indexing URLs with query string parameters.

    I recently worked on a URL-rewriting project (part1, part2) and the goal was to give static-looking URLs to the pages that contained the content while not worrying about the other pages. I do have to say, though, that URL-rewriting in .Net 1.1 is a bit of a headache.

  • Tommy Olsson says:

    I don't see a problem with query strings when they are used properly. SEs may (or may not) adjust their ranking when query strings are used, but once the site is judged to be authoritative, it doesn't matter. (I used query strings on my blog, and it did fairly well in Google.)

    What worries me is that some developers use query strings (HTTP GET requests) for operations that modify or delete information on the back-end. That is in violation of the HTTP specification, which says (IIRC) that GET requests must be read-only.

  • Robert Nyman says:

    Joel,

    While it is indded important to differentiate those two from a meta perspective, what says that the content in one has a higher value than the content in the other?

    Tommy,

    I agree, it has to be used with care and in the correct situtations.

  • Jules says:

    Tommy wrote:

    <blockquote cite="http://www.robertnyman.com/2006/01/11/how-bad-is-a-query-string/#comment-2566">What worries me is that some developers use query strings (HTTP GET requests) for operations that modify or delete information on the back-end. That is in violation of the HTTP specification, which says (IIRC) that GET requests must be read-only.

    Good to know! Thanks.

  • Joel says:

    Ah, sorry, I wasn't trying to imply that one or the other had more value, just that one (items with 'clean' URLs) was easier to index.

  • Basically, what Jeroen said, but I would phrase it as follows:

    1) index.php?page=portfolio

    2) /portfolio/

    Query strings don't identify themselves as definite content, but as dynamic content. A portfolio page isn't dynamic content, but definite content.

    search.php?query=CSS+example

    This, however, clearly indicates dynamic content (as ?query=XHTML should be expected to provide a completely different content).

    I find that Query strings should not be used for definite pages. Especially bad is this:

    index.php?pageid=235

    Doesn't tell me jack sh!t about what it is, unlike this:

    /log/archive/2006/01/17/easy-css-themes

    Then, there's the other reason of great importance: Cool URI's Don't Change.

    What if your system is replaced by one in a different language? index.php indicates that it runs on PHP, but now your new system is ASP.NET and not only is the extension not appropriate, obviously, but all GET vars used by the new system are completely different — with a system that relies on query strings for URL's, you are now officially f*cked. πŸ™‚

    So, Query Strings are just not the smartest approach.

  • Robert Nyman says:

    Joel,

    Why would it be easier to index? Because the friendly URL has a more understandable text?

    Faruk,

    I buy the argument about being correct semantically but it still doesn't change if the page is dynamic or not.

    I don't think if the system changes from, for instance, PHP to ASP.NET, would be the biggest problem if that transition would occur… πŸ™‚

    I'm all for friendly URLs, but I'm just saying that that is indeed a workaround compared to how the programming languages are constructed initially. So, should the languages themselves change then?

  • Robert, interesting question. Should they change? Doesn't Ruby on Rails allow you to build an application with friendly URLs out of the box?

    As I've said to you before already. I don't think it's a real work around. It's just a matter of having your application reflect the way the internet is defined and understood by other applications.

  • Robert Nyman says:

    Jeroen,

    Well, maybe languages should have built-in support for that from scratch. Will it happen? Probably not in the foreseeable future.

  • I'm no expert but if you write a crawler, you'll find that you need to normalize URLs. That is that you need to make them unique.

    You may find links on a page like:

    http://www.jumpsocial.com/index.php http://jumpsocial.com/index.php/
    index.php

    These links can all refer to the same place. There are a lot of ways to rewrite a link.

    There is a way of normalizing these links so that you don't fail to resolve different rewrites into the same address.

    With querystrings, it's harder to normalize.

    http://www.JumpSocial.com/index.php?ooga=1&bohttp://www.JumpSocial.com/index.php?booga=2&o

    Are the same URL. If you have lots of querystring variables, I can imagine that it would be even harder.

    That's one possibility why they say that crawlers don't like querystrings.

    Personally I don't understand why that would downgrade your rating or confuse a crawler. I check my Web logs and I find that Google and other crawlers completely explore the querystring space.

    I'd like to see some evidence that it downgrades your page rank or findability.

    Off hand, it comes to mind that one could order the querystring variables alphabetically as a way of normalizing.

    Darcy Whyte

  • Even if there is no benefit to SEO (which is questionable), there's a definite benefit for the user.

    For example,

    /articles/3 hints at their being at least 2 more articles to view

    /article.php?id=3 would be scary to most web users.

    So why not make it easier for users to find related pages without searching?

  • Darcy says:

    I agree Allain. But shouldn't the application give links to the other articles (perhaps next previous or a list).

    Darcy

  • Allain Lalonde says:

    Sure but that doesn't really have much to do with SEO and urls. I guess the point is intuitive urls are better for their own sake.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.