This project is read-only.

URL Argument Parsing Failing on encoded '&' -- %26 or &

Jul 28, 2009 at 7:53 PM

In HttpHelper.cs : 103, the IsAmp function is used to determine when a new URL argument is starting and the current one should simply be added to the collection.

There are two special checks for encoded '&' characters (URL Encoded=%26 and HTML Encoded=&) that advance the string index automatically. While I don't quite fully understand why these checks are necessary, I do know that they currently cause the IsAmp function to return true and therefore that they are the start of a new argument (this is certainly not the case for %26, & is debatable, although I don't necessary agree with that special case).

Here is the new and improved IsAmp function (which I will submit as a patch) which simply adds 'return false;' to the code blocks that run within he two special case 'if' statements.

        private static bool IsAmp(string queryStr, ref int index, out int outIndex)
        {
            outIndex = index;
            if (queryStr[index] == '%' && queryStr.Length > index + 2 && queryStr[index + 1] == '2' &&
                queryStr[index + 2] == '6')
            {
                outIndex += 2;
                return false;
            }
            else if (queryStr[index] == '&')
            {
                if (queryStr.Length > index + 4
                    && (queryStr[index + 1] == 'a' || queryStr[index + 1] == 'A')
                    && (queryStr[index + 2] == 'm' || queryStr[index + 2] == 'M')
                    && (queryStr[index + 3] == 'p' || queryStr[index + 3] == 'P')
                    && queryStr[index + 4] == ';')
                {
                    outIndex += 4;
                    return false;
                }
            }
            else
                return false;

            return true;
        }
Does anyone not agree with this change? As it curently is (only one return false; after that lone else), I can't see how someone could ever provide '&' as a part of a URL argument value.
Aug 10, 2009 at 10:50 AM

Can you provide a testcase when this change is needed? As I see it, it would break the url decoding 

Aug 10, 2009 at 6:53 PM

I wasn't really sure the use case for the & check (I assumed you wanted to handle the case where someone HTML encoded it instead).. so I'll focus on the %26 case.

FYI: I am able to reproduce this even on today's latest source (25767).

I loaded up the trunk solution and ran Tutorial1 with a small modification to the /hello response logic:

            if (request.Uri.AbsolutePath == "/hello")
            {
                System.Text.StringBuilder sb = new System.Text.StringBuilder();
                sb.AppendLine("Hello to you too!<br/><br/>");

                sb.AppendLine("Params:<br/>");
                foreach (HttpInputItem param in request.Param)
                    sb.AppendLine(param.ToString() + "<br/>");

                context.Respond(sb.ToString());
            }

For the following URL you would expect to see    testparam = here&there    printed out (because that is the output format of the ToString() method on HttoInputItem) but that is not what happens.

http://localhost:8081/hello?testparam=here%26there

Here is the actual output:

Hello to you too!

Params:
testparam = here
= there

The parsing code (as it) treats the encoded & (%26) as an actual & and therefore starts parsing a new parameter (which results in the broken testparam parameter and the blank name for the there value).

The code change I made above was simply to returns false; after the %26 and &amp; checks (meaning this character (and those I'm advancing in outIndex are not specifying the start of a new parameter).

I hope this helps.