Strange hex characters appear in html source code (check_http and chunks)

Written by - 1 comments

Published on March 10th 2015 - Listed in Linux Monitoring Nagios


A website, being monitored with the Nagios plugin check_http, was constantly reporting an alert, because the string being looked for in the source code (footerContent) was not found:

./check_http -I 192.168.4.50 -H www.example.com -s footerContent
HTTP CRITICAL: HTTP/1.1 200 OK - string footerContent not found on http://www.example.com:80/ - 21322 bytes in 2.500 second response time

check_http offers a verbose option which displays the full source code. And by using the -v parameter, the source code revealed strange signs:

./check_http -I 192.168.4.50 -H www.example.com -v
[...]
                <div id="footer">
                        <div class="footerCo
305
ntent cfx">
                                <p>&copy; 2015 EXAMPLE</p>
                                 <ul><li class=""><a href="/en/site-links/terms-conditions/" target="_self" class="">Terms & conditions
[...]

Note the "305" string in the source code, which sits in the middle of the string "footerContent"? Now it makes sense, why the Nagios http check failed and couldn't find the full string.
At first we expected a bug in the check_http plugin but by using curl's --raw option, the same behavior could be seen:

curl http://192.168.4.50 -H "Host: www.example.com" -v --raw

So the question was: Where does this strange number come from? Another look at the full source code output of check_http revealed several other strange characters.

A tcpdump also revealed that the website was loaded in several tcp packets. One of the packets started with the content "ntent cfx". So right after the 305 number. It seemed that Nginx was splitting up the source code into several packets, the "305" was therefore the beginning of the new packet with its size (at least that's what we guessed). But why did this happen?

I enabled the nginx debug log and launched the check_http command again.
Meanwhile I had tried to add/remove several Nginx proxy parameters, so the weird character added in the source code has changed its position and value:

./check_http -I 192.168.4.50 -H www.example.com -v
[...]
                <div id="footer">
                        <div class="footerContent cfx">
                                <p>&copy; 2015 EXAMPLE</p>
                                 <ul
2d4
><li class=""><a href="/en/site-links/terms-conditions/" target="_self" class="">Terms & conditions
[...]

In the huge debug output I found the following information:

[...]
[debug] 43063#0: *531 HTTP/1.1 200 OK
Server: nginx/1.7.8
Date: Tue, 10 Mar 2015 12:04:21 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
X-Powered-By: PHP/5.5.9-1ubuntu4
Set-Cookie: CONCRETE5=hq9tb7d4kvqqn8bfdv9uvofgm1; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding
[...]

There was one very interesting line which caught my eye: Transfer-Encoding: chunked
Nginx therefore took the response from the upstream server and split it into several pieces (chunks).

I modified the Nginx vhost and disabled chunked_transfer_encoding:

    location / {
        proxy_pass http://upstream;
        chunked_transfer_encoding off;
    }

As soon as I reloaded Nginx, I tested it with check_http again:

./check_http -I 192.168.4.50 -H www.example.com -v
[...]
                <div id="footer">
                        <div class="footerContent cfx">
                                <p>&copy; 2015 EXAMPLE</p>
                                 <ul><li class=""><a href="/en/site-links/terms-conditions/" target="_self" class="">Terms & conditions

No strange characters, no breaking up the source code anymore! I went through the full source code and couldn't find any of these "code splits" anymore.
In the Nginx debug log, the output changed, too:

[debug] 43196#0: *655 HTTP/1.1 200 OK
Server: nginx/1.7.8
Date: Tue, 10 Mar 2015 12:11:46 GMT
Content-Type: text/html; charset=UTF-8
Connection: close
X-Powered-By: PHP/5.5.9-1ubuntu4
Set-Cookie: CONCRETE5=it08ppn4bgjsjg1qvsvleo4432; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding

The Header "Transfer-Encoding" disappeared. So this is the reason, why these "strange characters" showed up in the html source code.

These characters don't show up in a browser, because they're correctly decrypted. However when "raw" http data is read then these chunks split up your source code, hence causing you an alert if your search string is cut by the chunk. So the issue here comes from the check_http plugin which uses HTTP/1.1 but does not understand the chunked http encoding. By definition, every client using HTTP/1.1 must understand Transfer-Encoding chunked.


Add a comment

Show form to leave a comment

Comments (newest first)

ck from Geneva, Switzerland wrote on Mar 10th, 2015:

So at the end it all points to a bug in check_http plugin, because it is requesting a HTTP/1.1 connection yet cannot handle the chunked data. A bug is open since 2009 (http://sourceforge.net/p/nagiosplug/bugs/446/):

Fixing 2638765 uncovered this bug (actually, most HTTP servers respect Host: headers in HTTP/1.0 requests, even though it isn't part of the spec).
Anyway: if a client sends a HTTP/1.1 request, it must be able to handle the incoming HTTP/1.1 reply. Which includes coping with a chunked one.


This should however be fixed in the upcoming release 2.1 of the nagios-plugins.