[Inteproxy-commits] r359 - in trunk: . inteproxy test
scm-commit at wald.intevation.org
scm-commit at wald.intevation.org
Tue Mar 6 18:18:24 CET 2012
Author: aheinecke
Date: 2012-03-06 18:18:24 +0100 (Tue, 06 Mar 2012)
New Revision: 359
Added:
trunk/inteproxy/decompressstream.py
trunk/test/test_decompressstream.py
Modified:
trunk/
trunk/ChangeLog
trunk/inteproxy/httpmessage.py
trunk/inteproxy/proxycore.py
trunk/test/test_inteproxy.py
Log:
Merged revisions 347-357 via svnmerge from
svn+ssh://aheinecke@svn.wald.intevation.org/inteproxy/branches/compression
........
r347 | aheinecke | 2012-02-22 17:06:21 +0000 (Wed, 22 Feb 2012) | 8 lines
Add support for gzip/deflate compression in HTTPMessage
The read method now uncompresses the data if necessary and always
returns the uncompressed value. Both compression methods use
a zlib decompression object with a different Window Size.
This also allows decompressing chunks of the stream without
loading the complete stream into memory.
........
r348 | aheinecke | 2012-02-22 17:08:53 +0000 (Wed, 22 Feb 2012) | 2 lines
Add Accept-Encoding headers to http requests
........
r349 | aheinecke | 2012-02-22 17:58:14 +0000 (Wed, 22 Feb 2012) | 3 lines
Break lines at 80 characters in the docstring and
change indention a bit.
........
r350 | aheinecke | 2012-02-23 15:22:26 +0000 (Thu, 23 Feb 2012) | 4 lines
Add TestInteProxyCompressedConnection to test content-encodings
A test for compression on a chunked connection is missing, yet
........
r351 | aheinecke | 2012-02-23 15:26:28 +0000 (Thu, 23 Feb 2012) | 5 lines
Check for the need to decompress a response on initialization
of the response.
This allows to correctly change the headers (Content-Encoding/
Content-Length) before forwarding the response.
........
r352 | aheinecke | 2012-02-23 17:53:21 +0000 (Thu, 23 Feb 2012) | 10 lines
Move compression logic out of the httpmessage classes.
The decompression will still be transparent for the transfer_data
functions but no longer be handled by the Httpmessage class.
Also the accept-encoding header is no longer overwritten and is
only added if the client did not request gzip or deflate.
If the request already contained accept-encoding and no rewrite
is neccessary the response will stay encoded.
........
r353 | aheinecke | 2012-02-24 09:33:54 +0000 (Fri, 24 Feb 2012) | 10 lines
Add exception handling to compression, return raw data
if it can not be decompressed.
Change decompressed read to allow chained read wrappers.
Add test to TestInteProxyCompressedConnection for handling
an invalid compressed response.
Improve comments.
........
r354 | aheinecke | 2012-02-24 16:04:28 +0000 (Fri, 24 Feb 2012) | 12 lines
* M inteproxy/proxycore.py:
Add method decompressed_read to read decompressed
data from a compressed response and use it where
data from a httpresponse is read.
* M test/test_inteproxy.py:
Disable test in TestInteProxyCompressedConnection for handling
an invalid compressed response, as InteProxy will now crash in
that case.
* M inteproxy/httpmessage.py:
Add decompressor parameter to read_entire_message to decompress
the body with it.
........
r355 | aheinecke | 2012-02-29 09:57:31 +0000 (Wed, 29 Feb 2012) | 11 lines
Refactor compression, HTTPMessage now reads from an
internal _body_stream which is a DecompressStream
in the case that do_decompress is called before reading.
do_decompress also now modifies the header to remove
the Content-Encoding and lenght.
The body stream in httpmessage is now also set
when the body is set.
........
r356 | aheinecke | 2012-03-02 10:09:39 +0000 (Fri, 02 Mar 2012) | 2 lines
Fix indention
........
r357 | aheinecke | 2012-03-02 10:23:30 +0000 (Fri, 02 Mar 2012) | 8 lines
Fix Bug in decompressstream reading that caused the
read not to read everything correctly when a negative
parameter was passed after some parts had already been
read.
Added unit test for decompressstream reading.
Some cleanup in proxycore
........
Property changes on: trunk
___________________________________________________________________
Modified: svnmerge-integrated
- /branches/compression:1-346
+ /branches/compression:1-358
Modified: trunk/ChangeLog
===================================================================
--- trunk/ChangeLog 2012-03-06 17:17:41 UTC (rev 358)
+++ trunk/ChangeLog 2012-03-06 17:18:24 UTC (rev 359)
@@ -1,3 +1,83 @@
+2012-03-02 Andre Heinecke <aheinecke at intevation.de>
+ * A test/decompressstream.py:
+ Added test for decompressed reading
+ * M inteproxy/decompressstream.py:
+ Fix reading of the complete stream after starting to read
+ small chunks.
+ * M inteproxy/proxycore.py:
+ Remove the response parameter again for the transfer_data
+ functions.
+
+2012-02-29 Andre Heinecke <aheinecke at intevation.de>
+ * M inteproxy/proxycore.py:
+ Remove decompressed_read method.
+ Remove method get_decompress_object
+ * M inteproxy/httpmessage.py:
+ Added do_decompress method to httpresponse to select decompression
+ of the response.
+ Read does now create a decompression object if necessary to decompress
+ the input stream of the response.
+ * A inteproxy/decompressstream.py:
+ Add decompress stream class to wrap around an input stream for
+ decompressed reading.
+
+2012-02-24 Andre Heinecke <aheinecke at intevation.de>
+ * M inteproxy/proxycore.py:
+ Add method decompressed_read to read decompressed
+ data from a compressed response and use it where
+ data from a httpresponse is read.
+ * M test/test_inteproxy.py:
+ Disable test in TestInteProxyCompressedConnection for handling
+ an invalid compressed response, as InteProxy will now crash in
+ that case.
+ * M inteproxy/httpmessage.py:
+ Add decompressor parameter to read_entire_message to decompress
+ the body with it.
+
+2012-02-24 Andre Heinecke <aheinecke at intevation.de>
+
+ * M inteproxy/proxycore.py:
+ Add exception handling to compression, return raw data
+ if it can not be decompressed.
+ Change decompressed read to allow chained read wrappers
+ * M test/test_inteproxy.py:
+ Add test to TestInteProxyCompressedConnection for handling
+ an invalid compressed response
+
+2012-02-23 Andre Heinecke <aheinecke at intevation.de>
+
+ * M inteproxy/httpmessage.py:
+ Remove compression handling but still use the read function
+ of HTTPMessage for reading the entire body.
+ * M inteproxy/proxycore.py:
+ Move compression handling into the proxycore. Add method
+ get_decompress_object to select the correct decompression
+ algorithm and Only do decompression if the client has not
+ requested a compressed response or if we need to rewrite urls.
+
+2012-02-23 Andre Heinecke <aheinecke at intevation.de>
+
+ * M inteproxy/httpmessage.py:
+ Only decompress responses, remove the Content
+ Encoding header, decide compression on inititalization
+ of a httpresponse.
+
+2012-02-23 Andre Heinecke <aheinecke at intevation.de>
+
+ * M test/test_inteproxy.py:
+ Add TestInteProxyCompressedConnection to test
+ messages with content-encoding deflate and gzip
+
+2012-02-22 Andre Heinecke <aheinecke at intevation.de>
+
+ * M inteproxy/proxycore.py:
+ Add Accept-Encoding header to http_request
+
+2012-02-22 Andre Heinecke <aheinecke at intevation.de>
+
+ * M inteproxy/httpmessage.py:
+ Added support for decompressing HTTPMessages on read
+
2012-01-21 Bjoern Schilberg <bjoern.schilberg at intevation.de>
* M setup.py:
Copied: trunk/inteproxy/decompressstream.py (from rev 357, branches/compression/inteproxy/decompressstream.py)
===================================================================
--- trunk/inteproxy/decompressstream.py (rev 0)
+++ trunk/inteproxy/decompressstream.py 2012-03-06 17:18:24 UTC (rev 359)
@@ -0,0 +1,51 @@
+# Copyright (C) 2012 by Intevation GmbH
+# Authors:
+# Bernhard Herzog <bh at intevation.de>
+# Andre Heinecke <aheinecke at intevation.de>
+#
+# This program is free software under the GPL (>=v2)
+# Read the file COPYING coming with the software for details.
+
+""" On the fly decompression of a data stream """
+
+class DecompressStream(object):
+ """A class to wrap around a data stream that contains compressed data.
+
+ The decompression object can be given on initalization.
+ """
+
+ def __init__(self, infile, decompressobj):
+ """Initialize the DecompressStream Object
+
+ The parameter infile is used as input stream
+ and the parameter decompressobj to provide the decompression.
+ """
+ self.infile = infile
+ self.decompressor = decompressobj
+
+ def read(self, amount = -1):
+ """Decompressed the stream and returns the uncompressed data.
+
+ A negative parameter for amount indicates that the complete
+ stream should be decompressed.
+ """
+ decompressed_chunks = []
+ count = 0
+
+ if amount < 0:
+ compressed = self.decompressor.unconsumed_tail
+ compressed += self.infile.read()
+ return self.decompressor.decompress(compressed)
+
+ while count < amount:
+ max_read = amount - count
+ compressed = self.decompressor.unconsumed_tail
+ if not compressed:
+ compressed = self.infile.read(amount)
+ if not compressed:
+ break
+ deflated = self.decompressor.decompress(compressed, max_read)
+ count += len(deflated)
+ decompressed_chunks.append(deflated)
+
+ return "".join(decompressed_chunks)
Modified: trunk/inteproxy/httpmessage.py
===================================================================
--- trunk/inteproxy/httpmessage.py 2012-03-06 17:17:41 UTC (rev 358)
+++ trunk/inteproxy/httpmessage.py 2012-03-06 17:18:24 UTC (rev 359)
@@ -8,6 +8,8 @@
"""Abstractions for http request and response messages"""
from StringIO import StringIO
+from inteproxy.decompressstream import DecompressStream
+from zlib import decompressobj, MAX_WBITS
class HTTPMessage(object):
@@ -66,6 +68,7 @@
if content_type is not None:
self.headers["Content-type"] = content_type
self._body = body
+ self._body_stream = StringIO(self.body)
def get_body(self):
self.read_entire_message()
@@ -77,12 +80,11 @@
raise NotImplementedError
def read(self, amount):
- if self._body_stream is None and self.body_has_been_read():
- self._body_stream = StringIO(self.body)
if self._body_stream is not None:
- return self._body_stream.read(amount)
+ data = self._body_stream.read(amount)
else:
- return self.infile.read(amount)
+ data = self.infile.read(amount)
+ return data
class HTTPRequestMessage(HTTPMessage):
@@ -144,6 +146,8 @@
self.version = version
self.status = status
self.reason = reason
+ self.__decompress = False
+ self.started_reading = False
def debug_log_message(self, log_function):
log_function("HTTPResponseMessage: %s %s %s",
@@ -151,6 +155,78 @@
super(HTTPResponseMessage, self).debug_log_message(log_function)
def read_entire_message(self):
- if self.body_has_been_read():
+ """ Read the entire message and set the messages body.
+
+ If the optional decompressor parameter is given the
+ body will be decompressed.
+ """
+ if not self.body_has_been_read():
+ self.set_body(self.read())
+
+ def do_decompress(self):
+ """
+ Decompress the input stream on read if possible.
+
+ If the content-encoding of the message is either gzip
+ or deflate the Content-Encoding and Content-Length
+ headers will also be removed.
+ """
+ if self.__decompress:
return
- self.set_body(self.infile.read())
+
+ if not self.headers.get("Content-Encoding"):
+ return
+
+ if self.headers.get("Content-Encoding") == "deflate":
+ self.__decompress = "deflate"
+ elif self.headers.get("Content-Encoding") == "gzip":
+ self.__decompress = "gzip"
+
+ if self.__decompress:
+ # Can decompress the input
+ if self.started_reading:
+ self.__decompress = False
+ raise Exception("do_decompress called after first read")
+
+ del self.headers["Content-Encoding"]
+ if self.headers.get("Content-Length"):
+ del self.headers["Content-Length"]
+
+ def read(self, amount = -1):
+ """
+ Read the message up to amount bytes and return a data string of
+ length amount.
+ If do_decompress was called before this will return the data in
+ a decompressed form if possible.
+ """
+
+ if self.started_reading == False and amount != 0:
+ self.started_reading = True
+
+ if self._body_stream is None:
+ if self.__decompress:
+ # Create the decompression object
+ #
+ # On defate decompression -zlib.MAX_WBITS is given to ensure
+ # that non RFC confirming responses as they are sent by most
+ # http servers are decompressed correctly by ignoring a
+ # possibly invlaid header.
+
+ # To decompress gzip with zlib 16 needs to be added to the
+ # wbits parameter
+
+ # See the documention of inflateInit2 at
+ # http://zlib.net/manual.html
+
+ if self.__decompress == "deflate":
+ self._body_stream = DecompressStream(self.infile,
+ decompressobj(-MAX_WBITS))
+ elif self.__decompress == "gzip":
+ self._body_stream = DecompressStream(self.infile,
+ decompressobj(16 + MAX_WBITS))
+ else:
+ raise Exception("Invalid decompress method"
+ " in HTTPResponse")
+ else:
+ self._body_stream = self.infile
+ return self._body_stream.read(amount)
Modified: trunk/inteproxy/proxycore.py
===================================================================
--- trunk/inteproxy/proxycore.py 2012-03-06 17:17:41 UTC (rev 358)
+++ trunk/inteproxy/proxycore.py 2012-03-06 17:18:24 UTC (rev 359)
@@ -75,6 +75,11 @@
client_request = self.read_client_request()
#
+ # Make sure that it requests compressed data
+ #
+ self.ensure_encoding_header(client_request)
+
+ #
# Determine the transcoder to use
#
transcoder = self.server.transcoder_map.get_transcoder(self.command,
@@ -222,6 +227,7 @@
extra_headers = [("Host", "%s:%d" % remote_address)]
+
sock = None
if scheme == "http":
@@ -299,6 +305,23 @@
return response_message
+ def ensure_encoding_header(self, client_request):
+ """
+ Request compression even if the client did not.
+ This will modify the headers of client_request.
+ """
+ if not client_request.headers.get("Accept-Encoding"):
+ client_request.headers["Accept-Encoding"] = "gzip, deflate"
+ self.should_decompress_response = True
+ elif ( not "gzip" in client_request.headers["Accept-Encoding"] and
+ not "deflate" in client_request.headers["Accept-Encoding"] ):
+ client_request.headers["Accept-Encoding"] = \
+ ", ".join([client_request.headers["Accept-Encoding"],
+ "gzip", "deflate"])
+ self.should_decompress_response = True
+ else:
+ self.should_decompress_response = False
+
def send_headers(self, response):
"""Write the HTTP headers to the output stream."""
for header, value in response.headers.items():
@@ -318,6 +341,9 @@
do_rewrite = self.have_to_rewrite()
do_chunked = response.headers.get("Transfer-encoding") == "chunked"
+ if do_rewrite or self.should_decompress_response:
+ response.do_decompress()
+
if do_chunked and do_rewrite:
self.send_headers(response)
self.transfer_data_rewrite_chunked(response)
Copied: trunk/test/test_decompressstream.py (from rev 357, branches/compression/test/test_decompressstream.py)
===================================================================
--- trunk/test/test_decompressstream.py (rev 0)
+++ trunk/test/test_decompressstream.py 2012-03-06 17:18:24 UTC (rev 359)
@@ -0,0 +1,44 @@
+# Copyright (C) 2012 by Intevation GmbH
+# Authors:
+# Andre Heinecke <aheinecke at intevation.de>
+#
+# This program is free software under the GPL (>=v2)
+# Read the file COPYING coming with the software for details
+
+"""Tests for the inteproxy.decompressstream module"""
+
+import unittest
+import StringIO
+import zlib
+
+from random import Random
+
+from inteproxy.decompressstream import DecompressStream
+
+class DecompressStreamTest(unittest.TestCase):
+
+ def test_read_amounts(self):
+ """Test for the ChunkedTransferEncodingWriter"""
+
+ DATA = "It's still magic even if you know how it's done."
+
+ compressed_data = zlib.compress(DATA)
+ compressed_stream = StringIO.StringIO(compressed_data)
+ decompressobj = zlib.decompressobj()
+
+ dstream = DecompressStream(compressed_stream, decompressobj)
+
+ # Test reading small "bites"
+ result = dstream.read(5)
+ self.assertEqual(result, DATA[:5])
+ result2 = dstream.read(1)
+ self.assertEqual(result2, DATA[5])
+ result3 = dstream.read(-123)
+ self.assertEqual(result + result2 + result3, DATA)
+
+ # Test reading everything
+ compressed_stream = StringIO.StringIO(compressed_data)
+ decompressobj = zlib.decompressobj()
+
+ dstream = DecompressStream(compressed_stream, decompressobj)
+ self.assertEqual(dstream.read(), DATA)
Modified: trunk/test/test_inteproxy.py
===================================================================
--- trunk/test/test_inteproxy.py 2012-03-06 17:17:41 UTC (rev 358)
+++ trunk/test/test_inteproxy.py 2012-03-06 17:18:24 UTC (rev 359)
@@ -521,3 +521,48 @@
self.assertEquals(response.status, 200)
data = response.read()
self.assertEquals(data, "some text")
+
+class TestInteProxyCompressedConnection(ServerTest):
+ remote_contents = [
+ ("/plain", [("Content-Type", "text/plain")], "not encoded"),
+ ("/gzip", [("Content-Type", "text/plain"),
+ ("Content-Encoding", "gzip")],
+ base64.b64decode("H4sICNwRRk8AA2Zvby50eHQAS8vP5wIAqGUyfgQAAAA=")),
+ ("/deflate", [("Content-Type", "text/plain"),
+ ("Content-Encoding", "deflate")],
+ base64.b64decode("S8vPBwA="))]
+ # ("/invalid", [("Content-Type", "text/plain"),
+ # ("Content-Encoding", "deflate")], "foo")]
+
+
+ def test_plain(self):
+ http = httplib.HTTPConnection("localhost", self.server.server_port)
+ http.request("GET", self.remote_server_base_url + "plain")
+ response = http.getresponse()
+ self.assertEquals(response.status, 200)
+ data = response.read()
+ self.assertEquals(data, "not encoded")
+
+ def test_deflate(self):
+ http = httplib.HTTPConnection("localhost", self.server.server_port)
+ http.request("GET", self.remote_server_base_url + "deflate")
+ response = http.getresponse()
+ self.assertEquals(response.status, 200)
+ data = response.read()
+ self.assertEquals(data, "foo")
+
+ def test_gzip(self):
+ http = httplib.HTTPConnection("localhost", self.server.server_port)
+ http.request("GET", self.remote_server_base_url + "gzip")
+ response = http.getresponse()
+ self.assertEquals(response.status, 200)
+ data = response.read()
+ self.assertEquals(data, "foo\n")
+
+ #def test_invalid_data(self):
+ # http = httplib.HTTPConnection("localhost", self.server.server_port)
+ # http.request("GET", self.remote_server_base_url + "invalid")
+ # response = http.getresponse()
+ # self.assertEquals(response.status, 200)
+ # data = response.read()
+ # self.assertEquals(data, "foo")
More information about the Inteproxy-commits
mailing list