-
Notifications
You must be signed in to change notification settings - Fork 61
Description
@sigmavirus24 I wrote a test for one function that downloads a large zip file using requests module. I've found discrepancy in Content-Length when comparing test execution with betamax and without it. Using Betamax, the length of the binary string extracted is way larger. Besides that, I need to pass that binary string to BytesIO and then to zipfile.ZipFile
, but got zipfile.BadZipFile: Bad magic number for central directory
exception.
My test setup:
import betamax
from betamax.fixtures import unittest
import os
mode = os.getenv('BETAMAX_RECORD_MODE')
with betamax.Betamax.configure() as config:
config.cassette_library_dir = 'tests/test_funcs/cassettes'
config.default_cassette_options['record_mode'] = mode
print(f'Using record mode <{mode}>')
def the_function(session):
# session = requests.Session()
from io import BytesIO
from zipfile import ZipFile
response = session.get("https://ww2.stj.jus.br/docs_internet/processo/dje/xml/stj_dje_20211011_xml.zip")
zip_in_memory = BytesIO(response.content)
try:
my_zip = ZipFile(zip_in_memory, 'r')
my_zip.testzip()
result = True
except Exception:
result = False
return result
class BaseTest(unittest.BetamaxTestCase):
custom_headers = None
custom_proxies = None
_path_to_ignore = None
_no_generator_return_search = False
def setUp(self):
super(BaseTest, self).setUp()
if self.custom_headers:
self.session.headers.update(self.custom_headers)
if self.custom_proxies:
self.session.proxies.update(self.custom_proxies)
self.worker_under_test = self.worker_class()
self.worker_under_test._session = self.session
def test_search(self):
result = the_function(self.session)
assert result
I pass the self.session
to function under test and use it to get a endpoint. Through that endpoint, I get the zip file in the form of bytes string (response.content
). I found that test runs without errors if I don't use the Betamax session.
Test
Session headers
{'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
Request headers
{'Accept-Ranges': 'bytes', 'ETag': 'W/"159406-1633990217000"', 'Last-Modified': 'Mon, 11 Oct 2021 22:10:17 GMT', 'Content-Type': 'application/zip', 'Content-Length': '159406', 'Date': 'Thu, 21 Oct 2021 14:37:27 GMT', 'Set-Cookie': 'BIGipServerpool_wserv=973081866.20480.0000; path=/; Httponly, TS01dc523b=016a5b383346ca02628a7c1dd47ef26e8cadf4a1b22fa9261c6b9ac1de8ac5665e99bd4a42c5b1d0af72b97105f57020b5e0f78fa7452df6080bf5ea3ee7a85d2de98968a2; Path=/; Domain=.www.stj.jus.br', 'Strict-Transport-Security': 'max-age=604800; includeSubDomains', 'Content-Security-Policy': "upgrade-insecure-requests; frame-ancestors 'self' https://*.stj.jus.br https://*.web.stj.jus.br https://stjjus.sharepoint.com/"}
Actual content length
len(response.content) == 288055
Script execution
Session headers
{'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
Request headers
{'Accept-Ranges': 'bytes', 'ETag': 'W/"159406-1633990217000"', 'Last-Modified': 'Mon, 11 Oct 2021 22:10:17 GMT', 'Content-Type': 'application/zip', 'Content-Length': '159406', 'Date': 'Thu, 21 Oct 2021 14:39:24 GMT', 'Set-Cookie': 'BIGipServerpool_wserv=973081866.20480.0000; path=/; Httponly, TS01dc523b=016a5b3833746a54a2d1276a2b3de87f48f672e9cd7c18c4dad842ddddeac244bcbcf1a470b59eecf83bd6a3bdeffc7c7017210981de929d01df6c054118625399d2b04ad2; Path=/; Domain=.www.stj.jus.br', 'Strict-Transport-Security': 'max-age=604800; includeSubDomains', 'Content-Security-Policy': "upgrade-insecure-requests; frame-ancestors 'self' https://*.stj.jus.br https://*.web.stj.jus.br https://stjjus.sharepoint.com/"}
Actual content length
len(response.content) == 159406
I'm using Python 3.8.2, Betamax 0.8.1, Pytest 5.4.1 to run test and Requests 2.25.1
Related question: https://stackoverflow.com/questions/69653406/how-to-mock-a-function-that-downloads-a-large-binary-content-using-betamax
Related issue: #122