Igor Kromin |   Consultant. Coder. Blogger. Tinkerer. Gamer.

Not all stories have a good ending and this one is certainly one of those. If you try to extract the contents of a ZIP file in your PHP app that runs in Google's App Engine Standard Environment you are bound to get a lot of headaches and ultimately failure. This is exactly what happened to me over the weekend - something that started off as a simple new feature turned into a day of debugging and hitting the brick wall.

My goal was to be able to upload a ZIP file to my webapp and have it automatically unzipped immediately after. This was something I've implemented previously, that code ran in a standalone Apache server however. This should have been an easy task! After I hit my first lot of issues, I came across this entry in the Google App Engine Issue Tracker - PHP zip extension can't create zip file on production server. The issue was logged in 2014 and 4 years later still hasn't been resolved!

Before going into details, yes the zip and zlib extensions were enabled in the standard environment...
Enabled extensions
The following extensions have been enabled in the PHP runtime for App Engine:
...
zip
zlib


One of the checks I added during debugging was to make sure that PHP was able to read the file I was trying to unzip. This was done using fopen() like so...
 PHP
$fileName = 'gs://#default#/test.zip';
$file = fopen($fileName, 'r');
var_dump($file);
fclose($file);


...which gave me a nice resource stream. The file was readable.
 Output
resource(40) of type (stream)


Ok so lets see where I started. Probably the most obvious place, the PHP ZipArchive class.
 PHP
$zipFile = new ZipArchive();
$status = $zipFile->open($fileName);
var_dump($status);


...which ended up failing with...
 Output
int(11)


This of course was the "ZipArchive::ER_OPEN - Can't open file" constant.

I then thought I would try out the plain zip_open() function...
 PHP
$zipFile = zip_open($fileName);
var_dump($file);


This gave me the same '11' error as before, so at least I knew the source of the issue for ZipArchive.



At this point I figured that the out-of-the-box PHP zip functionality was not usable inside App Engine so started to look for alternative libraries that could extract zip files but didn't rely on built-in zip functions. The first I tried was PhpZip.
 PHP
$zipFile = new \PhpZip\ZipFile();
$zipFile->openFile($fileName);
$listFiles = $zipFile->getListFiles();
var_dump($listFiles);


That gave me some very promising results! The ZIP central directory seemed to be readable and the file list inside the ZIP was being displayed!
 Output
array(1) { [0]=> string(10) "test.dat" }


Naturally I tried to extract the file...
 PHP
$zipFile->extractTo('gs://#default#/test');


...only to get this error:
 Output
'PhpZip\Exception\Crc32Exception' with message 'test.dat (expected CRC32 value 0x1ea42f40, but is actually 0x0)'


After a lot of searching around, I found a solution to that, the CRC32 check was failing and a hack was to bypass it. The change was in Stream/ZipInputStream.php file, readEntryContent() function. All I did was change $skipCheckCrc = false; to $skipCheckCrc = true;. Very dodgy but it made the error go away, only to give me this warning...
 Output
Warning: gzinflate(): data error in /Volumes/.../vendor/nelexa/zip/src/PhpZip/Stream/ZipInputStream.php on line 443


I was also left with a zero-length 'test.dat' file in the directory I tried to extract to. So PhpZip didn't seem to work either.

Next I decided to try out the PclZip library.
 PHP
$zipFile = new PclZip($fileName);
$listFiles = $zipFile->listContent();
var_dump($listFiles);
error_log($zipFile->errorInfo(true));


This failed as well. The $listFiles array was empty and the error was...
 Output
PCLZIP_ERR_BAD_FORMAT (-10) : Unable to go to the end of the archive 'gs://#default#/test.zip'


This seemed to be an issue with GAE's fseek() behaviour. I decided to move on.

The next library I tried was TbsZip...
 PHP
$zipFile = new clsTbsZip();
$zipFile->Open($fileName);
var_dump($zipFile->CdFileLst);


The output from that seemed promising!
 Output
array(1) { [0]=> array(20) { ["vers_used"]=> int(798) ["vers_necess"]=> int(20) ["purp"]=> string(18) "b:0000000000000000" ["meth"]=> int(8) ["time"]=> int(17377) ["date"]=> int(19588) ["crc32"]=> int(514076480) ["l_data_c"]=> int(32142) ["l_data_u"]=> int(54898) ["l_name"]=> int(10) ["l_fields"]=> int(24) ["l_comm"]=> int(0) ["disk_num"]=> int(0) ["int_file_att"]=> int(0) ["ext_file_att"]=> int(2176057344) ["p_loc"]=> int(0) ["v_name"]=> string(10) "test.dat" ["v_fields"]=> string(24) "UT..." ["v_comm"]=> string(0) "" ["bin"]=> string(80) "PK..." } }


It was telling me I had 1 file in the archive, so I tried to extract it (by index)...
 PHP
$zipFile->FileRead(0, true);


I was rewarded the the same warning I saw previously...
 Output
Warning: gzinflate(): data error in /Volumes/.../vendor/seblucas/tbszip/tbszip.php on line 296


Damn!

So it looked like some of the libraries could read the ZIP central directory and knew which files there were inside the zip but for some reason gzinflate() was failing. I tried to find a solution to that but didn't get anywhere unfortunately.

So what did I end up doing instead? I changed my file upload functionality to include a drag-and-drop area that could accept multiple files and upload them sequentially. Not the exact solution I was after, but it did the trick!

If you happen to have better luck with getting ZIP extraction working in App Engine, do let me know!

-i

A quick disclaimer...

Although I put in a great effort into researching all the topics I cover, mistakes can happen. Use of any information from my blog posts should be at own risk and I do not hold any liability towards any information misuse or damages caused by following any of my posts.

All content and opinions expressed on this Blog are my own and do not represent the opinions of my employer (Oracle). Use of any information contained in this blog post/article is subject to this disclaimer.
Hi! You can search my blog here ⤵
NOTE: (2022) This Blog is no longer maintained and I will not be answering any emails or comments.

I am now focusing on Atari Gamer.