Rendered at 19:21:03 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ChrisNorstrom 2 days ago [-]
I know you meant well but...
"It deletes empty folders" and "Let me know if this is a problem for you"
NEVER DO THAT. I know you meant well, but the first rule of any program is to NEVER automatically delete something without informing the user. NEVER. Users keep empty folders for structure, reminders, or placeholders because software will dump files into it later when it's run. If it was there when they zipped it up, it should be there when they unzip it. Otherwise they'll check the before and after and it will show some folders missing, create confusion, and the user will run off trying to find out if anything else is missing.
Example: A user zips up a program. Some programs are coded to look for a folder and dump files into it, if the folder is missing the program will fail. I've had that occasionally over the years. Not all programs will recreate a missing folder.
Ekaros 2 days ago [-]
One thing I dislike about git is that it really does not support empty folders well. Even though they might make sense lot of time. Either now or for future. There is decent reasons to have empty folders.
svth 2 days ago [-]
I just work around it with a .gitkeep file.
ebolyen 2 days ago [-]
Seems we need a .zipkeep file then.
Just kidding, I don't see how the overhead of the directory entry is even remotely enough to warrant removal. Most of the magic can be left to efficient DEFLATE compatible blocks and removing entries not in the central directory in the first place (ZIP files can support concatenation of new data so long as you re-write the central directory at the end of the file).
sumtechguy 2 days ago [-]
Yeah that probably should just be an option. Basically the default is to least mangle the zip file. Where the most extreme is turned on by flags. One of those could be 'remove empty folders'.
lifthrasiir 2 days ago [-]
While not very popular, ECT [1] is (still?) the best solution in this space and has been my go-to tool for this purpose.
I had no heard of ECT, but I'm not impressed.
I've just benchmarked it against two others PNG optimizers, and here are the file sizes for default and max levels:
BTW, I could not compile ECT on my Linux system, because its CMake config was too old. I used the Windows release through Wine, but it shouldn't change the results above.
I tried to apply ECT to a few .gz files, but it complained it was not compatible, and I did not dig further.
[edited for a typo s/I/it/]
futune 2 days ago [-]
I use ect on a monthly basis, at least. Especially for png files. It's pretty great!
zamadatix 2 days ago [-]
Yeah, for how well it does with PNGs it really doesn't get nearly as much attention as the other tools for the same do.
useyourloaf 2 days ago [-]
Thank you for the pointer!
Wowfunhappy 2 days ago [-]
Obviously, the purpose of this tool isn't to preserve 100% compatibility. Things like removing empty directories makes that clear.
But, why would you remove comments? Presumably, if those are there, they were added for a specific reason. And the author acknowledges the space savings are minimal.
gwbas1c 2 days ago [-]
> Things like removing empty directories makes that clear.
I hope that's disabled by default. Something like: "turning this option on may reduce file size by a small percent, but could impact compatibility."
I suspect the option will be much more useful with file formats that are zip under the hood, where it's easier to test the small subset of applications that read those files and/or update the file specification.
MrDOS 2 days ago [-]
Ken Silverman (of Build Engine fame) has written a few deflate-centric compression utilities[0]. The PNGOUT recompressor is the most famous of these (and is very good – it practically always beats OptiPNG), but the suite also includes a .zip archive recompressor called KZIP. I'd be curious to see how ZIP Shrinker compares to this tool.
> Typically, other archives like .tar.bz2 can be smaller. But those aren’t backwards-compatible!
Is there any point for (new) .bz2 archives in the era of Zstd?
j16sdiz 2 days ago [-]
Tooling ?
It takes years for bzip2 be in every Linux Distro, and we _still_ doing gzip.
LZMA / xz tool are start to get more support, but they are nowhere near universal.
No idea when how long zstd will need.
strenholme 2 days ago [-]
xz is pretty universal across POSIX and clones though. It comes with any modern Linux distro, Busybox even has an .xz decompressor, so `tar xvJF file.tar.xz` does the right thing in *NIX land, which I presume includes MacOS with Brew.
For Windows systems, 7-zip (.7z, similar compression to .xz) is a free download for Windows 10, and Windows 11 can open up a .7z file with a simple double click.
.zip and .gz no longer need to be used here in 2026.
lstodd 2 days ago [-]
.zip is used as a seekable container with some compression. There is no replacement comparable in simplicity. 7z is overcomplicated, compressed tar is not seekable.
.gz/deflate is used when something very cheap and very fast is needed. xz/lzma is quite often too slow or requires too much memory even on decompression.
so no, .zip and .gz are very much needed in 2026.
adapiz 2 days ago [-]
Compared to xz and even parallel xz, gzip and parallel gzip are just better if speed is more important. The compression is not superior but already good if you consider just the uncompressed data. For long term storage, it makes sense, to invest the extra time for better compression but if it's about transfer time, you might end up with a overall longer processing time instead of just a longer transfer time because of a worse compression ratio.
It's like with image formats: Pick the right one for your use case.
MrDrMcCoy 2 days ago [-]
If you add zstd to the comparison matrix, it wins on both speed and compression ratio. Its adoption is quickly catching up to xz as a result, and I expect it to approach gzip in availability in a few years.
Dwedit 2 days ago [-]
GitHub won't let you upload a 7z file as an attachment for the issue tracker. Thus forcing me to use an inferior and obsolete compression format.
jgalt212 2 days ago [-]
gzip is very fast, universally supported, and good enough. It will be around for ever.
you need python 3.14 for zstd.
yjftsjthsd-h 2 days ago [-]
Zstd is implemented in C?
Am4TIfIsER0ppos 2 days ago [-]
Debian? Did they discover it yet?
sigio 2 days ago [-]
I think it's been in since debian 11... at least 12, it's been in my default ansible playbooks for a while.
jurgenkesker 2 days ago [-]
APKs need to be zipaligned, I don't see that mentioned.
Ooh, with btrfs you could reflink an uncompressed zip entry to its own file on disk.
teddyh 2 days ago [-]
…and you’ve rediscovered .a (ar) files.
KerrickStaley 2 days ago [-]
You can also make ZIP files smaller by switching the compression from Deflate to Zstandard. In the one case I tried this, this resulted in a 60% file size decrease [1]. Unfortunately Info-ZIP which provides the unzip command hasn't had a release in 18 years, so it doesn't support this newer compression/decompression method. You have to use 7-Zip instead.
As far as I know, the ISO standard for zip only specifies two compression methods: "store" (no compression) and "deflate". If I follow that, when I create a zip file, I know it's not performant, but at least it's almost universal (except for file ownership, permissions, character encoding and anything modern).
The corporate PKWARE has added other compressions to their original zip software, but those are not in the standard. They will not work for an EPUB, a LibreOffice file, etc. If I want a good compression, I reach for zstd (often through `tar`) or 7z if I want more portability.
Dwedit 2 days ago [-]
Then it's not a zip file anymore.
Just like if you modified PNG files to use zstandard instead of deflate, but otherwise be identical, it's still not a PNG file anymore.
tiagod 2 days ago [-]
That's not true. Zip files have supported other compression algorithms since the late 90s.
giancarlostoro 2 days ago [-]
I guess its PNG v2 then? ;)
billpg 2 days ago [-]
Do any formats using ZIP as the underlying format use ZIP comments for metadata? Unless there's a lot of compressors leaving "Zip file generated by MySuperZipper™" then I imagine any comments left were probably done for a good reason.
ebolyen 2 days ago [-]
I'm not aware of any, but it wouldn't be insane to build a seekable deflate implementation by defining offsets in a zip comment. This would leave the zip file backwards compatible to usual decompression while allowing internal seeking within an individual file if the decompressor was aware of this index.
mxmlnkn 2 days ago [-]
For seekable gzip indexes in zip, there SOZip: https://github.com/sozip/sozip-spec . However, it stores the indexes as files succeeding the actual file entry. To hide these index files and avoid extraction, they are not listed in the central directory, but a linear scan of the local headers, which some wrongly-behaved ZIP tools do, or which might be necessary for recovering broken ZIP files, would find those hidden indexes.
luzifer42 2 days ago [-]
I made once a maven plugin which reprocesses jar files.
It allows to remove extra content such as comments and directories.
In addition, it handles nested zip files to increase their compress-ability.
And all the features can be toggled individually.
> This has the side effect of removing empty directories
yeah, this will inevitably break things. excluding those from the directory stripping shouldn't be too hard (TM)
Sweepi 2 days ago [-]
so this tool:
- Strips away comments, metadata and directories(!!)
- re-compresses the data with deflate (on presumably higher setting)
makes me feel uneasy that sth. which does lossy compression(metadata is lost) is called "ZIP Shrinker". Hope nobody gets surprised by this.
The real solution is to use lzma(2).
etrez 2 days ago [-]
Cool project!
Now, Zip-Ada's ReZip does much better, even if you stick with the Deflate compression scheme.
For Zip archives, you have more compression schemes available (BZip2, LZMA, ...) and even much better results.
"It deletes empty folders" and "Let me know if this is a problem for you"
NEVER DO THAT. I know you meant well, but the first rule of any program is to NEVER automatically delete something without informing the user. NEVER. Users keep empty folders for structure, reminders, or placeholders because software will dump files into it later when it's run. If it was there when they zipped it up, it should be there when they unzip it. Otherwise they'll check the before and after and it will show some folders missing, create confusion, and the user will run off trying to find out if anything else is missing.
Example: A user zips up a program. Some programs are coded to look for a folder and dump files into it, if the folder is missing the program will fail. I've had that occasionally over the years. Not all programs will recreate a missing folder.
Just kidding, I don't see how the overhead of the directory entry is even remotely enough to warrant removal. Most of the magic can be left to efficient DEFLATE compatible blocks and removing entries not in the central directory in the first place (ZIP files can support concatenation of new data so long as you re-write the central directory at the end of the file).
[1] https://github.com/fhanau/Efficient-Compression-Tool
I tried to apply ECT to a few .gz files, but it complained it was not compatible, and I did not dig further.
[edited for a typo s/I/it/]
But, why would you remove comments? Presumably, if those are there, they were added for a specific reason. And the author acknowledges the space savings are minimal.
I hope that's disabled by default. Something like: "turning this option on may reduce file size by a small percent, but could impact compatibility."
I suspect the option will be much more useful with file formats that are zip under the hood, where it's easier to test the small subset of applications that read those files and/or update the file specification.
[0]: https://advsys.net/ken/utils.htm
Is there any point for (new) .bz2 archives in the era of Zstd?
It takes years for bzip2 be in every Linux Distro, and we _still_ doing gzip.
LZMA / xz tool are start to get more support, but they are nowhere near universal.
No idea when how long zstd will need.
For Windows systems, 7-zip (.7z, similar compression to .xz) is a free download for Windows 10, and Windows 11 can open up a .7z file with a simple double click.
.zip and .gz no longer need to be used here in 2026.
.gz/deflate is used when something very cheap and very fast is needed. xz/lzma is quite often too slow or requires too much memory even on decompression.
so no, .zip and .gz are very much needed in 2026.
you need python 3.14 for zstd.
The APK number looks impressive if you don't know some files are uncompressed on purpose: https://developer.android.com/tools/zipalign
There's also no mention that this likely breaks signatures? https://developer.android.com/tools/apksigner
The weird thing is that they say they worked on Signal ... so, what's going on?
[1] http://web.archive.org/web/20031018072659/http://msdn.micros...
[1] https://github.com/UKGovernmentBEIS/inspect_ai/pull/3145
As far as I know, the ISO standard for zip only specifies two compression methods: "store" (no compression) and "deflate". If I follow that, when I create a zip file, I know it's not performant, but at least it's almost universal (except for file ownership, permissions, character encoding and anything modern).
The corporate PKWARE has added other compressions to their original zip software, but those are not in the standard. They will not work for an EPUB, a LibreOffice file, etc. If I want a good compression, I reach for zstd (often through `tar`) or 7z if I want more portability.
Just like if you modified PNG files to use zstandard instead of deflate, but otherwise be identical, it's still not a PNG file anymore.
https://luccappellaro.github.io/2015/03/01/ZopfliMaven.html
yeah, this will inevitably break things. excluding those from the directory stripping shouldn't be too hard (TM)
makes me feel uneasy that sth. which does lossy compression(metadata is lost) is called "ZIP Shrinker". Hope nobody gets surprised by this.
The real solution is to use lzma(2).