You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
During extraction of at least 29 NETGEAR firmware images, unblob may try creating the same output file twice triggering an exception. As a result, some files that should be extracted are not.
To Reproduce
Steps to reproduce the behavior:
Download a sample firmware to trigger the bug with: wget https://www.downloads.netgear.com/files/GDC/M4100/M4100-V10.0.2.20.zip
Launch unblob with command unblob -v M4100-V10.0.2.20.zip
See error:
2024-02-10 23:39.13 [error ] Unknown error happened while extracting chunk pid=2295991
Traceback (most recent call last):
File "/unblob/unblob/processing.py", line 607, in _extract_chunk
if result := chunk.extract(inpath, extract_dir):
File "/unblob/unblob/models.py", line 115, in extract
return self.handler.extract(inpath, outdir)
File "/unblob/unblob/models.py", line 452, in extract
return self.EXTRACTOR.extract(inpath, outdir)
File "/unblob/unblob/handlers/archive/cpio.py", line 384, in extract
parser.dump_entries(fs)
File "/unblob/unblob/handlers/archive/cpio.py", line 215, in dump_entries
fs.carve(entry.path, self.file, entry.start_offset, entry.size, mode=entry.mode & 0o777)
File "/unblob/unblob/file_utils.py", line 511, in carve
carve(safe_path, file, start_offset, size, mode=mode)
File "/unblob/unblob/file_utils.py", line 294, in carve
with carve_path.open("xb") as f:
File "/usr/lib/python3.10/pathlib.py", line 1119, in open
return self._accessor.open(self, mode, buffering, encoding, errors,
FileExistsError: [Errno 17] File exists: '/tmp/tmp1151iav4/M4100_V10.0.2.20.zip_extract/m4100v10.0.2.20.stk_extract/1201148-2097967.lzma_extract/lzma.uncompressed_extract/lib/libthread_db-1.0.so'
Expected behavior
This error should not be raised, instead additional files should be extracted. I made a simpel change in file_utils.py's carve method (see below) to return early if the target file already exists and with this change an extra 75 files are created in [extract_dir]/m4100v10.0.2.20.stk_extract/1201148-2097967.lzma_extract/lzma.uncompressed_extract. I doubt this is the right fix, but it shows that this bug prevents some files from being extracted.
Environment information:
OS: Ubuntu 22.04
Docker
Linux b4935d734f27 6.2.2 #3 SMP PREEMPT_DYNAMIC Wed Mar 8 12:03:22 EST 2023 x86_64 x86_64 x86_64 GNU/Linux
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.3 LTS"
The following executables found installed, which are needed by unblob:
7z ✓
debugfs ✓
jefferson ✓
lz4 ✓
lziprecover ✓
lzop ✓
sasquatch ✓
sasquatch-v4be ✓
simg2img ✓
ubireader_extract_files ✓
ubireader_extract_images ✓
unar ✓
zstd ✓
Additional context
I found this bug while doing some large-scale evaluations of filesystems produced by binwalk and unblob using fw2tar.
My (likely-incorrect) patch that results in additional files being created:
diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index 21e887b..3db4b98 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py
@@ -291,6 +291,9 @@ def carve(carve_path: Path, file: File, start_offset: int, size: int):
"""Extract part of a file."""
carve_path.parent.mkdir(parents=True, exist_ok=True)
+ if carve_path.exists():
+ print(f"Warning not replacing {carve_path}")
+ return
with carve_path.open("xb") as f:
for data in iterate_file(file, start_offset, size):
f.write(data)
After fixing this, I got another error along the same vein in file_utils which I patched with:
diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index 21e887b..3db4b98 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py
@@ -579,7 +582,8 @@ class FileSystem:
if safe_link:
dst = safe_link.dst.absolute_path
self._ensure_parent_dir(dst)
- dst.symlink_to(src)
+ if not dst.exists():
+ dst.symlink_to(src)
def create_hardlink(self, src: Path, dst: Path):
"""Create a new hardlink dst to the existing file src."""
The text was updated successfully, but these errors were encountered:
It looks like only a couple of binaries got patched during the build process (and maybe some previously non-existing added), so I think, the proper solution would be to overwrite duplicate entries on extracting cpio archives.
I would also limit the solution to the cpio extractor, and would not make general behavior change in FileSystem.
Describe the bug
During extraction of at least 29 NETGEAR firmware images, unblob may try creating the same output file twice triggering an exception. As a result, some files that should be extracted are not.
To Reproduce
Steps to reproduce the behavior:
wget https://www.downloads.netgear.com/files/GDC/M4100/M4100-V10.0.2.20.zip
unblob -v M4100-V10.0.2.20.zip
Expected behavior
This error should not be raised, instead additional files should be extracted. I made a simpel change in file_utils.py's carve method (see below) to return early if the target file already exists and with this change an extra 75 files are created in
[extract_dir]/m4100v10.0.2.20.stk_extract/1201148-2097967.lzma_extract/lzma.uncompressed_extract
. I doubt this is the right fix, but it shows that this bug prevents some files from being extracted.Environment information:
Additional context
I found this bug while doing some large-scale evaluations of filesystems produced by binwalk and unblob using fw2tar.
My (likely-incorrect) patch that results in additional files being created:
After fixing this, I got another error along the same vein in file_utils which I patched with:
The text was updated successfully, but these errors were encountered: