Skip to content

Conversation

@LucaPalumbo
Copy link

Fix HTML byte display issue and improve NBI handling (Closes #181)

This PR fixes issue #181, where byte data was displayed in HTML reports as Python byte literals (b'...').
It also includes improvements to Network-Based Indicators (NBI) processing, display formatting, and shared listener utilities.

Summary of Changes

1. Improved HTTP POST body handling

  • Detects whether POST data is printable text or binary.

  • Printable data is now decoded and displayed as text.

  • Binary data is shown as a truncated hexdump (first 16 lines, consistent with RawListener).

2. Consolidated shared utilities

  • ListenerBase.py was not a true base listener but contained only utility functions.

  • Renamed listeners/ListenerBase.pylisteners/utils.py to better reflect its purpose.

  • Moved hexdump_table() from RawListener.py to utils.py so it can be shared across listeners.

  • Updated all listener modules (__init__.py, HTTP, Raw, FTP, SMTP, POP, TFTP) to import from utils.py.

  • Removed an unused ListenerBase import from ssl_utils/__init__.py.

3. HTML report fixes and improvements

  • Enabled hexdump display for both Data Hexdump (from RawListener) and Request Body (Hexdump) (from HTTPListener).

  • Rewrote the JavaScript function copyNbiData(), which previously failed when special character \n was present.

    • Before, only the content up to the first newline was copied; now the full content is copied correctly.

Screenshots

New behavior:

If raw bytes are present, a hexdump is shown.

Otherwise, the content is displayed as a decoded string.

after

Previous behavior:

Raw bytes caused a Python Traceback because an invalid UTF-8 decoding was attempted.

before

Feedback

I am open to comments or suggestions!
Let me know if you would prefer these changes split into separate PRs or adjusted in any way.

Resolves

Closes #181

This commit addresses issue mandiant#181 where bytes data was being displayed as
`b'...'` literals in HTML reports, and implements several related
improvements to Network-Based Indicators (NBI) collection and display.

Changes:

1. Improve HTTP POST body handling:
   - Detect printable vs. binary POST data
   - Display printable data as decoded text
   - Generate hexdump for binary content (showing first 16 lines)

2. Consolidate shared utilities:
   - Renamed `listeners/ListenerBase.py` in `listeners/utils.py` since it contains utility functions
   - Moved `hexdump_table()` from `listeners/RawListener.py` to `listeners/utils.py`
   - Updated all listeners file (__init__, HTTP, Raw, FTP, SMTP, POP, TFTP) to import from utils

3. Fix HTML report template:
   - Allowed hexdump display for both "Data Hexdump" (from TCP listerner)
     and "Request Body (Hexdump)" (from HTTP listener)
   - Rewrote JavaScript `copyNbiData()` function that did not parse correctly
     when special character (like \n) were present

4. Clean up SSL utilities:
   - Removed unused `ListenerBase` import from `ssl_utils/__init__.py`

Resolves: mandiant#181
@google-cla
Copy link

google-cla bot commented Nov 16, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@tinajohnson
Copy link
Contributor

Hi @LucaPalumbo

Thank you for the PR! I will review and merge this when I get a chance which is probably going to be early January.

@tinajohnson tinajohnson added this to the v3.6 milestone Dec 18, 2025
@LucaPalumbo
Copy link
Author

Thanks for the update! Sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use chardet to detect encoding

2 participants