Skip to content

Overhaul stats: Add the PeerClient and version as labels to metrics #1456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
josecelano opened this issue Apr 14, 2025 · 4 comments
Closed
Assignees
Labels
- Admin - Enjoyable to Install and Setup our Software Enhancement / Feature Request Something New Needs Feedback What dose the Community Think?

Comments

@josecelano
Copy link
Member

josecelano commented Apr 14, 2025

It depends on:

We are receiving many bad requests from BitTorrent clients in the tracker demo. For example:

It would be invaluable to include the clients' software version as a label in the metrics so we can check what BitTorrent clients are producing those errors. If they are known clients with public repositories, we could open issues.

We should include the new labels in all metrics where the information is available.

We are using the aquatic PeerId which included the PeerClient info.

https://github.com/greatest-ape/aquatic/blob/master/crates/peer_id/src/lib.rs

This is the current metric where we collect UDP errors:

udp_tracker_server_errors_total{server_binding_ip="0.0.0.0",server_binding_port="6969",server_binding_protocol="udp"} 1066214

The new one could include these new labels:

  • peer_client_software_name
  • peer_client_software_version

NOTICE: We have to generate a key value for the name.

@josecelano josecelano added - Admin - Enjoyable to Install and Setup our Software Enhancement / Feature Request Something New labels Apr 14, 2025
@josecelano josecelano self-assigned this Apr 14, 2025
@josecelano
Copy link
Member Author

josecelano commented Apr 23, 2025

The list of peer IDS included in BEP 20:

https://www.bittorrent.org/beps/bep_0020.html

The list of peer IDS from:

AI model: GPT-4o


📋 BitTorrent Client Peer ID Prefixes

Client Name Peer ID Format Software Identifier Example Peer ID
ABC Axxxxx--- A A2A3B---
Ares -AGYYYY- or -A~YYYY- AG or A~ -AG1234-
Arctic -ARYYYY- AR -AR1234-
Azureus/Vuze -AZYYYY- AZ -AZ2060-
BitBuddy -BBYYYY- BB -BB1234-
BitComet exbc\xYY\xZZ exbc exbc\x01\x23...
BitFlu -BFYYYY- BF -BF1100-
BitLet -WTYYYY- WT -WT1234-
BitLord exbc\xYY\xZZLORD exbc exbc\x01\x23LORD...
BitPump -AXYYYY- AX -AX1234-
BitRocket -BRYYYY- BR -BR1000-
BitSpirit \0\3BS or \0\2BS BS \0\3BS...
BitTornado Txxxxx--- T T03A3---
BitTorrent (Mainline) Mx-y-z-- M M4-3-6--
BitTorrent X -BXYYYY- BX -BX1234-
BitTyrant AZYYYYBT AZ AZ2500BT
Bits on Wheels -BOWxxx-yyyyyyyyyyyy BOW -BOWA0C-ABCDEFGHIJKL
BTG -BGYYYY- BG -BG1234-
BTQueue Qxxxxx--- Q Q1A2B---
BTSlave -BSYYYY- BS -BS1234-
Deluge -DEYYYY- DE -DE1230-
Electric Sheep -ESYYYY- ES -ES1234-
Enhanced CTorrent -CDYYYY- CD -CD1234-
EBit -EBYYYY- EB -EB1234-
FireTorrent -WYYYYY- WY -WY1234-
FlashGet -FGYYYY FG -FG0180
FoxTorrent -FTYYYY- FT -FT1234-
Freebox BitTorrent -FXYYYY- FX -FX1234-
FrostWire -FWYYYY- FW -FW1234-
G3 Torrent -G3xxxxxxxxx G3 -G3nickname
GSTorrent -GSYYYY- GS -GS1234-
Halite -HLYYYY- HL -HL1000-
Hydranode -HNYYYY- HN -HN1234-
KGet -KGYYYY- KG -KG1234-
KTorrent -KTYYYY- KT -KT2200-
LH-ABC -LHYYYY- LH -LH1234-
libTorrent -ltYYYY- lt -lt1234-
libtorrent -LTYYYY- LT -LT1234-
LimeWire -LWYYYY- LW -LW1234-
Lphant -LPYYYY- LP -LP1234-
MLdonkey -MLx.y.z- ML -ML2.7.2-kgjjfkd
MonoTorrent -MOYYYY- MO -MO1234-
MoonlightTorrent -MTYYYY- MT -MT1234-
MooPolice -MPYYYY- MP -MP1234-
Miro -MRYYYY- MR -MR1234-
Net Transport -NXYYYY- NX -NX1234-
Opera OPxxxx OP OP1234

🔗 Sources

@josecelano
Copy link
Member Author

josecelano commented Apr 23, 2025

Since there are many clients, I won't use labeled metrics to count requests from these clients. That would generate a lot of time series in Prometheus.

The reason for adding this label was to identify peers that are sending the wrong connection ID. I'm assuming there could be some bad implementations. I only want to confirm if that happens.

Alternative Solution

I think we could change the BanService to include two new HashMaps to count:

  • Number of bans per client software (only the software name)
  • Number of bans per client software + version

That would help to identify if a concrete client software or version is not sending the connection ID.

We could expose these counters via the stats tracker API in a new route /stats/banning/info:

/stats
/stats/banning/banned/ips  <- returns the list of banned IPs.
/stats/banning/info        <- returns the counters for clients and client-versions 

/metrics

On the other hand, I increased the ban period from 1 hour to 24 hours. It seems the number of banned IPS does not have an upper limit:

Image

Image

cc @da2ce7

@josecelano
Copy link
Member Author

josecelano commented May 5, 2025

After discussing this issue today again with @da2ce7 I've realised that it does not matter whether I used the labelled metrics or an independent data structure. The real problem is if I add a new label to the udp_tracker_server_errors_total metric that increases the number of time series of that metric N times, where N is the number of combinations of client software + version. However, if I use a new metric for this, the number of time series would be the same as if I stored the same information in an independent HashMap. For example, having NC clients and NV versions per client,

Number of metric values = NC * NV

In a HashMap where the key is the client name and the value is a list of the number of bans per version of that client. Something like:

type ClientMap = HashMap<String, VersionMap>;
type VersionMap = HashMap<String, Arc<AtomicUsize>>;

If I create an independent metric like this:

udp_tracker_server_client_bans_total{client_software_name="qB",client_software_version="4.3.9"}

The number of time series generated in Prometheus would be the same as if we pull the JSON. Assuming, on average, 15 for the number of different clients and 5 versions per client, we would have only 75 new entries in the metrics collections. In my original idea of adding a new label to udp_tracker_server_errors_total, we should multiply the current number of time series by 75.

The key point is that even if it is related to requests, we disregard the rest of the request and focus solely on the relevant information for IP bans, specifically the client software.

josecelano added a commit to josecelano/torrust-tracker that referenced this issue Jun 2, 2025
…t sendable

The error will be included in the UdpError event ans sent via tokio
channel.
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Jun 2, 2025
Not exposing the original complex error type becuase:

- It's too complex.
- It forces all errors to be "Sent", "PartialEq".
- It would expose a lot of internals.
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Jun 2, 2025
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Jun 2, 2025
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Jun 2, 2025
josecelano added a commit to josecelano/torrust-tracker that referenced this issue Jun 2, 2025
josecelano added a commit that referenced this issue Jun 2, 2025
a8f3a97 refactor: [#1551] extract event handler for each udp event (Jose Celano)
89ac87c refactor: [#1551] extract methods in udp event handler" (Jose Celano)
07c6e89 refactor: rename UDP tracker server error variants (Jose Celano)
21bea5b refactor: [#1456] increase ban counters asyncronously (Jose Celano)
ad1b19a feat: trigger UDP error event when there is no transaction ID too (Jose Celano)
525ab73 refactor: [#1456] extract methods (Jose Celano)
f485501 refactor: [#1456 clean code (Jose Celano)
0108c26 fix: test. Error message changed (Jose Celano)
d7902f1 refactor: [#1456] remove unused enum variant in udp server error (Jose Celano)
8f3c22a feat: [#1456] expose error kind in the UdpError event (Jose Celano)
52b9660 feat: [#1456] wrapper over aquatic RequestParseError to make it sendable (Jose Celano)

Pull request description:

  I will add a new metric to count which client's software is used when a UDP error is produced (only for connection ID errors, for now).

  This PR make some changes before adding the new metric.

  ### Subtasks

  - [x] Wrapper for aquatic parse error. It's not sendable.
  - [x] Add the error to the `Event::UdpError` event.
  - [x] Move logic to increase the number of wrong connection IDs per IP to the event listener.
  - [x] Rename errors.
  - [x] Refactor event handler.

ACKs for top commit:
  josecelano:
    ACK a8f3a97

Tree-SHA512: faa185a8050ef9e45f317ef06aa74e52bb385c31167fa0f199411bb1a47a573429daa31b2cccdd024f8fd75c91227618350d10e41d12e8b4062fb1e8d7f7bfdc
@josecelano
Copy link
Member Author

Implemented. I'm deploying it to the tracker demo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- Admin - Enjoyable to Install and Setup our Software Enhancement / Feature Request Something New Needs Feedback What dose the Community Think?
Projects
None yet
Development

No branches or pull requests

1 participant