Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working on new TransportTuple.hash implementation #839

Merged
merged 14 commits into from
Jun 21, 2022

Conversation

ibc
Copy link
Member

@ibc ibc commented Jun 20, 2022

This branch should be applied on webrtcserver branch or in v3 once the former is merged.

Not happy at all yet because:

  • SetHash() implementation in this branch is too simple. We should consider local IP:port and protocol as well, but we cannot just "sum" those numbers since there is a chance to collide.
  • And if we collide (if we have 2 tuples with same hash in the same WebRtcServer) then we'll have an assertion problem:
    RTC::WebRtcServer::OnWebRtcTransportTransportTupleRemoved() | failed assertion `this->mapTupleWebRtcTransport.find(tuple->hash) != this->mapTupleWebRtcTransport.end()': tuple not handled
    
  • So IMHO we need hash to be int64_t and we should concatenate remote IP:port first then local IP:port plus protocol somewhere. But summing protocol cannot be just about doing + 1 or +2. imagine this:
    - Client A connects from udp:1.2.3.4:5001 to udp:9.9.9.9:9000.
    - Client B connects from tcp:1.2.3.4:5000 to tcp:9.9.9.9:9000.
    - If tcp protocol just means "adding 1 to the hash" then both hashes will be the same (BUMPP!).
    

This branch should be applied on `webrtcserver` branch or in `v3` once the former is merged.

Not happy at all yet because:

- `SetHash()` implementation in this branch is too simple. We should consider local IP:port and protocol as well, but we cannot just "sum" those numbers since there is a chance to collide.
- And if we collide (if we have 2 tuples with same `hash` in the same `WebRtcServer`) then we'll have an assertion problem:
  ```
  RTC::WebRtcServer::OnWebRtcTransportTransportTupleRemoved() | failed assertion `this->mapTupleWebRtcTransport.find(tuple->hash) != this->mapTupleWebRtcTransport.end()': tuple not handled
  ```
- So IMHO we need `hash` to be  `int64_t` and we should concatenate remote IP:port first then local IP:port plus protocol somewhere. But summing protocol cannot be just about doing + 1 or +2. imagine this:
  ```
  - Client A connects from udp:1.2.3.4:5001 to udp:9.9.9.9:9000.
  - Client B connects from tcp:1.2.3.4:5000 to tcp:9.9.9.9:9000.
  - If tcp protocol just means "adding 1 to the hash" then both hashes will be the same (BUMPP!).
@ibc ibc requested a review from jmillan June 20, 2022 13:26
@ibc ibc mentioned this pull request Jun 20, 2022
@ibc
Copy link
Member Author

ibc commented Jun 20, 2022

For example, assuming that the client IP is IPv6 1:2:3:4:A:B:C:D, hash value is 5360, 6006 and numbers like that :(

This is the current (not completed of course) code:

auto* remoteSockAddrIn6 = reinterpret_cast<const struct sockaddr_in6*>(remoteSockAddr);
auto* a = reinterpret_cast<const uint32_t*>(std::addressof(remoteSockAddrIn6->sin6_addr));

this->hash += a[0] ^ a[1] ^ a[2] ^ a[3];
this->hash += ntohs(remoteSockAddrIn6->sin6_port);

However when client IP is IPv4 192.168.1.100, hash value is 3232300617, 3232297556, etc, which are LONGER :(

auto* remoteSockAddrIn = reinterpret_cast<const struct sockaddr_in*>(remoteSockAddr);

this->hash += ntohl(remoteSockAddrIn->sin_addr.s_addr);
this->hash += ntohs(remoteSockAddrIn->sin_port);

ok, with client's IPv6 9999:8888:777:6:AAAA:BBB:CC:D it produces hashes like 948198608, 948199393, so it's ok.

@ibc
Copy link
Member Author

ibc commented Jun 20, 2022

I've changed hash to be uint64_t so there is lot of room to put "concatened" values.

const auto address = ntohl(remoteSockAddrIn->sin_addr.s_addr);
const uint64_t port = (ntohs(remoteSockAddrIn->sin_port));

std::cout << "address " << std::bitset<64>{address} << std::endl;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add include bitset at the top of the file if we go this way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, though the log is temporal

worker/src/RTC/TransportTuple.cpp Outdated Show resolved Hide resolved
@jmillan
Copy link
Member

jmillan commented Jun 21, 2022

TCP example:

 mediasoup:Worker (stdout) address  0000000000000000000000000000000000001010000000000000000000000001 +1ms
 mediasoup:Worker (stdout) port     0000000000000000000000000000000000000000000000001110001011011101 +0ms
 mediasoup:Worker (stdout) tmp hash 0000000000000000111000101101110100001010000000000000000000000001 +0ms
 mediasoup:Worker (stdout) fin hash 0000000000000000111000101101110100001010000000000000000000000001 +0ms

UDP example:

 mediasoup:Worker (stdout) address  0000000000000000000000000000000000001010000000000000000000000001 +20ms
 mediasoup:Worker (stdout) port     0000000000000000000000000000000000000000000000001100011011111100 +0ms
 mediasoup:Worker (stdout) tmp hash 0000000000000000110001101111110000001010000000000000000000000001 +0ms
 mediasoup:Worker (stdout) fin hash 1000000000000000110001101111110000001010000000000000000000000001 +0ms

@nazar-pc
Copy link
Collaborator

New implementation is much more complex, what are we getting by going this route?

@jmillan
Copy link
Member

jmillan commented Jun 21, 2022

New implementation is much more complex, what are we getting by going this route?

Why do you say it's more complex? it's just shifting numbers into a uint64_t type?

@jmillan
Copy link
Member

jmillan commented Jun 21, 2022

what are we getting by going this route

Which route, getting the hash out of the remote socket? Sorry @nazar-pc, I don't follow.

@nazar-pc
Copy link
Collaborator

So before we just concatenated a string and hashed it. Simple enough, we can compare the strings later to know if things are identical.

Now we're doing a bunch of bit manipulations, which for me is more complex. But if there any significant performance or memory usage difference of doing any of this comparing to just having a string?

@jmillan
Copy link
Member

jmillan commented Jun 21, 2022

One of the concerns of using strings is the fact that they would be used as key in a map in the new WebRtcServer class. And the costs of the lookup on such map would cost more than using numbers.

@nazar-pc
Copy link
Collaborator

I get that, it was my original concern as well, just not clear to me how expensive that actually is.

@ibc
Copy link
Member Author

ibc commented Jun 21, 2022

I get that, it was my original concern as well, just not clear to me how expensive that actually is.

Generating a string which is a concatenation of numbers converted into strings plus using std:hash() is way more expensive than having a uint64_t number computed by reading other numbers. Take into account that TransportTuple::SetHash() is executed for every received packet. So this is important.

@ibc ibc marked this pull request as ready for review June 21, 2022 17:34
@ibc
Copy link
Member Author

ibc commented Jun 21, 2022

I think we can merge this and keep working in the other branch.

@ibc
Copy link
Member Author

ibc commented Jun 21, 2022

Let's merge this.

@ibc ibc merged commit 4671cac into webrtcserver Jun 21, 2022
@ibc ibc deleted the transporttuple-new-hash branch June 21, 2022 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants