Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS response time in delayshell not deterministic #105

Open
worenga opened this issue Nov 22, 2016 · 12 comments · May be fixed by #106
Open

DNS response time in delayshell not deterministic #105

worenga opened this issue Nov 22, 2016 · 12 comments · May be fixed by #106

Comments

@worenga
Copy link

worenga commented Nov 22, 2016

When replaying pages i noticed that dns timings are sometimes indeterministically off by 1 RTT. I assume that is some kind of pipelinine stall somewhere.

Example:


#replay a recording that has dns recordings for foo.com and google.com
$ mm-webreplay /tmp/recording-of-foo-com mm-delay 50 
[replay] [delay 50 ms] $ dig google.com foo.com google.com

; <<>> DiG 9.10.3-P4-Ubuntu <<>> google.com foo.com google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 33771
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com.			IN	A

;; Query time: 102 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 18:36:42 CET 2016
;; MSG SIZE  rcvd: 28

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54282
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;foo.com.		IN	A

;; ANSWER SECTION:
foo.com.	0	IN	A	<ip-address-omitted-for-this-example>

;; Query time: 102 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 18:36:42 CET 2016
;; MSG SIZE  rcvd: 64

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 22859
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com.			IN	A

;; Query time: 204 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 18:36:42 CET 2016
;; MSG SIZE  rcvd: 28

Notice that the 3rd query time is twice the rtt.

I found that passing "--all-servers" to dnsmasq in dns_server.cc will avoid the issue (at least it has not happened in a testrun with 90 tries). However, i do not really understand why.

@keithw
Copy link
Collaborator

keithw commented Nov 22, 2016

Hmm, I'm having trouble replicating this. I'm also more troubled by the "status: REFUSED" on your third query than the fact that it took 200 milliseconds. I wonder why dnsmasq is refusing the query. (We are running dnsmasq with --no-resolv, so I'd be surprised if --all-servers made a difference! We only want it to be answering queries locally, from its hosts file.) I couldn't get the problem to happen with just a very simply replay directory after recording "wget foo.com" and "wget google.com".

Can you supply a tarball of the replay directory that produces this problem?

@worenga
Copy link
Author

worenga commented Nov 22, 2016

@keithw , ok sorry, more background, and new information, as i do understand the problem better now.

I have created a mahimahi HTTP/1 capture from a HTTP/1 webserver running in a VM on my host.
The hostname of the vm is h2pushsrv.
The website i recorded can be found under https://h2pushsrv/editorial/.
I am using chrome (upstream version build from the master branch in release mode) with a fresh profile and incognito-mode, so completely cold-cache.
I am recording from the VM-host and in the hosts /etc/hosts i have h2pushsrv set to the VM's ip address.

Before replaying i remove h2pushsrv from /etc/hosts and restart networking so that mahimahi's dnsmasq serves it.

Now in replay mode with 50ms delay using chrome, when I enable query logging in dnsmask (-q --log-facility=-) I can see the following queries during replay.

dnsmasq[4999]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.0.1
dnsmasq[4999]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[4992]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1
dnsmasq[4999]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[4992]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1
dnsmasq[4999]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.0.1
dnsmasq[4999]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[4992]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1
dnsmasq[4999]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[4999]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[4992]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[4992]: /tmp/replayshell_hosts.NY6yKS h2pushsrv is 192.168.56.101
dnsmasq[4999]: reply h2pushsrv is 192.168.56.101

Since i am connected to eduroam.rwth-aachen.de via wi-fi chrome also issues dns request for
h2pushsrv.eduroam.rwth-aachen.de

Now I noticed that sometimes the DNS-Lookup part of the replay in chrome is sometimes off by 1 RTT.

Note that there is no capture file for h2pushsrv.eduroam.rwth-aachen.de since there wasn't actually any req/response served over this domainname.

When i dig h2pushsrv.eduroam.rwth-aachen.de i get REFUSED, however with 100ms RTT.

[replay] [delay 50 ms] $ dig h2pushsrv.eduroam.rwth-aachen.de
dnsmasq[7846]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.0.1
dnsmasq[7846]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[7839]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1

; <<>> DiG 9.10.3-P4-Ubuntu <<>> h2pushsrv.eduroam.rwth-aachen.de
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 61970
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;h2pushsrv.eduroam.rwth-aachen.de. IN A

;; Query time: 102 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 21:04:04 CET 2016
;; MSG SIZE rcvd: 50

This is correct.

Now when i do dig h2pushsrv.eduroam.rwth-aachen.de h2pushsrv h2pushsrv.eduroam.rwth-aachen.de

[replay] [delay 50 ms] $ dig h2pushsrv.eduroam.rwth-aachen.de h2pushsrv h2pushsrv.eduroam.rwth-aachen.de
dnsmasq[7846]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.0.1
dnsmasq[7846]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[7839]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1

; <<>> DiG 9.10.3-P4-Ubuntu <<>> h2pushsrv.eduroam.rwth-aachen.de h2pushsrv h2pushsrv.eduroam.rwth-aachen.de
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 2243
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;h2pushsrv.eduroam.rwth-aachen.de. IN A

;; Query time: 102 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 21:06:30 CET 2016
;; MSG SIZE rcvd: 50

dnsmasq[7846]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[7846]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[7839]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[7839]: /tmp/replayshell_hosts.82Aj8O h2pushsrv is 192.168.56.101
dnsmasq[7846]: reply h2pushsrv is 192.168.56.101
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20387
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;h2pushsrv. IN A

;; ANSWER SECTION:
h2pushsrv. 0 IN A 192.168.56.101

;; Query time: 101 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 21:06:30 CET 2016
;; MSG SIZE rcvd: 54

dnsmasq[7846]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.0.1
dnsmasq[7846]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[7839]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1
dnsmasq[7846]: forwarded h2pushsrv.eduroam.rwth-aachen.de to 100.64.0.2
dnsmasq[7839]: query[A] h2pushsrv.eduroam.rwth-aachen.de from 127.0.1.1
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 3491
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;h2pushsrv.eduroam.rwth-aachen.de. IN A

;; Query time: 201 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Nov 22 21:06:30 CET 2016
;; MSG SIZE rcvd: 50

Suddenly, query time is 201msec for the last req, which is 2xRTT.
Note: You may have to repeat this measurement 1-2 times before you can see the effect.

Here is the recording:
https://drive.google.com/open?id=0B8Lb-nOxGTpDa0Q0Q09vTXZGalU

I guess the entire problem can be avoided by requesting
https://h2pushsrv./editorial/ instead of
https://h2pushsrv/editorial/ (note the . after the domainname)
However, the added double RTT still feels like a bug.

@keithw
Copy link
Collaborator

keithw commented Nov 23, 2016

I've looked at the tcpdump output, and it looks to me like dig is retrying the last query twice -- once with an OPT field in the request (having to do with DNSSEC) and then once without.

I haven't dived into the dig source code, but this seems like it might be peculiar to dig, and perhaps particular to its behavior when you put multiple queries on one command-line.

Are you seeing this on a more normal gethostbyname or getaddrinfo call? Or at all outside of dig? Or even outside of running dig with multiple hostnames on one command line?

@worenga
Copy link
Author

worenga commented Nov 23, 2016

Are you seeing this on a more normal gethostbyname or getaddrinfo call? Or at all outside of dig? Or
even outside of running dig with multiple hostnames on one command line?

Yes. In fact, the way i spotted this was using chrome dev tools. Because i am doing Page Load Time Measurements, i noticed the DNS timings we're off in some runs in chrome.

See Screenshots here (first-screenshot: start of a fresh chrome in mm-webreplay with delay 50, then i quitted chrome inside the replayshell and started it fresh again. the second time dns lookup is 300ms)
https://imgur.com/a/swzUn

Note to test this with other browsers you may have to disable ua checking in replayserver

@keithw
Copy link
Collaborator

keithw commented Nov 24, 2016

Okay, can you get a tcpdump capture of the DNS requests? My suspicion is that Chrome (or the stub resolver) is making two requests, just like dig was doing.

My random guess at the moment is that it's possible that the root cause here is that dnsmasq is returning REFUSED instead of NXDOMAIN for nonexistent names (causing the resolvers to have some reason to want to retry the request). But first let's try to figure out if the reason for the extra time is that it's indeed making a second request, just as with dig.

@worenga
Copy link
Author

worenga commented Nov 25, 2016

Thank you. I think the problem is that i am still connected to our university network, where i am part of a domain.

See Screenshots here: https://imgur.com/a/8VVPo
As you can see from the screenshot (where i enabled dns logging in dnsmask on the right) it also queries h2pushsrv.comsys.rwth-aachen.de from inside mahimahi.

First case:
Pcap: https://drive.google.com/open?id=0B8Lb-nOxGTpDVUtCNXpnTEI2blU
SSL Key Logfile for pcap: https://drive.google.com/open?id=0B8Lb-nOxGTpDRDdnTF9JTFBKcnc

Second Case:
https://drive.google.com/open?id=0B8Lb-nOxGTpDQmY4XzRHVXNQTWc
SSL Key Logfile for pcap: https://drive.google.com/open?id=0B8Lb-nOxGTpDbGRETUYxZ25rZmM

Note that there is alot of garbage due to the initial start of chrome.

@keithw
Copy link
Collaborator

keithw commented Nov 26, 2016

Okay, and just to confirm, does adding --all-servers to the dnsmasq command line still fix the problem (and make these two cases the same)?

@worenga
Copy link
Author

worenga commented Nov 26, 2016

@keithw From my following experiment --all-servers seems to play some role, but does not solve the issue entirely. So sorry for the confusion about this flag.

I am connected to my WiFi at home now so now instead of h2pushsrv.comsys.rwth-aachen.de it will query h2pushsrv.local if i am connected to wifi it stops doing that when i am disconnected.

See the following set of screenshots: https://imgur.com/a/hiKDZ

The issue seems so simply be that whenever the machine i run mahimahi on is connected to an outside network, there are additional DNS queries issued to $query_hostname.$domainname which cannot be answered by dnsmasq and those extra requests will incur the extra RTTs.

A more minimalist example would be:

[replay] [delay 100 ms] ➜  ~ /usr/bin/time python -c 'import socket; socket.gethostbyname("h2pushsrv")'
dnsmasq[16403]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[16403]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[16396]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[16403]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[16403]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[16396]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[16403]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[16403]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[16396]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[16396]: /tmp/replayshell_hosts.hKsVIn h2pushsrv is 192.168.56.101
dnsmasq[16403]: reply h2pushsrv is 192.168.56.101
0.02user 0.00system 0:00.63elapsed 4%CPU (0avgtext+0avgdata 9716maxresident)k
0inputs+0outputs (0major+1068minor)pagefaults 0swaps```

@worenga
Copy link
Author

worenga commented Nov 26, 2016

The problem seems to be that ubuntu adds a search directive to my /etc/resolv.conf.
This search directive is responsible for issuing the additional dns queries.
Mahimahi only spawns a dnsmasq on the nameservers that are retrieved from resolvconf but ignores the search thing.

Maybe we could generate a resolf.conf specifically only for the replay namespace, but this requires a named network namespace i think.

Instead of unshare(CLONE_NEWNET) the Mahimahi replayshell could create a named network namespace like mahimhai.$pid using ip netns add mahimhai.$pid and then use setns to use that namespace. Having a named network namespace would allow for the creation of a /etc/netns/$namespacename/resolv.conf that is utilized within the network namespace only.

This might also eliminate the need to start multiple dnsmasq instances if there are multiple dns servers configured in the user's system.

@keithw
Copy link
Collaborator

keithw commented Nov 27, 2016

That's an interesting idea, but to the best of my knowledge, per-network-namespace resolv.conf files just aren't implemented (e.g. it doesn't look like glibc knows anything about them).

I think this may have to be a "performing as designed" as far as mahimahi is concerned. Your userspace programs are doing multiple DNS lookups, apparently because that's what's in your (userspace) resolver configuration. What you're seeing is the effects of that behavior.

@worenga
Copy link
Author

worenga commented Nov 27, 2016

That's an interesting idea, but to the best of my knowledge, per-network-namespace resolv.conf files just aren't implemented (e.g. it doesn't look like glibc knows anything about them).

I cannot confirm that. I've fiddled around with this yesterday and got a working mahimahi prototype where a named network namespace is created with corresponding /etc/netns/$namespacename/resol.conf. (Adopted from the iproute2 source for ip netns $namespace exec )
However, it lacks proper cleanup yet, which is why i havent created a PR so far.

I think this may have to be a "performing as designed" as far as mahimahi is concerned. Your userspace programs are doing multiple DNS lookups, apparently because that's what's in your (userspace) resolver configuration. What you're seeing is the effects of that behavior.

I am not sure about the design philosophy of mahimahi, however it is confusing that the timing of a local replay of a website is dependent of the outside dns configuration (i.e. whether i am part of a domain or not).

See this log of a replay i did from the host h2pushsrv:

➜  ~ mm-webreplay test5 mm-delay 100
dnsmasq[8901]: started, version 2.75 cachesize 150
dnsmasq[8901]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC loop-detect inotify
dnsmasq[8901]: warning: no upstream servers configured
dnsmasq[8901]: read /tmp/replayshell_hosts.3p4N1l - 49 addresses
dnsmasq[8908]: started, version 2.75 cachesize 150
dnsmasq[8908]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC loop-detect inotify
dnsmasq[8908]: using nameserver 100.64.0.2#45299
dnsmasq[8908]: cleared cache

[replay] [delay 100 ms] ➜  ~ /usr/bin/time python -c 'import socket; socket.gethostbyname("h2pushsrv")'
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[8901]: /tmp/replayshell_hosts.3p4N1l h2pushsrv is 192.168.56.101
dnsmasq[8908]: reply h2pushsrv is 192.168.56.101
0.02user 0.00system 0:00.64elapsed 4%CPU (0avgtext+0avgdata 9592maxresident)k
0inputs+0outputs (0major+1067minor)pagefaults 0swaps

[replay] [delay 100 ms] ➜  ~ /usr/bin/time python -c 'import socket; socket.gethostbyname("h2pushsrv")'
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[8901]: /tmp/replayshell_hosts.3p4N1l h2pushsrv is 192.168.56.101
dnsmasq[8908]: reply h2pushsrv is 192.168.56.101
0.02user 0.00system 0:00.83elapsed 3%CPU (0avgtext+0avgdata 9756maxresident)k
0inputs+0outputs (0major+1072minor)pagefaults 0swaps

[replay] [delay 100 ms] ➜  ~ /usr/bin/time python -c 'import socket; socket.gethostbyname("h2pushsrv")'
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[8901]: /tmp/replayshell_hosts.3p4N1l h2pushsrv is 192.168.56.101
dnsmasq[8908]: reply h2pushsrv is 192.168.56.101
0.01user 0.01system 0:00.83elapsed 3%CPU (0avgtext+0avgdata 9688maxresident)k
0inputs+0outputs (0major+1066minor)pagefaults 0swaps

[replay] [delay 100 ms] ➜  ~ /usr/bin/time python -c 'import socket; socket.gethostbyname("h2pushsrv")'
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[8901]: /tmp/replayshell_hosts.3p4N1l h2pushsrv is 192.168.56.101
dnsmasq[8908]: reply h2pushsrv is 192.168.56.101
0.02user 0.00system 0:00.83elapsed 3%CPU (0avgtext+0avgdata 9760maxresident)k
0inputs+0outputs (0major+1072minor)pagefaults 0swaps

[replay] [delay 100 ms] ➜  ~ /usr/bin/time python -c 'import socket; socket.gethostbyname("h2pushsrv")'
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv.local from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv.local to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv.local from 127.0.1.1
dnsmasq[8908]: query[A] h2pushsrv from 127.0.0.1
dnsmasq[8908]: forwarded h2pushsrv to 100.64.0.2
dnsmasq[8901]: query[A] h2pushsrv from 127.0.1.1
dnsmasq[8901]: /tmp/replayshell_hosts.3p4N1l h2pushsrv is 192.168.56.101
dnsmasq[8908]: reply h2pushsrv is 192.168.56.101
0.02user 0.00system 0:00.64elapsed 4%CPU (0avgtext+0avgdata 9648maxresident)k
0inputs+0outputs (0major+1066minor)pagefaults 0swaps

Obviously, sometimes the time spend resolving DNS is 1 RTT higher, which yields to different page load times inside the same mm-webreplay session.
Also note mm-webreplay dnsmasq does not serve the h2pushsrv.local domain (REFUSED) but the h2pushsrv domain, because this is the domain that was included in the protobuf recordings.

Since the .local depends on the current local network configuration (.local is just my home network, when i am at work its another domainname), eliminating the .local by introducing a mm-webreplay-specific resolv.conf seems like a clean way to solve this problem.

@worenga worenga linked a pull request Nov 27, 2016 that will close this issue
@keithw
Copy link
Collaborator

keithw commented Nov 27, 2016

(commented on #106)

worenga added a commit to worenga/mahimahi-h2o that referenced this issue May 23, 2017
Noticed a discrepancy in dns timings dns timings were off sometimes a rtt.
Not entirely sure why this is, checked chrome dns queries and even in incognito chrome sometimes sends a deterministic bunch of dns queries too chromium-i18n.appspot.com and google.com,
 either by chromedrivier issuing the request due to url entering (prefetching google search results) or some google stuff to protect one from malware etc.
anyhow, timings should not never differ across runs.
i think its some request pipelining issue in replayshell passing --all-servers to dnsmasq seems to resolve this, allthough i have no idea why, which is kinda dangerous.

Observe Ticket: ravinet/mahimahi#105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants