Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

would like dcmd for heuristically finding back references #107

Closed
davepacheco opened this issue Feb 9, 2018 · 8 comments
Closed

would like dcmd for heuristically finding back references #107

davepacheco opened this issue Feb 9, 2018 · 8 comments

Comments

@davepacheco
Copy link
Contributor

davepacheco commented Feb 9, 2018

While debugging issues related to memory usage (including memory leaks), it's common to look at a particular object and want to know what other references exist for that object. findjsobjects -r can help with this, though it's extremely time-consuming. (On large core files -- and issues related to memory usage often result in large core files -- each iteration can take tens of minutes, and you often want to make dozens of iterations.) It also doesn't currently find objects referenced via closures -- see #105.

In the meantime, I discovered that I could find back references reasonably reliably using a combination of ugrep and scouring nearby memory. It basically works like this: I'd find an interesting representative object from findjsobjects and look for references:

> 400bc4e08189::findjsobjects -r
400bc4e08189 is not referred to by a known object.

But using ugrep, I'd find that it was referred to elsewhere:

> 400bc4e08189::ugrep
400bc4e08c18

Then I'd try to summarize the region of memory ahead of that reference like this:

> 400bc4e08c18-0t64,0t9=p8+n | ::eval ".=nn; ./p; *.::v8type"
                
                
0x400bc4e08bd8: 0x7a9cfc104101  
0x7a9cfc104101: FixedArray
                
                
0x400bc4e08be0: 0x4b79a3904121  
0x4b79a3904121: Oddball: "undefined"
                
                
0x400bc4e08be8: 0x497ed1e05141  
0x497ed1e05141: Map
                
                
0x400bc4e08bf0: 0x900000000     
0x900000000: SMI: value = 9
                
                
0x400bc4e08bf8: 0x79785ffef4b9  
0x79785ffef4b9: JSFunction
                
                
0x400bc4e08c00: 0x6b8ee5b34791  
0x6b8ee5b34791: FixedArray
                
                
0x400bc4e08c08: 0               
0x0: SMI: value = 0
                
                
0x400bc4e08c10: 0x4b79a3906b71  
0x4b79a3906b71: JSGlobalObject
                
                
0x400bc4e08c18: 0x400bc4e08189  
0x400bc4e08189: JSFunction

Observing that virtually every V8 heap object has as its first word a pointer to a Map, I'd walk back from my pointer to the first address that points to a Map. In this case, that's 400bc4e08be8. To get the heap object's address, we have to take addr|1 -- 400bc4e08be9 in this case. Then I'd try to print that out, and sure enough:

> 0x400bc4e08be9::v8print
400bc4e08be9 FixedArray {
    400bc4e08be9 FixedArrayBase {
        400bc4e08be9 HeapObject < Object  {
            400bc4e08be8 map = 497ed1e05141 (Map)
        }
        400bc4e08bf0 length = 900000000 (SMI: value = 9)
    }
    400bc4e08bf8 data = 79785ffef4b9 (JSFunction)
}

It's a FixedArray, and our original value is one of its elements:

> 0x400bc4e08be9::v8array
79785ffef4b9
6b8ee5b34791
0
4b79a3906b71
400bc4e08189      <------ our original pointer
400bc4e081d1
400bc4e081f1
4b79a3904161
400bc4e08211

I used this approach iteratively to reconstruct a tree that included 4 JSObjects, 1 JSArray, 6 closures, and a handful of FixedArrays that were used as intermediate values (e.g., for arrays or closure contexts). It was tedious and time-consuming, especially when it branched (i.e., there was more than one reference), but it was pretty reliable, and it can be done automatically quite quickly.

I'd propose a couple of dcmds like:

ADDR::jsfindrefs [-s SIZE]
ADDR::v8findrefs [-s SIZE]

which would essentially follow this heuristic:

  • Use ugrep to find candidates addresses.
  • For each candidate address, walk backwards up to SIZE bytes and attempt to interpret each pointer-aligned address as a V8 heap object. If we get through SIZE bytes and find nothing, give up. If we find a Map, stop -- we almost certainly don't need to look further than this point.
  • As an extra check, use our knowledge of heap objects to attempt to determine whether the the reference we found via ugrep is actually contained in the object. Many heap classes are fixed-size, which makes it pretty easy. Others are variable size (like strings, FixedArrays, and JSObjects), but we can tell how big they are from the information in them. This check should eliminate many false positives.
  • If we make it this far, for v8findrefs, we're done -- report the reference and move onto the next ugrep candidate.
  • For jsfindrefs, we could do a little more interpretation: if we find a FixedArray, we should probably take the next step and find where that FixedArray is referenced. It's likely to be a closure or a JSArray.

I'm envisioning that v8findrefs would iterate just once and report the V8 heap objects that immediately refer to the given address, while jsfindrefs would keep following each path until it finds the next JS-level object along each path. Either of these could take an option to repeat the process a number of times.

As for an appropriate default for SIZE: the maximum you'd have to walk back for fixed-size objects is the size of the largest fixed-size object that we know about. Fortunately, we have that information. For strings and arrays, you might have to walk back arbitrarily further than that, since those are arbitrary-sized objects. But since we can always stop when we find the first Map, and false positives can usually be eliminated with the check above, there's no reason not to set this limit fairly high (e.g., even a few hundred KB).

Although this was mostly useful here because we don't yet have #105, and the performance issues associated with findjsobjects -r could also be addressed with the export-to-sqlite path, I think these dcmds would still be useful tools to have.

I will add a detailed analysis for the original problem to MORAY-455, in case that's helpful.

@davepacheco
Copy link
Contributor Author

I just discovered that mdb has a private function called mdb_whatis_register that allows dmods to plug into the ::whatis dcmd. That may be useful, too.

@davepacheco
Copy link
Contributor Author

I implemented the behavior I described for ::v8findrefs as a new dcmd called ::v8whatis under #111. I've prototyped the change for ::jsfindrefs -- I've only tried it on a synthetic program, but it seems quite promising. For reference, the branch is here:
https://github.com/davepacheco/mdb_v8/tree/dev-jsfindrefs

At this point, it still needs some cleanup, documentation, and tests.

@davepacheco
Copy link
Contributor Author

davepacheco commented Mar 19, 2018

I've put up a change for ::jsfindrefs here: https://cr.joyent.us/#/c/3691

Some notes for reviewers:

  • This change refactors the ::v8whatis implementation I added under want ::v8whatis #111 into a first-class interface declared in "mdb_v8_dbg.h" for use here. The obj_contains() function was similarly first-classed, now called v8contains().
  • This change re-implements the equivalent of mdb's ::ugrep dcmd in its most basic form. While unfortunate, the implementation is pretty straightforward, and the comments explain why it's necessary. I've put this under a new "dbi" subsystem for "debugger interface". The intent is to support putting other debugger-provided facilities here (like iterating mappings and reading from the address space).
  • This change includes updates to CHANGES.md and version in preparation for an immediate release.
  • This is unrelated, but I fixed a typo I noticed in the ::v8print help output.
  • I've also added a new environment variable used by the test suite called MDBV8_TEST_KEEPCORE. If set to "true", "1", or "yes", this causes the standaloneTest facility to preserve the core file even if the test succeeds. Normally, the core file is removed when a test passes.

I have a copy at the the same GitHub branch as before, so you can see the docs for the new dcmd here:
https://github.com/davepacheco/mdb_v8/blob/dev-jsfindrefs/docs/usage.md#jsfindrefs

@davepacheco
Copy link
Contributor Author

PS3 of the above change passes the runtests_node tests:

0.10.48 sunos x86: success
0.10.48 sunos x64: success
0.12.17 sunos x86: success
0.12.17 sunos x64: success
4.8.4   sunos x86: success
4.8.4   sunos x64: success
6.11.2  sunos x86: success
6.11.2  sunos x64: success

@davepacheco
Copy link
Contributor Author

I've updated the Gerrit CR in response to code review feedback. I'm rerunning the test suite on the same Node versions as before.

@davepacheco
Copy link
Contributor Author

Here are the test results from PS4:

$ time ./tools/runtests_node run $HOME/node-versions
...
node v6.11.2 ia32
Configuration:
    SRC:                         /home/dap/mdb_v8
    Output directory:            /var/tmp/catest.31582
    Temp directory:              /var/tmp/catest.31582_tmpfiles
    Keep successful test output: false
    Found 8 test(s) to run

===================================================

Executing test test/standalone/tst.arrays.js ... success.
Executing test test/standalone/tst.jsclosure.js ... success.
Executing test test/standalone/tst.jsfindrefs.js ... success.
Executing test test/standalone/tst.postmortem_basic.js ... success.
Executing test test/standalone/tst.postmortem_details.js ... success.
Executing test test/standalone/tst.postmortem_findjsobjects.js ... success.
Executing test test/standalone/tst.postmortem_jsstack.js ... FAILED.
>>> failure details in /var/tmp/catest.31582/failure.0

Executing test test/standalone/tst.v8whatis.js ... success.

===================================================

Results:
	Tests passed:	 7/ 8
	Tests failed:	 1/ 8

===================================================
Cleaning up output from successful tests ... done.
node v6.11.2 x64
Configuration:
    SRC:                         /home/dap/mdb_v8
    Output directory:            /var/tmp/catest.36172
    Temp directory:              /var/tmp/catest.36172_tmpfiles
    Keep successful test output: false
    Found 8 test(s) to run

===================================================

Executing test test/standalone/tst.arrays.js ... success.
Executing test test/standalone/tst.jsclosure.js ... success.
Executing test test/standalone/tst.jsfindrefs.js ... success.
Executing test test/standalone/tst.postmortem_basic.js ... success.
Executing test test/standalone/tst.postmortem_details.js ... success.
Executing test test/standalone/tst.postmortem_findjsobjects.js ... success.
Executing test test/standalone/tst.postmortem_jsstack.js ... FAILED.
>>> failure details in /var/tmp/catest.36172/failure.0

Executing test test/standalone/tst.v8whatis.js ... success.

===================================================

Results:
	Tests passed:	 7/ 8
	Tests failed:	 1/ 8

===================================================
Cleaning up output from successful tests ... done.

Summary:
0.10.48 sunos x86: success
0.10.48 sunos x64: success
0.12.17 sunos x86: success
0.12.17 sunos x64: success
4.8.4   sunos x86: success
4.8.4   sunos x64: success
6.11.2  sunos x86: fail
6.11.2  sunos x64: fail


real	41m29.672s
user	14m48.500s
sys	26m45.323s

More specifically, across all Node versions, there were two test failures. They both appear to be instances of #38:

dap@f0c3d2d4 mdb_v8 $ cat /var/tmp/catest.31582/failure.0/README 
/home/dap/mdb_v8/test/standalone/tst.postmortem_jsstack.js failed: test returned 1
dap@f0c3d2d4 mdb_v8 $ cat /var/tmp/catest.31582/failure.0/*.err

assert.js:81
  throw new assert.AssertionError({
  ^
AssertionError: unexpected frame where stalloogle was expected; core retained as /var/tmp/node.35574
    at ChildProcess.<anonymous> (/home/dap/mdb_v8/test/standalone/tst.postmortem_jsstack.js:140:11)
    at emitTwo (events.js:106:13)
    at ChildProcess.emit (events.js:191:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:215:12)
dap@f0c3d2d4 mdb_v8 $
dap@f0c3d2d4 mdb_v8 $ cat /var/tmp/catest.36172/failure.0/README 
/home/dap/mdb_v8/test/standalone/tst.postmortem_jsstack.js failed: test returned 1
dap@f0c3d2d4 mdb_v8 $ cat /var/tmp/catest.36172/failure.0/*.err

assert.js:81
  throw new assert.AssertionError({
  ^
AssertionError: unexpected frame where stalloogle was expected; core retained as /var/tmp/node.40373
    at ChildProcess.<anonymous> (/home/dap/mdb_v8/test/standalone/tst.postmortem_jsstack.js:140:11)
    at emitTwo (events.js:106:13)
    at ChildProcess.emit (events.js:191:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:215:12)

(This also looks like #97, but this PS has the fix for that issue.)

@davepacheco
Copy link
Contributor Author

As described in #38, this issue does seem intermittent:

dap@f0c3d2d4 mdb_v8 $ node -v
v6.11.2
(reverse-i-search)`': ^C
dap@f0c3d2d4 mdb_v8 $ make -n test
tools/catest -a
dap@f0c3d2d4 mdb_v8 $ ./tools/catest test/standalone/tst.postmortem_jsstack.js 
Configuration:
    SRC:                         /home/dap/mdb_v8
    Output directory:            /var/tmp/catest.73874
    Temp directory:              /var/tmp/catest.73874_tmpfiles
    Keep successful test output: false
    Found 1 test(s) to run

===================================================

Executing test test/standalone/tst.postmortem_jsstack.js ... FAILED.
>>> failure details in /var/tmp/catest.73874/failure.0


===================================================

Results:
	Tests passed:	 0/ 1
	Tests failed:	 1/ 1

===================================================
Cleaning up output from successful tests ... done.
dap@f0c3d2d4 mdb_v8 $ ./tools/catest test/standalone/tst.postmortem_jsstack.js 
Configuration:
    SRC:                         /home/dap/mdb_v8
    Output directory:            /var/tmp/catest.75980
    Temp directory:              /var/tmp/catest.75980_tmpfiles
    Keep successful test output: false
    Found 1 test(s) to run

===================================================

Executing test test/standalone/tst.postmortem_jsstack.js ... success.

===================================================

Results:
	Tests passed:	 1/ 1
	Tests failed:	 0/ 1

===================================================
Cleaning up output from successful tests ... done.
dap@f0c3d2d4 mdb_v8 $ 

I think this can count as a clean test run. Obviously, we should still fix #38, but I don't think this work should be blocked on that.

joyent-automation pushed a commit that referenced this issue Apr 3, 2019
Reviewed by: Cody Peter Mello <[email protected]>
Approved by: Cody Peter Mello <[email protected]>
@davepacheco
Copy link
Contributor Author

Integrated. Thanks @melloc for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant