Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix StackOverflowError when a wire is connected to itself #41

Merged
merged 1 commit into from
Dec 9, 2022

Conversation

aphedges
Copy link
Contributor

@aphedges aphedges commented Dec 8, 2022

During each tick, electrical components are checked for whether they connect to ground with the help of the method Terminal.connectsToGround(). The method contains a maxSteps parameter that is decremented on each recursive call to prevent infinite recursion. However, the value of maxSteps was never checked, so the method would recurse infinitely until the stack overflowed when one terminal of a wire was connected to the other terminal. Adding checks to stop recursing when maxSteps reaches 0 prevents the error.


I used the following script for testing:

from scienceworld import ScienceWorldEnv

env = ScienceWorldEnv("")
env.load("task-2a-test-conductivity-of-unknown-substances", 380, "")
*_, info = env.step("connect orange wire terminal 2 to orange wire terminal 1")
print(repr(info["look"]))

Before this commit, I get the following output:

'java.lang.StackOverflowError'

After this commit, I get the following output:

'This room is called the workshop. In it, you see: \n\tthe agent\n\ta substance called air\n\ta green box (containing nothing)\n\ta red box (containing nothing)\n\ta table. On the table is: a battery, a blue wire, a green light bulb, which is off, a orange wire, a red wire, a switch, which is off, a violet light bulb, which is off, a yellow light bulb, which is off.\n\ta ultra low temperature freezer. The ultra low temperature freezer door is closed. \n\tunknown substance R\nYou also see:\n\tA door to the hallway (that is closed)\n'


To debug the issue, I created a new patch to replace the patch in #30 (comment) that worked before #30 was merged. This specific patch uses 6ba4730 as a base.

diff --git a/scienceworld/scienceworld.py b/scienceworld/scienceworld.py
index 227c6b5..6911444 100644
--- a/scienceworld/scienceworld.py
+++ b/scienceworld/scienceworld.py
@@ -30,7 +30,13 @@ class ScienceWorldEnv:

         # Launch Java side with dynamic port and get back the port on which the
         # server was bound to.
-        port = launch_gateway(classpath=serverPath, die_on_exit=True, cwd=BASEPATH)
+        import sys, time
+        port = launch_gateway(classpath=serverPath, die_on_exit=True, cwd=BASEPATH,
+                  javaopts=['-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005,quiet=y'],
+                  redirect_stdout=sys.stdout, redirect_stderr=sys.stderr)
+        print("Attach debugger")
+        time.sleep(10)  # Give time for user to attach debugger
+

         # Connect python side to Java side with Java dynamic port and start python
         # callback server with a dynamic port

During each tick, electrical components are checked for whether they
connect to ground with the help of the method
`Terminal.connectsToGround()`. The method contains a `maxSteps`
parameter that is decremented on each recursive call to prevent infinite
recursion. However, the value of `maxSteps` was never checked, so the
method would recurse infinitely until the stack overflowed when one
terminal of a wire was connected to the other terminal. Adding checks to
stop recursing when `maxSteps` reaches 0 prevents the error.
@MarcCote MarcCote added the bug Something isn't working label Dec 9, 2022
@MarcCote
Copy link
Collaborator

MarcCote commented Dec 9, 2022

I think this fix will probably solve cognitiveailab/drrn-scienceworld#4 as well.

@aphedges very interested in making the debugging process builtin. Would you be willing to add those options and timeout if a given ENV variable is set? For instance, something like

if bool(os.environ.get("SCIENCEWORLD_DEBUG")):
    port = launch_gateway(classpath=serverPath, die_on_exit=True, cwd=BASEPATH,
                          javaopts=['-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005,quiet=y'],
                          redirect_stdout=sys.stdout, redirect_stderr=sys.stderr)
    print("Attach debugger")
    time.sleep(10)  # Give time for user to attach debugger
else:
    port = launch_gateway(classpath=serverPath, die_on_exit=True, cwd=BASEPATH)

@PeterAJansen
Copy link
Collaborator

Thanks for fixing this @aphedges . I vaguely remember losing a few commits when I was developing the electrical engine, it's possible these checks were what ended up getting lost. As Marc said, I think you've probably also fixed cognitiveailab/drrn-scienceworld#4 too. :)

Re: debugging, I agree, the process is very challenging and laborious. I typically made either small simulations to test something specific, or added println() calls in critical spots, then just ran the simulator manually in a terminal window and watched what would happen when the python client got to the critical point. (I think that last debug method might be slightly more challenging with the new auto-port-gateway, so redirecting the stdout as above should allow that debugging method to return again).

@PeterAJansen PeterAJansen merged commit bf40b47 into allenai:main Dec 9, 2022
@aphedges
Copy link
Contributor Author

@MarcCote:

@aphedges very interested in making the debugging process builtin. Would you be willing to add those options and timeout if a given ENV variable is set?

Sure! I actually wrote some very similar code in my agent, so I'll combine our two approaches and create a PR.

@PeterAJansen:

Re: debugging, I agree, the process is very challenging and laborious. I typically made either small simulations to test something specific, or added println() calls in critical spots, then just ran the simulator manually in a terminal window and watched what would happen when the python client got to the critical point.

I usually use those two techniques as well. I always create a small simulation program for a minimal reproduction and try to use println() statements, but if the bug is too difficult to solve with just those, then I attach a debugger.

@aphedges aphedges mentioned this pull request Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants