-
Notifications
You must be signed in to change notification settings - Fork 7
Description
General issue:
We're using the R extension in a model and everything works well when we use it in single runs. However, when we set up BehaviorSpace experiments with several runs in parallel, NetLogo crashes unexpectedly - with the error message indicating a memory access problem (full error message is attached).
In Detail:
We've so far found out that the crashes occur under different conditions:
-
We've tested it with both a real model, and a small dummy model that only puts one variable to R, evaluates one expression, and returns the result to NetLogo (see code below) - the crashes occur with both models.
-
The crashes occur in both Windows and Mac environments: we've tested it on a 4 core Windows notebook, 6 core MacBook pro, and 40 core Windows compute cluster.
-
The number of parallel runs seems to be very important: e.g. on the Windows compute cluster 4 parallel runs were fine, 5 parallel runs already let NetLogo crash. On my MacBook Pro, 8 parallel runs were fine, 9 always made NetLogo crash.
-
Memory availability does not seem to be the (main) problem, as the Windows compute cluster has 768 GB of memory, and the MacBook Pro has 32 GB of memory available. We also tried increasing the Java heap size (using the NetLogo.cfg, or the command line options in headless mode).
-
The error occurs both starting the experiment in the GUI as well as in headless mode.
We've used NetLogo 6.1 and R 3.5.3 & 3.5.2 to run the models on the three different machines. The Windows notebook is running Windows 7, the Windows compute cluster is running Windows Server 2012 R2 Standard, and the MacBook Pro is running MacOS 10.14.6
The sample model that we used is the following (full .nlogo file is attached, renamed to .txt):
extensions [r]
globals [
data
data-result
]
to setup
set data []
repeat n
[
set data fput random 100 data
]
end
to go
r:put "dats" data
r:eval "dats_mean <- mean(dats)"
set data-result r:get "dats_mean"
print ( word "the mean is " result )
stop
end
to-report result
report data-result
end
The Crash report on the Windows compute cluster is
Problem signature:
Problem Event Name: APPCRASH
Application Name: NetLogo.exe
Application Version: 0.0.0.0
Application Timestamp: 5c10d881
Fault Module Name: R.dll
Fault Module Version: 3.52.10334.0
Fault Module Timestamp: 5c1b62d0
Exception Code: c00000fd
Exception Offset: 0000000000128c81
OS Version: 6.3.9600.2.0.0.16.7
Locale ID: 1033
Additional Information 1: f94c
Additional Information 2: f94c8bdfb5de375694f8855243970f35
Additional Information 3: 9eed
Additional Information 4: 9eed6c337d597a4fceca8d0a6d67e101
The Crash report on my MacBook starts as follows (full crash report attached):
Process: NetLogo [88321]
Path: /Applications/NetLogo 6.1.0/NetLogo 6.1.0.app/Contents/MacOS/NetLogo
Identifier: org.nlogo.NetLogo
Version: 6.1.0 (6.1.0)
Code Type: X86-64 (Native)
Parent Process: ??? [1]
Responsible: NetLogo [88321]
User ID: 501Date/Time: 2019-08-09 10:25:45.436 +0200
OS Version: Mac OS X 10.14.6 (18G84)
Report Version: 12
Bridge OS Version: 3.6 (16P6568)
Anonymous UUID: 7F799A0C-F48F-02C1-E29E-BF2F8763DF14Sleep/Wake UUID: 8CB42D94-58A5-4AB6-9916-BB24F48A2706
Time Awake Since Boot: 190000 seconds
Time Since Wake: 2600 secondsSystem Integrity Protection: enabled
Crashed Thread: 45 Java: JobThread
Exception Type: EXC_BAD_ACCESS (SIGABRT)
Exception Codes: EXC_I386_GPFLT
Exception Note: EXC_CORPSE_NOTIFYApplication Specific Information:
abort() calledThread 0:: AppKit Thread Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x00007fff6f5c122a mach_msg_trap + 10
1 libsystem_kernel.dylib 0x00007fff6f5c176c mach_msg + 60
2 com.apple.CoreFoundation 0x00007fff4353b1ee __CFRunLoopServiceMachPort + 328
3 com.apple.CoreFoundation 0x00007fff4353a75c __CFRunLoopRun + 1612
4 com.apple.CoreFoundation 0x00007fff43539ebe CFRunLoopRunSpecific + 455
5 com.apple.HIToolbox 0x00007fff427991ab RunCurrentEventLoopInMode + 292
6 com.apple.HIToolbox 0x00007fff42798ee5 ReceiveNextEventCommon + 603
7 com.apple.HIToolbox 0x00007fff42798c76 _BlockUntilNextEventMatchingListInModeWithFilter + 64
8 com.apple.AppKit 0x00007fff40b3179d _DPSNextEvent + 1135
9 com.apple.AppKit 0x00007fff40b3048b -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 1361
10 libosxapp.dylib 0x0000000113e38328 -[NSApplicationAWT nextEventMatchingMask:untilDate:inMode:dequeue:] + 124
11 com.apple.AppKit 0x00007fff40b2a5a8 -[NSApplication run] + 699
12 libosxapp.dylib 0x0000000113e37f72 +[NSApplicationAWT runAWTLoopWithApp:] + 156
13 libawt_lwawt.dylib 0x0000000113dc30bf -[AWTStarter starter:] + 905
14 com.apple.Foundation 0x00007fff45832742 __NSThreadPerformPerform + 328
15 com.apple.CoreFoundation 0x00007fff43557683 CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 17
16 com.apple.CoreFoundation 0x00007fff43557629 __CFRunLoopDoSource0 + 108
17 com.apple.CoreFoundation 0x00007fff4353afeb __CFRunLoopDoSources0 + 195
18 com.apple.CoreFoundation 0x00007fff4353a5b5 __CFRunLoopRun + 1189
19 com.apple.CoreFoundation 0x00007fff43539ebe CFRunLoopRunSpecific + 455
20 libjli.dylib 0x00000001106e78fe CreateExecutionEnvironment + 871
21 libjli.dylib 0x00000001106e34cc JLI_Launch + 1952
22 libpackager.dylib 0x000000010ec77844 JavaLibrary::JavaVMCreate(unsigned long, char**) + 180
23 libpackager.dylib 0x000000010ec75b27 JavaVirtualMachine::StartJVM() + 3703
24 libpackager.dylib 0x000000010ec74bf0 RunVM() + 16
25 libpackager.dylib 0x000000010ec7fc7f start_launcher + 1631
26 org.nlogo.NetLogo 0x000000010ec31cec main + 220
27 libdyld.dylib 0x00007fff6f48c3d5 start + 1...
hs_err_pid88321.log
NetLogo_R_Extension_Crashlog.txt
NetLogo_R_Test.txt