Fix for nasty `rvmCreateVM SIGABRT` crash
21 Apr 2022 | fix debugger vmThis issue was mentioned multiple times over all channels and all reports have the following in common:
- issue usually was seen in crash reporters;
- no steps or information how to reproduce;
- no simple project to reproduce;
- production source code can’t be shared for investigation;
- crash is related to heavy network activities;
- crash log looks similar to bellow:
rvmCreateVM
SIGABRT ABORT 0x00000001d5219414
Crashed: Thread
0 libsystem_kernel.dylib 0x1d5219414 __pthread_kill + 8
1 libsystem_pthread.dylib 0x1f2d74b50 pthread_kill + 272
2 libsystem_c.dylib 0x1b06f7b74 abort + 104
3 RoboVMMobile 0x1060a77b0 rvmCreateVM + 4386699184
4 RoboVMMobile 0x1060a51ec rvmInitialize + 4386689516
5 RoboVMMobile 0x10609d9a8 _bcInitializeClass + 4386658728
Things changed once Benjamin Schulte @MrStahlfelge
reported unknown rare crash that was happening approximately once per 100 launches and ErgoWallet is opensources.
1 crash per 100 launches sound good to track the bug.
Root case
Issue was reproduced and stack trace for debug code looks more informative than production one (there is an issue for this):
Assertion failed: (CLASS_IS_STATE_LOADED(clazz)), function rvmInitialize, file class.c, line 1173.
And its actual state is CLASS_STATE_ALLOCATED
which looks odd. Considering that bcInitializeClass
logical steps looks as following:
bcInitializeClass
ldcClass
rvmFindClassUsingLoader
java.lang.ClassLoader.loadClass()
rvmFindLoadedClass
getLoadedClass (from hash)
rvmFindClassInClasspathForLoader
synchronized {
loadUserClass-loadClass-createClass()
rvmAllocateClass -> CLASS_STATE_ALLOCATED
rvmRegisterClass()
addLoadedClass (to hash map)
-> CLASS_STATE_LOADED
}
rvmInitialize
synchronized {
// expects CLASS_STATE_LOADED
// no ineresting as crash is happening here
}
Call flow looks straight forward and from logical flow it not expected to have CLASS_STATE_ALLOCATED
in rvmInitialize
.
Heavy load is concurrency sensitive and classic synchronization problem is here.
The issue is triggered by concurrent loading of same class from multiple threads.
Scenario is following:
Thread1:
- didn’t find loaded class;
- goes creating one in
createClass()
; - move class state to
CLASS_STATE_ALLOCATED
; - and register class into loaded classes hashmap by
addLoadedClass
; - moves class to
CLASS_STATE_LOADED
; - goes to
rvmInitialize()
Thread2 and trouble:
- tries to
rvmFindLoadedClass
while Thread1 somewhere between 4. and 5. - as class in hash map it picks it from there and takes a shortcuts directly to
rvmInitialize()
. - Assertion failed: (CLASS_IS_STATE_LOADED(clazz))
Problems is here:
rvmFindLoadedClass
allows not synchronized access to loaded class hash map. As result class in middle of initialization is returned as loaded (this actually a root case);rvmFindLoadedClass
should not return classes in state other thanCLASS_STATE_INITIALIZED
otherwisejava.lang.ClassLoader.loadClass()
might return not initialized classes.
The fix
Class* rvmFindLoadedClass(Env* env, const char* className, Object* classLoader) {
+ obtainClassLock();
Class* clazz = getLoadedClass(env, className);
+ releaseClassLock();
if (rvmExceptionOccurred(env)) return NULL;
+ if (clazz && !CLASS_IS_STATE_INITIALIZED(clazz)) return NULL;
return clazz;
}
Code
The fix was proposed as PR648
Tips how to catch such kinds of bugs
Using debug version of RoboVM
Native code has to be build in debug mode:
compiler/vm/build.sh –build=debug
To use debug libraries RoboVM to be run in debug mode by specifying -DROBOVM_DEV_ROOT={path to compiler dir}
dev root home;
Ability to reproducing
Crash rate 1 from 100 might be too boring for manual reproduction (which might not happen in debug mode). To get proof is useful to run an automated batch to see if it reproducible. Following run and restart app in simulator multiple times:
for i in {1..500}; do echo “-“; echo “-“; echo $i ; timeout 4 xcrun simctl launch –console 7F276E93-4A20-49F8-944B-227386A0FA80 org.ergoplatform.ios ; done
NP: timeout
is part of coreutils
, so brew install coreutils
.
Logging
It’s not 100% reproducible, good approach is to put printf
/asserts in lines of interest and launch batch to get the crash. And analyze the output. Then put same in better places :)
Read code
When possible problem and code area is localised it might be enough to read the code.
Crash log symbolication
Console often fails to symbolicate crash as it messes with DSYMs in indexed location. It’s quicker to symbolicate crashes manually.
But apple moved to .ips
json-like format of crash logs and sadly symbolicatecrash
script is not able to handle it.
Quickest way is to use atos
:
Open .ips
crash report in Console.app
, find stack trace of interest:
Thread 20 Crashed:
0 libsystem_kernel.dylib 0x1cba39e60 __pthread_kill + 8
1 libsystem_pthread.dylib 0x1cba8d3c0 pthread_kill + 256
2 libsystem_c.dylib 0x1801023f4 abort + 124
3 libsystem_c.dylib 0x180101910 __assert_rtn + 268
4 ErgoWallet 0x103dba5c8 0x100858000 + 55977416
5 ErgoWallet 0x103dac4ac 0x100858000 + 55919788
6 ErgoWallet 0x103dac41c 0x100858000 + 55919644
7 ErgoWallet 0x102172554 0x100858000 + 26322260
8 ErgoWallet 0x10217245c 0x100858000 + 26322012
9 ErgoWallet 0x1014bd54c 0x100858000 + 12997964
10 ErgoWallet 0x10215af84 0x100858000 + 26226564
Here 0x100858000
in second column is the image load address, and first column is a stacktrace addresses.
Switch to robovm-build/tmp/Unnamed/ios/arm64-simulator
where ErgoWallet.app.dSYM
is located and run:
atos -o ErgoWallet.app.dSYM/Contents/Resources/DWARF/ErgoWallet -l {load_address} {stacktrace1} {stacktrace2} …
e.g.
atos -o ErgoWallet.app.dSYM/Contents/Resources/DWARF/ErgoWallet -l 0x100858000 0x103dba5c8 0x103dac4ac 0x103dac41c 0x102172554
Using Xcode
It will catch abort
and will allow to evaluate stacktraces/thread/data content. Very useful.
Option one: attach to running process
- Open/Create any dummy iOS project in Xcode;
- Make sure robovm app is running in simulator;
- Check that exactly same simulator is selected in Xcode;
- use menu “Debug/Attach to Process by PID or Name”;
- enter image name, e.g.
ErgoWallet
. And attach; - wait for crash, set symbolic breakpoints whatever.
Doesn’t work when app crashes on startup.
Option two: make xcode to run pre-build application
- build and run robovm app on simulator (e.g. binary to be build);
- open Product->Scheme->Edit scheme… dialog;
- in info tab, click
Executable
, selectOther
; - in directory picked dialog navigate to
robovm-build/tmp/Unnamed/ios/arm64-simulator
and select application, e.g.ErgoWallet.app
.
Now when Run
is hit, Xcode will build dummy project but deploy and launch robovm one.
Happy coding !
Comments