Tracking slipped NPE, but in Crashlytics v11.15.0
06 Oct 2025 | signal npe robovm bug crashlyticsFew months since last Tracking slipped NPE and here we go again:
- same user reported;
- same scenario (code that handles a lot of NPE in try/catch);
- this time with Crashlytics.
Time for dissection…
TLTR (spoiler)
- Crashlytics installs signals in background thread, so
Signal.install()
doesn’t preserve proper chain; - Due to ^^^^ it will report NPE inside try/catch as crash (and will not report crash after it);
- Its Mach exception handler is not protected against simultaneous events, and will crash if this happens;
- Its Signal handler is not protected against simultaneous events, will mess signal handler, and crash.
Environment
Well known and simple scenario:
Runnable NPE = () -> {
try {
((String) null).equals("hello");
} catch (NullPointerException ignored) {}
};
Thread t1 = new Thread(NPE);
Thread t2 = new Thread(NPE);
t1.start();
t2.start();
And well know setup code:
Signals.installSignals(() -> {
FIRApp.configure();
}, true);
Case 1: Signal handler
Once Signals.installSignals
is called with preservePorts = true
, all Mach handlers installed inside lambda are rolled back (sometimes, but details bellow).
In this case Signal handler code will handle NPE, and its code available in FIRCLSSignal.c.
It does:
- FIRCLSSignalHandler - handler itself
- –> FIRCLSSignalSafeRemoveHandlers – replaces signal handlers with
SIG_DFL
; - –> FIRCLSSignalRecordSignal – saves crash;
- –> FIRCLSSignalSafeInstallPreexistingHandlers:
- –> —> again calls FIRCLSSignalSafeRemoveHandlers
- –> —> restores previous signal handlers;
What wrong here:
- there is no multi-thread lock on handling signals, can happen simultaneous;
- non-atomic and messy restore to previous signal chain:
SIG_DFL
first, then previous;
And there is a crash in case of simultaneous NPE events, log looks as bellow:
FIRCLSSignalHandler 11
FIRCLSSignalHandler 11
FIRCLSSignalSafeRemoveHandlers 11
FIRCLSSignalSafeRemoveHandlers 11
FIRCLSSignalRecordSignal
FIRCLSSignalRecordSignal
FIRCLSSignalSafeInstallPreexistingHandlers 11
<crash>
and crash is inside FIRCLSSignalRecordSignal
as it is surprised by simultaneous nature of event and just kills the app (!!!!):
static void FIRCLSSignalRecordSignal(int savedErrno, siginfo_t *info, void *uapVoid) {
if (FIRCLSContextMarkAndCheckIfCrashed()) {
FIRCLSSDKLog("Error: aborting signal handler because crash has already occurred");
exit(1); /// <<<<<< CRASH HERE
return;
}
}
Case 2: Mach handler
Once Signals.installSignals
is called with preservePorts = false
Crashlytics will handle everything in Mach exception handler that has higher priority than signals FIRCLSMachException.c.
Due to nature of Mach message handling it is handled in one thread one-by-one and simultaneous mess as in case of FIRCLSSignalHandler
but again there is no protection against simultaneous event:
- FIRCLSMachExceptionServer - runs message receive loop;
- FIRCLSMachExceptionDispatchMessage - handles received message;
- –> FIRCLSMachExceptionUnregister - unregisters from next messages;
- –> FIRCLSSignalSafeInstallPreexistingHandlers - restores signals (same non-atomic as above);
- –> FIRCLSMachExceptionRecord - records crash.
Whats wrong here:
- simultaneous event will produce multiple messages that will be sent to server;
- server process these in single thread so no run condition will occur;
- by handling first message it unsubscribes from future messages due events;
- (and not expect anymore messages)
- as simultaneous happened before unsubscription it will be delivered to server.
- server will process unexpected second event and crash.
Crash happens in similar code due unexpected conditions (and it just kills the app):
static bool FIRCLSMachExceptionRecord(FIRCLSMachExceptionReadContext* context,
MachExceptionMessage* message) {
...
if (FIRCLSContextMarkAndCheckIfCrashed()) {
FIRCLSSDKLog("Error: aborting mach exception handler because crash has already occurred\n");
exit(1); /// <<<<<< CRASH HERE
return false;
}
...
}
Case 3: Why at all it handles JAVAs NPE as RoboVM should be the one ?
Following setup should restore RoboVMs power:
Signals.installSignals(() -> {
FIRApp.configure();
}, true);
But it doesn’t and randomly Crashlytics rule signals/mach. This happens due the way it installs handlers: FIRCLSContext.m
FBLPromise* FIRCLSContextInitialize(FIRCLSContextInitData* initData,
FIRCLSFileManager* fileManager) {
...
if (!_firclsContext.readonly->debuggerAttached) {
#if CLS_SIGNAL_SUPPORTED
dispatch_group_async(group, queue, ^{
FIRCLSSignalInitialize(&_firclsContext.readonly->signal);
});
#endif
#if CLS_MACH_EXCEPTION_SUPPORTED
dispatch_group_async(group, queue, ^{
FIRCLSMachExceptionInit(&_firclsContext.readonly->machException);
});
#endif
dispatch_group_async(group, queue, ^{
FIRCLSExceptionInitialize(&_firclsContext.readonly->exception,
&_firclsContext.writable->exception);
});
} else {
FIRCLSSDKLog("Debugger present - not installing handlers\n");
}
}
Everything is async, and probably will happen when Signals.installSignals
exits and will override RoboVM
handlers…
Bottom line and fix:
Crashlytics
is kind of strange
, have reported few issues there (without any hope):
Currently, it can be workaround with simple delay:
Signals.installSignals(() -> {
FIRApp.configure();
// FIXME: dirty workaround -- wait till Crashlytics install handler async
try { Thread.sleep(200); } catch (Exception e) {}
}, true);
Comments