Crash on Android N, SQLCipher 3.5.2


#1

I received a crash report from a user on Android N (Nexus 5X) that I believe is a SQLCipher issue. We’ve seen some other I/O issues (networking-related) in our app related to resume from app-standby or doze mode on Android M and N, so I would guess that this is likely related.

Not much else to add, but happy to answer any other questions I can. Thanks!

Exception java.util.concurrent.TimeoutException: android.content.ContentResolver$CursorWrapperInner.finalize() timed out after 10 seconds
java.lang.Object.wait (Object.java)
java.lang.Thread.parkFor$ (Thread.java:2127)
sun.misc.Unsafe.park (Unsafe.java:325)
java.util.concurrent.locks.LockSupport.park (LockSupport.java:161)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt (AbstractQueuedSynchronizer.java:840)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued (AbstractQueuedSynchronizer.java:873)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire (AbstractQueuedSynchronizer.java:1197)
java.util.concurrent.locks.ReentrantLock$FairSync.lock (ReentrantLock.java:200)
java.util.concurrent.locks.ReentrantLock.lock (ReentrantLock.java:262)
net.sqlcipher.database.SQLiteDatabase.lock (SQLiteDatabase.java:496)
net.sqlcipher.database.SQLiteProgram.close (SQLiteProgram.java:294)
net.sqlcipher.database.SQLiteQuery.close (SQLiteQuery.java:136)
net.sqlcipher.database.SQLiteCursor.close (SQLiteCursor.java:510)
android.database.CursorWrapper.close (CursorWrapper.java:50)
android.database.CursorWrapper.close (CursorWrapper.java:50)
android.content.ContentResolver$CursorWrapperInner.close (ContentResolver.java:2630)
android.content.ContentResolver$CursorWrapperInner.finalize (ContentResolver.java:2641)
java.lang.Daemons$FinalizerDaemon.doFinalize (Daemons.java:222)
java.lang.Daemons$FinalizerDaemon.run (Daemons.java:209)
java.lang.Thread.run (Thread.java:761)


#2

Hi @ncantelmo

Some additional information would be helpful in diagnosing the problem you are seeing. A few questions:

  • What query operation is taking place within your application when this occurs?
  • Is your SQL operation running within a transaction where the exception occurs?
  • If you are using transactions at this point, how many transactions are operating concurrently when this error occurs?
  • How much data are you operating on when this error occurs?
  • Are you able to reproduce this issue with a test case within the SQLCipher for Android test suite?

Thanks!


#3

Thanks for the quick response!

I don’t have many exact details about the surrounding application state at the moment, but I do know that this was a user with a lot of historic data signing in on a new device for the first time. That would result in a big data sync in the background, including a large (10000+ record) batch inserts within a transaction. Unfortunately, I can’t say for sure that this was the particular query running at that time.

Not sure if that’s enough detail to start with, but I’ll see what I can do to reproduce the issue regardless.


#4

Hi again,

TL;DR: The original issue was probably not caused by SQLCipher, but I can’t say for sure.

I was able to capture a method trace today for some slow app behavior that I believe could help to explain the earlier crash.

The issue I dug into today was caused by a background service iteratively saving a bunch of newly received record updates, causing a ContentProvider to send out data-changed notifications on each one, which in turn caused a CursorAdapter/Loader to immediately restart a query to get the latest view on the data. This produced a sort of trashing behavior, causing the background service to run hard, the database updates to take a long time, and the UI to not be updated efficiently.

The method trace revealed that the background service was spending a bunch of time on the code path:

net.sqlcipher.database.SQLiteDatabase.update
net.sqlcipher.database.SQLiteDatabase.updateWithOnConflict
net.sqlcipher.database.SQLiteDatabase.lock

java.lang.Object.wait

Each wait was ~200-350ms, and was invoked 24x over a 12-second trace window.

Meanwhile, the main thread produced a trace that included 23 instances of:

android.support.v4.content.CursorLoader.onCanceled
android.content.ContentResolver$CursorWrapperInner.close
android.database.CursorWrapper.close
net.sqlcipher.database.SQLiteCursor.close

net.sqlcipher.database.SQLiteDatabase.lock

java.lang.Object.wait

On the main thread, the waits were each about 25ms.

The data set for the behavior shown above was a ~40000-row, indexed table joined with a few other tables using a view, queried using both a LIMIT and OFFSET clause to limit the results to the 100 most recent records. This is a smaller data set than I believe would have been encountered in the previous (crash) scenario from a few weeks ago, and in that version of the app the LIMIT and OFFSET clauses were not included in the query.

So although the Object.wait() times for the SQLiteCursor.close() code path was relatively small in this instance, it could have been substantially longer in the previous case (although the 10 second wait timeout that caused the original crash still seems like a stretch).

That said, I don’t yet have an explanation for how the background/foreground thrashing described above could have caused a finalizer daemon to run into an issue. The CursorLoader explicitly closes the cursors it uses, and we certainly try to do so everywhere as well. My original though that the issue could have been caused by app-standby on Marshmallow is complicated by the fact that this thrashing scenario shouldn’t happen unless the app is in the foreground. I failed to reproduce the original issue using the test suite, so I don’t have much else to work with.

For now I’m going on the theory that the original issue was caused by inefficiencies in our app’s DB use paired with a cursor that is somehow not explicitly being closed. Since it’s a very rare issue with clear steps that we can take to improve our DB availability, I’m not planning to spend more time on it (directly) unless we start seeing it show up again in our crash logs.