Reliable Identification of SQLCipher Databases

Occasionally we receive questions from developers that have a need to conclusively identify a SQLCipher database in order to support some specific functionality in their application. It’s quite easy to identify standard SQLite databases because each database file starts with a header string of “SQLite format 3\000”. However, identification of SQLCipher databases is not trivial because the database header is 16 bytes of random data that varies between individual databases. This data is used as a random salt for SQLCipher’s key derivation process. Therefore our typical response is that it’s not possible to conclusively identify a SQLCipher database. A recent question from a SQLCipher developer prompted us to brainstorm an alternate approach.

Based on some work we committed in SQLCipher 3 to facilitate encrypted database attachment, it is possible to pass an extended key specification to SQLCipher using raw hex key semantics. In this case, the first 64 characters of the raw key string are used directly as the key value without key derivation, while the final 32 characters of the raw key string are used as the salt header. This can be used to control the first 16 bytes of the SQLCipher database file to ensure that a database has a predictable header value. Here is an example:

sjlombardo@vlindev2:~/Documents/code/sqlcipher$ ./sqlcipher test.db
sqlite> PRAGMA key = "x'2DD29CA851E7B56E4697B0E1F08507293D761A05CE4D1B628663F411A8086D993D761A05CE4D1B628663F411A8086D99'";
sqlite> create table t1(a,b);
sqlite> insert into t1(a,b) values (1,2);
sqlite> .q

sjlombardo@vlindev2:~/Documents/code/sqlcipher$ ./sqlcipher test2.db
sqlite> PRAGMA key = "x'7293D761A05CE4D1B628663F411A8086D992DD29CA851E7B56E4697B0E1F08503D761A05CE4D1B628663F411A8086D99'";
sqlite> create table t1(a,b);
sqlite> insert into t1(a,b) values (1,2);
sqlite> .q

sjlombardo@vlindev2:~/Documents/code/sqlcipher$ ./sqlcipher test.db
sqlite> PRAGMA key = "x'2DD29CA851E7B56E4697B0E1F08507293D761A05CE4D1B628663F411A8086D99'";
sqlite> .schema
CREATE TABLE t1(a,b);
sqlite> select * from t1;
1|2
sqlite> .q

sjlombardo@vlindev2:~/Documents/code/sqlcipher$ ./sqlcipher test2.db
sqlite> PRAGMA key = "x'7293D761A05CE4D1B628663F411A8086D992DD29CA851E7B56E4697B0E1F0850'";
sqlite> .schema
CREATE TABLE t1(a,b);
sqlite> select * from t1;
1|2
sqlite> .q

Notice that after creation, even though the databases have different keys, the header values are identical:

sjlombardo@vlindev2:~/Documents/code/sqlcipher$ hexdump -C test.db | head -2
00000000  3d 76 1a 05 ce 4d 1b 62  86 63 f4 11 a8 08 6d 99  |=v...M.b.c....m.|
00000010  a6 dc 8b 2f 31 3b 76 52  e4 fd 2e 8c e4 ad 80 63  |.../1;vR.......c|

sjlombardo@vlindev2:~/Documents/code/sqlcipher$ hexdump -C test2.db | head -2
00000000  3d 76 1a 05 ce 4d 1b 62  86 63 f4 11 a8 08 6d 99  |=v...M.b.c....m.|
00000010  ff db 83 8e d7 a1 99 41  c9 2d 53 92 3d ac c4 89  |.......A.-S.=...|

We’d envision that this approach could be used in an implementation to ensure that encrypted databases created by a given application have a know header. This could then be inspected by the application, without invoking SQLCipher, to determine whether a given file is an application encrypted database.

This approach should work well for applications that need this functionality, but it’s important to note that the use of the raw key string in this approach will bypass SQLCipher’s key derivation process. This means that an application would be responsible for performing key derivation on its own, or have some other a means to provide suitable key material.

Finally, it’s worth noting that experimental changes that may be incorporated into SQLCipher 4 would add a predictable header and metadata to SQLCipher files directly, so this approach may become unnecessary in future versions.