How Core Data Saves Data in SQLite

Published on

Core Data is an object graph framework that has data persistence capabilities. The data organization structure of the same object graph in different persistence storage types (SQLite, XML) can vary greatly. If you have ever browsed through the SQLite database file generated by Core Data, you will have seen many strange tables and fields. This article will introduce these tables and fields, which may help you understand some of the confusion, such as: why Core Data does not require primary keys, how NSManagedObjectID is constructed, and what is the basis for determining save conflicts.

How to get the SQLite database file of Core Data

There are several ways to get the SQLite database file generated by Core Data:

  • Get the storage location of the file directly

In the code (usually placed in the Core Data Stack, for more information about the Stack, please refer to Mastering Core Data Stack), directly printing the location where persistent storage is saved is the most direct and efficient way to obtain it:

Swift
container.loadPersistentStores(completionHandler: { _, error in
    if let error = error as NSError? {
        fatalError("Unresolved error \(error), \(error.userInfo)")
    }
})

#if DEBUG
// If you have multiple stores saved in different directories, you need to print them out one by one
if let url = container.persistentStoreCoordinator.persistentStores.first?.url {
    print(url)
}
#endif

https://cdn.fatbobman.com/image-20220528103822780.png

Use the shortcut key (⇧⌘ G) or the menu command (Go to Folder) in Finder to directly navigate to the location of the file.

https://cdn.fatbobman.com/image-20220528103959218.png

  • Enable Debugging Parameters

If you have enabled Core Data’s debugging information output in your project, you can directly find the database path address at the top of the debugging information.

Swift
-com.apple.CoreData.CloudKitDebug 1

For more information about debugging parameters, please refer to Core Data with CloudKit: Troubleshooting.

  • Find by breakpoint

During the application execution, pause the program execution with any breakpoint, and enter the following command in the debugging window to obtain the root path of the application in the sandbox.

Swift
po NSHomeDirectory()
  • Third-party tools

Some third-party tools (such as RocketSim) offer the functionality to directly access the App directory in the simulator.

https://cdn.fatbobman.com/rocketSim_get_URL.png

It is recommended that readers continue reading with an open SQLite database file generated by Core Data.

Basic Tables and Fields

Basic tables and fields refer to the tables (non-entity tables) created by Core Data in SQLite database to meet basic functionalities without enabling other additional features (e.g. persistent history tracking, Core Data with CloudKit), and the special fields created in entity tables.

Tables Corresponding to Entities

The following figure shows the database structure of a project created with Xcode Core Data template (with only one entity “Item” and one attribute “timestamp”), where the table corresponding to the entity “Item” in SQLite is ZITEM.

https://cdn.fatbobman.com/tableAndFieldInCoreData_tableList1.png

Core Data follows the following rules to convert entities in the data model to SQLite format:

  • The name of the table corresponding to an entity is Z + the entity name (all uppercase). In this example, it is ZITEM.
  • The field corresponding to an attribute in the entity is Z + the attribute name (all uppercase). In this example, it is ZTIMESTAMP.
  • For attributes that have the same uppercase name (attributes are case-sensitive in their definition), a number will be added to other properties with the same name. For example, if Item has two attributes called timestamp and timeStamp, two fields will be created in the table: ZTIMESTAMP and ZTIMESTAMP1.
  • Three special fields are added to each entity table: Z_PK, Z_ENT, and Z_OPT (all of INTEGER type).
  • If the entity definition contains a relationship, a corresponding field will be created in the entity table or a corresponding intermediate relationship table will be created (see details below).

Z_ENT Field

Each entity table is registered in the Z_PRIMARYKEY table (details below). This field is equivalent to the registered Z_ENT field. It can be viewed as the ID of the table.

Z_PK Field

An integer that starts from 1 and increments by 1. It can be viewed as the primary key of the table. Z_PK + Z_ENT (primary key + table ID) is the key for Core Data to find specific entries in a particular SQLite data file.

Z_OPT Field

The version number of the data record. Each modification to the data will cause this value to increment by one.

Z_PRIMARYKEY Table

The Z_PRIMARYKEY table is the foundation for locating data via Z_PK + Z_ENT. Its main functions are:

  • Registering tables created by Core Data in SQLite (all tables that need to be located via Z_PK + Z_ENT, excluding Z_PRIMARYKEY, Z_METADATA, and Z_MODELCACHE)
  • Marking relationships between entities (only for abstract entities)
  • Recording the names of entities (as defined in the data model)
  • Recording the current maximum Z_PK value used for each registered table

Z_ENT

The ID of the table. Entity tables start from number 1, while tables created for other system functions start from number 16000. The following diagram shows the correspondence between Z_ENT in the Memo table and the Z_Ent field recorded in the Z_PRIMARYKEY table.

https://cdn.fatbobman.com/tableAndFieldInCoreData_z_ent_1.png

https://cdn.fatbobman.com/tableAndFieldInCoreData_z_ent_2.png

Z_NAME Field

The name of the entity in the data model (case-sensitive), used for reverse lookup of corresponding data from the URL (see specific application below).

Z_SUPER Field

If the entity is a sub-entity of an entity (Abstract Entity), this value corresponds to the Z_ENT of its parent entity. 0 indicates that the entity has no parent entity. The following figure shows the situation of Z_SUPER when Item is an abstract entity and ItemSub is its sub-entity.

https://cdn.fatbobman.com/tableAndFieldInCoreData_z_super_1.png

https://cdn.fatbobman.com/tableAndFieldInCoreData_z_super_2.png

Z_MAX Field

Marks the last used Z_PK value for each registry table. When creating new entity data, Core Data finds the corresponding entity’s last used Z_PK value (Z_MAX) from the Z_PRIMARYKEY table, adds one to this value, and uses it as the new record’s Z_PK value, and updates the corresponding entity’s Z_MAX value.

Z_METADATA Table

The Z_METADATA table records information about the current SQLite file, including version, identifier, and other metadata.

Z_UUID Field

The ID identifier (UUID type) of the current database file. This value can be obtained through the managed object coordinator. When converting NSManagedObjectID to a storable URL, this value represents the corresponding persistent storage.

Z_PLIST Field

Stores metadata about persistent storage in Plist format (excluding the persistent storage UUID identifier). Developers can read or add data through the persistent storage coordinator. If necessary, developers can also save data unrelated to the database in it (which can be considered as an alternative usage of the Core Data database file to save program configurations).

Swift
let coordinate = container.persistentStoreCoordinator
guard let store = coordinate.persistentStores.first else {
    fatalError()
}
var metadata = coordinate.metadata(for: store) // Get metadata (Z_PLIST + Z_UUID)
metadata["Author"] = "fat" // Add new metadata
store.metadata = metadata

try! container.viewContext.save() // Except when creating a new persistent storage, adding data in other situations requires an explicit call to the context's save method to complete the persistence

The following figure shows the situation where the data in Z_PLIST (in BLOB format) is exported in Plist format:

https://cdn.fatbobman.com/tableAndFieldInCoreData_z_plist.png

Z_VERSION Field

The specific purpose is unknown (presumably the SQLite format version of Core Data), which is always 1.

Z_MODELCACHE Table

Although Core Data reserves the signature information of the current data model version used in Z_PLIST in the Z_METADATA table, because the content of Z_PLIST can be changed, in order to ensure that the data model version used by the application is completely consistent with the SQLite file, Core Data saves a cache version of the data model corresponding to the current SQLite data in the Z_MODELCACHE table (a variant of mom or omo).

The cache data in Z_MODELCACHE and the data model signature in metadata together provide assurance for data model version validation and version migration.

Gains from Database Structure

After having a certain understanding of the tables and fields in SQLite, some questions that trouble Core Data developers may be effectively explained.

Why Primary Key is Not Required

Core Data automatically adds an auto-increment primary key data for each new record through the Z_MAX corresponding to the entity table. Therefore, when defining a data model in Core Data, developers do not need to define a primary key attribute for the entity (in fact, they cannot create an auto-increment primary key either).

The Composition of NSManagedObjectID

The NSManagedObjectID of a managed object is composed of the database ID, table ID, and primary key in the entity table. In SQLite, these correspond to the Z_UUID, Z_ENT, and Z_PK fields. By converting the NSManagedObjectID to a URL that can be stored, its composition can be clearly displayed.

Swift
let url = itemSub.objectID.uriRepresentation()

https://cdn.fatbobman.com/tableAndFieldInCoreData_nsmanagedObjectID_url.png

The combination of information from files (persistent storage), tables, and rows will also help Core Data convert from a URL to the corresponding managed object.

Swift
let url = URL(string:"x-coredata://E8B22CEA-8316-45E7-BC08-3FBA516F962C/ItemSub/p1")!

if let objectID = container.persistentStoreCoordinator.managedObjectID(forURIRepresentation: url) {
    if let itemSub = container.viewContext.object(with: objectID) as? ItemSub {
        ...
    }
}

For more information on converting from URL to managed object, please refer to Showcasing Core Data in Applications with Spotlight.

How to identify relationships in a database

Core Data uses the feature of locating records in the same database with only Z_ENT + Z_PK to mark relationships between different entities. To save space, Core Data only stores the Z_PK data of each relationship record, while Z_ENT is obtained directly from the Z_PRIMARYKEY table by the data model.

The rules for creating relationships in the database are:

  • One-to-Many

    No new fields are created on the “one” side, while a new field is created on the “many” side, corresponding to the Z_PK value of the “one” side. The field name is Z + relationship name (uppercase).

  • One-to-One

    New fields are added on both ends of the relationship, corresponding to the Z_PK values of the corresponding data.

  • Many-to-One

    No new fields are added on either end of the relationship. Instead, a new table representing the many-to-many relationship is created, and the Z_PK values of the two sides of the relationship are added to the table row by row.

    In the figure below, Item and Tag have a many-to-many relationship, and Core Data creates the Z_2TAGS table to manage the relationship data.

https://cdn.fatbobman.com/image-20220528162005978.png

When abstract entities are enabled, in addition to recording the Z_PK value that corresponds to the relationship data, a field is also added to record which Z_ENT the data belongs to specifically (parent entity or a certain sub-entity).

Table for Persistent History Tracking

In CoreData, if the data storage format is SQLite (most developers use this method) and the persistent history tracking function is enabled, any changes to the data in the database (deletion, addition, modification, etc.) will trigger a system notification of “database changes” to the application that has called the database and registered for the notification.

In recent years, with the application of App Group, widgets, Core Data with CloudKit, Core Data in Spotlight, and other functions, more and more Core Data applications have actively or passively enabled the persistent history tracking option. After enabling this function (desc.setOption(true as NSNumber,forKey: NSPersistentHistoryTrackingKey)), Core Data will create three new tables in SQLite to manage and record transactions and register information about these three tables in the Z_PRIMARYKEY table.

For more detailed information about persistent history tracking, please refer to Using Persistent History Tracking in CoreData.

https://cdn.fatbobman.com/tableAndFieldInCoreData_persistent_history_tracing_tables.png

https://cdn.fatbobman.com/image-20220528172620831.png

Z_ATRANSACTIONSTRING Table

In order to distinguish the source of transactions, the creator of a transaction needs to set the transaction author for the managed object context. Core Data gathers all transaction author information in the Z_ATRANSACTIONSTRING table.

Swift
container.viewContext.transactionAuthor = "fatbobman"

If the developer has also set a name for the context, Core Data will create a record for that context name.

Swift
container.viewContext.name = "viewContext"

https://cdn.fatbobman.com/tableAndFieldInCoreData_atransactionString.png

Core Data also creates default author records for some other system functions. These transactions generated by system authors should be ignored when handling transactions.

The meanings of Z_PK and Z_ENT are consistent with those mentioned above and will not be repeated in the following text.

Z_ATRANSACTION Table

You can understand a persistent history tracked transaction as a persistence process in Core Data (such as calling the save method of a context). Core Data saves information related to a transaction in the Z_ATRANSACTION table. The most important information included is the time the transaction was created and the transaction author.

https://cdn.fatbobman.com/image-20220528174541292.png

ZAUTHORTS Field

Corresponds to Z_PK of transaction author in Z_ATRANSACTIONSTRING table. In the above image, it corresponds to fatbobman whose Z_PK in Z_ATRANSACTIONSTRING is 1.

ZCONTEXTNAMETS Field

If a name is set for the context that created the transaction, this field corresponds to the Z_PK record of the context name in the Z_ATRANSACTIONSTRING table. In the above image, it corresponds to viewContext.

ZTIMESTAMP Field

The creation time of the transaction.

ZQUERYGEN Field

If a lock query token (NSQueryGenerationToken) is set for the managed object context, the transaction record will also save the query token at that time in the ZQUERYGEN field (BLOB type).

Swift
try? container.viewContext.setQueryGenerationFrom(.current)

Z_ACHANGE Table

In a transaction, there are usually several data operations (create, modify, delete). Core Data stores each data operation in the Z_CHANGE table and associates it with a specific transaction through Z_PK.

https://cdn.fatbobman.com/tableAndFieldInCoreData_change.png

ZCHANGETYPE Field

Data operation type: 0 for new, 1 for update, 2 for delete

ZENTITY Field

Z_ENT of the corresponding entity table for the operation

ZENTITYPK Field

Z_PK of the corresponding data record in the entity table for the operation

ZTRANSACTIONID Field

Z_PK of the transaction corresponding to the operation in the Z_ATRANSACTION table

Understanding Persistence History Tracking from SQLite Perspective

Creating Transactions

In Core Data, the creation of transactions in the persistent history tracking is automatically done. The process is roughly as follows:

  • Get Z_MAX from Z_PRIMARYKEY table for Z_ATRANSACTION
  • Create a new transaction record in Z_ATRANSACTION using Z_PK (Z_MAX + 1) + Z_ENT (the corresponding Z_ENT in Z_PRIMARYKEY for the transaction table) + author ID + timestamp, and update Z_MAX
  • Get Z_MAX from Z_ACHANGE
  • Create data operation records one by one in Z_ACHANGE

Querying Transactions

Since only the transaction creation timestamp is saved in the database, regardless of the query method used (Date, NSPersistentHistoryToken, NSPersistentHistoryTransaction), it will ultimately be converted into a comparison of timestamps.

  • Timestamp later than the last query time of the current application
  • Author is not the author of the current app or other system function author
  • Get all Z_CHANGE records that meet the above conditions

Merging Transactions

The data operation records (Z_ACHANGE) extracted from the transaction contain complete operation types, corresponding instance data positions, and other information. Entity data (Z_PK + Z_ENT) is extracted from the database according to the information and merged (converted to NSManagedObjectID) into the specified context.

Delete transaction

  • Query and extract transactions with a timestamp earlier than the last query time of all authors (including the current application author, but excluding system function authors)
  • Delete the above transactions (Z_ATRANSACTION) and their corresponding operation data (Z_ACHANGE).

Understanding the above process is very helpful for understanding the code of Persistent History Tracking Kit.

Other

If your application uses Core Data with CloudKit, you will get further surprises (😱) when browsing the SQLite data structure. Core Data will create more tables to handle synchronization with CloudKit. Considering the complexity and length of the tables, we will not continue to expand on them. However, with the foundation above, it is not very difficult to understand their purpose.

The following figure shows the system tables added to SQLite after enabling private database synchronization:

https://cdn.fatbobman.com/image-20220528201143040.png

These tables mainly record information about the CloudKit private domain, last synchronization time, last synchronization token, export operation log, import operation log, data to be exported, Core Data relationship mapping table with CloudKit, CKRecordName corresponding to local data, complete CKRecord mirror image of local data in the shared public database, and so on.

As Core Data functionality continues to increase, we may see more system function tables in the future.

Conclusion

The main purpose of writing this article is to summarize my scattered research in recent times for future reference. Therefore, even if you have completely mastered the external storage structure of Core Data, it is still best to avoid directly manipulating the database as Apple may change its underlying implementation at any time.

Get weekly handpicked updates on Swift and SwiftUI!