Page 78 of 98
Data migration
Sqlite_orm supports automatic schema migration to a certain degree but there are a few caveats. Currently you can make changes to the storage
schema and call storage.sync_schema(true) which will attempt to apply the current schema to the database . There are limits to what it can do and
currently we do not support data migration primitives like add_column(), drop_column(), etc. However, the sync_schema method will attempt to detect the
changes between the storage schema and the database schema and try to reconcile without data loss most of the time. There are however very clear
exceptions, and we are working on them.
First, not all attributes of a column nor attributes of a table are compared to the database schema. This is currently a limitation of the table_xinfo
pragma of sqlite3, not of sqlite_orm itself. Second, there is a common source of problems with the foreign key checking mechanism that makes it very difficult
to do backups of tables that need to be dropped and recreated. As it happens, when a table has dependent rows, it cannot normally be dropped. There are
only two solutions that we know of at present: one is to remove all foreign key constraints of tables towards the table at hand temporarily using a sqlite client
like SqliteStudio or DB Browser for Sqlite (see the section on tools). This will allow the backing up of the current table since that process requires a drop table
as one of its steps. The other method is controversial but used by these tools and some developers to simplify the process. It has to do with disabling FK
checking before doing the backup and restoring it immediately afterwards. This does not require to remove the foreign key constraints that target the table at
hand. For more detailed information about how to automate the schema migration process, please feel free to get in contact with the author (see my details at
the end of the document). If data preservation and schema evolution are important to you, you need this additional information.
What are the aspects of a column that are comparable to the database schema? First, whether the column is part of the primary key of the table.
Second, whether the column has a default value. Third, whether the column is nullable (i.e. if it accepts null values). Fourth, whether the column is hidden
(meaning the column is generated_always_as()). Period. All other changes, like what the default value of the column is, or what the generated value of the
column is, or whether the column is unique, or has a check constraint, are simply not detected when compared with the physical database schema. For a
change in any of these properties to be incorporated at the physical database, we require to drop and recreate the table. The easiest and more secure way to
provoke this behavior is to remove the primary key constraint of a table temporarily: this will ensure the drop and recreation of the table using a backup,
preserving the data. Putting aside how we deal with foreign key constraints, be it by means of sqlite client tools or at the database configuration level, the
essence of data preservation and schema evolution deals with provoking a drop and recreate of the table at hand with a backup process in place.
What actions on the storage schema are detected by sync_schema and how exactly does it respond? This next section deals with this topic.
Schema Actions Detected by sync_schema()
1. Adding a column to the storage schema that does not exist in the database
a. If the column has no default value nor is nullable nor is generated, then there is no way data in that table can be preserved. Just think about
it: what value could be inserted in that column for each existing row?
i. This is therefore strictly prohibited unless you do not care about losing the table’s data. If you wish to add such a column you must
first add a lossless column (see next) and then tweak its properties as you like (thus a two-step process is unavoidable)
b. If the column has a default value, is nullable or is generated
21
, then there will be an ALTER TABLE ADD COLUMN command that is efficient
and effective. There will be no loss of table’s data and no backup will be needed
i. sync_schema_simulate(true) will return
sync_schema_result::new_columns_added for that table
2. Removing a column from storage schema that exists in the database
a. An ALTER TABLE DROP COLUMN command will be issued and no data loss will occur
21
If generated_always_as().stored() then a drop and recreated will be triggered with backup so no data loss either