New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove SQLite and support PostgreSQL only #355
Conversation
I'd really prefer not to change the entire schema (giving a huge amount of code churn) just because the moniker mapper doesn't work properly. After all, the mapper is customizable right? |
@edolstra: The mapper is customizable, but we'd still have a major inconsistency when it comes to identifier naming, because the code would refer to camelCased names while the database and schema would map them to lower-case names. So every time you work with database identifiers you'd have to look up the right name from the mapper and the SQL schema in case of raw SQL. Also we have a lot of code which bypasses the mapper (not only C++ code but also some Perl code). |
We now longer support SQLite (see 78974ab) and we already use a temporary PostgreSQL database during shell setup. So let's remove all conditionals about SQLite and all dependencies. However, we now have the problem that DBIx classes are now generated all-lowercase when running a DBIx update. The reason for this is that if you don't quote table names in SQL, the identifier is folded to lower-case and to upper-case according to the SQL standard[1]. SQLite on the other hand only matches case- insensitively but on a schema dump the tables are with cases. Let's illustrate this by example, on PostgreSQL we have this: hydra=# CREATE TABLE TestFooBar (foo int, bar int[]); CREATE TABLE hydra=# \dt test* List of relations Schema | Name | Type | Owner --------+------------+-------+-------- public | testfoobar | table | aszlig (1 row) And on SQLite we have the following: sqlite> CREATE TABLE TestFooBar (foo int, bar int[]); sqlite> .schema CREATE TABLE TestFooBar (foo int, bar int[]); While it's perfectly fine to reference tables using lower-case identifiers, DBIx's make_schema_at() will pick up the tables from the schema. So in the end, we need to fix all of our identifiers to get upgrades working properly again. Note that there still is a dependency on DBDSQLite in release.nix, because it is needed for the Perl bindings of Nix. [1] https://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We want to map underscore-separated tables to CamelCase so that we still have the same class names as we had so far. Unfortunately we can't use the default moniker, because it tries to make pluralized names singular. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
With the new naming scheme for database identifiers, we're now going to have build_steps instead of buildSteps, so we need to adjust the relation name mapping accordingly. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We have rules to convert everything from the a_b variants to Perl class names, but because "SchemaVersion" will become "schemaversion", we'd get a class name called "schemaversion" instead of "SchemaVersion". This is in preparation of the conversion of all database identifiers. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Having CamelCase names for tables and columns doesn't really work out well, because we'd need to not only quote those names during creation but also all queries would need to use double quotes around these identifiers. Now every database identifier is consistent and lower-case only, which should make it easier to use within raw statements (and inside the SQL shell) and also less confusing in terms of schema definition versus raw statements or other code references. Of course, this isn't a very trivial change, so I wrote a small script which makes it easy to rebase this very commit should there be changes to master in the meantime. The Nix expression generating the script can be found here (yes, it's gory and a variant of "write-once-never-touch-again"): https://gist.github.com/aszlig/59ea813c28731d339b9764c46f363200 Note that I've left "userName" as-is (just making it lower-case in the schema) because there were a lot of references in the code that overlap with the column name. Another table I didn't touch except from making it lower-case (which it is anyway on PostgreSQL) is "schemaversion", because that table is needed for upgrading existing instances. Moreover, the reason why there now is no relation from jobset_inputs to jobsets (and back) anymore in the DBIx schema is that because it is now created using PostgreSQL instead of SQLite, that foreign key however was defined for SQLite only: > #ifdef SQLITE > , > foreign key (project, name, nixExprInput) > references JobsetInputs(project, jobset, > name) > #endif I've ran tests.install, tests.api and the tests in build.x86_64-linux and they all have passed. In addition to running those tests, I've dumped my running Hydra instance with database schema version 45 and tested the change. The upgrade worked successfully and the web interface is still working after a few tests. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Originally from 203a9b9. We want to track whether upgrades are consistent by using this as the base SQL file and iterating through all of the update SQL files, comparing the result against src/sql/hydra.sql. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Whenever we make a new upgrade-N.sql file, we need to make sure that we also add it to hydra.sql. That's exactly the part where human error can happen, so in order to mitigate this we now have a checker that can be called by: make -C src/sql check-db The update-dbix target has a dependency on check-db, so whenever we update the DBIx schema, we first check whether the singleton schema in hydra.sql is consistent with base.sql plus every update. We should now be able to find and fix already existing inconsistencies. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
The check constraint on jobsets (jobsets_check) has been explicitly named with underscores in upgrade-41.sql while the name within hydra.sql hasn't been set, so PostgreSQL will automatically create a name for the constraint, which luckily overlaps with our new naming scheme. Unfortunately however if a new Hydra instance has been set up between upgrade 41 and 50, the constraint would be implicitly created with the name jobsets_schedulingshares_check. In order to deal with this, I've added two statements in upgrade 50 that explicitly remove both variations of the constraint and recreate it with the name now defined in hydra.sql. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
These indices were added between introducing the schema upgrade system and were missed adding to upgrade-2.sql back then. The relevant commit is 8d65ab6. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This once was added in upgrade-12.sql as IndexBuildOutputsOnPath, but never made it into hydra.sql, so let's make sure that upgrade-50.sql fixes this for existing systems that may or may not lack this index. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Both of these weren't properly picked up by apgdiff, so let's rename them as well. I think especially the rule isn't supported by apgdiff. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
PostgreSQL version 9.2 doesn't support renaming rules, so let's just drop rule idempotentinsert rule and recreate it with the new name idempotent_insert instead. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
I've missed this in 28d10b4 mainly because it didn't show up/failed in the tests. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Rebased against current master. |
Given that it's just a rename (so it's reversible in any case), I'd say merge this as it removes tech debt with sqlite :) |
I spent a few hours today trying to get a dev hydra install working, |
We already have a few stored procedures and other incompatibilities between the two database backends and after asking @edolstra it's clear that we don't want to support SQLite anymore.
Unfortunately there are a few problems for an easy switch, mainly that if we don't use SQLite anymore for generating the DBIx schema, we will end up with a schema with all-lowercase table names, because PostgreSQL folds these names to lower-case.
Another issue with this identifier folding is already something which shows up in the code base: While in the SQL files every identifier is
CamelCased
, the actual database and the code referring to the identifiers is not.So to avoid confusion and also make naming of these identifiers more consistent I changed every
camelCased
identifier to useunder_scores
instead.This is somewhat related to #353, because that PR makes it easier to spin up a temporary PostgreSQL cluster.
Cc: @edolstra