Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove SQLite and support PostgreSQL only #355

Closed
wants to merge 13 commits into from

Conversation

aszlig
Copy link
Member

@aszlig aszlig commented Jul 7, 2016

We already have a few stored procedures and other incompatibilities between the two database backends and after asking @edolstra it's clear that we don't want to support SQLite anymore.

Unfortunately there are a few problems for an easy switch, mainly that if we don't use SQLite anymore for generating the DBIx schema, we will end up with a schema with all-lowercase table names, because PostgreSQL folds these names to lower-case.

Another issue with this identifier folding is already something which shows up in the code base: While in the SQL files every identifier is CamelCased, the actual database and the code referring to the identifiers is not.

So to avoid confusion and also make naming of these identifiers more consistent I changed every camelCased identifier to use under_scores instead.

This is somewhat related to #353, because that PR makes it easier to spin up a temporary PostgreSQL cluster.

Cc: @edolstra

@edolstra
Copy link
Member

edolstra commented Jul 7, 2016

I'd really prefer not to change the entire schema (giving a huge amount of code churn) just because the moniker mapper doesn't work properly. After all, the mapper is customizable right?

@aszlig
Copy link
Member Author

aszlig commented Jul 7, 2016

@edolstra: The mapper is customizable, but we'd still have a major inconsistency when it comes to identifier naming, because the code would refer to camelCased names while the database and schema would map them to lower-case names. So every time you work with database identifiers you'd have to look up the right name from the mapper and the SQL schema in case of raw SQL.

Also we have a lot of code which bypasses the mapper (not only C++ code but also some Perl code).

We now longer support SQLite (see 78974ab) and we already use a
temporary PostgreSQL database during shell setup.

So let's remove all conditionals about SQLite and all dependencies.

However, we now have the problem that DBIx classes are now generated
all-lowercase when running a DBIx update.

The reason for this is that if you don't quote table names in SQL, the
identifier is folded to lower-case and to upper-case according to the
SQL standard[1]. SQLite on the other hand only matches case-
insensitively but on a schema dump the tables are with cases.

Let's illustrate this by example, on PostgreSQL we have this:

hydra=# CREATE TABLE TestFooBar (foo int, bar int[]);
CREATE TABLE
hydra=# \dt test*
          List of relations
 Schema |    Name    | Type  | Owner
--------+------------+-------+--------
 public | testfoobar | table | aszlig
(1 row)

And on SQLite we have the following:

sqlite> CREATE TABLE TestFooBar (foo int, bar int[]);
sqlite> .schema
CREATE TABLE TestFooBar (foo int, bar int[]);

While it's perfectly fine to reference tables using lower-case
identifiers, DBIx's make_schema_at() will pick up the tables from the
schema.

So in the end, we need to fix all of our identifiers to get upgrades
working properly again.

Note that there still is a dependency on DBDSQLite in release.nix,
because it is needed for the Perl bindings of Nix.

[1] https://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We want to map underscore-separated tables to CamelCase so that we still
have the same class names as we had so far.

Unfortunately we can't use the default moniker, because it tries to make
pluralized names singular.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
With the new naming scheme for database identifiers, we're now going to
have build_steps instead of buildSteps, so we need to adjust the
relation name mapping accordingly.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
We have rules to convert everything from the a_b variants to Perl class
names, but because "SchemaVersion" will become "schemaversion", we'd get
a class name called "schemaversion" instead of "SchemaVersion".

This is in preparation of the conversion of all database identifiers.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Having CamelCase names for tables and columns doesn't really work out
well, because we'd need to not only quote those names during creation
but also all queries would need to use double quotes around these
identifiers.

Now every database identifier is consistent and lower-case only, which
should make it easier to use within raw statements (and inside the SQL
shell) and also less confusing in terms of schema definition versus raw
statements or other code references.

Of course, this isn't a very trivial change, so I wrote a small script
which makes it easy to rebase this very commit should there be changes
to master in the meantime.

The Nix expression generating the script can be found here (yes, it's
gory and a variant of "write-once-never-touch-again"):

https://gist.github.com/aszlig/59ea813c28731d339b9764c46f363200

Note that I've left "userName" as-is (just making it lower-case in the
schema) because there were a lot of references in the code that overlap
with the column name.

Another table I didn't touch except from making it lower-case (which it
is anyway on PostgreSQL) is "schemaversion", because that table is
needed for upgrading existing instances.

Moreover, the reason why there now is no relation from jobset_inputs to
jobsets (and back) anymore in the DBIx schema is that because it is now
created using PostgreSQL instead of SQLite, that foreign key however was
defined for SQLite only:

> #ifdef SQLITE
>    ,
>    foreign key   (project, name, nixExprInput)
>       references JobsetInputs(project, jobset,
>       name)
> #endif

I've ran tests.install, tests.api and the tests in build.x86_64-linux
and they all have passed.

In addition to running those tests, I've dumped my running Hydra
instance with database schema version 45 and tested the change.

The upgrade worked successfully and the web interface is still working
after a few tests.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Originally from 203a9b9.

We want to track whether upgrades are consistent by using this as the
base SQL file and iterating through all of the update SQL files,
comparing the result against src/sql/hydra.sql.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Whenever we make a new upgrade-N.sql file, we need to make sure that we
also add it to hydra.sql. That's exactly the part where human error can
happen, so in order to mitigate this we now have a checker that can be
called by:

make -C src/sql check-db

The update-dbix target has a dependency on check-db, so whenever we
update the DBIx schema, we first check whether the singleton schema in
hydra.sql is consistent with base.sql plus every update.

We should now be able to find and fix already existing inconsistencies.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
The check constraint on jobsets (jobsets_check) has been explicitly
named with underscores in upgrade-41.sql while the name within hydra.sql
hasn't been set, so PostgreSQL will automatically create a name for the
constraint, which luckily overlaps with our new naming scheme.

Unfortunately however if a new Hydra instance has been set up between
upgrade 41 and 50, the constraint would be implicitly created with the
name jobsets_schedulingshares_check.

In order to deal with this, I've added two statements in upgrade 50 that
explicitly remove both variations of the constraint and recreate it with
the name now defined in hydra.sql.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
These indices were added between introducing the schema upgrade system
and were missed adding to upgrade-2.sql back then.

The relevant commit is 8d65ab6.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This once was added in upgrade-12.sql as IndexBuildOutputsOnPath, but
never made it into hydra.sql, so let's make sure that upgrade-50.sql
fixes this for existing systems that may or may not lack this index.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Both of these weren't properly picked up by apgdiff, so let's rename
them as well. I think especially the rule isn't supported by apgdiff.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
PostgreSQL version 9.2 doesn't support renaming rules, so let's just
drop rule idempotentinsert rule and recreate it with the new name
idempotent_insert instead.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
I've missed this in 28d10b4 mainly because it didn't show up/failed in
the tests.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
@aszlig
Copy link
Member Author

aszlig commented Aug 4, 2016

Rebased against current master.

@domenkozar
Copy link
Member

Given that it's just a rename (so it's reversible in any case), I'd say merge this as it removes tech debt with sqlite :)

@grahamc
Copy link
Member

grahamc commented Oct 23, 2016

I spent a few hours today trying to get a dev hydra install working, and came to the conclusion that sqlite support is actually broken. Getting this support out of here would be much appreciated by the next person who sets up a dev env :) Seems I may have just been hitting bugs in the master version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants