Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize nixexpr{input,path} from builds to jobsetevals. #848

Merged
merged 1 commit into from Jan 25, 2021

Conversation

grahamc
Copy link
Member

@grahamc grahamc commented Jan 22, 2021

I meant to open this as a draft.

Duplicating this data on every record of the builds table cost
approximately 4G of duplication.

Note that the database migration included took about 4h45m on an
untuned server which uses very slow rotational disks in a RAID5 setup,
with not a lot of RAM. I imagine in production it might take an hour
or two, but not 4. If this should become a chunked migration, I can do
that.

Note: Because of the question about chunked migrations, I have NOT
YET tested this migration thoroughly enough for merge.

@grahamc grahamc closed this Jan 22, 2021
@grahamc grahamc reopened this Jan 22, 2021
Duplicating this data on every record of the builds table cost
approximately 4G of duplication.

Note that the database migration included took about 4h45m on an
untuned server which uses very slow rotational disks in a RAID5 setup,
with not a lot of RAM. I imagine in production it might take an hour
or two, but not 4. If this should become a chunked migration, I can do
that.

Note: Because of the question about chunked migrations, I have NOT
YET tested this migration thoroughly enough for merge.
Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing Hydra's poor track record on having the appropriate indices (per your earlier PR) I whole-heatedly endorse this. Hydra needs to prove it can follow the rules (well-planned joins) before it earns the right to break them (denormalization).

Thanks again for your work with this stuff, @grahamc.

@grahamc
Copy link
Member Author

grahamc commented Jan 22, 2021

For what it is worth I find Hydra's schema to be pretty good :). It has scaled to over a hundred million builds, and its largest table has 1.5 billion (1,500,000,000) rows. There are a few cleanups to be done, but all in it does a really great job for the amount of maintenance it requires

@edolstra
Copy link
Member

edolstra commented Jan 25, 2021

denormalization

FWIW, this is because the JobsetEvals table is a later addition (see 7daca03). There is some more cruft in the DB from the time that we didn't have JobsetEvals, like BuildInputs (which still exists but isn't maintained anymore).

@edolstra edolstra merged commit d0b3f2d into NixOS:master Jan 25, 2021
@grahamc grahamc deleted the normalize-nixexprinputpath branch January 25, 2021 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants