Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LB-225: Add support for importing data dumps. #274

Merged
merged 11 commits into from Nov 15, 2017

Conversation

paramsingh
Copy link
Collaborator

Based off #272, so will need to be rebased after that.

There are a bunch of things more to do here, for example the stats table has a foreign key to the user table so the stats tables can't be imported alone, it needs a corresponding entry in the user table. Not sure how to solve this, other than sanitizing users and putting it in the stats dump too.

@paramsingh paramsingh changed the title LB-212: Add support for importing data dumps. LB-225: Add support for importing data dumps. Oct 17, 2017
@paramsingh
Copy link
Collaborator Author

Added tests too.

@mayhem
Copy link
Member

mayhem commented Nov 10, 2017

Rebase, please

@paramsingh paramsingh force-pushed the import-data-dumps branch 2 times, most recently from ab1e9f3 to ce03530 Compare November 11, 2017 10:01
@paramsingh
Copy link
Collaborator Author

I would love opinions on the problem I mentioned in the OP. The stats dump currently cannot be imported alone, it needs a corresponding private dump.

the stats table has a foreign key to the user table so the stats tables can't be imported alone, it needs a corresponding entry in the user table. Not sure how to solve this, other than sanitizing users and putting it in the stats dump too.

@mayhem
Copy link
Member

mayhem commented Nov 11, 2017

Oh sorry, didn't see that question:

Not sure how to solve this, other than sanitizing users and putting it in the stats dump too.

That is exactly how MB does it -- lets do that too.

@paramsingh
Copy link
Collaborator Author

LB-240 and #293

@paramsingh
Copy link
Collaborator Author

Rebased over #293 and added logic to only import user table from stats dump when needed.

@paramsingh paramsingh force-pushed the import-data-dumps branch 2 times, most recently from ee2c37d to a555f57 Compare November 14, 2017 16:01
Copy link
Member

@mayhem mayhem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one thing, the rest looks sane.

pxz = subprocess.Popen(pxz_command, stdout=subprocess.PIPE)

# create a transaction and start importing
with db.engine.begin() as connection:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you decided to do this in a transaction, which at first glance is the opposite of importing. :) However, there are subtle differences -- namely when importing you can't have a live site running. That isn't a valid or desired use case. Therefore there is no need to run inside a transaction -- in fact doing a full data import in a transaction could cause some interesting problems. So, please remove this transaction.

Copy link
Member

@mayhem mayhem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. It looks too good, actually. This means that alastairp will find something the second I merge this. :) 🚜 !!

@mayhem mayhem merged commit b190e34 into metabrainz:master Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants