Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: 12.2.7 -> 14.2.1 #65470

Merged
merged 2 commits into from Sep 4, 2019
Merged

ceph: 12.2.7 -> 14.2.1 #65470

merged 2 commits into from Sep 4, 2019

Conversation

johanot
Copy link
Contributor

@johanot johanot commented Jul 27, 2019

Motivation for this change

Make ceph up-to-date and not broken.
This PR probably has a better chance of being merged before 19.09 than #49866.
It includes all commits from #49866 , plus:

  • 13.2.2 -> 14.2.1 upgrade
  • python2 -> python3
  • more cherrypy version check patching
  • more pythonpath hardcoding
  • use of in-tree rocksdb, since ceph 14 is not compatible with rockdb 6, introduced here:
    b54b5f9

Result:
It has been rebased with master, and It builds! 馃帀

Dashboard:
Not included yet. Working on it. That's the WIP part. If we decide not to include that, we can probably skip all the cherrypy patches.

Testing:
Real-life testing has been performed with 14.2.1 already.
The use of in-tree rocksdb however has not.
Will deploy this addition to my own ceph environment soon.

A bunch of squashing is probably also needed before merging.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@srhb
Copy link
Contributor

srhb commented Jul 27, 2019

Good work! (And to all of the people involved earlier as well!)

Refresh my memory though; if we decided not to ship the dashboard for now, do we even need the cerrypy hackery?

@lejonet
Copy link
Contributor

lejonet commented Jul 31, 2019

Since 14.2.0, they've broken out the dashboard to its own package. So we shouldn't need to have to have anything to do with it in the ceph build henceforth. (reference: "Also, the Ceph Dashboard is now split into its own package named
ceph-mgr-dashboard." 1)

I'm actually going to prime and install a new ceph node this weekend, I'll do it with this PR to give it some more testing.

@srhb
Copy link
Contributor

srhb commented Jul 31, 2019

<itshappening.gif>

@johanot
Copy link
Contributor Author

johanot commented Aug 5, 2019

@srhb

Refresh my memory though; if we decided not to ship the dashboard for now, do we even need the cerrypy hackery?

Module 'prometheus' has failed dependency: No module named 'cherrypy' #sadface :(

@johanot johanot changed the title [WIP] ceph: 12.2.7 -> 14.2.1 ceph: 12.2.7 -> 14.2.1 Aug 27, 2019
@johanot
Copy link
Contributor Author

johanot commented Aug 27, 2019

Update: We've just deployed this branch to our production env. at my day job, today. So far no issues.

Ok. Even though we can't get rid of cherrypy, we can still decide to merge this for 19.09? What do you say @lejonet @srhb? :-) I removed the "WIP" in the title here. What else is required before we can merge this?

@srhb
Copy link
Contributor

srhb commented Aug 27, 2019

The two of you use it actively and I'll be okay with your decision. However, it looks like the dashboard is now enabled by default -- is that safe? I don't recall whether the interface is protected initially or whether it even matters. Other than that: Are any release notes required?

@johanot
Copy link
Contributor Author

johanot commented Aug 28, 2019

However, it looks like the dashboard is now enabled by default

Ah.. You mean, in the test? (https://github.com/NixOS/nixpkgs/pull/65470/files#diff-07c1a40dbc7cbf287840ba0dbbf4cd10R44)
I suspect that what happens silently in the mgr-daemons is similar to what happens when you try enabling it manually:

$ ceph mgr module enable dashboard
Error ENOENT: module 'dashboard' reports that it cannot run on the active manager daemon: No module named 'jwt' (pass --force to force enablement)

and then the dashboard just stays disabled, with the cluster still being healthy.

What I don't know though; is why the dashboard code is even still in tree and still exist in the mgr-module namespace? I've read the same release note as @lejonet, hence I would have expected something like: Error ENOENT: all mgr daemons do not support module 'dashboard'.

Anyway. We can of course remove the attempted dashboard enabling from the test, and we should be able to remove the cherrypy patches on dashboard/module.py as well, since we don't use/support the dashboard.

Also; The NixOS test fails for other reasons. Pretty embarrassing. I'll have a look at that as well. :)

@lejonet
Copy link
Contributor

lejonet commented Aug 28, 2019

I never managed to deploy a ceph node with this PR because systemd-242 (the version "locked" in, in the PR branch) not liking my epyc SoC.
I'm going to fiddle with that this weekend, to retrofit 239 and hope for the best.

I'm also a bit confused why the dashboard is still in the codebase and all. The dashboard present in the 11-12 releases was completely without auth and such but iirc the biggest change outside of breaking it out to its own package in 14.2.x is that they've fitted a much more comprehensive dashboard, with auth, acls and other bells and whistles.

Thus I would defer the decision to @johanot, because he have the version actually running. Tho it would be prudent to note the cherrypy problem as a heads up to those that actually use modules.

Edit:
It seems like the new dashboard in 14 actually has username/password as default according to this

@johanot
Copy link
Contributor Author

johanot commented Aug 29, 2019

Added 3 new commits, including test fixes and rel note text. also, rebased with master. But after the rebase, ceph doesn't build anymore; because: #67676 . Don't have time to try and upgrade kinetic-cpp-client today.

@globin
Copy link
Member

globin commented Aug 30, 2019

Merged the kinetic-cpp-client openssl pinning to not block this.

@johanot
Copy link
Contributor Author

johanot commented Aug 30, 2019

Almost certain this will timeout. But let's try.
@GrahamcOfBorg test ceph

@srhb
Copy link
Contributor

srhb commented Sep 2, 2019

I'm inclined to drop kinetic support due to the potential headache of mixed openssl versions.
Other than that, I think I'm good with this I would like to get it rolling so we don't accumulate even more at this point. Can you squash it down as well? 馃槃

@srhb
Copy link
Contributor

srhb commented Sep 2, 2019

Looks like there's some bootstrap permission errors in the test as well.

@johanot
Copy link
Contributor Author

johanot commented Sep 3, 2019

Ok @srhb. Rebased with master again, conflicts fixed. Squashed from 20 down to 11 commits. Have a look at the latest 3 commits on the branch, those are the fixes for the newest issues.

I believe the decision to remove kinetic is correct, since the project seems to be abandoned by Seagate anyway. I wrote on the ceph mailing list yesterday trying to get a current status of kinetic support in ceph - still waiting for a response on that one.

Still Ceph doesn't build, because of cherrypy now. Feel free to cherrypick (no pun intended) this: #68001 for testing.

@srhb
Copy link
Contributor

srhb commented Sep 3, 2019

Maybe we should just throw pyjwt at the dashboard anyway, and see what sticks.

@globin
Copy link
Member

globin commented Sep 3, 2019

Rebased again, as a fix for cherrypy was pushed to master, tests succeed locally.

@ofborg ofborg bot requested a review from krav September 4, 2019 04:32
@srhb
Copy link
Contributor

srhb commented Sep 4, 2019

Dashboard working now, will squash, rearrange and merge tonight, hopefully. :)

@srhb srhb self-assigned this Sep 4, 2019
krav and others added 2 commits September 4, 2019 16:01
* maintain only one version
* ceph-client: init
* include ceph-volume python tool in output

nixos/ceph: extraConfig, fix test, wait for ceph-mgr to become active

* run ceph with disk group permission
* add extraConfig option for the global section
needed per cluster
* clear up how ceph.conf is generated
* fix ceph testcase
* remove kinetic
* release note
* add johanot as maintainer

nixos/ceph: create option for mgr_module_path
  - since the upstream default is no longer correct in v14

* fix module, default location for libexec has changed
* ceph: fix test
@srhb srhb merged commit 55256d6 into NixOS:master Sep 4, 2019
@srhb
Copy link
Contributor

srhb commented Sep 4, 2019

Thanks everyone involved for the hard work!

@johanot johanot deleted the ceph-14-upstream branch September 4, 2019 15:56
@srhb srhb mentioned this pull request Sep 4, 2019
@ivan
Copy link
Member

ivan commented Sep 4, 2019

@srhb this broke the manual in master

# nixos-rebuild switch --builders '' --keep-going
building Nix...
building the system configuration...
these derivations will be built:
  /nix/store/0b7w7s7x56imv25spsb8kf1xxsg2s7z5-nixos-manual-combined.drv
  /nix/store/cxdad32g4iayp0syaph0bdr13dy8z57p-manual-olinkdb.drv
  /nix/store/xdr72sjjz5p875g4rljlakidn660ag9r-nixos-manual-html.drv
  /nix/store/lyfajhgsp5b1sglx3cnk3wd9xyb53rnf-nixos-help.drv
  /nix/store/r3wmc21fcpaj4pwic2w45499119y7kkl-nixos-manpages.drv
  /nix/store/xah07113shrf245qs98xv1zqy0j0lbpd-nixos-manual.desktop.drv
  /nix/store/kf2474dr1jflnm32i6yj4yb3v32r5385-system-path.drv
  /nix/store/07q8i3gf2i25sbc0z7jiyf7wii0h7kw0-dbus-1.drv
  /nix/store/8l60l5h40q097qkzd0ladiklnnwvrzdx-unit-dbus.service.drv
  /nix/store/a4mpj2mg91b46knp82ps60ls7cqppg7z-unit-systemd-fsck-.service.drv
  /nix/store/lz1frqf79f2gn4n9vxj6dmy0njzbklj3-unit-polkit.service.drv
  /nix/store/s0zql3ii9plkijlsb23fav9c1lz3ihg9-unit-accounts-daemon.service.drv
  /nix/store/08vh2g65vd444kw9qfdgcrhs2nk56mf9-system-units.drv
  /nix/store/r82h49yag17rmm45b39nwx2pbkx23mfa-unit-dbus.service.drv
  /nix/store/65hhbjxn75k4g4rkhhiix0d24763dxsp-user-units.drv
  /nix/store/r9h4r11y5vrgwj2dcwkw35i82wh2cz3g-etc.drv
  /nix/store/y48lplsama5qb5hmn0ihavp0m87xfdg6-nixos-system-ra-19.09.git.76d3b14.drv
building '/nix/store/0b7w7s7x56imv25spsb8kf1xxsg2s7z5-nixos-manual-combined.drv'...

manual-combined.xml:73339: element listitem: Relax-NG validity error : Did not expect element text there
 73335       </para>
 73336     </listitem>
 73337     <listitem>
 73338       Ceph has been upgraded to v14.2.1.
 73339       See the <link xlink:href="https://ceph.com/releases/v14-2-0-nautilus-released/">release notes</link> for details.
 73340       The mgr dashboard as well as osds backed by loop-devices is no longer explicitly supported by the package and module.
 73341       Note: There's been some issues with python-cherrypy, which is used by the dashboard

manual-combined.xml:72647: element section: Relax-NG validity error : Did not expect element section there
 72643    This section lists the release notes for each stable version of NixOS and
 72644    current unstable revision.
 72645   </para>
 72646   <section xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.0" xml:id="sec-release-19.09">
 72647   <title>Release 19.09 (&#x201C;Loris&#x201D;, 2019/09/??)</title>
 72648
 72649   <section xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.0" xml:id="sec-release-19.09-highlights">

manual-combined.xml:72647: element section: Relax-NG validity error : Element appendix has extra content: section
 72643    This section lists the release notes for each stable version of NixOS and
 72644    current unstable revision.
 72645   </para>
 72646   <section xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.0" xml:id="sec-release-19.09">
 72647   <title>Release 19.09 (&#x201C;Loris&#x201D;, 2019/09/??)</title>
 72648
 72649   <section xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.0" xml:id="sec-release-19.09-highlights">

manual-combined.xml:3: element info: Relax-NG validity error : Element book has extra content: info
     1  <?xml version="1.0"?>
     2  <book xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.0" xml:id="book-nixos-manual">
     3   <info>
     4    <title>NixOS Manual</title>
     5    <subtitle>Version 19.09
     6    </subtitle>
     7   </info>

manual-combined.xml fails to validate
builder for '/nix/store/0b7w7s7x56imv25spsb8kf1xxsg2s7z5-nixos-manual-combined.drv' failed with exit code 3
cannot build derivation '/nix/store/cxdad32g4iayp0syaph0bdr13dy8z57p-manual-olinkdb.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/r3wmc21fcpaj4pwic2w45499119y7kkl-nixos-manpages.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/xdr72sjjz5p875g4rljlakidn660ag9r-nixos-manual-html.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/lyfajhgsp5b1sglx3cnk3wd9xyb53rnf-nixos-help.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/xah07113shrf245qs98xv1zqy0j0lbpd-nixos-manual.desktop.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/kf2474dr1jflnm32i6yj4yb3v32r5385-system-path.drv': 4 dependencies couldn't be built
cannot build derivation '/nix/store/07q8i3gf2i25sbc0z7jiyf7wii0h7kw0-dbus-1.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/s0zql3ii9plkijlsb23fav9c1lz3ihg9-unit-accounts-daemon.service.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/lz1frqf79f2gn4n9vxj6dmy0njzbklj3-unit-polkit.service.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/a4mpj2mg91b46knp82ps60ls7cqppg7z-unit-systemd-fsck-.service.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/8l60l5h40q097qkzd0ladiklnnwvrzdx-unit-dbus.service.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/r82h49yag17rmm45b39nwx2pbkx23mfa-unit-dbus.service.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/08vh2g65vd444kw9qfdgcrhs2nk56mf9-system-units.drv': 4 dependencies couldn't be built
cannot build derivation '/nix/store/65hhbjxn75k4g4rkhhiix0d24763dxsp-user-units.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/r9h4r11y5vrgwj2dcwkw35i82wh2cz3g-etc.drv': 4 dependencies couldn't be built
cannot build derivation '/nix/store/y48lplsama5qb5hmn0ihavp0m87xfdg6-nixos-system-ra-19.09.git.76d3b14.drv': 2 dependencies couldn't be built
error: build of '/nix/store/y48lplsama5qb5hmn0ihavp0m87xfdg6-nixos-system-ra-19.09.git.76d3b14.drv' failed

@srhb
Copy link
Contributor

srhb commented Sep 4, 2019

My bad. I'll fix it right away.

@lordcirth
Copy link
Contributor

Has anyone looked at updating this to 14.2.3? Both .2 and .3 had critical fixes that we needed at $WORK (Not running NixOS)

@johanot
Copy link
Contributor Author

johanot commented Sep 16, 2019

@lordcirth it is done :) #68138

@lordcirth
Copy link
Contributor

@johanot Thanks! I guess I forgot to search for closed issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants