Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jemalloc450: enable to default option disable-initial-exec-tls #82711

Closed
wants to merge 1 commit into from

Conversation

Izorkin
Copy link
Contributor

@Izorkin Izorkin commented Mar 16, 2020

Motivation for this change

If use environment.memoryAllocator.provider = "jemalloc"; service mariadb crashed

200316  8:15:47 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.3.22-MariaDB-log
key_buffer_size=16777216
read_buffer_size=2097152
max_used_connections=0
max_threads=65537
thread_count=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 268479022 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x70fb51115b88
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x70fbf15aad28 thread_stack 0x49000
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(my_print_stacktrace+0x29)[0x638370d6ae49]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(handle_fatal_signal+0x53d)[0x63837085b10d]
/nix/store/9rabxvqbv0vgjmydiv59wkz768b5fmbc-glibc-2.30/lib/libpthread.so.0(+0x12f70)[0x70fbf3d88f70]
/nix/store/6hy3n14bm4f8jld1bnvr5lxzssmbh3vi-malloc-provider-jemalloc/lib/libjemalloc.so(+0x7ce5a)[0x70fbf40bce5a]
/nix/store/6hy3n14bm4f8jld1bnvr5lxzssmbh3vi-malloc-provider-jemalloc/lib/libjemalloc.so(+0x14366)[0x70fbf4054366]
/nix/store/6hy3n14bm4f8jld1bnvr5lxzssmbh3vi-malloc-provider-jemalloc/lib/libjemalloc.so(_ZdlPvm+0xe)[0x70fbf40c1cbe]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(_ZN11MDL_context27release_locks_stored_beforeE17enum_mdl_durationP10MDL_ticket+0x37)[0x638370768d47]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(_Z21mysql_execute_commandP3THD+0x4895)[0x63837067c0d5]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x1f4)[0x63837067feb4]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(+0x5df260)[0x638370680260]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(_Z19do_handle_bootstrapP3THD+0xc2)[0x638370680752]
/nix/store/ljfk18bib2h3ir1hif6rpq8slqw4rbzn-mariadb-server-10.3.22/bin/mysqld(handle_bootstrap+0x35)[0x638370680825]
/nix/store/9rabxvqbv0vgjmydiv59wkz768b5fmbc-glibc-2.30/lib/libpthread.so.0(+0x7edd)[0x70fbf3d7dedd]
/nix/store/9rabxvqbv0vgjmydiv59wkz768b5fmbc-glibc-2.30/lib/libc.so.6(clone+0x3f)[0x70fbf34d9a4f]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x70fb4de22ba0): insert into help_category (help_category_id,name,parent_category_id,url) values (1,'Geographic',0,'');
Connection ID (thread ID): 6
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/data/db/mysql
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             7911                 7911                 processes
Max open files            5151                 5151                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       7911                 7911                 signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: |/nix/store/isjw585167wip2jwk9l38ir3fyvafq5d-systemd-243.7/lib/systemd/systemd-coredump %P %u %g %s %t %c %h

cc @flokli @aanderse

Updated PR
Latest changes need to correct load TokuDB plugin:

      ### TokuDB
      malloc-lib = ${pkgs.jemalloc450}/lib/libjemalloc.so
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@Izorkin Izorkin force-pushed the mariadb-fix branch 2 times, most recently from dd4bd86 to 9453684 Compare March 22, 2020 08:44
@Izorkin Izorkin changed the title nixos/malloc: add jemalloc450-mysql memoryAllocator nixos/malloc: add jemalloc-mariadb memoryAllocator Mar 22, 2020
@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 22, 2020

@Mic92 thanks, fixed.

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 22, 2020

Fixed.

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 22, 2020

Found my error. Conflicted wit this configuration:

  systemd.services.mysql = with pkgs; {
    environment = {
      LD_PRELOAD = "${pkgs.jemalloc450}/lib/libjemalloc.so";
    };

@Izorkin Izorkin closed this Mar 22, 2020
@Izorkin Izorkin deleted the mariadb-fix branch March 22, 2020 19:13
@Izorkin Izorkin restored the mariadb-fix branch March 22, 2020 19:23
@Izorkin Izorkin reopened this Mar 22, 2020
@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 22, 2020

Update PR.
Need to correct load TokuDB plugin:
@flokli @Mic92 please recheck.

      ### TokuDB
      malloc-lib = ${pkgs.jemalloc450}/lib/libjemalloc.so

@ofborg ofborg bot removed the 6.topic: nixos label Mar 22, 2020
@Izorkin Izorkin changed the title nixos/malloc: add jemalloc-mariadb memoryAllocator jemalloc450: enable to default option disable-initial-exec-tls Mar 22, 2020
@flokli
Copy link
Contributor

flokli commented Mar 23, 2020

@Izorkin Sorry, I have a hard time understanding. Can you explain what you figured out?

This PR now seems to flip the default of "disableInitExecTls", instead of the override already present in nixpkgs?

@Mic92
Copy link
Member

Mic92 commented Mar 23, 2020

This is the implication according to the upstream documentation:

  • --disable-initial-exec-tls
    Disable the initial-exec TLS model for jemalloc's internal thread-local storage (on those platforms that support explicit settings). This can allow jemalloc to be dynamically loaded after program startup (e.g. using dlopen). Note that in this case, there will be two malloc implementations operating in the same process, which will almost certainly result in confusing runtime crashes if pointers leak from one implementation to the other.

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 23, 2020

@Izorkin Sorry, I have a hard time understanding. Can you explain what you figured out?

Automatically loaded 2 libraries through LD_PRELOAD

"${pkgs.jemalloc450}/lib/libjemalloc.so";
"${pkgs.jemalloc}/lib/libjemalloc.so";

There was a conflict between versions

This PR now seems to flip the default of "disableInitExecTls", instead of the override already present in nixpkgs?

Yes.
TokuDB plugin builded with only disabled initial-exec-tls.
Now Jemalloc 4 is used only in mariadb. Сan disable initial-exec-tls.

@flokli
Copy link
Contributor

flokli commented Mar 23, 2020

You can't just flip the default in pkgs/development/libraries/jemalloc/common.nix, it might break elsewhere.

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 23, 2020

Ups.., fixed.

Copy link
Contributor

@flokli flokli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this PR is supposed to do.

This moves the override of jemalloc that was passed into mariadb to a global override for every reference of jemalloc450. It also doesn't seem to be a no-op for the mariadb package itself.

@@ -12353,7 +12353,7 @@ in

jemalloc = callPackage ../development/libraries/jemalloc { };

jemalloc450 = callPackage ../development/libraries/jemalloc/jemalloc450.nix { };
jemalloc450 = callPackage ../development/libraries/jemalloc/jemalloc450.nix { disableInitExecTls = true; };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still changes pkgs.jemalloc450 for all packages referring it?

@@ -15875,7 +15875,7 @@ in
mariadb = callPackage ../servers/sql/mariadb {
# As per mariadb's cmake, "static jemalloc_pic.a can only be used up to jemalloc 4".
# https://jira.mariadb.org/browse/MDEV-15034
jemalloc = jemalloc450.override ({ disableInitExecTls = true; });
jemalloc = jemalloc450;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you remove the override from here and move it to pkgs.jemalloc450? This looks like a no-op to me (for pkgs.mariadb), but a breakage for many different packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jemmalloc450 is not used anywhere.
Use this variant?

  mariadb = callPackage ../servers/sql/mariadb {
    # As per mariadb's cmake, "static jemalloc_pic.a can only be used up to jemalloc 4".
    # https://jira.mariadb.org/browse/MDEV-15034
    jemalloc = jemalloc450-mariadb;

...
    jemalloc450 = callPackage ../development/libraries/jemalloc/jemalloc450.nix { };
    jemalloc450-mariadb = callPackage ../development/libraries/jemalloc/jemalloc450.nix { disableInitExecTls = true; };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If jemalloc450 isn't used anywhere, I'd propose to do the callPackage to jemalloc450.nix entirely inside the mysql derivation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, I don't see how your PR changes anything. You just move the override from one process to another - the same derivation is passed as jemalloc to the mysql derivation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whereas it is correct to load the library with disableInitExecTls = true; ?

malloc-lib = ${pkgs.jemalloc450}/lib/libjemalloc.so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you only callPackage of jemalloc450.nix inside the mariadb, you can call it with disableInitExecTls = true;, yes.

@Izorkin Izorkin closed this Mar 28, 2020
@Izorkin Izorkin deleted the mariadb-fix branch March 28, 2020 19:34
@flokli
Copy link
Contributor

flokli commented Mar 28, 2020

Can you explain why this was closed?

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 29, 2020

Can you explain why this was closed?

I do not know how correct to loading malloc-lib.

Added to my config this variant.

  nixpkgs.overlays = [
    (self: super: {
      jemalloc-mariadb = pkgs.jemalloc450.override {
        disableInitExecTls = true;
      };
    })
  ];

       ### TokuDB
-      malloc-lib = ${pkgs.jemalloc450}/lib/libjemalloc.so
+      malloc-lib = ${pkgs.jemalloc-mariadb}/lib/libjemalloc.so

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 31, 2020

@flokli found correct method to ignore environment.memoryAllocator.provider

diff --git a/nixos/modules/services/databases/mysql.nix b/nixos/modules/services/databases/mysql.nix
index f9e657f5774..7dc313b7239 100644
--- a/nixos/modules/services/databases/mysql.nix
+++ b/nixos/modules/services/databases/mysql.nix
@@ -378,6 +378,7 @@ in
           RuntimeDirectoryMode = "0755";
           Restart = "on-abort";
           RestartSec = "5s";
+          InaccessiblePaths = "/etc/ld-nix.so.preload";
           # The last two environment variables are used for starting Galera clusters
           ExecStart = "${mysql}/bin/mysqld --defaults-file=/etc/my.cnf ${mysqldOptions} $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION";
           ExecStartPost =

This is variant correct?

@flokli
Copy link
Contributor

flokli commented Mar 31, 2020

I have no idea if that's correct or not. I propose reaching out to @joachifm and @delroth, authors of nixos/modules/config/malloc.nix.

Also, will this all still be necessary when a version of mysql supporting latest jemalloc is released?

@Izorkin
Copy link
Contributor Author

Izorkin commented Mar 31, 2020

Remove the TokuDB storage engine - https://jira.mariadb.org/browse/MDEV-19780
Need in 10.5 TokuDB plugin?

@flokli
Copy link
Contributor

flokli commented Mar 31, 2020

Hm, if upstream decided to deprecate TokuDB in favor of MyRocks, I don't think we should add more special cases for it.

However, this issue/PR has diverged quite a bit from the initial description.

Please open new issues if new things arise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants