nix repl: Provide documentation from comment when evaluating to lambda #1652

roberth · 2017-10-31T16:18:56Z

This provides limited support for python-like docstrings in the nix repl. When the user evaluates an expression to a lambda, nix repl will now print the contents of a documentation comment, as long the comment written right before the attribute, and the attribute value is an actual lambda.

Current limitations:

Overriding the documentation requires redundant lambdas (eta abstraction to be precise)
No docstring support for anything other than functions

Demo:

nix-repl> pkgs.lib.concatMapStringsSep ":"
«lambda @ /nix/store/ksi72626r14035xkndbn584pmb7l703r-nixos-17.09.1535.1fdca25ee8/nixos/lib/strings.nix:64:30»

| concatMapStringsSep
| -------------------
| 
| NOTE: This function has already been applied!
|       You should ignore the first 1 parameter(s) in this documentation,
|       because they have already been applied.
|
| First maps over the list and then concatenates it.
| 
| Example:
|    concatMapStringsSep "-" (x: toUpper x)  ["foo" "bar" "baz"]
|    => "FOO-BAR-BAZ"


nix-repl>

Changes:

A new test set, nix-repl.sh
A new module, comment.cc for the documentation retrieval logic
Additions to repl.cc

Todo:

choose a comment syntax
hide the documentation behind a hint that :doc displays the documentation for the function

thufschmitt · 2017-11-01T08:22:59Z

Really nice improvement :)

It may be more natural to have this printed only with the :t command (or maybe a new :doc one) rather than when evaluating, what do you think of this?

grahamc · 2017-11-01T08:36:49Z

!!!

…

On Wed, Nov 1, 2017 at 09:23 Théophane Hufschmitt ***@***.***> wrote: Really nice improvement :) It may be more natural to have this printed only with the :t command (or maybe a new :doc one) rather than when evaluating, what do you think of this? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1652 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAErrKM-SIV-_TjoTagh8JTI-8Fawr2zks5syCpkgaJpZM4QNBS2> .

gilligan · 2017-11-01T11:18:08Z

Oops.. i didn't realize the PR was already open and commented on your fork instead.. well you can see it anyway ;)

gilligan · 2017-11-01T11:19:13Z

src/nix/comment.cc

+
+    std::string rawComment = matches[1];
+    std::string name = matches[2];
+    int timesApplied = countLambdas(matches[3]);


I think matches could be of length 3 here and you are accessing the 4th element? Since you check for < 3 in line 126..

gilligan · 2017-11-01T11:20:00Z

src/nix/comment.cc

+}
+
+// SLOW, probably O(n^2)
+std::string stripPrefix(std::string prefix, std::string s) {


How about something like this:

std::string stripPrefix(std::string prefix, std::string s) { std::string::size_type index = s.find(prefix); return (index == 0) ? s.erase(0, prefix.length()) : s; }

Untested but I guess that should work?! ;-)

gilligan · 2017-11-01T11:20:53Z

src/nix/comment.cc

+    regex_search(sourcePrefix, matches, e);
+
+    std::stringstream buffer;
+    if (matches.length() < 3) {


I really don't know what kind of standards the nix code base has in general.. i usually try to avoid "magic numbers" and would introduce some #define here or so.. same for the indices to access matches below. Just a thought.

gilligan · 2017-11-01T11:34:51Z

@roberth I wonder if this should perhaps restricted to not just any kind of /* .. */ delimited block but that we instead introduce some marker ?

{
  /*nixdoc
  Proofs that P = NP
  */
  proofThingy = p: np: 42;
}

Otherwise we might end up with arbitrary things like TODO: rewrite this or FIXME or whatever else. Since we (NixOsDocs folks) are starting off with improving nix functions documentation we could pick this up from the get go.

This would definitely also help for automatically extracting docstrings like these from nixpkgs. In fact - maybe we could go even further and use nixdoc:<function> (So nixdoc:proofThingy for the example above). That would make it really easy to create things like ctags files for navigating nixpkgs. I would love that.

roberth · 2017-11-07T17:01:47Z

@gilligan , thank you for your review. I have implemented all of your suggestions except the nixdoc:.

I'm also a bit concerned about the ugly comments that might show up. I'm not sure what the special syntax should be. nixdoc:myfunc will inevitably lead to copy paste errors and bitrot. A 'small syntax' like /** */ might be better than nixdoc, but that leaves no room for metadata in the comment that isn't normally shown...

gilligan · 2017-11-07T21:28:45Z

@roberth right, having the function name duplicated in the comment might easily go out of sync. Then again: doesn’t that also apply to any kind of comment written for any function anyway?

So the function documentation can always be wrong but if we have some tag we could maybe limit the output to the “right” kind of comment blocks?

In short: IMHO looking for “/** nixdoc” might be preferable.

roberth · 2017-12-02T16:16:00Z

@gilligan I agree that we need some kind of marker to indicate that it's a documentation comment.
I think /** or ## should be sufficient, but it's also nice to be forward compatible, so we can switch to RST/Markdown/whatever when we want to generate HTML docs. My suggestion:

 * `/**` followed by plain text, to be shown verbatim in the repl
 * `/**sometoken ` followed by text in some format

We can remove the sometoken in the repl, for forward compatibility. When we're going to generate HTML docs, we can decide on a format without causing breakage when switching. So in the future, docs may look like

`/**md
 # Introduction

 This function bla bla
` */

What I like about this is that it is very clear that we're using markdown here, it will be shown in the repl without the md part, and can be processed in a future repl, knowing that it's the 'Markdown+Nix' format.

I'm not saying that we should do Markdown in the future, though. There's no standard for it. At least Restructured Text has a better specification (from what I've heard). Markdown is rather popular though.
Let's not go for docbook at least :)

dtzWill · 2017-12-20T21:08:36Z

Sounds good to me! Can't wait! ^_^

roberth · 2018-01-10T15:15:36Z

Before the holidays, I put out a twitter poll about the syntax for documentation comments. The results:

It got 24 votes:

33% /** always plain text / (8 people)
33% /* always markdown */
17% /**md to pick md format (4 people)
17% ## let's discuss\n=======
One person mentioned doxygen, but didn't gain support from others.
On IRC, grahamc mentioned docbook

So, there is no clear winner, but a combination of /** plain text by default and /**md to 'upgrade' to markdown can be argued to have 50% of the votes (ignoring the possibility that some plain text voters hate markup languages at all cost)

If anyone is going to decide the color of this bike shed I think it should be @edolstra because I don't want to ruin his awesome creation with ugly comments.

edolstra · 2018-01-10T15:41:08Z

I'm not in favor of Markdown.

roberth · 2018-01-10T16:12:35Z

@edolstra good.

What about an extensible syntax marked by a keyword right after /**?
Or do you want to 'disallow' plain text (most comments now) and standardize on a single format? If so, which one? Some possibilities are restructured text (has a specification; python doc format), docbook and doxygen.

edolstra · 2018-01-10T16:26:12Z

src/nix/comment.cc

+
+        return parseDoc(buffer.str());
+    } catch (std::exception e) {
+        std::cout << "Caught exception: " << e.what() << std::endl;


=> ignoreException().

edolstra · 2018-01-10T16:27:36Z

src/nix/comment.cc

+             i++) {
+            buffer << line << "\n";
+        }
+        buffer << line.substr(0, pos.column-1);


This can be simplified to something like for (auto & line : tokenizeString(readFile(pos.file), "\n") { ... }.

tokenizeString didn't work because it doesn't give back the empty lines. I didn't find another suitable function in util either, so I have factored the thing out instead. The other review items are now solved.

edolstra · 2018-01-10T16:30:39Z

src/nix/comment.hh

+//
+// Will return empty values if nothing can be found.
+// For its limitations, see the docs of the implementation.
+struct Doc lookupDoc(Pos & pos);


Should be const Pos & pos.

edolstra · 2018-01-10T16:31:10Z

src/nix/comment.cc

+#include "comment.hh"
+#include "util.hh"
+
+// This module looks for documentation comments in the source code.


We don't use // in the Nix code base except for single-line comments.

roberth · 2018-01-11T14:39:53Z

Before this can be merged we have to make decisions about syntax, because this code will return FIXME comments and such.

stale · 2021-02-12T05:13:32Z

I marked this as stale due to inactivity. → More info

edolstra · 2022-10-11T13:01:15Z

I'm not really fond of abusing comments for documentation. That's what you do when you don't control the language and don't have a way to add documentation in a more structured way (e.g. C++ and doxygen). But we have the freedom to add documentation in a "proper" way (e.g. as annotations).

roberth · 2022-10-11T13:17:38Z

What makes use abuse? Aren't comments equally controllable?

By turning comments into annotations, you enable static introspection, so that you don't have to burden the interpreter with it. Evaluation performance is already a problem and __functor has an overhead.

Adding documentation at the value level is what you do when you don't control the language tooling and don't have a way to add documentation in a way that works without having to evaluate code. After all, documentation is not a runtime concept but one that relates directly to the source code.

piegamesde · 2022-10-11T13:23:02Z

I'm really in favor of this change (yay for more static and less evaluation, yay for less functors), but I really think we should have a dedicated team of people that discuss the syntax, independently of the implementation.

edolstra · 2022-10-11T13:41:08Z

Shoving documentation into comments and then trying to correlate the comment with some nearby identifier is the sort of hack you do when you can't add documentation as first-class syntax (e.g. Python annotations). At the least, documentation comments should have some distinguishing syntax (like doxygen comments), since it's not obvious in

  # Just a function.
  f = x: x;

that the comment # Just a function is intended as documentation. (E.g. it could also be # FIXME: bla bla.)

In Nix, documentation can only be done at the value level (as in the NixOS module system) because we don't have a proper module system. So we can't actually statically extract from a source file how the functions in that file are supposed to be called. E.g. we have no way to figure out from an attribute like

  /* Bla */
  concatMapStringsSep = ...;

that this defines a function that can be called as lib.concatMapStringsSep.

piegamesde · 2022-10-11T14:07:17Z

Even if you want to do an implementation that is more value/run time based, there are alternative proposals that would not make it a functor-exclusive feature: #5527 (comment)

blaggacao · 2022-10-11T20:34:54Z

We need this (native doc support). A diversity of solutions isn't adding ecosystem (research) value any more at this point in time. To the contrary.

Ericson2314 · 2022-10-12T22:12:44Z

We should do this. We were also talking in the documentation team about need for better API docs in the code so they do not get out of sync CC @fricklerhandwerk @infinisil.

@edolstra I understand your concerns about hijacking comments being ugly, but I think it is a good place for iteration. Consider these examples:

In Rust, a /// variation on comments is common place, even though there are other annotation options
Typescript, mypy, and other post-hoc type checkers evolved from special comments to dedicated syntax.

Basically, I think it will take a few rounds of experimentation to figure out exactly the system we want. The fact that we are so expression-oriented means that we cannot easily steal ideas from other languages with a "static top-level" as-is. If we go for dedicated syntax immediately, we tie our hands with comparability concerns. Conversely, if we "steal" some comment syntax we can be sure that we are backwards compatible.

In this manner we can have a low-impact unstable feature will both allow us to start written extremely important API documentation immediately, and also give us plenty of flexibility to experiment with different variations. Once we decide what we want, we can introduce dedicated syntax, and only stabilize that --- not the interim comment syntax. Existing API documentation (which is hopefully much more comprehensive at that point than it is today!) can then be transitioned over to using the new system.

This is the lowest risk plan which allows us to accelerate documentation contribution the soonest. Let's do it!

Ericson2314 · 2022-10-12T22:15:17Z

CC @mightyiam who was looking to work on this very problem!

mightyiam · 2022-10-13T04:43:49Z

Thank you, @Ericson2314 .

Tracking issue seems to be #228 (comment). Will provide updates there. If anyone is willing to answer some questions we may have, please subscribe to that issue.

nixos-discourse · 2023-01-13T16:12:48Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-generate-documentation-from-arbitrary-nix-code/22292/9

fricklerhandwerk · 2023-02-13T14:47:46Z

Discussed in the Nix team meeting 2023-01-20:

@fricklerhandwerk: there are multiple approaches to the broad problem of documenting Nix language code
@thufschmitt: while the implementation is small, it's a substantial change to the language
@roberth: don't have time to work on this
agreement: this should be an RFC
- @NixOS/documentation-team should coordinate this

nixos-discourse · 2023-02-13T14:51:14Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-01-20-nix-team-meeting-minutes-25/25432/1

inclyc · 2023-09-26T03:23:22Z

src/nix/comment.hh

+    // this is crucial information.
+    int timesApplied;
+
+    Doc(std::string rawComment, std::string comment, std::string name, int timesApplied) {


would like to prefer the C++ way writing ctors

Suggested change

Doc(std::string rawComment, std::string comment, std::string name, int timesApplied) {

Doc(std::string rawComment, std::string comment, std::string name, int timesApplied) : rawComment(std::move(rawComment), comment(std::move(comment)), name(std::move(name)), timesApplied(timesApplied) {

and remove this-> ... = ... assignments.

We are using copy constructors for std::string, this is inefficient because C++ string are really "value-like" strings, not references, the buffer will be copied each time if we do

this->name = name

inclyc · 2023-09-26T03:27:08Z

src/nix/comment.cc

+
+/* Try to recover a Doc by looking at the text that leads up to a term
+   definition */
+static struct Doc parseDoc(std::string sourcePrefix) {


I'd rather like to preserve comment information in our lexer, not using regexes here. That is, we store pointers to comments here, and construct & strip indentation after parsing state.

nix/src/libexpr/lexer.l

Lines 305 to 306 in 9428d7d

\#[^\r\n]* /* single-line comments */

\/\*([^*]|\*+[^*/])*\*+\/ /* long comments */

This has no overhead but we must carefully deal with the lifetime of these references/pointers.

I think we should do this in the lexer because it will be flexible for further changes & keep consistent with the lexer.

inclyc · 2023-09-26T03:28:10Z

src/nix/comment.cc

+   This module does not support tab ('\t') characters. In some places
+   they are treated as single spaces. They should be avoided.
+*/
+namespace nix::Comment {


Suggested change

namespace nix::Comment {

namespace nix::comment {

nit pick (NFC)

inclyc · 2023-09-26T03:28:49Z

src/nix/comment.cc

+}
+
+/* See lambdas in parseDoc */
+static int countLambdas(std::string piece) {


Suggested change

static int countLambdas(std::string piece) {

static int countLambdas(const std::string &piece) {

inclyc · 2023-09-26T03:31:04Z

src/nix/repl.cc

    std::ostream &  printValue(std::ostream & str, Value & v, unsigned int maxDepth, ValuesSeen & seen);
+
+    // Only prints if a comment is found preceding the position.


Suggested change

// Only prints if a comment is found preceding the position.

/// Only prints if a comment is found preceding the position.

inclyc · 2023-09-26T03:31:47Z

src/nix/comment.hh

+
+struct Doc {
+
+    // Name that the term is assigned to


Suggested change

// Name that the term is assigned to

/// Name that the term is assigned to

inclyc · 2023-09-26T03:33:50Z

src/nix/comment.cc

+
+static std::string readFileUpToPos(const Pos & pos) {
+
+    std::ifstream ifs(static_cast<const std::string>(pos.file));


can we read from the buffer instead? Because for static analysis tooling, actually there is no such file on the filesystem, this is a problem we deal with #6530

nixos-discourse · 2024-07-15T11:23:26Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-07-08-nix-team-meeting-minutes/49099/1

gilligan reviewed Nov 1, 2017

View reviewed changes

roberth force-pushed the lambda-docstring branch 2 times, most recently from aa530fb to d05552b Compare November 7, 2017 16:56

edolstra reviewed Jan 10, 2018

View reviewed changes

nix repl: Provide documentation from comment when evaluating to lambda

Loading
Loading status checks…

c9f618d

roberth force-pushed the lambda-docstring branch from d05552b to c9f618d Compare January 11, 2018 13:40

shlevy added the backlog label Apr 1, 2018

shlevy self-assigned this Apr 1, 2018

This was referenced Oct 27, 2018

Comment constructor haskell-nix/hnix#57

Open

Markup syntax in comments nix-community/nixdoc#5

Closed

domenkozar removed the backlog label Apr 30, 2020

roberth mentioned this pull request Aug 9, 2020

Feature: add :doc command to nix repl #3904

Closed

shlevy removed their assignment Oct 22, 2022

fricklerhandwerk added feature language UX labels Jan 10, 2023

roberth mentioned this pull request Feb 5, 2023

nix __dump-builtins does not include "derivation" #7753

Open

hsjobeki mentioned this pull request Apr 8, 2023

[RFC 0145] Doc-comments NixOS/rfcs#145

Merged

roberth added significant repl labels Jun 2, 2023

thufschmitt marked this pull request as draft June 23, 2023 11:47

Ericson2314 added RFC and removed RFC labels Jun 23, 2023

sternenseemann mentioned this pull request Aug 3, 2023

PoC for RFC145: dynamic documentation for lambdas #8778

Closed

6 tasks

inclyc reviewed Sep 26, 2023

View reviewed changes

inclyc mentioned this pull request Sep 27, 2023

libexpr: support :doc for nixpkgs lambdas #9054

Closed

roberth mentioned this pull request Jun 10, 2024

Add :doc for lambdas to repl with nix-doc #10771

Closed

roberth mentioned this pull request Jul 9, 2024

Doc comments #11072

Merged

roberth closed this in #11072 Jul 15, 2024

	Doc(std::string rawComment, std::string comment, std::string name, int timesApplied) {
	Doc(std::string rawComment, std::string comment, std::string name, int timesApplied) : rawComment(std::move(rawComment), comment(std::move(comment)), name(std::move(name)), timesApplied(timesApplied) {

	\#[^\r\n]* /* single-line comments */
	\/\([^]\|\+[^/])\+\/ /* long comments */

	static int countLambdas(std::string piece) {
	static int countLambdas(const std::string &piece) {

		std::ostream & printValue(std::ostream & str, Value & v, unsigned int maxDepth, ValuesSeen & seen);

		// Only prints if a comment is found preceding the position.

	// Only prints if a comment is found preceding the position.
	/// Only prints if a comment is found preceding the position.

	// Name that the term is assigned to
	/// Name that the term is assigned to


		static std::string readFileUpToPos(const Pos & pos) {

		std::ifstream ifs(static_cast<const std::string>(pos.file));

nix repl: Provide documentation from comment when evaluating to lambda #1652

nix repl: Provide documentation from comment when evaluating to lambda #1652

Conversation

roberth commented Oct 31, 2017 • edited Loading

thufschmitt commented Nov 1, 2017

grahamc commented Nov 1, 2017 via email

gilligan commented Nov 1, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gilligan commented Nov 1, 2017

roberth commented Nov 7, 2017

gilligan commented Nov 7, 2017

roberth commented Dec 2, 2017

dtzWill commented Dec 20, 2017

roberth commented Jan 10, 2018

edolstra commented Jan 10, 2018

roberth commented Jan 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roberth commented Jan 11, 2018

stale bot commented Feb 12, 2021

edolstra commented Oct 11, 2022

roberth commented Oct 11, 2022

piegamesde commented Oct 11, 2022

edolstra commented Oct 11, 2022 • edited Loading

piegamesde commented Oct 11, 2022

blaggacao commented Oct 11, 2022 • edited Loading

Ericson2314 commented Oct 12, 2022 • edited Loading

Ericson2314 commented Oct 12, 2022

mightyiam commented Oct 13, 2022

nixos-discourse commented Jan 13, 2023

fricklerhandwerk commented Feb 13, 2023

nixos-discourse commented Feb 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nixos-discourse commented Jul 15, 2024

roberth commented Oct 31, 2017 •

edited

Loading

edolstra commented Oct 11, 2022 •

edited

Loading

blaggacao commented Oct 11, 2022 •

edited

Loading

Ericson2314 commented Oct 12, 2022 •

edited

Loading