New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mnist: init at 2018-11-16 #50448
mnist: init at 2018-11-16 #50448
Conversation
I added the version as the date today because the package doesn't have versions. Unless you count the paper 1999 as the "version". I'm not even sure if this paper: LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278–2324. |
Related: #46922 |
I'm also wondering if this is the right way of doing multi-source derivations. I've just copied the style from Ocaml derivations. I also wonder about |
I've discovered another way to create this. It's actually more minimal. Because the current derivation leaves the "srcs" still in the The below just creates symlinks to the individual data. let
srcs = {
train-images = fetchurl {
url = "http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz";
sha256 = "029na81z5a1c9l1a8472dgshami6f2iixs3m2ji6ym6cffzwl3s4";
};
train-labels= fetchurl {
url = "http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz";
sha256 = "0p152200wwx0w65sqb65grb3v8ncjp230aykmvbbx2sm19556lim";
};
test-images= fetchurl {
url = "http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz";
sha256 = "1rn4vfigaxn2ms24bf4jwzzflgp3hvz0gksvb8j7j70w19xjqhld";
};
test-labels= fetchurl {
url = "http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz";
sha256 = "1imf0i194ndjxzxdx87zlgn728xx3p1qhq1ssbmnvv005vwn1bpp";
};
};
in
linkFarm
"mnist-2018-11-16"
[
{ name = srcs.train-images.name; path = srcs.train-images; }
{ name = srcs.train-labels.name; path = srcs.train-labels; }
{ name = srcs.test-images.name; path = srcs.test-images; }
{ name = srcs.test-labels.name; path = srcs.test-labels; }
] However I suspect this does not fit the conventions because there is no |
@c0bw3b What do you think of using |
I think it's preferable to The only other approach I can think of would be to use |
The files should not be decompressed. |
Oh.. Then I guess your current derivation is the better approach. |
@c0bw3b ready to merge? |
Would it make sense to mark it with |
@veprbl why would we do that when none of the data packages do that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally. LGTM.
Motivation for this change
Adding the MNIST dataset. This makes it really useful to use any ML applications that depends on this dataset, and tests can be ran using this smaller dataset.
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)nix path-info -S
before and after)