New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable OCR for ccextractor #76534
Enable OCR for ccextractor #76534
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! Whoever does this merge, please run a build of the package on CI first, I don't currently have permissions to trigger it.
This could also be added as an option instead of having it for everyone. |
Thank you for your contributions. This has been automatically marked as stale because it has had no activity for 180 days. If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity. Here are suggestions that might help resolve this more quickly:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash all commits together and update the PR title and commit message to ccextractor: enable ocr support
Done, however, there's a problem, ccextractor, is unable to locate the tesseract data. CCExtractor/ccextractor#1170 would fix it. But unfortunately it was rejected. |
@@ -1,5 +1,8 @@ | |||
{ stdenv, fetchFromGitHub, pkgconfig, cmake | |||
, glew, glfw3, leptonica, libiconv, tesseract3, zlib }: | |||
, glew, glfw3, zlib, libiconv | |||
, ocrSupport ? true, leptonica ? null, tesseract4 ? null }: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, ocrSupport ? true, leptonica ? null, tesseract4 ? null }: | |
, ocrSupport ? true, leptonica, tesseract4 }: |
This is a semi-automatic executed nixpkgs-review which does not build all packages (e.g. lumo, tensorflow or pytorch) Result of 1 package built:
|
This is a semi-automatic executed nixpkgs-review which does not build all packages (e.g. lumo, tensorflow or pytorch) Result of 1 package built:
|
@@ -17,7 +20,11 @@ stdenv.mkDerivation rec { | |||
|
|||
nativeBuildInputs = [ pkgconfig cmake ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nativeBuildInputs = [ pkgconfig cmake ]; | |
nativeBuildInputs = [ pkg-config cmake ]; |
@@ -1,5 +1,8 @@ | |||
{ stdenv, fetchFromGitHub, pkgconfig, cmake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ stdenv, fetchFromGitHub, pkgconfig, cmake | |
{ stdenv, lib, fetchFromGitHub, pkg-config, cmake |
, glew, glfw3, zlib, libiconv | ||
, ocrSupport ? true, leptonica ? null, tesseract4 ? null }: | ||
|
||
assert ocrSupport -> leptonica != null && tesseract4 != null; | ||
|
||
with stdenv.lib; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with stdenv.lib; |
"-DWITH_OCR=${if ocrSupport then "ON" else "OFF"}" | ||
]; | ||
|
||
buildInputs = [ glew glfw3 leptonica tesseract4 zlib ] ++ stdenv.lib.optional (!stdenv.isLinux) libiconv; | ||
|
||
meta = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meta = { | |
meta = with lib; { |
@@ -29,6 +36,6 @@ stdenv.mkDerivation rec { | |||
''; | |||
platforms = platforms.unix; | |||
license = licenses.gpl2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpl2Only or gpl2Plus?
Applied suggestions. I'm ignoring the other suggestions because they belong to another PR |
Update to tesseract4 Co-authored-by: Sandro <sandro.jaeckel@gmail.com>
@@ -1,5 +1,6 @@ | |||
{ lib, stdenv, fetchFromGitHub, pkgconfig, cmake | |||
, glew, glfw3, leptonica, libiconv, tesseract3, zlib }: | |||
, glew, glfw3, zlib, libiconv | |||
, ocrSupport ? true, leptonica ? null, tesseract4 ? null }: | |||
|
|||
with lib; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with lib; |
Please apply it where it is required.
Please apply the changes in a separate commit |
I marked this as stale due to inactivity. → More info |
Superseded by #131849 |
Motivation for this change
Although tesseract and leptonica are included as buildInputs, they are unused as OCR is disabled. Enabling it.
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)Notify maintainers
cc @titanous