Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to hidapi #79

Merged
merged 12 commits into from
Jun 7, 2017
Merged

Switch to hidapi #79

merged 12 commits into from
Jun 7, 2017

Conversation

cyrozap
Copy link
Collaborator

@cyrozap cyrozap commented Mar 2, 2017

Apologies for the giant diff, but I had to get rid of the hdevice definition to get everything to build, and that required me to replace it everywhere. Also, the CMake code to find the library is not robust, but it's no less robust than the way it was before regarding how it checked for libusb (it didn't). We should probably fix that at some point.

Also, I simplified the udev rules to one line and changed the device permissions to 660, setting the group to "plugdev" so not just anyone logged in to the system can talk to the connected programmer.

@whitequark
Copy link
Collaborator

Why not just typedef hid_device* hdevice;?

@cyrozap
Copy link
Collaborator Author

cyrozap commented Mar 2, 2017

I tried that originally, but I just got a bunch of compiler errors. Something about "hid_device*" not being a type.

@whitequark
Copy link
Collaborator

Well which errors exactly?

@whitequark
Copy link
Collaborator

This works for me:

$ cat t.cpp
#include <hidapi/hidapi.h>
typedef hid_device* hdevice;
$ gcc -c t.cpp

@cyrozap
Copy link
Collaborator Author

cyrozap commented Mar 2, 2017

I just pushed an amended version of that commit and it works now--not sure why it didn't before.

@whitequark
Copy link
Collaborator

OK this is much better. Let me try it on the buildbot...

@whitequark
Copy link
Collaborator

whitequark commented Mar 2, 2017

//Set the device configuration
//If this fails, with LIBUSB_ERROR_BUSY, poll every 100 ms until the device is free
if(0 != (err = libusb_set_configuration(hdev, 1)))
LogVerbose("Using device at %s\n", cur_dev->path);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is an obvious use-after-free here because you are using cur_dev but you've already called hid_free_enumeration(devs);. However, fixing this issue still doesn't make things work because mode switching just doesn't work for some reason and does nothing (Linux x64, hidapi-libusb).

@ArcaneNibble
Copy link
Collaborator

So a whole bunch of funny stuff seems to be going on here. First of all, the command to switch from "white" mode to "orange" mode doesn't seem to be a real HID report at all. It is just 64 bytes of 0 with a random 0x09 at offset 60. For whatever reason, the hidapi-libusb code skips the first byte if the first byte of the buffer is zero. If I change SwitchMode to have a 65-byte buffer and shift the 0x09 to offset 61, hidapi-libusb seems to start working (I can detect a part). However, hidapi-hidraw does not work at all. It can correctly switch from "white" mode to "orange" mode (with the SwitchMode change), but once in "orange" mode it always reports INTERNAL ERROR: hid_get_indexed_string failed.

@ArcaneNibble
Copy link
Collaborator

Er, hid_get_indexed_string just outright isn't implemented on the hidraw backend apparently. Not sure if this is worth fixing?

@ArcaneNibble
Copy link
Collaborator

Windows can switch from "white" to "orange" mode, but interactions in "orange" mode just result in ERROR: Unexpected reply and ERROR: Fault condition detected during initial check, exiting. Also, whitequark is probably going to tell me I'm doing it wrong, but getting cmake to statically link hidapi is immensely frustrating (er, more so than my usual build system butchery) because it requires setupapi to be linked in. The only way I could get this to work is to hack the gpdevboard CMakeLists.txt to contain target_link_libraries(gpdevboard ${HIDAPI_LIBRARY} log setupapi). Previously I could get away with just dumping a large pile of -L and -framework options in CMAKE_EXE_LINKER_FLAGS.

@whitequark
Copy link
Collaborator

The only way I could get this to work is to hack the gpdevboard CMakeLists.txt to contain target_link_libraries(gpdevboard ${HIDAPI_LIBRARY} log setupapi). Previously I could get away with just dumping a large pile of -L and -framework options in CMAKE_EXE_LINKER_FLAGS.

Um. You are doing it right. And previously you were doing it wrong.

@ArcaneNibble
Copy link
Collaborator

But setupapi should only be linked on Windows, so somehow this needs fixing of some kind.

@whitequark
Copy link
Collaborator

whitequark commented Mar 2, 2017

if(WIN32)

@ArcaneNibble
Copy link
Collaborator

Alright, prepending a 0x00 byte in SendInterruptTransfer seems to more-or-less fix things. You also need to include <cstdlib> for wcstombs on macOS.

TL;DR:

diff --git a/src/gpdevboard/usb.cpp b/src/gpdevboard/usb.cpp
index 9b29b3d..bd41123 100644
--- a/src/gpdevboard/usb.cpp
+++ b/src/gpdevboard/usb.cpp
@@ -25,6 +25,8 @@
 #include <log.h>
 #include <gpdevboard.h>
 #include <unistd.h>
+#include <cstring>
+#include <cstdlib>
 
 using namespace std;
 
@@ -33,7 +35,10 @@ using namespace std;
 
 bool SendInterruptTransfer(hdevice hdev, const uint8_t* buf, size_t size)
 {
-       if(hid_write(hdev, buf, size) < 0)
+        uint8_t *new_buf = (uint8_t *)malloc(size + 1);
+        new_buf[0] = 0x00;
+        memcpy(&new_buf[1], buf, size);
+        if(hid_write(hdev, new_buf, size + 1) < 0)
        {
                LogError("hid_write failed (%ls)\n", hid_error(hdev));
                return false;
@@ -100,14 +105,16 @@ hdevice OpenDevice(uint16_t idVendor, uint16_t idProduct, int nboard)
                cur_dev = cur_dev->next;
                dev_index++;
        }
-       hid_free_enumeration(devs);
 
        hdevice hdev;
-       if (!cur_dev)
+       if (!cur_dev) {
+               hid_free_enumeration(devs);
                return NULL;
+       }
 
        LogVerbose("Using device at %s\n", cur_dev->path);
        hdev = hid_open_path(cur_dev->path);
+       hid_free_enumeration(devs);
        if (!hdev)
        {
                LogError("hid_open_path failed\n");

@ArcaneNibble
Copy link
Collaborator

cyrozap: Please verify that prepending a 0x00 (Report ID?) is indeed a proper fix and not a flaky hack. I'll test everything again tomorrow (later today).

@cyrozap
Copy link
Collaborator Author

cyrozap commented Mar 2, 2017

@rqou Please try the latest patches and see if they fixed what they were supposed to. As for the libusb-backend-weirdness issue, can you please check if the patch below works? If it doesn't, that means the programmer is not ignoring the report ID byte and we'll have to use the hack you described.

diff --git a/src/gpdevboard/protocol.cpp b/src/gpdevboard/protocol.cpp
index 42a5415..5eafd47 100644
--- a/src/gpdevboard/protocol.cpp
+++ b/src/gpdevboard/protocol.cpp
@@ -181,6 +181,7 @@ bool DataFrame::Roundtrip(hdevice hdev)
 bool SwitchMode(hdevice hdev)
 {
        uint8_t data[64] = {};
+       data[0] = 0x01;
        data[60] = 0x09;
        return SendInterruptTransfer(hdev, data, sizeof(data));
 }

@cyrozap
Copy link
Collaborator Author

cyrozap commented Mar 2, 2017

@rqou:

But setupapi should only be linked on Windows, so somehow this needs fixing of some kind.

That's trivially fixable--we can do that when fixing the rest of the build system.

@whitequark
Copy link
Collaborator

This PR now passes the same number of tests on Linux as master: https://openfpga-dashboard.antikernel.net/builders/openfpga/builds/87

@ArcaneNibble
Copy link
Collaborator

Alright, testing again:

Linux, libusb, no data[0] patch - cannot get out of "white" mode
Linux, libusb, with data[0]=0x01 patch - works
Linux, hidraw, no data[0] patch - cannot get out of "white" mode
Linux, hidraw, with data[0]=0x01 patch - cannot get out of "white" mode
macOS, no data[0] patch - cannot get out of "white" mode
macOS, with data[0]=0x01 patch - works
Windows, no data[0] patch - cannot get out of "white" mode
Windows, with data[0]=0x01 patch - cannot get out of "white" mode

so something funny is definitely going on.

@@ -26,6 +26,11 @@
#include <gpdevboard.h>
#include <unistd.h>

#if defined(__APPLE__)
#include <cstdlib>
Copy link
Collaborator

@ArcaneNibble ArcaneNibble Mar 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<cstdlib> is always supposed to be included when calling wcstombs. It only happens to break on macOS due to transitive dependencies in other headers. Likewise, <cstring> is always supposed to be included when calling memcpy (in my hack). These shouldn't be in the __APPLE__ guard.

@ArcaneNibble
Copy link
Collaborator

From the hidapi documentation: "The first byte of data[] must contain the Report ID. For devices which only support a single report, this must be set to 0x0. The remaining bytes contain the report data. Since the Report ID is mandatory, calls to hid_write() will always contain one more byte than the report contains. For example, if a hid report is 16 bytes long, 17 bytes must be passed to hid_write(), the Report ID (or 0x0, for devices with a single report), followed by the report data (16 bytes). In this example, the length passed in would be 17."

So it seems you really do need to do my "prepend a zero" method. I have no idea why some of the cases do end up working though.

@ArcaneNibble
Copy link
Collaborator

But on the other hand "Input reports are returned to the host through the INTERRUPT IN endpoint. The first byte will contain the Report number if the device uses numbered reports." So the API is asymmetric (at least for us where I'm pretty sure the programmer doesn't use numbered reports).

@cyrozap
Copy link
Collaborator Author

cyrozap commented Mar 4, 2017

@rqou

From the hidapi documentation: ...

That's only in the context of communicating using the "real" HID protocol, where the report descriptor says you need to send 16 bytes, and then you prepend the report ID. If you aren't doing that, i.e., if you're just sending raw buffers of data to the device (like we are), then you just give hid_write() a buffer and its length and that's what you'll see on the wire (barring any workaround nonsense, of course).

This is the commit where the report-skipping code was added: signal11/hidapi@89d84b1

I have no idea why some of the cases do end up working though.

If it isn't too much trouble, would you mind taking some USB packet captures for the "without any patch", "with data[0]=0x01 patch", and "with prepend-a-zero patch" cases on those four library/OS combinations you tested? I'd like to see what's actually being sent over the wire.

@whitequark
Copy link
Collaborator

If you look at the hidapi source it looks like sometimes it prepends a fake report ID (on Windows). I've only gave it a cursory look though.

That said... hidapi is gross. Can we just directly use hid.dll on Windows, libusb on Linux and [whatever is the system library] on macOS?

@cyrozap
Copy link
Collaborator Author

cyrozap commented Mar 4, 2017

Can we just directly use hid.dll on Windows, libusb on Linux and [whatever is the system library] on macOS?

Certainly, that's always an option. Of course, someone would have to write the code to do that, and I don't really feel like writing a bunch of code to use a dev kit I don't own, for two operating systems I don't use.

A middle-ground solution may be to vendor hidapi and fix the parts that are causing trouble.

Note: You actually only need to use libusb on very old Linux versions and FreeBSD--modern Linux has hidraw (this is different from hidapi) which allows you to communicate with the device without detaching the HID kernel driver.

@ArcaneNibble
Copy link
Collaborator

So from a combination of Google and poking around the Linux source code (combined with the fact that it works), I'm pretty sure you do need to prepend a 0 byte. The Linux hidraw source code contains the same report-ID-skipping logic as the hidapi libusb backend.

From what I understand, we actually are using the "real" HID protocol when communicating with the devkit. The only strange thing is that the devkit's descriptor for what should be in the "real" HID protocol specifies that it should contain 0x40 bytes of undefined usage in both its input and output reports. Because the descriptor does not contain a REPORT_ID entry, the data that appears on the wire is thus only these 64 bytes without a report ID. This is why when reading data we do not have to do anything. The reason we have to do more work when writing data is because the kernel API just works that way (always containing a report ID that will not be sent on the wire). hidapi then makes the other platforms behave the same way as Linux.

Incidentally, the reason that other people were reporting bugs against hidapi regarding report IDs might just be because the devices themselves are getting it wrong. For example, the kernel contains a HID_QUIRK_SKIP_OUTPUT_REPORT_ID quirk to never send report IDs.

@ArcaneNibble
Copy link
Collaborator

According to this blog post, report IDs only appear on the wire iff there is a possibility of having more than one (as determined by the presence of REPORT_ID). If there isn't, you just send on the wire the bytes described by the descriptor which in our case is "64 bytes of undefined contents."

Sorry, something went wrong.

@ArcaneNibble
Copy link
Collaborator

So what seems to be happening in the data[0]=0x01 case (at least on Linux) is that the kernel layers end up seeing a report ID of 1 followed by 63 report bytes. The kernel doesn't care that the descriptor doesn't specify that a report ID of 1 exists and just concatenates these (implemented by never splitting them in the first place) and sends 64 bytes onto the wire. Without the patch, the kernel sees a "do not send a report ID" followed by 63 report bytes and sends only those 63 bytes on the wire. The reason everything else after the "go from 'white' to 'orange' mode" command seems to work fine is because they start with a sequence number that seems to start at 1. Therefore, other packets never end up starting with zero and don't fail.

I'm guessing (haven't tested) that the reason Windows isn't happy in either case is because Windows actually does more parsing of report descriptors and won't correctly send a packet specifying a report ID that the descriptor doesn't claim is in use.

Sorry, something went wrong.

@whitequark
Copy link
Collaborator

So... with absolutely no disrespect to @cyrozap, whose work I appreciate, I'm not sure if merging this PR is a good idea. It adds a dependency that's rather annoying to build on Windows (because Windows) and macOS (because it requires autotools), and libhidraw does not have particularly good code (just try reading some of it). I would much rather see a clean reimplementation of libhidraw that can be easily vendored into our codebase, and I'm probably going to take a stab at it when I am back in the same country as my devkit.

Sorry, something went wrong.

@ArcaneNibble
Copy link
Collaborator

Hmm, I had no trouble at all cross-building hidapi for Windows from a Linux host. Building it for macOS didn't work with either autocrap or the manual makefile, but one of them managed to produce a .o file that you can pretend is the .a file and it will still work. The code is a little bit disgusting, but IMHO so would "working with IOKit" and "working with hid.dll/the DDK" become after a while. We can definitely vendor the three files (one for each platform) from hidapi and improve it though.

@ArcaneNibble
Copy link
Collaborator

ArcaneNibble commented May 22, 2017

Per some meatspace discussions @azonenberg and I had, I am going to try to vendor hidapi (it's just one .c file per platform) and slowly deleting/cleaning parts of the code we don't need.

@ArcaneNibble
Copy link
Collaborator

This code has now been tested to still work on Linux and Windows and macOS

@azonenberg azonenberg merged commit 4960e43 into azonenberg:master Jun 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants