Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement XDR buffer factories for all platforms #78

Closed
5 tasks done
whitequark opened this issue Jun 3, 2019 · 3 comments
Closed
5 tasks done

Implement XDR buffer factories for all platforms #78

whitequark opened this issue Jun 3, 2019 · 3 comments
Labels
Milestone

Comments

@whitequark
Copy link
Contributor

whitequark commented Jun 3, 2019

This issue is primarily informational and used to collect design choices for the XDR buffer factories for all interesting platforms.

The XDR buffer factory cannot possibly cover every conceivable permutation of modes that the platform can provide. It also shouldn't. It would be very hard to validate an abstraction for even 2-3 different XDR mode combinations, and it would not even be particularly useful for writing robust code because inevitably there will be combinations of platforms, modes and gearings that are not supported, and portable code would avoid it.

Instead, I choose to select a specific mode that would be the only one supported when using platform.request(..., xdr=n). This mode is selected so that it is supported on the maximum amount of platforms (hopefully, every single one) and presents the least amount of timing difficulties.

Specifically, XDR factories should instantiate a buffer that:

  • for output buffers, captures o0, o1, ... at the rising clock edge;
  • for input buffers, outputs i0, i1, ... at the rising clock edge, with one cycle of latency.

This way, no additional timing constraints are added: o* need to be valid for one cycle before the edge, and i* are valid for one cycle after the edge.

This can be implemented as follows for each FPGA family we support, considering only DDR (XDR=2) primitives:

  • iCE40: re-register D_OUT_1, and re-register D_IN_0 as well as D_IN_1 in fabric.
  • ECP5: the only available mode.
  • MachXO2: the only available mode.
  • Series 6: use DDR_ALIGNMENT=C0 and re-register Q0 in fabric.
  • Series 7: use DDR_CLK_EDGE=SAME_EDGE_PIPELINED for IDDR, and use DDR_CLK_EDGE=SAME_EDGE for ODDR.

See also this IRC discussion.

@whitequark
Copy link
Contributor Author

I've looked at several FPGA families (S7, ECP5, XO2) to see how they implement XDR for X>2. It looks like there are two common motifs: an additional clock is used and there is CDC from low-speed clock to high-speed clock, with an associated latency. This is probably the only realistic way to implement this so I'd expect all other families to work the same.

So, a minimal change to the buffer factories to enable this would be to add i/o_clk_fast to pin_layout, and, since the CDC latency is both inevitable and variable, make it so that the platform can communicate the latency back to code.

Most realistic applications will also need fixed or configurable delay elements, which will also need to be exposed via pin_layout. It is not entirely trivial to do because these delay elements often have platform-specific limitations, like "input and output delay may not be used on the same pin", so that will need to be addressed somehow, instead of just always instantiating every possible delay element.

@whitequark
Copy link
Contributor Author

Huge thanks to @jfng for doing the work on Xilinx series 6 and 7.

@whitequark
Copy link
Contributor Author

whitequark commented Jun 25, 2019

I've implemented the ECP5 buffers. MachXO2 buffers are basically the same, so I think this is all done! Of course, we can look at more platforms in the future, but I think 3 families are demonstrating the viability of the concept quite well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant