Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please consider an optional reset on MultiReg #37

Closed
Wren6991 opened this issue Feb 27, 2019 · 9 comments
Closed

Please consider an optional reset on MultiReg #37

Wren6991 opened this issue Feb 27, 2019 · 9 comments
Milestone

Comments

@Wren6991
Copy link
Contributor

I was browsing the code to learn the standard library, and bumped into this in cdc.py:

class MultiReg:
    def __init__(self, i, o, odomain="sync", n=2, reset=0):

     ...

        self._regs = [Signal(self.i.shape(), name="cdc{}".format(i),
                             reset=reset, reset_less=True, attrs={"no_retiming": True})
                      for i in range(n)]

I understand there are caveats with resets on CDC capture flops, but sometimes you need it. For example, a pulse-to-pulse synchroniser (or toggle synchroniser) contains a toggling NRZ signal crossing the domains, and the capture flops need a capture-domain reset to avoid spurious toggling observed by the logic when deasserting reset.

platform.get_multi_reg(self) may contain considerable magic for some synthesis flows, and should only need writing once; ideally, something like a pulse-to-pulse synchroniser should be implementable using the existing MultiReg.

A ligh-touch option could be to just add a reset_less=True to MultiReg.__init__, and pass this down.

Happy to be argued with, told to go implement it, or just told I'm wrong :)

@Wren6991
Copy link
Contributor Author

On a slightly-related note, are there any plans to flesh out the CDC library with e.g. pulse-to-pulse, Gray code, or ready-ack + multibit? Batteries included!

@whitequark
Copy link
Contributor

I understand there are caveats with resets on CDC capture flops, but sometimes you need it. For example, a pulse-to-pulse synchroniser (or toggle synchroniser) contains a toggling NRZ signal crossing the domains, and the capture flops need a capture-domain reset to avoid spurious toggling observed by the logic when deasserting reset.

Weak yes. I will need to dig in and understand the context here much better. You can help this by providnig compelling real-world examples and extensive documentation. I.e. teach me the use cases.

On a slightly-related note, are there any plans to flesh out the CDC library with e.g. pulse-to-pulse, Gray code, or ready-ack + multibit? Batteries included!

Yes, all this was in Migen and it will get ported. Gray counters are already in.

@Wren6991
Copy link
Contributor Author

Wren6991 commented Feb 27, 2019

Weak yes. I will need to dig in and understand the context here much better. You can help this by providnig compelling real-world examples and extensive documentation. I.e. teach me the use cases.

Okay I will try! I don't guarantee any of the below code works, but it's conceptually sound

Pulse to pulse is used to signal the occurrence of some event from one domain to another. The idea is that each 1-clock-wide pulse on the input results in a 1-clock wide pulse on the output, even when the clocks are different speeds. For slow->fast you can do 100% duty cycle at the input, otherwise you require some level of sparseness at the input to avoid dropping pulses.

One use for this is a credit-based DREQ on e.g. a UART with a DMA interface. If you signal a single pulse for each transfer you require, you can guarantee that a DMA, in the fast system clock domain, will never overflow your FIFO, in the slow peripheral clock domain. OTOH if you use the normal level-sensitive DREQ scheme, you need to deassert the DREQ before completely filling your FIFO, to account for the transfers which may be in-flight in the bus CDC between the DMA and the UART. This wastes FIFO space.

(You could alternatively use an async FIFO, but on most SoCs it's normal to run the entire peripheral bus at a much lower frequency than your main interconnect, to save power etc)

In Verilog it looks something like this:

module sync_pulse (
	input wire i_clk,
	input wire i_rst_n,
	input wire i_pulse,

	input wire o_clk,
	input wire o_rst_n,
	output wire o_pulse
);

// Generate toggle on pulse in launching domain

reg i_toggle;

always @ (posedge i_clk or negedge i_rst_n)
	if (!i_rst_n)
		i_toggle <= 1'b0;
	else if (i_pulse)
		i_toggle <= !i_toggle;

// Cross the toggle signal

wire o_toggle;

// equivalent to MultiReg
sync_1bit (
	.clk   (o_clk),
	.rst_n (o_rst_n),
	.i     (i_toggle),
	.o     (o_toggle)
);

// Generate pulse on toggle in capturing domain

reg o_toggle_prev;

always @ (posedge o_clk or negedge o_rst_n)
	if (!o_rst_n)
		o_toggle_prev <= 1'b0;
	else
		o_toggle_prev <= o_toggle;

assign o_pulse = o_toggle != o_toggle_prev;

endmodule

Without a reset on the MultiReg, you may randomly get a pulse on o_pulse when o_rst_n is removed.

If you need full-duty-cycle at the input for fast-slow, and you need backpressure, then one tool is a credit FIFO, which is an async FIFO with no data path. It's used kind of like a semaphore which you up() in one clock domain and down() in the other. This might be a handy special-case of the existing async fifo!

Another trick which requires reset on a MultiReg is a req-ack handshake. This allows a multi-bit signal to cross domains safely, but with some delay and infrequent updates. For example, suppose you have an RTC which is clocked off some fixed slow clock, and you want to sample a raw counter value from a processor whose clock varies.

module sync_sample #(
	parameter W_DATA = 8
) (
	input wire i_clk,
	input wire i_rst_n,
	input wire [W_DATA-1:0] i_samp,

	input wire o_clk,
	input wire o_rst_n,
	output reg [W_DATA-1:0] o_samp
);

reg [W_DATA-1:0] samp_cross;

reg i_req;  // i -> o
wire i_ack; // i <- o

reg o_ack;  // o -> i
wire o_req; // o <- i

// Launching domain

sync_1bit (
	.clk   (i_clk),
	.rst_n (i_rst_n),
	.i     (o_ack),
	.o     (i_ack)
);

always @ (posedge i_clk or negedge i_rst_n) begin
	if (!i_rst_n) begin
		samp_cross <= {W_DATA{1'b0}};
		i_req <= 1'b0;
	end else begin
		if (i_req && i_ack) begin
			i_req <= 1'b0;
		end else if (!(i_req || i_ack)) begin
			i_req <= 1'b1;
			samp_cross <= i_samp;
		end
	end
end

// Capturing domain

sync_1bit (
	.clk   (o_clk),
	.rst_n (o_rst_n),
	.i     (i_req),
	.o     (o_req)
);

always @ (posedge o_clk or negedge o_rst_n) begin
	if (!o_rst_n) begin
		o_samp <= {W_DATA{1'b0}};
		o_ack <= 1'b0;
	end else begin
		o_ack <= o_req;
		if (o_req && !o_ack)
			o_samp <= samp_cross;
	end
end

endmodule

The idea with this one is that you avoid metastability and bit skew by guaranteeing that samp_cross doesn't transition for two o_clks before it is sampled in the o_clk domain. You don't have multiple flops on the output of samp_cross, and in fact these would be harmful. Metastabilities still occur inside of the sync_1bit modules, but that's safe due to the way the handshake progresses (it's effectively a 2-bit gray counter with one bit in each clock domain).

An async FIFO is unsuitable here: you are interested in sampling a single value of the signal now-ish, not reading in some buffered sequence.

Note that this is robust wrt the ordering of reset removal:

  • If i_rst_n is removed last, there is no req, so o does not ack
  • If o_rst_n is removed last, reqs are not acked

However, if there is no reset on the MultiReg, there may be a period after reset removal where the synchronisers contain garbage, and the req and ack signals will toggle somewhat randomly. This may cause samp_cross to transition too close to being captured by o_samp, which is fatal. It's cured by giving the block a few clocks whilst it's in reset, but still a dangerous footgun.

Sorry for the verbosity :( this (PDF) is a fairly concise document that gives a bit of a who's who of clock crossing tricks, and has diagrams.

@Wren6991
Copy link
Contributor Author

Yes, all this was in Migen and it will get ported. Gray counters are already in.

Okay, noted, I will go check out the Migen standard library and COMPAT_SUMMARY.md before bothering you further :)

Would working on the libraries be helpful, or are people mostly focusing on the core language at the moment? I would be happy to put time into implementing missing features from Migen, or verifying/reviewing the parts that currently exist in nMigen but are unverified. I saw that nMigen has some first-class FV support, which is one reason for my interest!

@Wren6991
Copy link
Contributor Author

Wren6991 commented Mar 3, 2019

Reading back I think I misunderstood what you asked for, and that you already knew 99% of what I said. That's pretty bad, sorry.

However, even if e.g. the Migen PulseSynchroniser was FPGA-proven, there are ways for this kind of issue to hide:

  • The MultiReg flops may have been initialised to default 0 during FPGA config
  • The designs may have had a simple power-on-only reset structure, without a processor resetting and unresetting blocks at its whim
  • There may have always been sufficient clocks during reset for the MultiReg to reach its quiescent state

None of these are universally true. Also note page 17 on the Cummings paper referenced in nmigen/fifo.py: he explicitly resets the pointer sync registers, whereas the AsyncFIFO class uses a non-resettable MultiReg.

@whitequark
Copy link
Contributor

Would working on the libraries be helpful, or are people mostly focusing on the core language at the moment? I would be happy to put time into implementing missing features from Migen, or verifying/reviewing the parts that currently exist in nMigen but are unverified. I saw that nMigen has some first-class FV support, which is one reason for my interest!

Working on the libraries is definitely helpful! There's a lot of stuff that needs to be ported from Migen, documentation improved, etc. I will likely have opinions on how it should be implemented, but help is very welcome here!

Reading back I think I misunderstood what you asked for, and that you already knew 99% of what I said. That's pretty bad, sorry.

That's OK. I will now try to summarize what you said. "MultiReg is used for safely sampling signals across clock domains. When the sampled value is used within a sequence of events triggered by something else (which does not happen shortly after a reset), this is OK. But when the sampled value is used itself to trigger other events, this can result in misfires." Is this correct?

@Wren6991
Copy link
Contributor Author

Wren6991 commented Mar 3, 2019

Working on the libraries is definitely helpful! There's a lot of stuff that needs to be ported from Migen, documentation improved, etc. I will likely have opinions on how it should be implemented, but help is very welcome here!

Okay cool! I will start digging into Migen a little bit more. There are plenty of other things which are ubiquitous but kind of tricky, like clock control.

Opinionated design is good!

That's OK. I will now try to summarize what you said. "MultiReg is used for safely sampling signals across clock domains. When the sampled value is used within a sequence of events triggered by something else (which does not happen shortly after a reset), this is OK. But when the sampled value is used itself to trigger other events, this can result in misfires." Is this correct?

Yes you summed that up much better than I could. MultiReg is dangerous if the output value is observed immediately after reset.

@whitequark
Copy link
Contributor

Thanks. Could you send a PR for reset on MultiReg, together with an explanation of the cases where it would be useful, perhaps excerpted from the above? I would appreciate it.

In general, explaining how to use nMigen primitives soundly is very important, and all documentation work on that front is greatly appreciated. You are also good at following existing style, so I feel comfortable knowing that the patches you submit would be quick to review, which I'm very grateful for, given the amount of workload and backlog I have.

@Wren6991
Copy link
Contributor Author

Sorry, I spaced out for an entire week there. Brain has now rebooted. Will add a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants