Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vivado CDC constraints #227

Merged
merged 19 commits into from
Sep 24, 2019
Merged

Vivado CDC constraints #227

merged 19 commits into from
Sep 24, 2019

Conversation

dlharmon
Copy link
Contributor

Related: #212

This tags the first register in each MultiReg or ResetSynchronizer
with the attribute nmigen_async_ff and then applies a false path and
max delay constraint to all registers tagged with that attribute in
the .xdc file.

The max delay defaults to 5 ns and has an override, max_delay where
it can be changed for the > whole project. It's possible to make this
an argument to MultiReg instead, but is more complex. > git commit
-m "add clock domain crossing constraints on Vivado This tags the
first register in each MultiReg or ResetSynchronizer with the
attribute nmigen_async_ff and then applies a false path and max
delay constraint to all registers tagged with that attribute in the
.xdc file.

The max delay defaults to 5 ns and has an override, max_delay where
it can be changed for the whole project. It's possible to make this an
optional argument to MultiReg instead, but is more complex. It would
probably work to set nmigen_async_ff to the desired delay rather
than just TRUE. I'm not sure how hard it would be to extract that in
the TCL or if it would be easier to keep a dict of all used delay
values and put a line for each into the .xdc file.

Vivado was inferring an SRL16 from a MultiReg in some cases
This tags the first register in each `MultiReg` or `ResetSynchronizer`
with the attribute `nmigen_async_ff` and then applies a false path and
max delay constraint to all registers tagged with that attribute in
the `.xdc` file.

The max delay defaults to 5 ns and has an override, `max_delay` where
it can be changed for the > whole project. It's possible to make this
an argument to `MultiReg` instead, but is more complex.  > git commit
-m "add clock domain crossing constraints on Vivado This tags the
first register in each `MultiReg` or `ResetSynchronizer` with the
attribute `nmigen_async_ff` and then applies a false path and max
delay constraint to all registers tagged with that attribute in the
`.xdc` file.

The max delay defaults to 5 ns and has an override, `max_delay` where
it can be changed for the whole project. It's possible to make this an
optional argument to `MultiReg` instead, but is more complex. It would
probably work to set `nmigen_async_ff` to the desired delay rather
than just `TRUE`. I'm not sure how hard it would be to extract that in
the TCL or if it would be easier to keep a dict of all used delay
values and put a line for each into the `.xdc` file.
@codecov
Copy link

codecov bot commented Sep 22, 2019

Codecov Report

Merging #227 into master will decrease coverage by 0.05%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
- Coverage   82.38%   82.32%   -0.06%     
==========================================
  Files          33       33              
  Lines        5483     5489       +6     
  Branches     1172     1174       +2     
==========================================
+ Hits         4517     4519       +2     
- Misses        834      836       +2     
- Partials      132      134       +2
Impacted Files Coverage Δ
nmigen/lib/cdc.py 84% <50%> (-6.91%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da53048...b78ef18. Read the comment docs.

@dlharmon
Copy link
Contributor Author

References:
https://forums.xilinx.com/t5/Timing-Analysis/how-to-constrain-a-CDC/td-p/748174

I expect a few more constraint requirements once #40 is merged due to a timing path between the write clock and async outputs on the LUT RAMs. It's a challenge to deal with: https://forums.xilinx.com/t5/Vivado-TCL-Community/set-disable-timing-for-hierarchical-cell/td-p/495808 Alternately, set the clocks to be asynchronous, but getting correct clock names is a challenge.

@dlharmon dlharmon changed the title Xilinx cdc constraints Vivado CDC constraints Sep 22, 2019
@dlharmon
Copy link
Contributor Author

Xilinx UG903 is self contradictory and the set_false_path constraint appears to override the set_max_delay constraint. I'm looking into how to solve this.

the false path had precedence over the max_delay despite the docs

the large negative min delay effectively disables hold path timing
which was the intended goal of the false path.
@dlharmon
Copy link
Contributor Author

Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

I also noticed that there is a "critical warning" generated when the added lines are present in the .xdc file and there are no flip flops with the nmigen_async_ff attribute.

Direction would be appreciated.

@whitequark
Copy link
Contributor

This tags the first register in each MultiReg or ResetSynchronizer
with the attribute nmigen_async_ff

This isn't directly related to the PR at hand (I'll look more into how Vivado works before commenting here), but I have a question you might answer regarding attributes. nMigen emits some internal attributes like nmigen.hierarchy, nmigen.decoding, src (which should really be nmigen.src), etc. Vivado complains about these. Is here a way to tell it "just ignore any attribute that starts with nmigen", other than the nmigen_async_ff and other semantically important ones?

@whitequark
Copy link
Contributor

Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

Can you elaborate on the exact nature of complication here?

@whitequark
Copy link
Contributor

Vivado complains about these

Actually, I was wrong. Vivado doesn't mind. ISE complains:

WARNING:Xst:37 - Detected unknown constraint/property "src". This constraint/property is not supported by the current software release and will be ignored.
WARNING:Xst:37 - Detected unknown constraint/property "src". This constraint/property is not supported by the current software release and will be ignored.
WARNING:Xst:37 - Detected unknown constraint/property "nmigen.hierarchy". This constraint/property is not supported by the current software release and will be ignored.

@dlharmon
Copy link
Contributor Author

Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

Can you elaborate on the exact nature of complication here?

We would either need an i_domain argument to MultiReg and to know the final Verilog name of the clock or some way to find the clock source of the i argument (may not exist if it came from a pin which is OK or could be combinatorial logic from multiple domains). I assumed the latter to be difficult and possibly would add much more complexity than required. If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I did find that instead of the set_min_delay -1000 -to <REG>, a better way to do it is to add set_false_path -hold -to <REG>. I'll push that change to this branch after testing it.

I think I'll need to add a dict keyed with (source_clk, max_delay) to platform for keeping track of which constraints need to be added to the .xdc file. Of course, if we don't do per CDC max_delay or set_max_delay -datapath_only -from <X> -to <Y> it could simply be a bool to enable adding to the constraints file.

@dlharmon
Copy link
Contributor Author

nMigen emits some internal attributes like nmigen.hierarchy, nmigen.decoding, src (which should really be nmigen.src), etc. Vivado complains about these. Is here a way to tell it "just ignore any attribute that starts with nmigen", other than the nmigen_async_ff and other semantically important ones?

As you found, Vivado does seem to ignore any attribute it doesn't use. It seems safe enough to assume it won't use anything starting with nmigen. Perhaps, I should swap my nmigen_async_ff to nmigen.async_ff for consistency.

I do see Vivado complaining about the init attribute on Verilog wires.

@whitequark
Copy link
Contributor

We would either need an i_domain argument to MultiReg and to know the final Verilog name of the clock or some way to find the clock source of the i argument (may not exist if it came from a pin which is OK or could be combinatorial logic from multiple domains). I assumed the latter to be difficult and possibly would add much more complexity than required.

We'll have this information once #4 is implemented, but right now this isn't viable.

If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I don't understand this statement. We already have r_domain and w_domain in AsyncFIFO. Or do you mean some other kind of FIFO?

I think I'll need to add a dict keyed with (source_clk, max_delay) to platform for keeping track of which constraints need to be added to the .xdc file. Of course, if we don't do per CDC max_delay or set_max_delay -datapath_only -from <X> -to <Y> it could simply be a bool to enable adding to the constraints file.

I would personally opt for a simpler solution: grab the attribute with Tcl and set it, again, with Tcl. I'll throw together a snippet soon so you don't have to do it yourself.

@whitequark
Copy link
Contributor

Perhaps, I should swap my nmigen_async_ff to nmigen.async_ff for consistency.

Sounds good.

I do see Vivado complaining about the init attribute on Verilog wires.

This is #220. Unfortunately an upstream issue.

@dlharmon
Copy link
Contributor Author

dlharmon commented Sep 23, 2019

If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I don't understand this statement. We already have r_domain and w_domain in AsyncFIFO. Or do you mean some other kind of FIFO?

I was referring to MultiReg/FFSynchronizer which is used within AsyncFIFO and doesn't currently get passed the input clock. It's simple enough to add an argument i_domain = None to FFSynchronizer if it's not None add the -from and -datapath_only arguments to set_max_delay, omit set_false_path.

The LUTRAM on 7series has a timing path from the write clock to the async read data. I want to make that a false path / max delay while keeping the path from read address to read data. Having the from clock makes that possible.

@dlharmon
Copy link
Contributor Author

I would personally opt for a simpler solution: grab the attribute with Tcl and set it, again, with Tcl. I'll throw together a snippet soon so you don't have to do it yourself.

My knowledge of Tcl is quite limited, so I'll leave that to you.

MultiReg renamed to FFSynchronizer
This is effectively equivalent without the ugly constant
There is now no hold path in the timing report, but it retains
the max_delay.
@whitequark
Copy link
Contributor

@dlharmon I think I understand how to do this in Tcl, but could you please explain what set_max_delay actually does here? I've been looking at various documentation for quite some time and I still do not understand why it is useful.

@jordens
Copy link
Member

jordens commented Sep 23, 2019

This one is good and what I based the migen async reset synchronizer constraints on.

There are more good posts by that guy in the xilinx forums: http://www.colmryan.org/avrums-clock-domain-crossing-widsom.html

@dlharmon
Copy link
Contributor Author

@dlharmon I think I understand how to do this in Tcl, but could you please explain what set_max_delay actually does here? I've been looking at various documentation for quite some time and I still do not understand why it is useful.

The first link mentioned by @jordens does a decent job explaining it.

Simply adding a false path to the flip flop is the easiest, but allows the place and route unlimited delay in that net which is problematic in some cases. For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

@dlharmon
Copy link
Contributor Author

Screenshot from 2019-09-23 11-26-25

@whitequark
Copy link
Contributor

For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

@dlharmon
Copy link
Contributor Author

It looks like you can't use arbitrary Tcl commands in XDC files (AR#54842), so I moved it to the main script. Please let me know if this works properly.

I merged this with my local copy and built a bitstream. The timing report and constraints look good. I like how this works. Thanks.

@dlharmon
Copy link
Contributor Author

dlharmon commented Sep 23, 2019

This seems to work, but does generate a warning for each input with no clock.

foreach cell [get_cells -hier -filter {nmigen.vivado.max_delay != ""}] {
    set clock [get_clocks -quiet -of_objects [all_fanin -flat -startpoints_only [get_pin $cell/D]]]
    if {[llength $clock] != 0} { 
        set_max_delay -datapath_only -from $clock \
       -to [get_cells $cell] [get_property nmigen.vivado.max_delay $cell]
    }
}

Edit: add -quiet to get_clocks. It's silences the warning. It might also be worth considering for get_cells

I suppose we could add an else with a simple max delay if there is no source clock.

@whitequark
Copy link
Contributor

This seems to work, but does generate a warning for each input with no clock.

Yep, I was about to suggest something like that. I think you want get_clocks -include_generated_clocks, too.

But isn't this behavior correct? If you set max_delay, then you're expecting the input to be driven by something with a clock, and if it isn't, then a warning is appropriate.

@dlharmon
Copy link
Contributor Author

But isn't this behavior correct? If you set max_delay, then you're expecting the input to be driven by something with a clock, and if it isn't, then a warning is appropriate.

Not necessarily. What about the case of something like a reset pushbutton on an input? I'm not sure that should trigger a warning.

@whitequark
Copy link
Contributor

What about the case of something like a reset pushbutton on an input? I'm not sure that should trigger a warning.

What would be the purpose of setting max_input_delay on an FFSynchronizer connected to a pushbutton?

@dlharmon
Copy link
Contributor Author

What would be the purpose of setting max_input_delay on an FFSynchronizer connected to a pushbutton?

Nothing whatsoever. I'd forgotten about the plan to make the default max delay None, so a max delay would not be applied in this case. There may be some cases (async SRAM like interface) where signals are not a clock, need to be synchronized and need timing specified. That's rare and what we have implemented covers the vast majority of use cases.

@dlharmon
Copy link
Contributor Author

I like the idea of removing the -quiet, not adding an else.

@whitequark
Copy link
Contributor

whitequark commented Sep 23, 2019

There may be some cases (async SRAM like interface) where signals are not a clock, need to be synchronized and need timing specified. That's rare and what we have implemented covers the vast majority of use cases.

Are you covering this use case by adding -quiet and using if {[llength $clock] != 0}? Or is that code serving some other purpose?

Also, I understand that set_max_delay overrides set_false_path. However, should a false path not be added if there is no max delay, i.e. in the default case? (Did you mis-merge the PR?)

@whitequark
Copy link
Contributor

Ah, I missed your second comment somehow. Yes. Removing -quiet and the else branch seems good to me.

@dlharmon
Copy link
Contributor Author

Are you covering this use case by adding -quiet and using if {[llength $clock] != 0}? Or is that code serving some other purpose?

Also, I understand that set_max_delay overrides set_false_path. However, should a false path not be added if there is no max delay, i.e. in the default case? (Did you mis-rebase the PR?)

Good idea on doing set_false_path when there is no max delay set. I completely missed that possibility.

I need to rework the PR and could have mis-rebased too.

@dlharmon
Copy link
Contributor Author

I pushed what I think we agreed on.

It does add -quiet to the initial find cells lines so that if there are no nmigen.vivado.max_delay or nmigen.vivado.false_path it won't complain. It will warn and I think error if there is a nmigen.vivado.max_delay on a signal with no source clock.

I added the max_input_delay parameter to lib/cdc.py. I'm not sure of your preferred way of handling the asserts, but an informative message like "max_input_delay is not supported on this platorm" would be a good thing.

@whitequark
Copy link
Contributor

Thanks for the update. It looks like I did not post my work-in-progress patch for max_input_delay that has nicer error messages and some additional comments. Could you integrate this please?

diff --git a/nmigen/lib/cdc.py b/nmigen/lib/cdc.py
index d21df79..3c62b8f 100644
--- a/nmigen/lib/cdc.py
+++ b/nmigen/lib/cdc.py
@@ -29,15 +29,15 @@ class FFSynchronizer(Elaboratable):
         Signal connected to synchroniser output.
     o_domain : str
         Name of output clock domain.
-    stages : int
-        Number of synchronization stages between input and output. The lowest safe number is 2,
-        with higher numbers reducing MTBF further, at the cost of increased latency.
     reset : int
         Reset value of the flip-flops. On FPGAs, even if ``reset_less`` is True,
         the :class:`FFSynchronizer` is still set to this value during initialization.
     reset_less : bool
-        If True (the default), this :class:`FFSynchronizer` is unaffected by ``o_domain`` reset.
-        See "Note on Reset" below.
+        If ``True`` (the default), this :class:`FFSynchronizer` is unaffected by ``o_domain``
+        reset. See "Note on Reset" below.
+    stages : int
+        Number of synchronization stages between input and output. The lowest safe number is 2,
+        with higher numbers reducing MTBF further, at the cost of increased latency.
 
     Platform override
     -----------------
@@ -61,7 +61,8 @@ class FFSynchronizer(Elaboratable):
 
     :class:`FFSynchronizer` is reset by the ``o_domain`` reset only.
     """
-    def __init__(self, i, o, *, o_domain="sync", reset=0, reset_less=True, stages=2):
+    def __init__(self, i, o, *, o_domain="sync", reset=0, reset_less=True, stages=2,
+                 max_input_delay=None):
         _check_stages(stages)
 
         self.i = i
@@ -72,10 +73,16 @@ class FFSynchronizer(Elaboratable):
         self._o_domain   = o_domain
         self._stages     = stages
 
+        self._max_input_delay = max_input_delay
+
     def elaborate(self, platform):
         if hasattr(platform, "get_ff_sync"):
             return platform.get_ff_sync(self)
 
+        if self._max_input_delay is not None:
+            raise NotImplementedError("Platform {!r} does not support constraining input delay "
+                                      "for FFSynchronizer".format(platform))
+
         m = Module()
         flops = [Signal(self.i.shape(), name="stage{}".format(index),
                         reset=self._reset, reset_less=self._reset_less)
@@ -117,7 +124,7 @@ class ResetSynchronizer(Elaboratable):
     Define the ``get_reset_sync`` platform method to override the implementation of
     :class:`ResetSynchronizer`, e.g. to instantiate library cells directly.
     """
-    def __init__(self, arst, *, domain="sync", stages=2):
+    def __init__(self, arst, *, domain="sync", stages=2, max_input_delay=None):
         _check_stages(stages)
 
         self.arst = arst
@@ -125,10 +132,16 @@ class ResetSynchronizer(Elaboratable):
         self._domain = domain
         self._stages = stages
 
+        self._max_input_delay = None
+
     def elaborate(self, platform):
         if hasattr(platform, "get_reset_sync"):
             return platform.get_reset_sync(self)
 
+        if self._max_input_delay is not None:
+            raise NotImplementedError("Platform {!r} does not support constraining input delay "
+                                      "for ResetSynchronizer".format(platform))
+
         m = Module()
         m.domains += ClockDomain("reset_sync", async_reset=True, local=True)
         flops = [Signal(1, name="stage{}".format(index), reset=1)
diff --git a/nmigen/vendor/lattice_ecp5.py b/nmigen/vendor/lattice_ecp5.py
index f18e87c..89143ab 100644
--- a/nmigen/vendor/lattice_ecp5.py
+++ b/nmigen/vendor/lattice_ecp5.py
@@ -533,3 +533,6 @@ class LatticeECP5Platform(TemplatedPlatform):
                 io_B=p_port[bit],
             )
         return m
+
+    # CDC primitives are not currently specialized for ECP5. While Diamond supports the necessary
+    # attributes (TBD); nextpnr-ecp5 does not.
diff --git a/nmigen/vendor/lattice_ice40.py b/nmigen/vendor/lattice_ice40.py
index bc1df8f..ffbbd0c 100644
--- a/nmigen/vendor/lattice_ice40.py
+++ b/nmigen/vendor/lattice_ice40.py
@@ -564,3 +564,6 @@ class LatticeICE40Platform(TemplatedPlatform):
 
     # Tristate and bidirectional buffers are not supported on iCE40 because it requires external
     # termination, which is incompatible for input and output differential I/Os.
+
+    # CDC primitives are not currently specialized for iCE40. It is not known if iCECube2 supports
+    # the necessary attributes; nextpnr-ice40 does not.
diff --git a/nmigen/vendor/xilinx_7series.py b/nmigen/vendor/xilinx_7series.py
index b25c4e9..f0a4f19 100644
--- a/nmigen/vendor/xilinx_7series.py
+++ b/nmigen/vendor/xilinx_7series.py
@@ -78,6 +78,10 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
             {% endfor %}
             {{get_override("script_after_read")|default("# (script_after_read placeholder)")}}
             synth_design -top {{name}} -part {{platform.device}}{{platform.package}}-{{platform.speed}}
+            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
+            foreach cell [get_cells -hier -filter {nmigen.vivado.max_delay != ""}] {
+                set_max_delay [get_property nmigen.vivado.max_delay $cell] -to $cell
+            }
             {{get_override("script_after_synth")|default("# (script_after_synth placeholder)")}}
             report_timing_summary -file {{name}}_timing_synth.rpt
             report_utilization -hierarchical -file {{name}}_utilization_hierachical_synth.rpt
@@ -367,7 +371,28 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(ff_sync._stages)]
+        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        if ff_sync.max_input_delay is not None:
+            flops[0].attrs["nmigen.vivado.max_delay"] = str(ff_sync.max_input_delay)
         for i, o in zip((ff_sync.i, *flops), flops):
             m.d[ff_sync._o_domain] += o.eq(i)
         m.d.comb += ff_sync.o.eq(flops[-1])
         return m
+
+    def get_reset_sync(self, reset_sync):
+        m = Module()
+        m.domains += ClockDomain("reset_sync", async_reset=True, local=True)
+        flops = [Signal(1, name="stage{}".format(index), reset=1,
+                        attrs={"ASYNC_REG": "TRUE"})
+                 for index in range(reset_sync._stages)]
+        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        if reset_sync.max_input_delay is not None:
+            flops[0].attrs["nmigen.vivado.max_delay"] = str(reset_sync.max_input_delay)
+        for i, o in zip((0, *flops), flops):
+            m.d.reset_sync += o.eq(i)
+        m.d.comb += [
+            ClockSignal("reset_sync").eq(ClockSignal(reset_sync._domain)),
+            ResetSignal("reset_sync").eq(reset_sync.arst),
+            ResetSignal(reset_sync._domain).eq(flops[-1])
+        ]
+        return m
diff --git a/nmigen/vendor/xilinx_spartan_3_6.py b/nmigen/vendor/xilinx_spartan_3_6.py
index 284cfab..eb0aac0 100644
--- a/nmigen/vendor/xilinx_spartan_3_6.py
+++ b/nmigen/vendor/xilinx_spartan_3_6.py
@@ -412,6 +412,10 @@ class XilinxSpartan3Or6Platform(TemplatedPlatform):
         return m
 
     def get_ff_sync(self, ff_sync):
+        if ff_sync._max_input_delay is not None:
+            raise NotImplementedError("Platform {!r} does not support constraining input delay "
+                                      "for FFSynchronizer".format(self))
+
         m = Module()
         flops = [Signal(ff_sync.i.shape(), name="stage{}".format(index),
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
@@ -423,6 +427,10 @@ class XilinxSpartan3Or6Platform(TemplatedPlatform):
         return m
 
     def get_reset_sync(self, reset_sync):
+        if reset_sync._max_input_delay is not None:
+            raise NotImplementedError("Platform {!r} does not support constraining input delay "
+                                      "for ResetSynchronizer".format(self))
+
         m = Module()
         m.domains += ClockDomain("reset_sync", async_reset=True, local=True)
         flops = [Signal(1, name="stage{}".format(index), reset=1,

error messages and comments
did not merge xilinx_7series.py
@dlharmon
Copy link
Contributor Author

dlharmon commented Sep 24, 2019

Thanks for the update. It looks like I did not post my work-in-progress patch for max_input_delay that has nicer error messages and some additional comments. Could you integrate this please?

First commit reverts lib/cdc.py and vendor/xilinx_spartan_3_6.py and applies the above patch except for vendor/xilinx_7series.py. 2nd commit adds comments for max_input_delay.

I think we want to not apply a false path if there is a max delay set as it overrides the max delay in Vivado.

I have -quiet on the TCL commands to search for our attributes as it will be common for there to be none in simple single clock designs. There is no -quiet on the max delay find clock part.

@whitequark
Copy link
Contributor

@dlharmon Everything looks good to me now. I wrote a comment explaining how this all works. If it looks accurate to you I'll merge.

@dlharmon
Copy link
Contributor Author

@dlharmon Everything looks good to me now. I wrote a comment explaining how this all works. If it looks accurate to you I'll merge.

Looks good. Thanks for all your help on this one.

@whitequark whitequark merged commit f3a8880 into m-labs:master Sep 24, 2019
@whitequark
Copy link
Contributor

Thanks for all your help on this one.

I very much appreciate assistance with CDC primitives, especially on platforms I haven't worked with much.

Speaking of which, I tried to add false path support for ISE. I came up with this patch:

diff --git a/nmigen/vendor/xilinx_spartan_3_6.py b/nmigen/vendor/xilinx_spartan_3_6.py
index 284cfab..ba3d53f 100644
--- a/nmigen/vendor/xilinx_spartan_3_6.py
+++ b/nmigen/vendor/xilinx_spartan_3_6.py
@@ -127,9 +127,12 @@ class XilinxSpartan3Or6Platform(TemplatedPlatform):
                 {% endfor %}
             {% endfor %}
             {% for signal, frequency in platform.iter_clock_constraints() -%}
-                NET "{{signal|hierarchy("/")}}" TNM_NET="PRD{{signal|hierarchy("/")}}";
-                TIMESPEC "TS{{signal|hierarchy("/")}}"=PERIOD "PRD{{signal|hierarchy("/")}}" {{1000000000/frequency}} ns HIGH 50%;
+                NET "{{signal|hierarchy("/")}}" TNM_NET = "PERIOD_{{signal|hierarchy("/")}}";
+                TIMESPEC "TS_clk_{{signal|hierarchy("/")}}" = PERIOD
+                    "PERIOD_{{signal|hierarchy("/")}}" {{1000000000/frequency}} ns HIGH 50%;
             {% endfor %}
+            NET "*/async_ff$stage1$*" TPSYNC = "TPSYNC_async_ff";
+            TIMESPEC "TS_async_ff" = TO "TPSYNC_async_ff" TIG;
             {{get_override("add_constraints")|default("# (add_constraints placeholder)")}}
         """
     }
@@ -413,7 +416,7 @@ class XilinxSpartan3Or6Platform(TemplatedPlatform):
 
     def get_ff_sync(self, ff_sync):
         m = Module()
-        flops = [Signal(ff_sync.i.shape(), name="stage{}".format(index),
+        flops = [Signal(ff_sync.i.shape(), name="async_ff$stage{}$".format(index),
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(ff_sync._stages)]
@@ -425,7 +428,7 @@ class XilinxSpartan3Or6Platform(TemplatedPlatform):
     def get_reset_sync(self, reset_sync):
         m = Module()
         m.domains += ClockDomain("reset_sync", async_reset=True, local=True)
-        flops = [Signal(1, name="stage{}".format(index), reset=1,
+        flops = [Signal(1, name="async_ff$stage{}$".format(index), reset=1,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(reset_sync._stages)]
         for i, o in zip((0, *flops), flops):

Would you be able to confirm that it works? I'm also not entirely sure about using NET "*/async_ff$stage1$*" TPSYNC. It seems like NET "*/async_ff$stage0$*" TPSYNC would work more like what we have for Vivado now, but the documentation is not very clear on exactly which paths the constraint will apply to.

@dlharmon
Copy link
Contributor Author

Would you be able to confirm that it works? I'm also not entirely sure about using NET "*/async_ff$stage1$*" TPSYNC. It seems like NET "*/async_ff$stage0$*" TPSYNC would work more like what we have for Vivado now, but the documentation is not very clear on exactly which paths the constraint will apply to.

I'm far less confident with ISE than Vivado. It's been a while since I've done more than minor changes to old projects.

I made this test case:

from nmigen import *
from nmigen.lib.cdc import FFSynchronizer
from nmigen.build import *
from nmigen.vendor.xilinx_spartan_3_6 import *

class TestPlatform(XilinxSpartan6Platform):
    device      = "xc6slx9"
    package     = "tqg144"
    speed       = "2"
    default_clk = "clk100"
    resources   = [
        Resource("clk100", 0, Pins("P126", dir="i"), Clock(100e6), Attrs(IOSTANDARD="LVCMOS33")),
        Resource("clk97", 0, Pins("P123", dir="i"), Clock(97e6), Attrs(IOSTANDARD="LVCMOS33")),
        Resource("led", 0, Pins("P119", dir='o'), Attrs(IOSTANDARD="LVCMOS33")),
    ]
    connectors  = []

class Blinky(Elaboratable):
    def elaborate(self, platform):
        clk97    = platform.request("clk97")
        user_led = platform.request("led", 0)
        counter  = Signal(20)
        blink_97 = Signal()
        m = Module()
        m.domains.clk97 = ClockDomain("clk97")
        m.d.comb += ClockSignal("clk97").eq(clk97.i)
        m.submodules.mreg = FFSynchronizer(i=counter[-1], o=blink_97, o_domain="clk97", stages=2, reset=1)
        m.d.sync += counter.eq(counter + 1)
        m.d.comb += user_led.o.eq(blink_97)
        return m

if __name__ == "__main__":
    platform = TestPlatform()
    platform.build(Blinky(), do_program=False)

With or without your patch, it passes timing. I'd forgotten that ISE by default ignores timing between unrelated clocks. I think just the ASYNC_REG attribute is enough here. I'm not 100% sure which flip flop the TIG gets applied to with your patch. The reporting in ISE is rather lacking compared with Vivado.

I ran trce -intstyle ise -v 3 -s 2 -n 3 -fastpaths -xml top.twx top_par.ncd -o top.twr top.pcf to generate a detailed timing report, top.twr It looks like the TIG is getting attached too late in the synchronizer chain even if I move it to stage 0.

@whitequark
Copy link
Contributor

Thanks. Then I think it is fine as it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants
Contributor Author dlharmon commented It looks like you can't use arbitrary Tcl commands in XDC files (AR#54842), so I moved it to the main script. Please let me know if this works properly. I merged this with my local copy and built a bitstream. The timing report and constraints look good. I like how this works. Thanks.