Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a sanitizer for memory port combinations #12

Open
whitequark opened this issue Dec 21, 2018 · 5 comments
Open

Implement a sanitizer for memory port combinations #12

whitequark opened this issue Dec 21, 2018 · 5 comments
Labels

Comments

@whitequark
Copy link
Contributor

There are a lot of possible combinations of memory ports and many of them are not legal. This depends on the vendor. In this issue we collect such behaviors to eventually implement a sanitizer, either as a part of nMigen or as a Yosys pass.

@whitequark
Copy link
Contributor Author

whitequark commented Dec 21, 2018

Lattice iCE40

DRAM

No distributed RAM.

BRAM

Configurable as 2048x2, 1024x4, 512x8 or 256x16.

Block RAM rules:

  • Port A: read-only.
  • Port B: write-only.
  • All reads and writes are synchronous.
  • If A and B have the same address, after a simultaneous RCLK and WCLK edge, port A returns the data that was there before the write (non-transparent).
  • TN1256 unclear about availability of RE. TN1256 claims that it is only available in 256x16 configuration but lists all other primitives still with RE port.
  • WE with sub-word granularity (i.e. MASK) only available in the 256x16 configuration.

Yosys behavior

  • Yosys will add logic to make a transparent port, if requested.
  • Yosys does not use RE or WE at all. Instead, RCLKE and WCLKE are used. This is due to a silicon bug. This should not affect anything as these ports are equivalent in functionality, and in fact it is not clear why RE and WE even exist in the first place.

iCECube behavior

  • It is unknown if iCECube will add logic to make a transparent port.
  • iCECube does not use RE or WE at all, similarly to Yosys.

@nakengelhardt
Copy link
Contributor

nakengelhardt commented Dec 21, 2018

Xilinx (7 Series and UltraScale)

DRAM

  • supports asynchronous read
  • if synchronous read is used, write is transparent to all other ports
  • a memory description matching SP/SDP/TDP pattern is mapped to DRAM below a certain size, or if there isn't much logic in the design otherwise and the synthesizer doesn't feel like trying very hard; increasing the size of the memory may lead the synthesizer to put the same description in BRAM instead
  • sometimes it will find a read register somewhere even if you intended to write an asynchronous read port

BRAM

Configurable as 32768x1, 16384x2, 8192x4, 4096x9, 2048x18, 1024x36, 512x72 (32 Kb) or 16384x1, 8192x2, 4096x4, 2048x9, 1024x18, 512x36 (18 Kb) in SDP mode.

  • the underlying BRAM hard block has two ports which can do both reads and writes.
  • using the same address in the same cycle on both ports is considered an aberration. There are only two permissible cases:
    • both ports are reading, or
    • one port is writing AND it is in READ_FIRST mode, in which case the other port will read the old data. (Note that it is the setting of the write port that defines the behavior of the read port. Setting of the read port is irrelevant.)
  • all other cases are user error and have undefined behavior (observing with ILA, the other port usually reads the old data).
  • a port that writes will also always read simultaneously. In fact, there is no "read enable" signal, there is only a "port enable" signal and a "write enable" signal. In the xilinx examples for how to infer memories, migen's re is called ena/enb and always also gates writes.
  • the setting of READ_FIRST/WRITE_FIRST/NO_CHANGE affects which data ends up on the read data signal of the writing port.
  • there is no way to get the "fresh" data on the other port.
  • when inferring any memory, unless you used the exact READ_FIRST template, vivado will assume you know that the behavior is undefined despite having written very well-defined verilog, and it can pick any behavior it wants for collisions.

@daveshah1
Copy link

daveshah1 commented Dec 21, 2018

Lattice ECP5

Distributed RAM

Uses 3 SLICEs, two for memory and one to feed in the write port, to implement a 16x4 RAM. One synchronous write port with enable and one asynchronous read port. The SLICE flip flops can be used to make the read port synchronous. Written data propogates in about 800ps from write clock edge to async
read port (if addresses are the same).

Block RAM

Configurable as 16384x1, 8192x2, 4096x4, 2048x9 or 1024x18 true dual port (DP16KD) or 512x36 pseudo dual port (this mode not implemented in Yosys/nextpnr yet, it uses a different PDPW16KD primitive).

Rules:

  • Port A read and write (DP16KD) or write only (PDPW16KD)
  • Port B read and write (DP16KD) or read only (PDPW16KD)
  • Both ports configurable to three write modes (Yosys only uses the latter two):
    • NORMAL: read data when an address is also written to is undefined
    • READBEFOREWRITE: read data when an address is also written to is the previous value
    • WRITETHROUGH: read data when an address is also written to is the new value
  • Both ports have synchronous read and write only. An additional output register can be enabled per
    port, increasing read latency to 2 clocks.
  • "Byte" (actually 9 bit bytes) enables are available in 18 and 36 bit modes, actually
    using the otherwise unused lower address bits.
  • BRAMs are given an ID (WID) which is then used to initialise them by a command
    referencing this ID after the main bitstream. Assigning WIDs at synthesis, although
    not currently implemented, could make swapping out BRAM content easier.
  • Each port has a clock enable (CE), write enable (WE) and output register clock enable (OCE).
  • Each port has three chip select pins, the port is only enabled when these match the port's
    CSDECODE parameter.

Yosys Behaviour

  • As Yosys doesn't support true dual port inference, port A is always write and port B is always read.
  • WE is used to enable write ports, CE is used to enable read ports
  • CSDECODE, extra output registers, and the PDPW16KD primitive are not currently inferred

Diamond Behaviour

TBC

@whitequark
Copy link
Contributor Author

@nakengelhardt Can you explain why does UG473 specify the BRAM dimensions for SDP mode specifically? Is it different for TDP mode?

@nakengelhardt
Copy link
Contributor

Do you mean this?

Each 18 Kb block and 36 Kb block can also be configured in a simple dual-port RAM mode. In this mode, the block RAM port width doubles to 36 bits for the 18 Kb block RAM and 72 bits for the 36 Kb block RAM.

The aspect ratios supported are different for SDP and TDP mode, yes. Tables 1-11 to 1-14 on pages 29-30 have the valid combinations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants