Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FPGA interchange lookahead improvements #260

Open
litghost opened this issue Mar 29, 2021 · 4 comments
Open

FPGA interchange lookahead improvements #260

litghost opened this issue Mar 29, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@litghost
Copy link

The current lookahead in the FPGA interchange implementation is disabled by default because it is slow to compute and requires significant memory, both when being computed and when using it. This issue covers topics around how the lookahead might be improved and goals in that directly.

Current status

The current lookahead is data-driven, and in theory should be robust to various architectures, and work both on timing and non-timing driven situations. It is derived from the extended map lookahead (https://github.com/verilog-to-routing/vtr-verilog-to-routing/blob/master/vpr/src/route/router_lookahead_extended_map.h) developed as part of the VPR flow for https://github.com/symbiflow/symbiflow-arch-defs/ .

Current problems

  • The time to compute the lookahead is fairly high for the A35T (6 min on a 56 core machine). In theory, computation time should not increase with larger fabrics, but this has not been verified.
  • Memory consumption during the computation of the lookahead is fairly high (~15 GiB) and when the lookahead is being used (~6 GiB).

Potential solutions

  • To lower the computation time when computing the lookahead, consider shrinking small search space, either geometrically or via limiting the depth of the expansion. Doing so will need to be traded with router performance
  • Profile lookahead computation and see if there are further optimizations that are possible
  • Explore ways to shring number of and size of CostMap (which consume the majority of the disk and RAM)
  • Currently the lookahead has no duplicate detection for wire pairs with similiar cost map's. Potentially detect (or supply in the chipdb) wire similarity information. There is a trade off to requiring arch's to supply this information, so prefer solutions that are driven based on the routing graph alone, rather than architecture specific input.
  • Explore parameteric equations to represent CostMap's. It is likely that many/all of the cost maps could be reduced to a set of parameteric equations and/or table lookups with less resolution. See if level of detail techiniques can be applied.
@litghost litghost added this to To Do in FPGA interchange bootstrapping via automation Mar 29, 2021
@issuelabeler issuelabeler bot added the duplicate This issue or pull request already exists label Mar 29, 2021
@litghost litghost added enhancement New feature or request and removed duplicate This issue or pull request already exists labels Mar 29, 2021
@mithro
Copy link
Member

mithro commented Mar 29, 2021

Can the lookahead be generated once and then saved / loaded?

@litghost
Copy link
Author

Can the lookahead be generated once and then saved / loaded?

That's already being done. However, consider that GH actions only has 4 threads, so if it takes ~6 minutes on a 56 core machine, worst case it could take ~1 hr to compute in GH actions.

@mithro
Copy link
Member

mithro commented Mar 30, 2021

How big is the file, could it just be shipped with the interchange files in some way?

@litghost
Copy link
Author

How big is the file, could it just be shipped with the interchange files in some way?

So there is additional complexity in your suggestion. The lookahead is generated by nextpnr, so it is dependent on the chipdb binary, which is downstream of interchange device database. So while we might be able to generate the lookahead from the device database, there is a risk of mismatch between nextpnr and the lookahead. Which is why the current logic is bound to the chipdb, not the device database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

2 participants