New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instability due to small changes to memory initialization #1776
Comments
@acomodi There are a couple of things going wrong here. I would not expect a huge change in routing time for memory init functions? |
@acomodi This kind of sounds like the BRAM is being converted into flops? |
@mithro Hmm, I doubt this is the issue. I am pretty sure that, if a BRAM gets inferred as logic, a much higher number of SLICEMs will be used, and DRAMs would get inferred instead, which is not the case here: Pack log:
I have compared the two different resource utilizations from the pack.log and they are exactly the same in both runs. This means that the circuit is the same and it gets implemented on the same number/types of resources. I think that this might be an issue with the initial placement, and placement in general, specific to the BRAMs. Given that we currently support only BRAM_Ls, we might end up in a situation where some BRAMs get placed far from the core logic of the design, ending up in the reported differences in CPD and run-time. And this isn't actually related to BRAMs, but to all the tiles, but given that there is a hughe choice of CLBs, the placer should correctly optimize their placement, while, the lack of BRAMs might end up in bad placements (with consequent bad routing results). |
@acomodi - So this has nothing to do with the BRAM contents then and should happen with different seeds? |
@mithro At the moment these are only theories. I am performing additional tests and trying also different seeds to see if this might be the real issue here. I'll post some additional data soon. |
UpdateSeems that the issue is initial placement indeed. By using the same packer output and by changing the seed during placement to 1000 I got the following routing iterations: default seed:
custom seed (1000):
This is using the current symbiflow-arch-defs master (1d92154) and its conda VTR package. I have also double-checked once again the packer results utilizatiion between two different runs and it actually changes from run to run: Control run:
Test run:
There is a variation in the SLICEL count. |
The variation in the SLICEL count might also be a packing issue, rather than a placer issue. Still worth investigating. |
@litghost Indeed. What bugs me though is the huge difference in the eblifs: An initialization memory change should not alter the synthesized output this much. Anyway, this issue, for now will reflect on all the litex tests results from CI, so two different CI runs cannot be compared at the moment. |
While testing the RR graph base costs fixes I came across a possible instability issue that causes run-time and QoR to drastically change between one build to another.
With the newly added litex designs autogeneration, the resulting litex-generated designs files are exactly the same from one build to another, except for the
mem.init
files. The difference in themem.init
files is related to the LiteX bios which will report a different timestamp of the design generation. This is expected to happen.The small changes in the
mem.init
files though generate a great difference between the final outputs of the two runs.In fact, the difference in the generated
eblif
files is huge, even though they should be different only in a specific BRAM init values fields. This leads to VPR producing different outputs, step by step, with a kind-of non-deterministic behaviour.An example is the
minilitex
test:Difference in litex-generated files:
DIfference in resulting routing run-time and QoR:
Control run:
Test run:
NOTE: this test has been performed using the RR graph base costs fixes, but I am running experiments using the master+wip version on Symbiflow baseline, and I do expect to see a similar behavior.
The text was updated successfully, but these errors were encountered: