Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bitarray implementation to eliminate external dependency #265

Closed
wants to merge 16 commits into from

Conversation

mndza
Copy link

@mndza mndza commented Jan 24, 2021

This PR is an attempt to address #179. This bitarray implementation follows the interface established by glasgow.support.bits while changing the backing storage to an array() object. Currently the element size of this array is a byte and not a machine word.

I am not particularly happy with the byte_reverse / byte_reversed method names to imply in-place modification or not. Also, I am not sure about the correct behavior in those functions when the number of stored bits is not a multiple of the byte size. Right now it drops the bits that exceed the bit length after the byte reversal.

Please note that for initializing the bitarray from a bytes object you need to provide the explicit length in bits. This is because I copied the constructor interface from glasgow.support.bits, but I do not know if that is the desired operation.

Most of the test cases were copied from glasgow.support.bits.

Copy link
Member

@attie attie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mndza thanks very much for doing this!

I've had a rummage, and found a few issues - sorry!

... this should be a "request changes" review, I'm still learning GitHub's interface.

Comment on lines +201 to +204
def setall(self, value):
value_int = ~(-1 << self._len_) if bool(value) else 0
value_bytes = value_int.to_bytes(len(self._array_), "little")
self._array_ = array('B', value_bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation makes me nervous as it stands...

It seems to work with absurdly large integers, but did I just not go high / complex enough?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still haven't found a way to do it faster than this. Regarding your concern, this code should in fact work for arbitrarily large numbers, as Python's int implementation is only limited by the amount of available memory.

I also tried some other approaches. The following runs at least 10x slower in my machine, testing with 1GiB bitarray:

def setall(self, value):
    byte = 0xFF if bool(value) else 0
    self._array_ = array('B', bytes(byte for _ in range(len(self._array_))))
    if self._len_ % 8:
        self._array_[-1] &= ~(-1 << (self._len_ % 8))

self.fuse = bitarray(int(count, 10), endian="little")
self.fuse = bitarray(0, int(count, 10))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again re avoiding changes...

Do you handle endianness at all at the moment? Is it important?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding endianness, the class currently only works in "little-endian" mode: least significant bytes are stored at the smallest address, and bits are ordered from LSB to MSB inside a byte. This is also consistent with glasgow.support.bits.

I am not sure if this is a problem though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endianness should emphatically not be a property of the new bitarray class. However, it must be possible to specify endianness when serializing/deserializing to/from buffer objects. E.g. SPI and JTAG have opposite endianness and we are actually going to hit this when working on gateware.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I can see that glasgow.support.bits does not have a way to do so. It's fine to extend both of them at the same time in a different PR, provided that existing bitarray-using code can be ported as it is.

software/glasgow/protocol/jesd3.py Show resolved Hide resolved
software/glasgow/support/bitarray.py Show resolved Hide resolved
software/glasgow/support/bitarray.py Show resolved Hide resolved
software/glasgow/support/bitarray.py Outdated Show resolved Hide resolved
software/glasgow/support/bitarray.py Outdated Show resolved Hide resolved
Comment on lines +320 to +345
def byte_reverse(self):
""" In-place byte reversal """
if not hasattr(self.byte_reverse, "table"): # only build once
self.byte_reverse.__func__.table = self.__build_revbyte_table()
table = self.byte_reverse.table
for i in range(len(self._array_)):
self._array_[i] = table[self._array_[i]]
if self._len_ % 8:
self._array_[-1] &= ~(-1 << (self._len_ % 8))
return self

def byte_reversed(self):
new = self.__class__(self._array_, self._len_)
return new.byte_reverse()

def reversed(self):
if self._len_ % 8:
# Adding zero padding in the low bits allows easier bit reversal
# for the cases where bit length is not multiple of byte size
out = self.__class__(0, 8-(self._len_%8)) + self
else:
out = self.__class__(self._array_, self._len_)
out._array_.reverse() # Reverse byte order (last to first)
out.byte_reverse() # Reverse bit order in every byte
out._len_ = self._len_ # Truncate possible length extension due to padding
return out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between bitarray.byte_reverse(), bitarray.byte_reversed() and bitarray.reversed() is very subtle on first glance... perhaps a comment?

Also, some don't behave properly, or observe self._len_ properly: 😬

>>> x = bitarray.from_bytes(b'abc', 8*4)
>>> x
bitarray('00000000011000110110001001100001')
>>> x.byte_reverse()
bitarray('00000000110001100100011010000110')
>>> x.byte_reversed()
bitarray('00000000011000110110001001100001')
>>> x.reversed()
bitarray('00000000011000010110001001100011')

Copy link
Author

@mndza mndza Jan 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know the distinction is not very clear (see my original PR message). I will add some docstrings, but maybe we could remove one of the methods (byte_reverse() or byte_reversed()) ?

The wrong result in reversed() was caused by the bug fixed in b99b66f.

Regarding self._len_:

  • In reversed(), self._len_ is used to determine how much padding should be added before the actual reverse process. Not seeing the problem here.
  • In byte_reverse() / byte_reversed(), entire bytes are reversed, so the only thing I do when self._len_ is not multiple of 8 is mask the final byte. I also stated my doubts about the correct operation here in the original PR. Could you explain what behavior are you expecting in this case? Then I may modify the method to comply.


def __bitop(self, other, op, clear_top=False):
other = self.__class__(other)
# Only perform operation in-place when the other bitarray is smaller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a wrong condition to pick between mutating and copying operations.

Copy link
Author

@mndza mndza Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the proper response when other's bit length is larger? Maybe raise an exception like @attie (#265 (comment)) mentioned? If so, should that only happen for the in-place / mutating operations?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the behavior in commit ffc277a

Base automatically changed from master to main February 27, 2021 09:59
Copy link
Member

@whitequark whitequark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. Unfortunately, while the basic approach to the task was sound, this particular implementation includes a substantial degree of complexity (e.g. the __copy_bit_slice method) and numerous issues (see inline comments), and as a result I cannot convince myself that we should merge it. Again unfortunately, I also do not expect that I will have the review bandwidth necessary to bring this implementation to the point where it would be mergeable.

I apologize for causing you to spend your time in this way. I think that some of the code in this PR could be reused by a future developer handling the same task, so hopefully it will have some utility in the end.

@whitequark whitequark closed this Jul 1, 2023
@mndza
Copy link
Author

mndza commented Jul 25, 2023

I do agree with you: looking at this code now (it's been a while) and I'm not happy with it. I should have targeted for simplicity and correctness first. My apologies.

@whitequark
Copy link
Member

Thank you for understanding. I'm happy to work with you on any future contributions, if you would like to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants