Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework std.base64 api #621

Merged
merged 2 commits into from Nov 21, 2017
Merged

rework std.base64 api #621

merged 2 commits into from Nov 21, 2017

Conversation

thejoshwolfe
Copy link
Sponsor Contributor

  • rename decode to decodeExactUnsafe.
  • add decodeExact, which checks for invalid chars and padding.
  • add decodeWithIgnore, which also allows ignoring chars.
  • alphabets are supplied to the decoders with their
    char-to-index mapping already built, which enables it to be
    done at comptime.
  • all decode/encode apis except decodeWithIgnore require dest
    to be the exactly correct length. This is calculated by a
    calc function corresponding to each api. These apis no longer
    return the dest parameter.
  • for decodeWithIgnore, an exact size cannot be known a priori.
    Instead, a calc function gives an upperbound, and a runtime
    error is returned in case of overflow. decodeWithIgnore
    returns the number of bytes written to dest.

closes #611

@thejoshwolfe
Copy link
Sponsor Contributor Author

Now that i'm looking at it, i want to move the decode functions into the "Alphabet" structs, and rename the "Alphabet" structs to "Decoder".

@andrewrk
Copy link
Member

Looks good to me, feel free to merge. However I'd like to suggest not using "unsafe" to communicate that something could be an undefined value. "unsafe" usually communicates that something could be undefined behavior which is very different.

@thejoshwolfe
Copy link
Sponsor Contributor Author

would "unchecked" be better than "unsafe"?

@andrewrk
Copy link
Member

yes

@thejoshwolfe
Copy link
Sponsor Contributor Author

actually calcDecodedSizeExactUnsafe() uses @divExact() to assume the input length is a multiple of 4. That is actually undefined behavior if that assumption is not met. And that assumption is transitively included in all the other Unsafe methods, so Unsafe is a correct description, albeit just for that one assumption. Invalid characters are still defined behavior with undefined values.

@andrewrk
Copy link
Member

Ok sounds like it's good to merge then

@thejoshwolfe
Copy link
Sponsor Contributor Author

Do you have an opinion on this new OOP-looking API? Seems like the namespace is cleaner, and I like the new comptime asserts for encoding.

* rename decode to decodeExactUnsafe.
* add decodeExact, which checks for invalid chars and padding.
* add decodeWithIgnore, which also allows ignoring chars.
* alphabets are supplied to the decoders with their
  char-to-index mapping already built, which enables it to be
  done at comptime.
* all decode/encode apis except decodeWithIgnore require dest
  to be the exactly correct length. This is calculated by a
  calc function corresponding to each api. These apis no longer
  return the dest parameter.
* for decodeWithIgnore, an exact size cannot be known a priori.
  Instead, a calc function gives an upperbound, and a runtime
  error is returned in case of overflow. decodeWithIgnore
  returns the number of bytes written to dest.

closes #611
@thejoshwolfe
Copy link
Sponsor Contributor Author

I just started trying to write a simple base64 transform command line program using this api, and i've immediately run into a design limitation 😞. The existing API is hostile to streaming.

New idea. Replace the Base64DecoderWithIgnore class with a streaming decoder. It's already behaving a lot like a streaming decoder internally anyway. That can be an improvement after this PR is merged though. And I probably won't have time to do that in the immediate future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std.base64 should support reporting errors and ignoring non alphabet characters
2 participants