-
Notifications
You must be signed in to change notification settings - Fork 435
Data processor block/upgrade. #1077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
General question: why an extra component for this? For the codec/compression methods that would make sense, but data loading/dumping from/to bytes should be part of the base API.
|
I'd like this be an extra component to keep the base API small. Changing the base API also potentially breaks persistence (i.e. will force computers to reboot when loaded from an old save), so I'd like to avoid it for that reason, too. |
Those propositions are only pure base. As separate component further additions are easily possible. @fnuecke this is why I proposed a separate component with more functions. |
Maybe
Regarding endianness, there is no situation in which it would be necessary to know the real computer's endianness. None of the functions usable from Lua manipulate bytes in a way that endianness matters. The data processor component would be the only exception, and its default endianness should be consistent, no matter what the host computer uses. |
@Kubuxu I know you know ;-) My post was in reply to @dgelessus's question why it should be an extra component. |
It is necessary when communicating by low lever internet protocol. And also there is There will be no Look into my edit of initial post I proposed that |
@Kubuxu I didn't mean that
A few examples below. (There's no need to understand what the "data" strings mean, they are just the packed binary data for the values.) -- Un/packing a string
format, data = tobytes("string with a \0 byte")
print(format, data) --> 20s string with a \0 byte
print(frombytes(format, data)) --> string with a \0 byte
-- Packing a double
format, data = tobytes(math.pi)
print(format, data) --> d \024-DT\251!\009@
-- Packing multiple values
format, data = tobytes(42, 24, "string", false)
print(format, data) --> 2b6s? *\024string\000
print(frombytes(format, data)) --> 42 24 string false
-- Packing using a format string
data = tobytesf("5si6s", "Hello", 1234567, "World!")
print(data) --> Hello\135\214\018\000World!\000
print(frombytes("5si6s", data)) --> Hello 1234567 World! |
for tobytes/frombytes are y'all looking for something like Lua 5.3's string.pack and string.unpack (http://www.lua.org/manual/5.3/manual.html#pdf-string.pack) ? |
I must say that people from Lua thought it out great. What about:
-- Un/packing a string
data, fmt = tobytes("string with a \0 byte")
print(format, data) --> c20 string with a \0 byte
print(unpack(format, data)) --> string with a \0 byte
-- Packing a double
data, format = tobytes(math.pi)
print(format, data) --> n \024-DT\251!\009@
-- Packing multiple values
data, format = tobytes(42, 24, "string", false)
print(format, data) --> nnc6b *\024string\0
print(frombytes(format, data)) --> 42 24 string false
-- Packing using a format string
data = pack("zi6z", "Hello", 1234567, "World!")
print(data) --> Hello\0\135\214\018\000World!\0
print(frombytes("zi6z", data)) --> Hello 1234567 World!
|
Those functions from Lua 5.3 are basically what I'm suggesting. (I bet there are some underlying C functions that do the same thing.) |
The |
@dgelessus You'd want to skip the format string when you already know what it is and have coded it in - or cached it - on the receiving end. I'm curious as to what the intended use case for this is. But other than these external services, when would you use (un)pack in preference to http://ocdoc.cil.li/api:serialization ? |
My intended use was to be able to compress data, save binary data in database w/o ugly hacks. |
I've kind of lost track, what's the current consensus? That this would provide Lua 5.3's Would someone else like to take a shot at this? That'd be highly appreciated. If so, for reference have a look at the unicode API for example. Keep in mind this needs to more or less be implemented twice, once for the native arch and once for the LuaJ arch. |
String.pack would fix one (the most significant) part of the request. It does request compression and base64 utilities, but I think those were secondary, as in "if we make a block anyway, may as well add this too". So yes, string.pack and family would provide what's requested. Presuming 5.3 is coming anyway it would be less work to simply wait for it and require it for this functionality 😈 |
@MaHuJa OC on Lua5.3 is only a case of fixing one bug in Eris for Lua 5.3 but we can't reproduce it outside of OC and OC itself is huge place to look for it. |
I saw it - but my argument is basically that resolving #811 would resolve this with no added work. |
Allright, since it's in now: if you want the string packing stuff use Lua 5.3, for other things, there's also the data card now, so I'll be closing this. |
Data processor would allow advanced data processing (pun not intended).
Its main purpose would be to provide functions for bulk data storage or lower level functions on data.
Proposed functions:
tobytes(data1: Primitive[, data2: Primitive[, ...]]) -> ByteArray, format: String
- Returns byte encoded primitive, throws error if rhere is more than one argument and it is string withnull
byte in it.tobytesf(format: String, ...) -> ByteArray, format: String
- Accepts format string and variable number of arguments. Returns ByteArray and the same format string for consistency withfrombytes(format/type: String, bytes: ByteArray) -> Primitive (of type 'type') or ... depending whether type or format was specfied.
- returns decoded data from bytes.d
: for doubleb
: for booleans
: for string (String can't contain0
byte.)r
(likeraw
): for byte array.r
is required to be followed by decimal number of bytes.deflate(bytes: ByteArray) -> ByteArray
- applies DEFLATE compression.inflate(bytes: ByteArray) -> ByteArray
- applies INFLATE decompression.endianness() -> String
- returns "big" in case of big-endian systems and "little" in case of little-endian systems.encode64(bytes: ByteArray) -> ByteArray
- returns bytes encoded to base64.decode64(bytes: ByteArray) -> ByteArray
- returns bytes decoded from bese64.Almost all of those functions can be implemented from inside of Lua but the implementation is never perfect and is usually very slow; Example.
EDIT2: Edited the main part of post to incorporate changes.
The text was updated successfully, but these errors were encountered: