summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorYorhel <git@yorhel.nl>2012-03-21 19:41:03 +0100
committerYorhel <git@yorhel.nl>2012-03-21 19:41:03 +0100
commit7cdb326621e532727e1faff6085b90ac39ffca7a (patch)
tree2e3b46c1b786906d3b8ed32c55cde85632d77b3e
parent3efce78eb69898dac2166afe169844eb8795b2c9 (diff)
Sync spec.pod and proto.pod with behaviour of Perl and C imlementation
Well, almost. Neither the Perl nor the C implementation completely follow the current spec, but their overal semantics are what I've decided on. The most important change with my previous ideas is that tuples are now dynamically typed and the removal of the boolean type. This is because it's much easier (and much more "natural") to implement dynamic typing semantics on top of a statically typed language than the other way around.
-rw-r--r--proto.pod53
-rw-r--r--spec.pod119
2 files changed, 89 insertions, 83 deletions
diff --git a/proto.pod b/proto.pod
index f6004a5..bd3f2a0 100644
--- a/proto.pod
+++ b/proto.pod
@@ -14,9 +14,10 @@ once the connection has been established.
After the connection has been established, both sides immediately send a
handshake message. This message is a space-separated list of parameters,
-followed by a newline character (\n). Parameters may themselves contain
-multiple items by separating them with commas. Parameters must be
-self-describing, the order in which they appear does not matter.
+followed by a newline character (0x0A) or CRLF sequence (0x0D 0x0A). Parameters
+may themselves contain multiple items by separating them with commas.
+Parameters must be self-describing, the order in which they appear does not
+matter.
The following is an example handshake.
@@ -205,7 +206,11 @@ delimiter.
=head3 Tuples & patterns
-I<TODO>
+A tuple is represented as an array in JSON. The elements of the tuple map quite
+naturally to JSON: C<null> represents a wildcard, a number represents an
+integer or a float, a string represents a string, an array represents an array
+and an object represents a map. The value C<true> should be taken as an alias
+of the integer C<1> and C<false> as an alias for C<0>.
=head3 Messages
@@ -263,43 +268,3 @@ Message type 7. Second element is the $tid, encoded as a number.
=back
-
-=head2 Gob
-
-=head3 Tuples & Patterns
-
- type Tuple []interface{}
- type Pattern []interface{}
-
-Allowed elements and their behaviour are as defined in the Go implementation.
-
-=head3 Messages
-
-I<TODO:> This section is outdated.
-
-Each message is communicated as a different struct. The type of a I<message> is
-therefore C<interface{}>. The following structs are used, corresponding to each
-of the messages.
-
- type msgRegister struct {
- pid int
- pattern Pattern
- }
-
- type msgUnregister struct {
- pid int
- }
-
- type msgTuple struct {
- tid int
- tuple Tuple
- }
-
- type msgReply struct {
- tid int
- pattern Pattern
- }
-
- type msgClose struct {
- tid int
- }
diff --git a/spec.pod b/spec.pod
index 5d20adc..8f8df72 100644
--- a/spec.pod
+++ b/spec.pod
@@ -36,10 +36,6 @@ Each element may be of a different type. Allowed types are:
=item * Strings
-=item * Booleans
-
-I<TODO:> Get rid of booleans and stick with integers instead?
-
=item * Arrays with elements of allowed types
The elements of the array do not have to be of the same type. A tuple itself is
@@ -62,23 +58,59 @@ A I<pattern> is similar to a tuple, but has a smaller set of allowed types:
=item * Strings
-=item * Booleans
-
=item * Wildcards
=back
-That is: Floating point numbers, arrays and maps are not allowed in patterns.
+That is: Arrays and maps are not allowed in patterns. Neither are floating
+point types, but see below on how they are handled.
+
+A string should always be encoded in UTF-8.
+
+An integer must be representable in a signed (two's complement) 32-bit integer.
+Thus anything in the range of C<-2^31> to C<2^31-1> constitutes a valid integer
+value. It is recommended that implementations extend this definition to 64-bit
+integers, but portable applications should not assume these to be available.
+
+The maximum precision and range of floating point numbers is dictated by the
+implementation, but should at least be a double precision IEEE 754 floating
+point. The special values C<NaN>, C<-Inf> and C<+Inf> are not allowed, and
+C<+0> and C<-0> should be considered as equivalent (i.e. the sign of the zero
+value may get lost in transmission).
+
+Even though the above string, integer and float types are specified separately,
+tuples are dynamically typed and conversion between the types is done depending
+on what the application expects to receive.
+
+Integer to float conversion should be obvious. Float to integer conversion may
+be done either by rounding the floating point number to the nearest integer or
+by throwing away the non-integer part (flooring), depending on the
+implementation. What happens when the number is out of the range of the integer
+type is also implementation-defined.
+
+String to integer conversion should at least be supported when the string
+matches the following regular expression: C<^-?([1-9][0-9]*|0)$>, in which case the
+string should be interpreted as a decimal number. Other formats may be allowed
+as well, but this is implementation-defined and should not be relied upon by
+applications that attempt to be portable. Behaviour when the number is out of
+range for the chosen integer type is again implementation-defined.
-All integers should be representable in a signed (two's complement) 64-bit
-integer. Thus anything in the range of C<-2^63> to C<2^63-1> (both inclusive)
-are valid integer values.
+String to float conversion follows the same behaviour as string to integer
+conversion. Strings matching the following regular expression must be supported
+for conversion: C<-?([1-9][0-9]*|0)(\.[0-9]+)?([eE]?[+-]?[0-9]+)?>.
-The maximum precision of floating point numbers is dictated by the
-implementation. In practice, this can be expected to be either single precision
-or double precision IEEE 754 floating points.
+Float to string and integer to string conversions are implementation-defined,
+as long they are reversible by the string to float and string to integer
+conversions used in the same implementation.
+
+Conversions to and from array and map types should not be allowed.
+
+There is no special boolean type, but if a boolean value is required then the
+following values should evaluate to I<false>: The integer or floating point
+with a value equal to 0, the zero-length string and the one-length string
+containing only the ASCII C<'0'> character. Any other value should be
+considered I<true>.
-Strings should be encoded in UTF-8.
=head2 Matching
@@ -96,39 +128,48 @@ either of the following holds:
=item 1. Either element is a wildcard
-=item 2. The elements are of the same type, and their value is equivalent.
+If the element of either the pattern or the tuple is a wildcard, then other
+element can be of any type and value (this includes arrays and maps).
-=back
+=item 2. Both elements have the same integer value
-The first rule is trivial: if the element of either the pattern or the tuple is
-a wildcard, then other element can be of any type and value. The second rule is
-slightly less obvious, and requires the type definitions given in the previous
-section to be interpreted properly. One implication of the second rule is that
-it is impossible to do a non-wildcard match on anything other than integers,
-strings and booleans, since values of other types are not allowed in patterns.
-Another thing to keep in mind is that both values have to be of the same type,
-so the integer C<0> is not equivalent to the boolean C<false>. Nor is the
-string C<"15"> equivalent to the integer C<15>. However, integers are always
-equal if their numeric value is equivalent (provided they are within the
-specified range). This means that if, within an implementation, a tuple with an
-element of type uint8 with value C<5> is matched against a pattern with an
-element at the same location of type int64 and with the same value, then these
-elements match. Two strings are equal if they are of the same length and all of
-their characters have the same unicode values. Strings are thus matched in a
-case-sensitive fashion. The above rules also imply that the tuple must hold at
-least as many elements as the pattern.
+This implies that both elements can be converted to an integer type, as
+specified in the previous section. This also implies that, when matching on
+floating point types, only the integer representation (either rounded or
+floored) is considered. Since this is implementation-defined, matching against
+a floating point number for which the floored number is not equivalent to the
+rounded number is not portable.
-I<TODO:> Allow integer-formatted strings to do match integers of the same value?
+=item 3. Both elements have the same string value
-I<TODO:> Allow the integer C<10> in a pattern to match a float C<10.0> in a tuple?
+Two strings are equivalent if they have the same length and their byte
+representations are equivalent. Strings are thus matched in a case-sensitive
+fashion.
+=back
+Rule 2 and 3 never apply to array or map types, as those can be converted to
+neither integers nor strings. Thus matching on these types is impossible except
+in rule 1, using a wildcard.
-=head1 Communication
+The above rules also imply that the tuple must hold at least as many elements
+as the pattern, but the number of elements does not have to be equal.
-I<TODO:> Describe the terms "network" and "session"
+Note that matching on the previously mentioned notion of a boolean type is not
+possible. That is, you can't specify that you wish to match on all elements
+that are considered I<true> or I<false>. If this is required, it is recommended
+to standardise on the integers C<1> and C<0> to represent true or false,
+respectively.
-I<TODO:> Refactor with the return-path being an argument of I<send>.
+Also keep in mind that the above matching rules have a lot of
+implementation-defined behaviour. For example, the integer C<1234> will match
+on the string C<"02322"> if the implementation accepts octal numbers as a valid
+candidate for integer conversion. This may lead to unexpected behaviour, but I
+do not expect this to be a problem in practice.
+
+
+
+=head1 Communication
Send and register are the two simplest form of communication. A session uses
the I<register> primitive to indicate that it is interested in receiving tuples
@@ -154,4 +195,4 @@ session sent them. Besides that, there are no ordering restrictions: Messages
sent by two different sessions may be received by an other session in any
order.
-I<TODO:> More ordering constraints on message receival?
+I<TODO:> Specify the return-path.