Resources
The Packed Encoding Rules (PER) are the most compact encoding rules. The goal of PER is to create smaller encodings. A key feature of PER is the way it encodes data by examining constraints.
Age ::= INTEGER (0..7) firstGrade Age ::= 6 -- C0
In the following example, integer A has four possible states, and you only need two bits to represent all possible states. Since we need two bits, there's no need to encode a length; it will always be two bits:
A ::= INTEGER (1234567..1234570) a A ::= 1234568 -- encoded in 2 bits
On the other hand, B in the example below is unbounded and therefore has an infinite number of states. Since we cannot know its length in advance, we must have a length field. The length field takes 8 bits and the value field takes 24 bits.
B ::= INTEGER b B ::= 1234568 -- encoded in 32 bits -- 8 bit length + 24 bit value
There are four variants of PER:
Note that a "canonical" encoding form ensures a consistent encoding for each message, therefore it is useful for comparing binary streams, for digital signatures, etc..
Some transformations permitted with BER are not allowed with PER:
Constraints that affect PER encodings are called PER-visible constraints. The following subtype constraints are PER-visible:
Other subtype constraints (e.g., inner type constraints, or single value constraints on an OCTET STRING type) are not PER-visible. However, although these constraints do not affect a PER encoding, they affect a set of abstract values (i.e., decoded values) that are considered to be valid.
Note: PER seeks to minimize the size of constrained types only for typical uses of the subtype constraint notation. Use of constraints that are normally PER-visible can be rendered not PER-visible by combining them with other constraints that are not PER-visible, or by the use of set operators. For example, D ::= OCTET STRING ('FE'H | SIZE (100..200)) does not have a PER-visible subtype constraint because the single value constraint on OCTET STRING is not PER-visible.
A ::= OCTET STRING (SIZE(100..120))
B ::= INTEGER (25..30)
C ::= INTEGER (40 | 55)
A ::= PrintableString (FROM ("0".."9"))
E ::= IA5String (SIZE(10) | FROM ("0".."9"))
Unlike BER, where lengths are always in octets, PER lengths can be in different units.
Length may stand for the number of:
No length is encoded if the size is known:
A ::= PrintableString (SIZE(5)) greeting A ::= "Hello" -- encoded as "Hello" (Aligned PER)
Length is present if the size varies:
B ::= PrintableString (SIZE(1..5) salutations B ::= "hi" -- encoded as 206869 (Aligned PER)
With UNALIGNED PER, the length is encoded in the minimum number of bits if the range is known and the upper bound is less than 64K.
If the range is unbounded or the upper bound is greater than or equal to 64K, the length is encoded as:
0 - 127 0LLLLLLL 128 - 16K-1 10LLLLLL LLLLLLLL >=16K 11000nnn fragmented nnn (1-4) = # of 16K multiples in fragment
Note that the first bit specifies whether the (0) short form or (1) long form is used. When it's the long form, the second bit specifies whether it's (0) unfragmented or (1) fragmented.
The fragmented form is always made up of multiple fragments ending either with the short or long form. That is, if we call the short form S, the long form L, and the fragmentation indicators (C1, C2, C3, or C4) as C, lengths take the forms S, L, CS, or CL (C can repeat as needed), for example:
BOOLEAN types are encoded in a single bit:
0 FALSE
1 TRUE
If constrained, INTEGERs are encoded in a field of minimum width. If they are not constrained, a length determinant is used:
Age ::= INTEGER (3..10) a Age ::= 4
height INTEGER ::= 4 -- a 001 -- height 00000001 00000100
In the first example, a is based on a type, Age, which has a range constraint of (3..10). Only eight states are possible, and they can be represented in 3 bits. A length of 3 bits is implied.
In the second example, height is based on a type that has no constraint. That is, it can take any of an infinite set of states. There is no implied length so one must be explicitly encoded. The first 8 bits are the length and the next 8 bits form the value. Consider the following to decode height:
An ENUMERATED type specifies a list of states. When we encode it we must know which state we're dealing with. BER uses the state value numbers within the {} brackets, PER doesn't use them except to know which is which. Even though the state value numbers are not continuous, there are only three possible states, and we only need 2 bits to encode three states.
We first sort the states and then assign each state to an index value. You'll see when we sort them that, although red comes first in the list, it is the one with the highest number. We have room for four states within 2 bits, and we assign them as:
Color ::= ENUMERATED {red(100), pink(2), blue(7)} hot Color ::= red -- hot 10 index 0 pink -- 1 blue -- 2 red
Color ::= BIT STRING {red(100), pink(2), blue(7)} cold Color ::= {blue} -- -- cold 00001000 00000001 --
A BIT STRING without named bits is relatively straightforward. As in the general case, you have a length (if needed) followed by a value. The units of length are bits, not octets. A length is needed whenever the length is not implied by the ASN.1 syntax.
Encoding a BIT STRING with named bits is somewhat more complicated. Unlike ENUMERATED, where the numbers represent states, in BIT STRING, the numbers represent bits. Bear in mind that whereas ENUMERATED lists the entire set of possible states, BIT STRING does not. Take the example on the slide. Although only three bits have names, any bit can be set. You could easily have '11'B even though neither bit is named. You could also have a very long string even though the highest named bit is number 100 (the 101st bit). We cannot therefore use a transformation table to index the named bits. Playing decoder with the cold example above.
OCTET STRING types are encoded with an optional length-prefix according to the rules for determining how to encode the length:
NotBounded ::= OCTET STRING FixedLength ::= OCTET STRING (SIZE(3)) shortO NotBounded ::= '112233'H fixedO FixedLength ::= '112233'H -- shortO 03112233 -- fixedO 112233
We have two examples, one unbounded (no length implied) and one of fixed length. The unbounded one, since no length is implied, needs a length determinant. The fixed length one, since the length is always 3, needs no length determinant.
In shortO, 03 is the length determinant.
NULL is never encoded. Since the value can only have one state, there is no point to encoding it. It therefore has neither length nor contents field.
In the UNALIGNED variant, character strings are encoded in the fewest number of bits necessary:
A ::= IA5String (FROM("AMEX")^SIZE(3)) B ::= IA5String a A ::= "AXE" -- 00 11 01 b B ::= "AXE" -- 00000011 1000001 1011000 1000101
This example illustrates how permitted alphabet constraints work in PER. Note how A is constrained by alphabet and also has a fixed length; B, on the other hand is unconstrained. Since A has a fixed length, the length of a value is implied and is not encoded. Since B can have any length, its length must be encoded.
Since the only characters possible are A, E, M, and X, we can create a 2-bit transformation table, namely, 00=A, 01=E, 10=M, 11=X and so "AXE" is encoded as 001101.
In the ALIGNED variant, characters are encoded in the fewest number of power of 2 bits necessary.
A ::= IA5String (FROM("AMEX")^SIZE(3)) B ::= IA5String a A ::= "AXE" -- 00 11 01 b B ::= "AXE" -- 00000011 01000001 01011000 01000101
Aligned PER character strings have characters that are always n**2 bits long (2, 4, 8, 16, etc.). Even though IA5String uses a 7-bit character table, we must use 8 bits in aligned PER.
The value is encoded the same as OCTET STRING using the same rules as BER.
A::= TeletexString (FROM("AMEX")^SIZE(3)) B::= TeletexString a A ::= "AXE" -- 00000011 01000001 01011000 01000101 b B ::= "AXE" -- 00000011 01000001 01011000 01000101
For some older string types, such as TeletexString and VideotexString, supported nonetheless in order to maintain backward compatibility, but are not PER-visible. The UTF8String type is also not PER-visible.
A preamble starts the SEQUENCE encoding if there are OPTIONAL or DEFAULT components, or if the ASN.1 type contains an extension marker
A ::= SEQUENCE { a INTEGER, b BOOLEAN, c NULL OPTIONAL} a A ::= {a 5, b TRUE} -- 0 00000001 00000101 1
We use a bitmap to specify whether optional elements are present. The bitmap is found in the preamble. If there are two optional elements, there are two bits in the bitmap; if there are 20, the bitmap is 20 bits.
In this example, we have only one optional element and so the bitmap is only one bit long. The encoding shows the preamble as 0, meaning the optional element is not present. Following the bitmap, we have the length determinant, and following the length, we have the INTEGER and BOOLEAN elements.
A preamble containing an index always starts the CHOICE encoding to identify which of the components is encoded. CHOICE is sorted by tag before assigning the preamble.
A ::= CHOICE { a INTEGER(4..9), b BOOLEAN, c NULL} chosen A ::= a:5 -- 01 001
Since no tags are found in PER, we need a way to specify within a CHOICE which possibility is taken. We do this by means of a choice index, whose size depends upon the number of possibilities within the choice. In this example there are three possibilities. One can accommodate three possibilities with 2 bits, so the choice index would be 2 bits long.
00 b [UNIVERSAL 1] 01 a [UNIVERSAL 2] 10 c [UNIVERSAL 5] 11 unused
Note in the encoding the choice index followed by the value chosen.
SEQUENCE OF is encoded with a count preamble that is present unless size is fixed. SET OF is encoded like SEQUENCE OF (in non-canonical PER). The count preamble is basically a length, except that the units are not octets, bits, or characters, but iterations of the elements of a SEQUENCE OF or SET OF.
A ::= SET OF CHOICE { a INTEGER(4..9), b BOOLEAN, c NULL} chosen A ::= { a:5, b:TRUE, c:NULL} -- 00000011 01 001 00 1 10