Reverse Engineered by Kevin Langdon. Additions by Karl von Randow. Integer Reverse Engineered by Martin Schnabel.
Word on the street is that Adobe will be providing the full specification in the fall. I couldn’t wait that long, so here it is. I still have some considerable work before this is easy to understand. Therefore please contact me directly if you have questions or even if you are interested in seeing some source code.
Since then, Adobe has released the specification of the AMF3 protocol
There are now multiple implementations of AMF3 in various Flash projects. View a list of the AMF3 Implementations here.
In its most common use inside Remoting traffic, AMF3 is sent as part of the original AMF0 envelope and body. The AMF3 data is added as a new data type in AMF0, with the code 0×11.
All of these types don’t have any inner data.
0×00 → undefined 0×01 → null 0×02 → boolean-false 0×03 → boolean-true
0×04 → integer type code, followed by up to 4 bytes of data.
Integer-data is probably the single most used item in AMF3. To save space it is an integer that can be 1-4 bytes long. The first bit of the first three bytes determine if the next byte is included (1) in this integer-data or not (0). The last byte, if present, is read completely (8 bits). The first bits are then removed from the first three bytes and the remaining bits concatenated to form a big-endian integer.
The integer has a maximum of 29 bits (3*7+8) and a value range of -268435456(int.MIN_VALUE»3) to 268435455(int.MAX_VALUE»3).
The integer is negative if it is the full 29 bits long and the first bit is set (1). This uses Two's complement notation and is therefore identical to normal signed integer behaviour. So if you read the integer into a 32 bit integer, all you will need to do is extend the sign. See reference code for Parsing Integers.
Examples:
0011 0101 = 53
1000 0001 0101 0100 = 212
1000 0110 1100 1010 0011 1111 = 107839
1111 1111 1111 1111 1111 1111 1111 1111 = -1
1100 0001 1111 1111 1111 1111 1111 1111 = -268435456
1100 0000 1000 0001 1000 0001 1000 0000 = 268435455
In the definitions of other datatypes below, integer-data refers to an integer stored in this format, but without the 0×04 bytecode.
(This sounds like a BER compressed integer)
0×05 → Number type-code followed by 8 bytes of data
Format is the same as an AMF0 Number
string = 0×06 string-data
string-data = integer-data [ modified-utf-8 ]
modified-utf-8 = *OCTET
The last bit of the integer-data element in string-data identifies if this string is an inline string(1) or a string reference(0).
String references are references to an already passed string. String indices start at 0 and are in the order that the inline string is encountered (Note that if the string is of zero length it is not stored for later reference, therefore does not increase the index counter). The index is the remaining bits of the integer-data.
For inline strings the remaining bits of the integer-data specify the byte length of the string (this is contrary to regular utf-8 storage where 2 bytes or 4 bytes specify the length of the byte string to follow). Those bytes are parsed using modified utf-8. For more on modified utf-8 see modified utf-8 on wikipedia
0×07 ⇒ XML type code followed by a string (excluding 0×06 flag, including length integer). This type will apparently only be used for the legacy XMLDocument class.
date = 0×08 integer-data [ number-data ]
The last bit of the integer-data element identifies if this date is an inline date(1) or an object reference(0). Object references are references to an already inlined object. Object references start at 0 and are in the order that the objects are defined. Object references include dates, arrays, and objects.
The number data represents the data as the number of milliseconds since January 1, 1970, 00:00:00 GMT. This is the same value as output from Date.getTime() in ActionScript.
Unlike AMF0 there is no timezone information included.
array = 0×09 integer-data ( [ 1OCTET *amf3-data ] | [OCTET *amf3-data 1] | [ OCTET *amf-data ] )
The last bit of the integer-data element identifies if this array is an inline array(1) or an object reference(0). Object references are references to an already inlined object. Object references start at 0 and are in the order that the objects are defined. Object references include dates, arrays, and objects.
For array references the remaining bits of the integer-data are the index of the referenced array.
For inline arrays the remaining bits of the integer-data declare the number of integer-indexed elements in the array.
Because in ActionScript arrays may also contain non-integer keys, key value pairs follow next. The key value pairs stop when an empty string is encountered for key name.
For “normal” arrays (not containing string keys), the array data is then preceded with a single octet (1), which is the empty string. The array elements then proceed as amf3-data.
For associative arrays, data is in name/value pairs, similar to how objects are encoded (type 2). If the integer-data was zero, then the name/value pairs are terminated by a single-null string (that is simply a 1). If the integer-data was 1, ten the name/value pairs are terminated by a single-null value string as well, but the null-value string will also have a final value. This second case occurs when the associative array has defined a key that is either 0 or \’\’ (empty string). If both 0 and \’\’ are set in the array then the first name/value pair sent will have a null string for the key and the final name/value pair sent will also have a null string for a key. (I’m not sure if there is any other way to get a null string key.) The zero value seems to be always the one sent first in case of the two null strings.
object = 0x0A integer-data [ class-def ] [ *amf3-data ]
class-def = string-data [ *string-data ]
The last bit of the integer-data element identifies if this object is an inline object(1) or an object reference(0). Object references are references to an already inlined object. Object references start at 0 and are in the order that the objects are defined. Object references include dates, arrays, and objects.
With inline objects, this second to last bit of the integer-data designates whether the object uses a reference(0) to a previously passed class-def or it is inline(1).
With class-def references the remaining bits of the integer-data element are the index to the class-def as it was passed. A class-def is simply a referenced object which itself has an inline class definition.
With inline class-defs the class name is read next as string-data.
Finally is the encoded object data. The 3rd bit represents whether the object is Externalizable, and the 4th to last bit represents whether the object is dynamic or not. Below, “the remaining integer-data” refers to integer-data » 4.
00 = Non-dynamic object. The remaining integer-data represents the number of class members that exist. The property names are read as string-data. The values are then read as amf3-data.
01 = Externalizable object. What follows is the value of the “inner” object, including type code. This value appears for objects that implement IExternalizable, such as ArrayCollection and ObjectProxy.
10 = Dynamic object. Like the 00 objects, the remaining integer-data represents the number of “hard” class members, with the same type of encoding as outlined above. This is followed by property-value pairs that represent the dynamic properties of the object. The property names and values are encoded as string-data followed by amf3-data until there is an empty string property name. If there is a class-def reference there are no property names and the number of values is equal to the number of properties in the class-def.
11 = Objects cannot be both dynamic and externalizable, so this is never encountered
xmlstring = 0x0B string-data
(Note: I updated the sections on both of the XML types as they seem to be inaccurate according to a lot of tests I did while writing the Java AMF3 implementation for the (Cinnamon Remoting Framework).
This appears to be used for XML data being sent back from ColdFusion where the XML is just sent as a big string, rather than as a DOM which is presumbably what the xml type is for. This information is inaccurate. The other XML type (type code 0×07) seems to be used for the legacy flash.xml.XMLDocument class only, while this type seems to be used for the new Top Level E4X XML class.
The xmlstring is read in exactly the same was as string. However xmlstrings do not get stored as a reference, and therefore cannot themselves reference another string.
XML-literals written to a socket using flash.net.Socket#writeObject() will result in this datatype. This also seems to be inaccurate in that it looks like it does not matter how you send an E4X XML object, it will always result in this datatype.
It will reformat the linebreaks indent and attribute quotes with the same behavior as «xmlliteral»#toString(). Empty root nodes (like <root/>) will result in an 0 length xmlstring.
0x0c ⇒ ByteArray flag, followed by string data
The string-data excludes the 0×06 flag but includes the length integer.
The length integer denotes the number of bytes to read from the stream. Wether or not the byte array was gzipped from flash / flex.
If the byte array was not gzipped, the string-data can be decoded as AMF.
If the byte array was gzipped, the string-data after the length is a compressed string that should be inflated before treating it as AMF. After it’s been inflated, you then decode that chunk as a sequence of bytes, the length to decode is the length if the inflated stream.
See the reference on decoding Byte Arrays. parsing byte arrays
References are more prominent in AMF3 than in AMF0, and one should have two or three arrays to keep track of them, one for strings, one for objects and possibly one for class definitions (some combine the second and third array). References are per-body, so the reference arrays should be reset every time a new body is encountered (this can happen if calls are batched or if a /onDebugEvents body is sent.
If you use NetConnection.call directly from Flash Player 9, then you will see that the only difference in the data sent is in the possible appearance of the 0×11 flag and AMF3 data. However, when using RemoteObject, the body is completely different, and therefore must be dealt with separately.
One can detect if RemoteObject has been used after the body has been decoded by looking at the target URI of the body. If it is the string “null”, then it probably is a RemoteObject call. In that case the target is useless for determining which service to call, and instead the body should be used. The body should be a typed object.
The two body types commonly encountered are:
A CommandMessage with an operation value of 5 is a ping request. The body of the response object should be of type flex.messaging.messages.AcknowledgeMessage, with an empty inner body. Here is an example of a function that generated acknowledge message:
generateAcknowledgeMessage($messageId = NULL, $clientId = NULL)
{
$result = new stdClass();
$result->_explicitType = "flex.messaging.messages.AcknowledgeMessage";
$result->messageId = generateRandomId();
$result->clientId = $clientId != NULL ? $clientId : generateRandomId();
$result->destination = null;
$result->body = null;
$result->timeToLive = 0;
$result->timestamp = (int) (time() . '00');
$result->headers = new STDClass();
$result->correlationId = $messageId;
return $result;
}
Note that the original command message should have included a messageId, and that the correlationId of the returned message should be the same. If a clientId was sent, it should be carried across the response.
A RemotingMessage contains three important keys: operation, source and body. Source is the name of the service to be called including package name, operation is the name of the method to be called, and body is the arguments to call the mathod with. It should be greeted with an AcknowledgeMessage such as the one above, with body set to the result of the method call. It could also be greeted with an ErrorMessage, such as the one below:
$results = new stdClass(); $results->_explicitType = "flex.messaging.messages.ErrorMessage"; $results->faultCode = $exception->code; $results->faultDetail = $exception->details . ' on line ' . $exception->line; $results->faultString = $exception->description;