一,前言
上篇文章我们介绍了hessian的序列化机制,那么,它是怎么反序列化的呢,我们接下来看看。
二,常规反序列化
反序列化比较简单,在上一文的最后,hessian已经定义了第一个字节不同的数字代表的不同的含义。那么,我们根据这些数字进行反解即可。例如,我们遇到第一个字节为’N’的,即表示空。在x00 - x1f之间,表示长度在32之内的字符串。在x80 - xbf表示-x10到x3f之间的整数等等。好了,不多说了,直接上代码。
Hessian2Input
public Object readObject() throws IOException {
int tag = _offset < _length ? (_buffer[_offset++] & 0xff) : read();
switch (tag) {
case 'N':
return null;
case 'T':
return Boolean.valueOf(true);
case 'F':
return Boolean.valueOf(false);
// direct integer
case 0x80:
....//略
case 0xbf:
return Integer.valueOf(tag - BC_INT_ZERO);
/* byte int */
case 0xc0:
....//略
case 0xcf:
return Integer.valueOf(((tag - BC_INT_BYTE_ZERO) << 8) + read());
/* short int */
case 0xd0:
case 0xd1:
case 0xd2:
case 0xd3:
case 0xd4:
case 0xd5:
case 0xd6:
case 0xd7:
return Integer.valueOf(((tag - BC_INT_SHORT_ZERO) << 16) + 256
* read() + read());
case 'I':
return Integer.valueOf(parseInt());
// direct long
case 0xd8:
...//略
case 0xef:
return Long.valueOf(tag - BC_LONG_ZERO);
/* byte long */
case 0xf0:
...//略
case 0xff:
return Long.valueOf(((tag - BC_LONG_BYTE_ZERO) << 8) + read());
/* short long */
case 0x38:
case 0x39:
case 0x3a:
case 0x3b:
case 0x3c:
case 0x3d:
case 0x3e:
case 0x3f:
return Long.valueOf(((tag - BC_LONG_SHORT_ZERO) << 16) + 256
* read() + read());
case BC_LONG_INT:
return Long.valueOf(parseInt());
case 'L':
return Long.valueOf(parseLong());
case BC_DOUBLE_ZERO:
return Double.valueOf(0);
case BC_DOUBLE_ONE:
return Double.valueOf(1);
case BC_DOUBLE_BYTE:
return Double.valueOf((byte) read());
case BC_DOUBLE_SHORT:
return Double.valueOf((short) (256 * read() + read()));
case BC_DOUBLE_MILL: {
int mills = parseInt();
return Double.valueOf(0.001 * mills);
}
case 'D':
return Double.valueOf(parseDouble());
case BC_DATE:
return new Date(parseLong());
case BC_DATE_MINUTE:
return new Date(parseInt() * 60000L);
case BC_STRING_CHUNK:
case 'S': {
_isLastChunk = tag == 'S';
_chunkLength = (read() << 8) + read();
int data;
_sbuf.setLength(0);
while ((data = parseChar()) >= 0)
_sbuf.append((char) data);
return _sbuf.toString();
}
case 0x00:
...//略
case 0x1f: {
_isLastChunk = true;
_chunkLength = tag - 0x00;
int data;
_sbuf.setLength(0);
while ((data = parseChar()) >= 0)
_sbuf.append((char) data);
return _sbuf.toString();
}
case 0x30:
case 0x31:
case 0x32:
case 0x33: {
_isLastChunk = true;
_chunkLength = (tag - 0x30) * 256 + read();
_sbuf.setLength(0);
int ch;
while ((ch = parseChar()) >= 0)
_sbuf.append((char) ch);
return _sbuf.toString();
}
case BC_BINARY_CHUNK:
case 'B': {
_isLastChunk = tag == 'B';
_chunkLength = (read() << 8) + read();
int data;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
while ((data = parseByte()) >= 0)
bos.write(data);
return bos.toByteArray();
}
case 0x20:
...//略
case 0x2f: {
_isLastChunk = true;
int len = tag - 0x20;
_chunkLength = 0;
byte[] data = new byte[len];
for (int i = 0; i < len; i++)
data[i] = (byte) read();
return data;
}
case 0x34:
case 0x35:
case 0x36:
case 0x37: {
_isLastChunk = true;
int len = (tag - 0x34) * 256 + read();
_chunkLength = 0;
byte[] buffer = new byte[len];
for (int i = 0; i < len; i++) {
buffer[i] = (byte) read();
}
return buffer;
}
case BC_LIST_VARIABLE: {
// variable length list
String type = readType();
return findSerializerFactory().readList(this, -1, type);
}
case BC_LIST_VARIABLE_UNTYPED: {
return findSerializerFactory().readList(this, -1, null);
}
case BC_LIST_FIXED: {
// fixed length lists
String type = readType();
int length = readInt();
Deserializer reader;
reader = findSerializerFactory().getListDeserializer(type, null);
return reader.readLengthList(this, length);
}
case BC_LIST_FIXED_UNTYPED: {
// fixed length lists
int length = readInt();
Deserializer reader;
reader = findSerializerFactory().getListDeserializer(null, null);
return reader.readLengthList(this, length);
}
// compact fixed list
case 0x70:
case 0x71:
case 0x72:
case 0x73:
case 0x74:
case 0x75:
case 0x76:
case 0x77: {
// fixed length lists
String type = readType();
int length = tag - 0x70;
Deserializer reader;
reader = findSerializerFactory().getListDeserializer(type, null);
return reader.readLengthList(this, length);
}
// compact fixed untyped list
case 0x78:
case 0x79:
case 0x7a:
case 0x7b:
case 0x7c:
case 0x7d:
case 0x7e:
case 0x7f: {
// fixed length lists
int length = tag - 0x78;
Deserializer reader;
reader = findSerializerFactory().getListDeserializer(null, null);
return reader.readLengthList(this, length);
}
case 'H': {
String type = GenericTypeUtil.getCurrentGenericType()==null?null:GenericTypeUtil.getCurrentGenericType().getName();//added by lG 2010-5-9
return findSerializerFactory().readMap(this, type);
// return findSerializerFactory().readMap(this,null);
}
case 'M': {
String type = readType();
return findSerializerFactory().readMap(this, type);
}
case 'C': {
readObjectDefinition(null);
return readObject();
}
case 0x60:
...//略
case 0x6f: {
int ref = tag - 0x60;
if (_classDefs.size() <= ref)
throw error("No classes defined at reference '"
+ Integer.toHexString(tag) + "'");
ObjectDefinition def = _classDefs.get(ref);
return readObjectInstance(null, def);
}
case 'O': {
int ref = readInt();
if (_classDefs.size() <= ref)
throw error("Illegal object reference #" + ref);
ObjectDefinition def = _classDefs.get(ref);
return readObjectInstance(null, def);
}
case BC_REF: {
int ref = readInt();
return _refs.get(ref);
}
default:
if (tag < 0)
throw new EOFException("readObject: unexpected end of file");
else
throw error("readObject: unknown code " + codeName(tag));
}
}
引用反序列化
在上篇文章中,我们是否记得,当我们遇到相同的引用对象时,直接写入了x51+引用对象所在的位置。
那么,我们如何在反序列化的时候找到所对应的引用呢?
我们看序列化时的引用处理逻辑:
public boolean addRef(Object object)
throws IOException
{ //拿到引用map的大小,即新对象放在map中的value
int newRef = _refs.size();
//将object放到引用map中,如果不存在,则新增,返回新增的value,
//否者,将返回与object相同引用的value
int ref = _refs.put(object, newRef, false);
//如果引用值不相等,则表示查找到引用
if (ref != newRef) {
//写引用标记
writeRef(ref);
return true;
}
else {
return false;
}
}
即每次生成一个对象时,它在map中映射的值都会加1,除非遇到相同的引用对象。
然后我们再看看反序列化的时候如何处理:
public Object readRef() throws IOException {
return _refs.get(parseInt());
}
很简单的一句话,根据当前的引用列表_refs获取对应的引用对象。那这个_refs是怎么生成的呢,我们接着UnsafeDeserializer中查看:
UnsafeDeserializer
public Object readObject(AbstractHessianInput in,
Object obj,
FieldDeserializer []fields)
throws IOException
{
try {
//先将实例化对象添加到引用中
int ref = in.addRef(obj);
//反序列化对象的字段
for (FieldDeserializer reader : fields) {
reader.deserialize(in, obj);
}
//调用readResolve方法
Object resolve = resolve(in, obj);
//最后设置到引用中
if (obj != resolve)
in.setRef(ref, resolve);
return resolve;
} catch (IOException e) {
throw e;
} catch (Exception e) {
throw new IOExceptionWrapper(obj.getClass().getName() + ":" + e, e);
}
}
//Hessian2Input 添加一个引用
public int addRef(Object ref) {
_refs.add(ref);
return _refs.size() - 1;
}
我们可以看到,假设有如下对象
Class A{
Class B;
Class C;
}
Class B{
Class A;
}
C c = new C();
B b = new B();
A a = new A(b,c);
b.setA(a);
序列化a时,首先map中添加a:0,接着b:1,然后在流中添加‘Q’+0(引用标志位于引用位置),接着c:2。
反序列化的时候,首先添加list中添加a,然后添加b,将b设置到a上,接着获取引用a,然后添加c,将c设置到a上,最后判断是否有readResolve方法,如果有,替换引用对象。
不知道大家发现这里会有一个bug产生没?当readResolve存在的时候,当外围存在一个d引用a时,b设置的引用跟c设置的引用会存在不一致的情况。
三,总结
总体来说,hessian序列化机制还是非常巧妙的,用前8位(一个字节)表示了不同的含义,然后再分情况解析。