Externalization vs serialization

Serialization uses certain default behaviors to store and later recreate the object. You may specify in what order or how to handle references and complex data structures, but eventually it comes down to using the default behavior for each primitive data field.
Externalization is used in the rare cases that you really want to store and rebuild your object in a completely different way and without using the default serialization mechanisms for data fields. For example, imagine that you had your own unique encoding and compression scheme.
However, transient keyword though bridges the gap between the two, as now we have control of which field to serialize and which not.
There is one major difference between serialization and externalization: When you serialize an Externalizable object, a default constructor will be called automatically; only after that will the readExternal() method be called.
Final words on serialization and externalization:
In earlier version of Java, reflection was very slow, and so serializaing large object graphs (e.g. in client-server RMI applications) was a bit of a performance problem. To handle this situation, the java.io.Externalizable interface was provided, which is like java.io.Serilizable but with custom-written mechanisms to perform the marshalling and unmarshalling fuctions (you need to implement readExternal and writeExternal methods on your class). This gives you the means to get around the reflection performance bottleneck.

In recent versions of java (1.3 onwards, certainly) the performance of reflection is vastly better than it used to be, and so this is much less of a problem.

Also, the built-in Java serialization mechanism isn’t the only one, you can get third-party replacements, such as JBoss Serialization, which is considerably quicker, and is a drop-in replacement for the default.

A big downside of Externalizable is that you have to maintain this logic yourself – if you add, remove or change a field in your class, you have to change your writeExternal/readExternal methods to account for it, as mentioned in post on limitations of serialization.

In summary, Externalizable is of the Java 1.1 days. There’s really no need for it any more. Only
when you need it very much, than only use it.

Referenceshttp://download.oracle.com/javase/tutorial/javabeans/persistence/index.html

Difference in sizes when you serialize using Serializable interface and serialize using Externalizable interface

Let’s take a simple case, an object of type SimpleClass with just few fields – firstName, lastName, weight and location, containing data {“Brad”, “Pitt”, 180.5, {49.345, 67.567}}. When you serialize this object that is about 24 bytes by implementing Serializable interface, it turns into 220 bytes (approx). As it turns out, the basic serialization mechanism stores all kinds of information in the file so that it can deserialize without any other assistance. Look at the format below when the object is serialized and you will understand why it is turned out to 200 bytes.

Length: 220
Magic: ACED
Version: 5
OBJECT
CLASSDESC
Class Name: “SimpleClass”
Class UID: -D56EDC726B866EBL
Class Desc Flags: SERIALIZABLE;
Field Count: 4
Field type: object
Field name: “firstName”
Class name: “Ljava/lang/String;”
Field type: object
Field name: “lastName”
Class name: “Ljava/lang/String;”
Field type: float
Field name: “weight”
Field type: object
Field name: “location”
Class name: “Ljava/awt/Point;”
Annotation: ENDBLOCKDATA
Superclass description: NULL
STRING: “Brad”
STRING: “Pitt”
float: 180.5
OBJECT
CLASSDESC
Class Name: “java.awt.Point”
Class UID: -654B758DCB8137DAL
Class Desc Flags: SERIALIZABLE;
Field Count: 2
Field type: integer
Field name: “x”
Field type: integer
Field name: “y”
Annotation: ENDBLOCKDATA
Superclass description: NULL
integer: 49.345
integer: 67.567

Now if you serialize the same by extending Externalizable interface, the size will be reduced drastically and the information saved in the persistant store is also reduced a lot. Here is the result of serializing the same class, modified to be externalizable. Notice that the actual data is not parseable externally any more–only your class knows the meaning of the data!

Length: 54
Magic: ACED
Version: 5
OBJECT
CLASSDESC
Class Name: “SimpleClass”
Class UID: 5CB3777417A3AB5BL
Class Desc Flags: EXTERNALIZABLE;
Field Count: 0
Annotation
ENDBLOCKDATA
Superclass description
NULL
EXTERNALIZABLE:
[70 00 04 4D 61 72 6B 00 05 44 61 76 69 73 43 3C
80 00 00 00 00 01 00 00 00 01]

What should be done when super class implements the Externalizable interface?

Well, in this case the super class will also have the readExternal and writeExternal methods as in Car class and will persist the respective fields in these methods.

import java.io.*;

/**
 * The superclass implements externalizable
 */
class Automobile implements Externalizable {

/*
     * Instead of making thse members private and adding setter
     * and getter methods, I am just giving default access specifier.
     * You can make them private members and add setters and getters.
     */
String regNo;
String mileage;

/*
     * A public no-arg constructor
     */
public Automobile() {}

Automobile(String rn, String m) {
regNo = rn;
mileage = m;
}

public void writeExternal(ObjectOutput out) throws IOException {
out.writeObject(regNo);
out.writeObject(mileage);
}

public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
regNo = (String)in.readObject();
mileage = (String)in.readObject();
}

}

public class Car implements Externalizable {

String name;
int year;

/*
     * mandatory public no-arg constructor
     */
public Car() { super(); }

Car(String n, int y) {
name = n;
year = y;
}

/**
     * Mandatory writeExernal method.
     */
public void writeExternal(ObjectOutput out) throws IOException {
// first we call the writeExternal of the superclass as to write
// all the superclass data fields
super.writeExternal(out);

//Now the subclass fields
out.writeObject(name);
out.writeInt(year);
}

/**
     * Mandatory readExternal method.
     */
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
// first call the superclass external method
super.readExternal(in);

//Now the subclass fields
name = (String) in.readObject();
year = in.readInt();
}

/**
     * Prints out the fields. used for testing!
     */
public String toString() {
return("Reg No: " + regNo + "\n" + "Mileage: " + mileage +
"Name: " + name + "\n" + "Year: " + year );
}
}

In this example since the Automobile class stores and restores its fields in its own writeExternal and readExternal methods, you dont need to save/restore the superclass fields in sub class but if you observe closely the writeExternal and readExternal methods of Car class closely, you will find that you still need to first call the super.xxxx() methods that confirms the statement the externalizable object must also coordinate with its supertype to save and restore its state.

What will happen when an externalizable class extends a non externalizable super class?

Then in this case, you need to persist the super class fields also in the sub class that implements Externalizable interface. Look at this example. 
/**
 * The superclass does not implement externalizable
 */
class Automobile {

/*
     * Instead of making thse members private and adding setter
     * and getter methods, I am just giving default access specifier.
     * You can make them private members and add setters and getters.
     */
String regNo;
String mileage;

/*
     * A public no-arg constructor
     */
public Automobile() {}

Automobile(String rn, String m) {
regNo = rn;
mileage = m;
}
}

public class Car implements Externalizable {

String name;
int year;

/*
     * mandatory public no-arg constructor
     */
public Car() { super(); }

Car(String n, int y) {
name = n;
year = y;
}

/**
     * Mandatory writeExernal method.
     */
public void writeExternal(ObjectOutput out) throws IOException {
/*
     * Since the superclass does not implement the Serializable interface
     * we explicitly do the saving.
     */
out.writeObject(regNo);
out.writeObject(mileage);

//Now the subclass fields
out.writeObject(name);
out.writeInt(year);
}

/**
     * Mandatory readExternal method.
     */
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
/*
     * Since the superclass does not implement the Serializable interface
     * we explicitly do the restoring
     */
regNo = (String) in.readObject();
mileage = (String) in.readObject();

//Now the subclass fields
name = (String) in.readObject();
year = in.readInt();
}

/**
     * Prints out the fields. used for testing!
     */
public String toString() {
return("Reg No: " + regNo + "\n" + "Mileage: " + mileage +
"Name: " + name + "\n" + "Year: " + year );
}
}

Here the Automobile class does not implement Externalizable interface. So to persist the fields in the automobile class the writeExternal and readExternal methods of Car class are modified to save/restore the super class fields first and then the sub class fields.

Limitation of externalization

Externalization efficiency comes at a price. The default serialization mechanism adapts to application changes due to the fact that metadata is automatically extracted from the class definitions (observe the format above and you will see that when the object is serialized by implementing Serializable interface, the class metadata(definitions) are written to the persistent store while when you serialize by implementing Externalizable interface, the class metadata is not written to the persistent store).

Externalization on the other hand isn’t very flexible and requires you to rewrite your marshalling and demarshalling code whenever you change your class definitions.

As you know a default public no-arg constructor will be called when serializing the objects that implements Externalizable interface. Hence, Externalizable interface can’t be implemented by Inner Classes in Java as all the constructors of an inner class in Java will always accept the instance of the enclosing class as a prepended parameter and therefore you can’t have a no-arg constructor for an inner class. Inner classes can achieve object serialization by only implementing Serializable interface.
If you are subclassing your externalizable class, you have to invoke your superclass’s implementation. So this causes overhead while you subclass your externalizable class. Observe the examples above where the superclass writeExternal method is explicitly called in the subclass writeExternal method.

Methods in externalizable interface are public. So any malicious program can invoke which results into loosing the prior serialized state.

Once your class is tagged with either Serializable or Externalizable, you can’t change any evolved version of your class to the other format. You alone are responsible for maintaining compatibility across versions. That means that if you want the flexibility to add fields in the future, you’d better have your own mechanism so that you can skip over additional information possibly added by those future versions.

Externalization : The Externalizable interface

There might be times when you have special requirements for the serialization of an object. For example, you may have some security-sensitive parts of the object, like passwords, which you do not want to keep and transfer somewhere. Or, it may be worthless to save a particular object referenced from the main object because its value will become worthless after restoring.

You can control the process of serialization by implementing the Externalizable interface instead of Serializable. This interface extends the original Serializable interface and adds writeExternal() and readExternal(). So its not a marker interface. These two methods will automatically be called in your object’s serialization and deserialization, allowing you to control the whole process.

There is one major difference between serialization and externalization: When you serialize an Externalizable object, a default constructor will be called automatically; only after that will the readExternal() method be called.

How serialization happens?
JVM first checks for the Externalizable interface and if object supports Externalizable interface, then serializes the object using writeExternal method. If the object does not support Externalizable but implement Serializable, then the object is saved using ObjectOutputStream. Now when an Externalizable object is reconstructed, an instance is created first using the public no-arg constructor, then the readExternal method is called. Again if the object does not support Externalizable, then Serializable objects are restored by reading them from an ObjectInputStream.

Following Listing shows how you can use externalization.

import java.io.*;
import java.util.*;

class Data implements Externalizable {
inti;
String s;
public Data() {
System.out.println("Data default constructor");
}
public Data(String x, int a) {
System.out.println("Second constructor");
s = x; i = a;
}
public String toString() {
return s + i;
}
public void writeExternal(ObjectOutput out)
throws IOException {
out.writeObject(s);
out.writeInt(i);
}
public void readExternal(ObjectInput in) {
s = (String)in.readObject();
i = in.readInt();
}
public static void main(String[] args)
throws IOException, ClassNotFoundException {
Data d = new Data("String value",1514);
System.out.println(d);
ObjectOutputStream o = new ObjectOutputStream(
New FileOutputStream("data.out"));
o.writeObject(d);
o.close();

// Now deserialize
ObjectInputStream in = new ObjectInputStream(
new FileInputStream("data.out"));
d = (Data)in.readObject();
}
}

If you inherit some class from a class implementing the Externalizable interface, you must call writeExternal() and readExternal() methods when you serialize or deserialize this class in order to correctly save and restore the object.