Understanding Hazelcast Serialization and Deserialization: Why It Matters

Overview

Hazelcast Serialization and Deserialization are key processes that convert objects to and from binary formats so they can be transmitted across nodes in a distributed environment, cached, or stored. Hazelcast provides multiple serialization mechanisms tailored for performance and flexibility.

Hazelcast Serialization and Deserialization

Pre-Requisite

Before start learning about Hazelcast Serialization and Deserialization process you need to understand basics of java and Hazelcast and below concepts.

Serialization in Hazelcast

Hazelcast Serialization is the process of converting an object into a byte stream, which can be stored or transferred across network nodes in the Hazelcast cluster. In a distributed system like Hazelcast, objects need to be serialized when they are sent between cluster members, replicated to backups, or stored in distributed data structures like IMap, ReplicatedMap, and Queue.

Example Flow of Serialization:

  • The object is created in one member of the Hazelcast cluster.
  • Before transmitting the object to another node or caching it, Hazelcast serializes the object into a byte stream.
  • The serialized data is sent to another node or stored.

Deserialization in Hazelcast

Hazelcast Deserialization is the reverse process that converts a serialized byte stream back into an object. In Hazelcast, whenever an object is retrieved from a distributed data structure, transmitted from a different node, or received from backup, it needs to be deserialized.

Example Flow of Deserialization:

  • The serialized byte stream is received from another node or read from a distributed data structure.
  • Hazelcast deserializes the byte stream back into the original object.
  • The object is then ready for use in the member’s JVM.

Serialization Mechanisms in Hazelcast

On Hazelcast Hazelcast Serialization and Deserialization, Hazelcast offers various serialization methods, each designed for different use cases and performance requirements.

1. Java Serialization (Serializable):

  • The simplest approach: objects implement java.io.Serializable.
  • Hazelcast uses the standard Java Hazelcast serialization mechanism.
  • Downside: Slow and inefficient for large or frequently used objects due to additional metadata and reflection.

Example of Hazelcast Serialization

public class Employee implements Serializable {
private static final long serialVersionUID = 1L;
   private String name;
   private String email; 
   private String phoneNo;
   
   public Employee(String name, String email, String phoneNo) {
     super();
     this.name = name;
     this.email = email;
     this.phoneNo = phoneNo;
   }
 }

2. IdentifiedDataSerializable:

  • A faster alternative to Java serialization, reducing overhead.
  • Requires implementing the IdentifiedDataSerializable interface.
  • This method avoids the reflection used in Java serialization, making it more efficient.
  • You implement the writeData and readData methods manually, allowing full control over how data is serialized/deserialized.
  • Best for: High-performance use cases where control over serialization is needed.

Example of IdentifiedDataSerializable

public class EmployeeIdentifiedDataSerializable implements IdentifiedDataSerializable {
    private static final long serialVersionUID = 1L;
    private String name;
    private String email;
    private String phoneNo;

    public EmployeeExternalizable(String name, String email, String phoneNo) {
        super();
        this.name = name;
        this.email = email;
        this.phoneNo = phoneNo;
    }

    @Override
    public String toString() {
        return "Employee{" +
                "name='" + name + '\'' +
                ", email='" + email + '\'' +
                ", phoneNo='" + phoneNo + '\'' +
                '}';
    }

    @Override
    public int getFactoryId() {
        return 0;
    }

    @Override
    public int getClassId() {
        return 0;
    }

    @Override
    public void writeData(ObjectDataOutput objectDataOutput) throws IOException {
        System.out.println("Serializing....");
        objectDataOutput.writeUTF(name);
    }

    @Override
    public void readData(ObjectDataInput objectDataInput) throws IOException {
        System.out.println("Deserializaing....");
        this.name = objectDataInput.readUTF();
    }
}

3. Portable Serialization:

  • Portable serialization supports versioning and accessing object fields without deserializing the entire object.
  • It’s well-suited for applications where class structures may evolve (e.g., adding or modifying fields) while ensuring backward compatibility.
  • Requires implementing the Portable interface and PortableFactory for object creation.
  • Best for: Versioned data, partial deserialization, and flexible schema updates.

Example of Portable Serialization

public class EmployeePortable implements Portable {
    private int id;
    private String name;

    @Override
    public void writePortable(PortableWriter writer) throws IOException {
        writer.writeInt("id", id);
        writer.writeUTF("name", name);
    }

    @Override
    public void readPortable(PortableReader reader) throws IOException {
        id = reader.readInt("id");
        name = reader.readUTF("name");
    }

    @Override
    public int getFactoryId() {
        return 1;
    }

    @Override
    public int getClassId() {
        return 2;
    }
}

4. Custom Serialization (StreamSerializer):

  • If none of the default serialization mechanisms are suitable, you can implement a custom serializer by extending the StreamSerializer<T> interface.
  • You can fully control how objects are serialized and deserialized, which can be highly optimized for specific use cases.
  • Best for: Specific objects that require custom serialization logic or optimizations.

Example of Custom Serialization

public class EmployeeCustomSerializer implements StreamSerializer<Employee> {
    @Override
    public void write(ObjectDataOutput out, Employee employee) throws IOException {
        out.writeInt(employee.getEmpId());
        out.writeUTF(employee.getName());
    }

    @Override
    public Employee read(ObjectDataInput in) throws IOException {
        int id = in.readInt();
        String name = in.readUTF();
        return new Employee(id, name, null, null);
    }

    @Override
    public int getTypeId() {
        return 3;
    }

    @Override
    public void destroy() {
        // Optional cleanup
    }
}

5. ByteArraySerializer:

  • Similar to StreamSerializer, but designed to work directly with byte arrays.
  • Useful when you need to store objects as byte arrays or communicate with systems that only handle byte arrays.
  • Best for: Scenarios requiring interaction with raw byte data.

Registration of Serializers in Hazelcast

After defining the right Hazelcast Serialization and Deserialization mechanism (such as StreamSerializer or IdentifiedDataSerializable), you need to register the serializer in Hazelcast’s configuration.

Example of registering a custom serializer:

Config config = new Config();
SerializerConfig serializerConfig = new SerializerConfig()
        .setTypeClass(Employee.class)
        .setImplementation(new EmployeeCustomSerializer());
config.getSerializationConfig().addSerializerConfig(serializerConfig);

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);

IMap<Employee, String> employeeOwners = hazelcastInstance.getMap("empDataMap");
Employee employee = new Employee(123, "JavaTechARC 3i", "javatecharc3i@gmail.com", "9830283");

System.out.println("Serializing key-value and add to map");
employeeOwners.put(employee, "Demo Test");

System.out.println("Serializing key for searching and Deserializing");
System.out.println(employeeOwners.get(employee));

hazelcastInstance.shutdown();

Choosing the Right Serialization Method in Hazelcast

  • IdentifiedDataSerializable: To avoid the overhead of reflection, it is best for performance.
  • Portable Serialization: It can be used for evolving data models and data versioning.
  • Custom (StreamSerializer): Whenever you need the full control over the data model, it is useful.
  • Java Serialization: Java Serialization is easy to implement and it’s slow compared with other serialization mechanism.

Conclusion :

By understanding and selecting the right Hazelcast Serialization and Deserialization mechanism, you can significantly improve performance and scalability in distributed Hazelcast applications. The Hazelcast Serialization and Deserialization demo code available on github.

Happy Learning ! 😊😊

 

1 thought on “Understanding Hazelcast Serialization and Deserialization: Why It Matters”

Leave a Comment

Your email address will not be published. Required fields are marked *

Index
Scroll to Top