1. Introduction
In today’s time, efficient querying and management of complex data structures is a necessity for data-driven applications. Hazelcast, a powerful in-memory data grid, is known for handling large-scale distributed data and provides robust querying features. One unique feature of Hazelcast JSON SQL JOIN Query is HazelcastJsonValue, which natively supports JSON data.
SQL (Structured Query Language) is a popular language that has been used for decades to query and manipulate data. Hazelcast takes this a step further by supporting SQL queries on distributed data structures such as maps, which allows developers to reap the benefits of distributed systems while still using familiar syntax.
SQL join is once One very powerful feature of SQL is join operations, which are used to combine data from multiple sources, based on related keys. This feature is very useful in scenarios where we need to link users’ profiles with their transaction records, associate employees with their departments, or aggregate data from different systems.
In Hazelcast JSON SQL JOIN Query article, we will see how SQL join queries can be implemented with JSON data in Hazelcast. We will learn how to store JSON data in Hazelcast maps, how to execute SQL join queries, and how to retrieve meaningful results. With a practical example, you will understand how simple and efficient Hazelcast makes handling distributed JSON data and harnessing the power of SQL.

2. Understanding Hazelcast’s JSON Support
In Hazelcast JSON SQL JOIN Query, to handle the JSON data efficiently Hazelcast provides HazelcastJsonValue
. The main purpose of HazelcastJsonValue
is to store, query and process the JSON data without any transformation on the JSON data structure. If your application process and works on JSON data, HazelcastJsonValue
class is very useful.
HazelcastJsonValue
is a special data type that stores and manages JSON strings as-is. It does not treat JSON objects like plain text, but allows you to query and manipulate the attributes inside them.
HazelcastJsonValue
Features
- Direct JSON Handling: JSON strings can be stored directly without any conversion. The format and structure of the data remains completely intact.
- Field-wise Querying: Specific fields of JSON can be queried through predicates or SQL queries.
3. What are Hazelcast SQL Joins?
In Hazelcast JSON SQL JOIN Query, using Hazelcast’s SQL engine, you can join two or more maps, just like you do in relational databases. Joins allow you to efficiently fetch and process related data.
Example:
Imagine you have two maps:
studentMap
map, which contains employee details (like ID, name, and department ID).departmentMap
map, which contains department details (like ID and department name).
You can write a join query and fetch studentMap
and their departmentMap
names simultaneously.
Types of Joins Supported in Hazelcast:
- INNER JOIN: This join fetches only those rows where there are matching records in both maps.
- LEFT OUTER JOIN: This join fetches all rows from the left map (like
studentMap
) irrespective of whether there is a matching record in the right map (likedepartmentMap
) or not.
4. Setting Up Hazelcast
Pre-requisite:
- Java 8 or higher
- Good understanding about Hazelcast Maps
- Basics understanding of Jet engine to execute SQL queries
Hazelcast dependencies:
Add the following dependency to your pom.xml
, hazelcast-sql
dependency used for Query operations.
if you are using Maven:
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast</artifactId>
<version>5.5.0</version> <!-- Use the latest stable version -->
</dependency>
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast-sql</artifactId>
<version>5.5.0</version>
<scope>runtime</scope>
</dependency>
In your development of Hazelcast SQL Query Example, If you’re using Gradle, include:
implementation 'com.hazelcast:hazelcast:5.5.0'
implementation 'com.hazelcast:hazelcast-sql:5.5.0'
Also explore the below articles:
- Exploring Hazelcast SQL Expressions: A Complete Guide for Developers
- Exploring Hazelcast SQL Data Types
- Hazelcast SQL Queries: A Complete Example for Developers
- Hazelcast Paging Predicate with JSON Data: A Complete Guide
- Understanding of Hazelcast QueueStore
- Understanding of Hazelcast MapLoader
- Understanding of Hazelcast MapStore
- How to Start Hazelcast Members and Client?
5. Creating and Populating Maps
Hazelcast JSON SQL JOIN Query, create sample JSON data map studentMap, departmentMap and courseMap, which we have used for sample SQL demo code.
studentMap
: KeyInteger
and accept value asHazelcastJsonValue
class typedeparmentMap
: KeyInteger
and accept value asHazelcastJsonValue
class type, wheredepartmentId
used as a foreign key intostudentMap
courseMap
: KeyInteger
and accept value asHazelcastJsonValue
class type, whereList<Course>
used as key into studentMap
Create a Hazelcast’s instance and enable Jet engine config, Jet engine will help us to execute the SQL queries
// Initialize Hazelcast instance
Config config = new Config();
config.getJetConfig().setEnabled(true); // Enable the Jet engine
HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);
Create sample maps studentMap
, deparmentMap
, and courseMap
used in the demo.
// Define maps
IMap<Integer, HazelcastJsonValue> studentMap = hazelcastInstance.getMap("studentMap");
IMap<Integer, HazelcastJsonValue> departmentMap = hazelcastInstance.getMap("departmentMap");
IMap<Integer, HazelcastJsonValue> courseMap = hazelcastInstance.getMap("courseMap");
Populate the JSON data into the maps.
Course course1 = new Course(501, "Physics");
Course course2 = new Course(502, "Chemistry");
Course course3 = new Course(501, "Maths");
courseMap.put(501, new HazelcastJsonValue(gson.toJson(course1)));
courseMap.put(502, new HazelcastJsonValue(gson.toJson(course2)));
courseMap.put(503, new HazelcastJsonValue(gson.toJson(course3)));
Student student1 = new Student(101, "Sachin", 1001, List.of(course1, course3));
Student student2 = new Student(102, "Rahul", 1002, List.of(course2));
Student student3 = new Student(103, "Rachel", 1001, List.of(course3));
Student student4 = new Student(104, "Villa", 1001, List.of(course1));
Student student5 = new Student(105, "Nev am", 1002, List.of(course2));
studentMap.put(101, new HazelcastJsonValue(gson.toJson(student1)));
studentMap.put(103, new HazelcastJsonValue(gson.toJson(student3)));
studentMap.put(102, new HazelcastJsonValue(gson.toJson(student2)));
studentMap.put(105, new HazelcastJsonValue(gson.toJson(student5)));
studentMap.put(104, new HazelcastJsonValue(gson.toJson(student4)));
Department department1 = new Department(1001, "Physics Department");
Department department2 = new Department(1002, "Chemistry Department");
departmentMap.put(1001, new HazelcastJsonValue(gson.toJson(department1)));
departmentMap.put(1002, new HazelcastJsonValue(gson.toJson(department2)));
Sample Student
, Department
, and Course
POJO.
public class Student {
private int studentId;
private String studentName;
private int departmentId;
private List<Course> courses;
//create setter and getter or use lombok lib
}
public class Course {
private int courseId;
private String courseName;
//create setter and getter or use lombok lib
}
public class Department {
private int id;
private String departmentName;
//create setter and getter or use lombok lib
}
6. SQL Data Type Mapping
Before starting Hazelcast JSON SQL JOIN query execution, create data type mappings for sudentMap
, departmentMap
, and courseMap
. SqlService allows to apply the mapping and execute the queries.
SqlService sqlService = hazelcastInstance.getSql();
//Then we define the schema in Hazelcast
String studentMapping = "CREATE MAPPING studentMap " +
"TYPE IMap " +
"OPTIONS (" +
"'keyFormat' = 'java', " +
"'keyJavaClass' = 'java.lang.Integer', " +
"'valueFormat' = 'java', " +
"'valueJavaClass' = 'com.hazelcast.core.HazelcastJsonValue'" +
")";
sqlService.execute(studentMapping);
String departmentMapping = "CREATE MAPPING departmentMap " +
"TYPE IMap " +
"OPTIONS (" +
"'keyFormat' = 'java', " +
"'keyJavaClass' = 'java.lang.Integer', " +
"'valueFormat' = 'java', " +
"'valueJavaClass' = 'com.hazelcast.core.HazelcastJsonValue'" +
")";
sqlService.execute(departmentMapping);
String courseMapping = "CREATE MAPPING courseMap " +
"TYPE IMap " +
"OPTIONS (" +
"'keyFormat' = 'java', " +
"'keyJavaClass' = 'java.lang.Integer', " +
"'valueFormat' = 'java', " +
"'valueJavaClass' = 'com.hazelcast.core.HazelcastJsonValue'" +
")";
sqlService.execute(courseMapping);
7. Performing an SQL Join Query on JSON
SQL join query is similar to a relational database query, in HazelcastJsonValue
query the data with JSON path by using of JSON_QUERY
. You can use the $. attribute name to locate the specific JSON attribute into JSON data.
//create query to select specific JSON attribute from json map
String sqlQuery = "SELECT JSON_QUERY(a1.this, '$.studentId') AS studentId, " +
"JSON_QUERY(a1.this, '$.studentName') AS studentName, " +
"JSON_QUERY(a2.this, '$.id') AS departmentId, " +
"JSON_QUERY(a2.this, '$.departmentName') as departmentName " +
"FROM studentMap a1 JOIN departmentMap a2 " +
"ON JSON_QUERY(a1.this, '$.departmentId') = JSON_QUERY(a2.this, '$.id')";
sqlService.execute(sqlQuery) responsible for query execution, which returns the SqlResult
as response.
SqlResult result = sqlService.execute(sqlQuery);
for (SqlRow row : result) {
//Sample result printing for demo
System.out.print(" "+row.getObject("studentId"));
System.out.print(" "+row.getObject("studentName"));
System.out.print(" "+row.getObject("departmentId"));
System.out.print(" "+row.getObject("departmentName"));
System.out.println();
}
SqlResult
as the result set which holds the data after query executionSqlRow
represents the collections of rows or the result set.row.getObject(int)
, get the result value by specific column orderrow.getObject("column name")
, get the value by specific column name
8. Best Practices and Limitations
Best Practices
- Use Indexes for Performance: Hazelcast SQL queries can be slow on large datasets if indexes are not used. Put indexes on fields that are frequently used in joins or filters. Example:
map.addIndex(IndexType.SORTED, "departmentId");
- Optimize the query: By fetching only required fields to reduce network and processing overhead. Avoid using SELECT * unless necessary.
- Monitor Query Performance: Optimize slow-running queries by using Hazelcast Management Center or query execution metrics.
- Balance the distributed environment: Cluster should be properly balanced so that data remains evenly distributed and query execution is fast.
- Keep Consistent Schema Design: Hazelcast is schema-less, but your data models must be logically aligned for joins to work properly. Use consistent key names for related fields.
- Use Ad-Hoc Queries Carefully: Ad-hoc SQL Use queries only for exploratory analysis. Predefine frequent queries for better performance and stability.
Limitations
- Only
IMap
is supported: Hazelcast SQL joins are currently available only for IMap data structures. Compatible with Multi-map or replicated map no. - High Memory Usage: During Joins execution, data is loaded into memory. If datasets are very large, then high memory usage or out-of-memory errors can occur.
- Distributed Execution Latency: Distributed processing is a strength, but if the cluster configuration is not good or the data is unevenly distributed, latency can occur.
- Complex Joins Can Be Slow: Complex queries with multiple joins can run slower than traditional databases especially with large datasets.
- Lack of Advanced Join Types: Hazelcast does not yet support FULL OUTER JOIN or CROSS JOIN types.
- Debugging Distributed Queries: Debugging Distributed SQL queries can be challenging, especially in large clusters. Logs and metrics It has to be analyzed carefully.
9. Conclusion
Hazelcast’s JSON SQL JOIN feature is very useful for those scenarios where relational-style queries are required but the data is stored in a distributed environment. By implementing the examples in this blog, you can practically understand the JSON SQL capabilities of Hazelcast.
If you have data in JSON format and want to execute relational-style queries, then Hazelcast’s JSON SQL query feature can be very helpful.
The Java sample program used on the demo are available over the GitHub.
Happy learning 😊