
Overview
Hazelcast, a leading in-memory data grid, offers seamless integration with SQL queries, making it a powerful tool for developers who need both scalability and performance. Hazelcast SQL Query Example provides a structured way to query data stored within Hazelcast clusters, combining the flexibility of SQL with the speed of in-memory computing. In this comprehensive guide, we explore how to use Hazelcast SQL queries effectively, complete with examples, to help developers master this functionality.
What Is Hazelcast SQL?
Hazelcast SQL is an SQL-based query engine integrated into Hazelcast that allows querying distributed data structures such as maps, lists, and caches. While Hazelcast was traditionally associated with key-value pair queries, Hazelcast SQL opens the door to more complex query scenarios. It is fully compliant with ANSI SQL standards, making it familiar to those experienced in SQL-based databases.
Key Features of Hazelcast SQL
- Distributed SQL Processing: Hazelcast SQL queries run across distributed clusters, providing scalability and resilience.
- Low Latency: Queries operate at in-memory speeds, providing extremely low-latency responses, even for complex joins and aggregations.
- Supports Standard SQL: This includes common SQL functions, joins, aggregations, and sorting.
- Scalability: Hazelcast scales horizontally, and Hazelcast SQL queries are designed to work seamlessly with data that is spread across a cluster.
Pre-requisite
Before start exploring Hazelcast SQL Queries Example, you should have basic understanding about :
- Hazelcast download and installed
- Any IDE
- How to Connect a Database with Hazelcast Using Hikari Connection Pooling?
- Understanding of Hazelcast QueueStore
- Understanding of Hazelcast MapLoader
- Understanding of Hazelcast MapStore
- How to Start Hazelcast Members and Client?
Setting Up Hazelcast SQL for Querying
In Hazelcast SQL Query Example, Before we jump into the code, let’s walk through the setup process to run SQL queries on a Hazelcast SQL Query Example cluster.
1. Installing Hazelcast
In below Hazelcast SQL Query Example, To use Hazelcast SQL, you need to install Hazelcast in your Java project. Add the following dependency to your pom.xml
, hazelcast-sql
dependency used for Query operations.
if you are using Maven:
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast</artifactId>
<version>5.5.0</version> <!-- Use the latest stable version -->
</dependency>
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast-sql</artifactId>
<version>5.5.0</version>
<scope>runtime</scope>
</dependency>
In your development of Hazelcast SQL Query Example, If you’re using Gradle, include:
implementation 'com.hazelcast:hazelcast:5.5.0'
implementation 'com.hazelcast:hazelcast-sql:5.5.0'
To enable the Jet engine on the members, do one of the following:
- Change member config using Java API:
config.getJetConfig().setEnabled(true)
- Change XML/YAML configuration property: Set
hazelcast.jet.enabled
totrue
- Add system property:
-Dhz.jet.enabled=true
(for Hazelcast embedded, works only when loading config viaConfig.load
)- Add environment variable:
HZ_JET_ENABLED=true
(recommended when running container image. For Hazelcast embedded, works only when loading config via Config.load)
2. Configuring Hazelcast Cluster
In Hazelcast SQL Query Example, a Hazelcast cluster is essential for executing SQL queries. Configure Hazelcast programmatically or use an XML configuration file and enable the Jet Config config.getJetConfig().setEnabled(true)
. Below is a simple programmatic configuration:
Config config = new Config();
config.getJetConfig().setEnabled(true);
HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);
3. Creating a Distributed Map
In Hazelcast, data is typically stored in distributed maps. For SQL queries, you need to define a map that Hazelcast can understand as a queryable entity.
IMap<Integer, Players> players = hazelcastInstance.getMap("players");
Here, we’re creating a distributed map of players, where Integer
is the key, and Players
is the value.
Sample POJO Mapping
package com.javatecharc.demo.model;
import lombok.AllArgsConstructor;
import lombok.Getter;
@AllArgsConstructor
@Getter
public class Players {
private int userId;
private String name;
private String department;
private int salary;
}
package com.javatecharc.demo.model;
import lombok.Getter;
import lombok.Setter;
@Getter
@Setter
public class Department {
private int id;
private String name;
}
4. Defining the Schema for SQL Queries
To query distributed data with SQL, we need to define a schema. This can be done using Hazelcast’s mapping feature. Below is an example of how to map a Hazelcast map to a SQL table:
//Put Player data
players.put(1, new Players(1, "Sachin", "Opening Batsman", 75000));
players.put(2, new Players(2, "Rahul", "Middle Order", 90000));
players.put(3, new Players(3, "Rohit", "Opening Batsman", 80000));
players.put(4, new Players(4, "Dhoni", "Finisher", 85000));
players.put(5, new Players(5, "Yuvraj", "Middle Order", 70000));
Then we define the schema and create mapping for the given map in Hazelcast:
SqlService sqlService = hazelcastInstance.getSql();
//Then we define the schema in Hazelcast
String mapping = "CREATE MAPPING players " +
"TYPE IMap " +
"OPTIONS (" +
"'keyFormat' = 'java', " +
"'keyJavaClass' = 'java.lang.Integer', " +
"'valueFormat' = 'java', " +
"'valueJavaClass' = 'com.javatecharc.demo.model.Players'" +
")";
sqlService.execute(mapping);
5. Executing SQL Queries
Once the schema is set up, you can perform queries on the distributed map using standard SQL. Hazelcast SQL supports a wide range of SQL operations, including SELECT, INSERT, UPDATE, DELETE, JOIN, and more.
Example: Simple SQL Query
//Executing SQL Queries
SqlResult result = sqlService.execute("SELECT * FROM players WHERE salary > 80000");
for (SqlRow row : result) {
System.out.println(row);
}
This query fetches all players with a salary greater than 80,000. The result set is printed to the console.
Example: Aggregation Query
//Average Salary
SqlResult averageSalary = sqlService.execute("SELECT AVG(salary) FROM players");
for (SqlRow row : averageSalary) {
System.out.println("Average salary: " + row.getObject(0));
}
In this case, the query computes the average salary of all players.
Example: Joining Data
Hazelcast SQL supports JOIN operations across distributed maps. Here’s an example of joining two maps, the below is the sample join query, pojo mappings you create as needed.
String joinQuery = "SELECT p.name, d.name FROM players p " +
"JOIN departments d ON p.departmentId = d.id";
SqlResult result = sqlService.execute(joinQuery);
for (SqlRow row : result) {
System.out.println(row.getObject(0) + " works in " + row.getObject(1));
}
6. Inserting and Updating Data via SQL
Hazelcast SQL also allows data manipulation through standard SQL commands like INSERT
and UPDATE
.
Inserting Data
sqlService.execute("INSERT INTO players VALUES (6, 'Ganguly', 'Captain', 950000)");
This command inserts a new players into the players
map.
Updating Data
sqlService.execute("UPDATE players SET salary = 85000 WHERE name = 'Dhoni'");
This command updates the salary of the players named Dhoni.
Performance Considerations for Hazelcast SQL
When running Hazelcast SQL queries, it’s essential to consider performance, especially in large, distributed environments. Here are some tips:
- Indexing: Use indexing to improve the performance of SQL queries. Hazelcast supports adding indexes to maps.
For example:
players.addIndex(IndexType.SORTED, "salary");
- Query Optimization: Be mindful of query structure. Avoid complex joins if they can be simplified, and ensure you are querying only the data needed.
- Cluster Scaling: Hazelcast SQL scales horizontally with your cluster. As you add more members, the query performance can improve due to parallelism. Ensure your cluster is properly sized for the volume of data being queried.
Conclusion
Hazelcast SQL is a powerful addition to the Hazelcast ecosystem, allowing developers to leverage SQL querying in a distributed, in-memory environment. From simple SELECT queries to complex JOIN operations, Hazelcast SQL provides the performance and scalability that modern applications demand.
By setting up a distributed map, defining schemas, and executing SQL queries efficiently, developers can fully utilize Hazelcast’s capabilities to manage large-scale data operations with ease.
The code sample available over the github.
Happy Learning! 😊