Categories
Java SQL

Create a PostgreSQL index on a column and test performance with Java JDBC

In this tutorial we will see how to create an index on a SQL table column with JDBC, then we will fill a table with 10⁷ rows (using PostgreSQL fast COPY function), and finally we will compare the select performance with vs without the index.

1) The SQL tables and the index

We will create a table client without an index, and a table product with an index. We write the SQL code in the file tableWithIdx.sql:

CREATE TABLE client (ID UUID PRIMARY KEY, name VARCHAR(50) NOT NULL);

CREATE TABLE product (ID UUID PRIMARY KEY, name VARCHAR(50) NOT NULL);

CREATE INDEX idx_product_name ON product(name);

2) Start PostgreSQL, create the tables and index

To run the PostgreSQL database, we use Java Testcontainers. Find a simple full tutorial on this framework here. The code is:

String username = "user";
String password = "password";
PostgreSQLContainer<?> postgreSQLContainer = new PostgreSQLContainer<>("postgres:11.1").withUsername(username).withPassword(password);
postgreSQLContainer.start();
Class.forName("org.postgresql.Driver");
Connection c = DriverManager.getConnection(postgreSQLContainer.getJdbcUrl(), username, password);

To create the tables and index, use the filetableWithIdx.sql from step 1) and run:

try (Statement stmt = c.createStatement()) {                       
    stmt.executeUpdate(resource("tableWithIdx.sql"));
}

where resource function code is:

static String resource(String path) throws IOException {
    InputStream is = IndexBlog.class.getClassLoader().getResourceAsStream(path);
    return new String(is.readAllBytes(), StandardCharsets.UTF_8);
}

3) Fill tables with 10⁷ rows of dummy data, using PostgreSQL COPY function

We want to fill the tables with a lot of data, to compare the performance of a search on table customer that doesn’t have an index, vs the search on table client that has an index.

To do this, we will create csv files with 10⁷ rows, where each line represents column values:

Path clientFile = Files.createTempFile("client", ".txt");
Path productFile = Files.createTempFile("product", ".txt");

createDataFiles(clientFile, productFile);

wherecreateDataFiles function is:

static int NUM = (int) Math.pow(10, 7);
  
static void createDataFiles(Path clientFile, Path productFile) throws IOException {
    String sep = System.lineSeparator();

    try (FileWriter fw = new FileWriter(clientFile.toFile())) {
        for (int i = 0; i < NUM; i++) {
            String line = UUID.randomUUID() + ",c" + i + sep;
            fw.write(line);
        }
    }

    try (FileWriter fw = new FileWriter(productFile.toFile())) {
        for (int i = 0; i < NUM; i++) {
            String line = UUID.randomUUID() + ",p" + i + sep;
            fw.write(line);
        }
    }
}

The last ten lines of clientFile look like this:

d5fa0468-6ea0-496c-a15d-b10c16546de1,c9999990
30f69f08-7ecd-42be-8c6b-fde44af6948c,c9999991
fbd77f33-c913-436d-89fb-388eb9141448,c9999992
80caefb5-fdfc-42a1-ad51-d2d27746fdb9,c9999993
c8f5efba-818c-4e16-9950-4ec8dc27ef11,c9999994
2b3814dd-5f4d-4414-8a8e-12805c75eaca,c9999995
9dce041b-16bb-4ecb-8664-b1ffa88437fd,c9999996
95d728b7-2e5d-41bc-ab6b-05e746914275,c9999997
ac36b92e-454a-4346-beaf-5d4223d0eb3d,c9999998
434ce154-3e43-428a-a06d-3f3a36283d1f,c9999999

To load the data into PostgreSQL, use the following code:

CopyManager cm = new CopyManager((BaseConnection) c);
copyData(cm, "client", clientFile);
copyData(cm, "product", productFile);

where function copyData is:

static void copyData(CopyManager cm, String table, Path data) throws IOException, SQLException {
    cm.copyIn("COPY " + table + " FROM STDIN USING DELIMITERS ',' WITH NULL AS '\\null';", Files.newBufferedReader(data));
}

4) Search on name column

We will run the following selects:

int idxToFind = NUM / 2;
String findClient = "select * from client where name='c" + idxToFind + "'";
String findProduct = "select * from product where name='p" + idxToFind + "'";

To run them using JDBC, do:

Client cli = query(Client.class, findClient, c).get(0);
Product p = query(Product.class, findProduct, c).get(0);

where the code for query, Client and Product is:

static <T> List<T> query(Class<T> clazz, String sql, Connection c) throws SQLException {
    ResultSetHandler<List<T>> resultSetHandler = new BeanListHandler<>(clazz, new BasicRowProcessor(new GenerousBeanProcessor()));
    return new QueryRunner().query(c, sql, resultSetHandler);
}

@NoArgsConstructor
@AllArgsConstructor
@Data
public static class Client {
    UUID id;
    String name;
}

@NoArgsConstructor
@AllArgsConstructor
@Data
public static class Product {
    UUID id;
    String name;
}

(@NoArgsConstructor@AllArgsConstructor and @Data are annotations from the Lombok project. They add constructors, getters and setters to the Java POJOs, so that we don’t have to write all this boilerplate code. But if you prefer, you can write the getters and setters by hand.)

5) Insert and Search performances

If we run the insertion / search code from above, for the two tables client and product (client has an index, product doesn’t), we get the following performances (in milliseconds):

Insert clients Elapsed: 67696
Insert products Elapsed: 130987
Search client Elapsed: 598
Search product Elapsed: 2

So we notice that insertion on a table with index is longer, but search is way faster, this is the expected behavior.

6) Full working Java code

Leave a Reply

Your email address will not be published. Required fields are marked *