id generation in database tables


By default GORM uses the native id generation strategy of the database, whether that be an auto-increment column or a sequence.

http://gorm.grails.org/latest/hibernate/manual/

Given a configuration with one database server this is good. However when adding replicas this could become problematic, since this is often auto-generated by the database server. For a fuller discussion se the links at the end of the post.

This blog post follows the progress of testing id generation with uuid and snowflake algorithms.

The hibernate userguide chapter Domain Model -> Identifiers -> Simple identifiers states that only these types should be used as identifier attribute types:

  • any Java primitive type
  • any primitive wrapper type
  • java.lang.String
  • java.util.Date (TemporalType#DATE)
  • java.sql.Date
  • java.math.BigDecimal
  • java.math.BigInteger

I expect issues with uuid since it is not in the list.

Lets see what happens with a default setup and this Book domain:

class Book {
    static constraints = {
    }

    String name
    String author
}

h2-console reveals this SQL statement for creating the table:

CREATE MEMORY TABLE "PUBLIC"."BOOK"(
    "ID" BIGINT DEFAULT (NEXT VALUE FOR "PUBLIC"."SYSTEM_SEQUENCE_EE497A0E_29ED_4D8B_89C4_43BE6C6C8282") NOT NULL NULL_TO_DEFAULT SEQUENCE "PUBLIC"."SYSTEM_SEQUENCE_EE497A0E_29ED_4D8B_89C4_43BE6C6C8282",
    "VERSION" BIGINT NOT NULL,
    "NAME" VARCHAR(255) NOT NULL,
    "AUTHOR" VARCHAR(255) NOT NULL
)

The ID type is BIGINT == INT8 == java.lang.Long, i.e. 64 bit integer. The value is generated by a Sequence – SYSTEM_SEQUENCE_EE497A0E_29ED_4D8B_89C4_43BE6C6C8282.

Switching to uuid

I have used uuid’s stored as string in a project a while back. Here I will look into uuids stored as 16-byte integers, i.e. UUID.

In order to switch to uuid application.groovy must contain these lines:

grails.gorm.default.mapping = {
    id generator: 'uuid'
}
grails.gorm.default.constraints = {
    id sqlType: 'uuid'
}

Which won’t run before we edit the Domain and specify that id is of type String. After specifying id as String gorm is able to convert the uuid.

package no.prpr

class Book {

    static constraints = {
    }

    String id
    String name
    String author

}

Let’s check if the uuid rewrite works

To test this I will create a new domain – Author and rewrite Book. Then add controllers, services and views. Here’s the updated Book and new Author.

class Book {

    static constraints = {
    }

    String id
    String name
    Author author

    String toString() {
        name
    }
    
}
class Author {

    static constraints = {
    }

    static hasMany = [books: Book]

    String id
    String name

    String toString() {
        name
    }
}

Now the Controllers have coded id to be Long in a few instances. This has to be changed to String. Here’s AuthorController.show:

    def show(String id) {
        respond authorService.get(id)
    }

Will it work on Postgres?

The tables are correctly created, but the following error message is given:

Caused by: org.postgresql.util.PSQLException: ERROR: column "id" is of type uuid but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.

There seems to be a difference in the h2 and postgres drivers. Anyway I could not make it work with the time I had set aside for this testing.

Let’s try twitter snowflake

The twitter snowflake algorithm creates a 64-bit integer, which is the same as long / int8 / bigint in size. Therefore we should be able to get away with replacing the generator.

I started with a new grails project and configured support for postgres.

Added Book and Author domains, and run “grails generate-all” for both domains.

I found a snowflake algorithm here: https://github.com/callicoder/java-snowflake

Created src/main/groovy/no.prpr.MySnowflakeGenerator.

package no.prpr

import com.callicoder.snowflake.Snowflake
import groovy.util.logging.Slf4j
import org.hibernate.HibernateException
import org.hibernate.MappingException
import org.hibernate.engine.spi.SharedSessionContractImplementor
import org.hibernate.id.Configurable
import org.hibernate.id.IdentifierGenerator
import org.hibernate.service.ServiceRegistry
import org.hibernate.type.Type

@Slf4j
class MySnowflakeGenerator implements IdentifierGenerator, Configurable {

    Snowflake snowflake

    @Override
    void configure(Type type, Properties params, ServiceRegistry serviceRegistry) throws MappingException {
        if (snowflake) {
            return
        }

        snowflake = new Snowflake(getNodeId())
    }

    @Override
    Serializable generate(SharedSessionContractImplementor session, Object object) throws HibernateException {
        snowflake.nextId()
    }

    long getNodeId() {
        InetAddress ip
        String hostName
        try {
            ip = InetAddress.getLocalHost()
            hostName = ip.getHostName()
        } catch(UnknownHostException e) {
            log.error(e.message)
        }
        hash(hostName ?: 'unknown', 10)
    }
    long hash(String stringToHash, int maxBits) {
        BigInteger hashCode = stringToHash.hashCode()
        log.debug("stringToHash: ${stringToHash}, hashCode: ${hashCode.longValue()}")
        ((maxBits)..63).each { bit ->
            hashCode = hashCode.clearBit(bit)
        }
        log.debug("cleared bits hashCode: ${hashCode.longValue()} ${Long.toBinaryString(hashCode.longValue())}")
        hashCode.longValue()
    }
}

You might want to rewrite the configure method. I wanted a generator without manual configuration using the server name. I wrote this test as proof of concept:

package no.prpr

import spock.lang.Specification

class MySnowflakeGeneratorIntegrationSpec extends Specification {

    MySnowflakeGenerator mySnowflakeGenerator

    def setup() {
        mySnowflakeGenerator = new MySnowflakeGenerator()
    }

    void 'test hashing 10 bit distribution'() {
        when:
        def generatedIds = [];
        (1..10).each { number ->
            String servername = "someapp${number}"
            Long hashCode = mySnowflakeGenerator.hash(servername, 10)
            if (!generatedIds.contains(hashCode)) {
                generatedIds.add(hashCode)
            }
            servername = "anotherapp${number}"
            hashCode = mySnowflakeGenerator.hash(servername, 10)
            if (!generatedIds.contains(hashCode)) {
                generatedIds.add(hashCode)
            }
        }
        then:
        generatedIds.size() == 20
    }

}

I see now that I could just have used the default which creates a nodeId based on the Network hardware address.

Replace id generator for all domains in application.groovy:

grails.gorm.default.mapping = {
    id generator: 'no.prpr.MySnowflakeGenerator'
}

Changing to the snowflake generator works with no other changes to either domain, controller or services.

Further resources

Other’s have written about this topic, specifically the article linked below inspired me to do this now.