comsysto1

How to create your own ‘dynamic’ bean definitions in Spring

Recently, I joined a software project with a Spring/Hibernate-based software stack, which is shipped in an SAAS-like manner, but the databases of the customers need to be separated from each other. Sounds easy for you? Ok let’s see what we have in detail.

1. Baseline study

Requirement: There is a software product that can be sold to different customers, but the provider wants to keep the sensible data of the customer in separate data sources. Every customer/login has only access to exactly one data source.

In a, let’s say, ‘simple’ Spring setup, one would simply add all databases to the configuration type of his choice. Spring currently offers four different ways of container configuration:

  • XML-based (old-fashion)
  • Annotation-based (almost old-fashion)
  • Code-based (pretty awesome)
  • Property-file-based (exotic, haven’t heard of it yet, right? We will come to that later again)

If you are interested in the details about the different possibilities, I refer to the Spring Reference Manual.

Customers get a pile of logins which are associated to a specific customer key, which should delegate to the correct database. There is always only one database relevant per customer.

Quick as a shot, the experienced Spring developer will refer to the ‘AbstractRoutingDataSource‘ which stores a Map of data sources internally. Based on your implementation  of determineCurrentLookupKey() – typically a ThreadLocal value is accessed – the correct data source from the internal map is chosen for opening a connection. The Spring Security experts will quickly show you how to introduce the customer key to your current session which you can easily access in your AbstractRoutingDataSource. It looks even more powerful in combination with Jndi, because it provides a look-up mechanism to search for JndiResources based on names.

All that sounds nice apparently. But this approach has two drawbacks for us:

For now, lets forget about these problems. We want to use the AbstractRoutingDataSource but without Jndi lookup functionality and don’t care about Hibernate performance ;).

2. The first solution in action

Initially, when the application was rolled out for the first time, there was only one customer present. Apart from that, a demo customer was created to prove the multiple-customer feature. No one was really complaining about the fact that a developer needs to insert the data source configuration for every new customer.  Everything looked good, until the day came, where a second customer was contracted. First of all we should set up a test environment, with a separate data source. Then the real one. Temporarily we thought about having a fifth ‘customer’ for running various integration tests. The team finally recognized, that the task of setting up the data source beans should be pushed to OPS somehow, because they already have to define the correct JndiResource in the servlet context.

3. Where only few seem to have gone before

New developer-driven requirement: Setting up a new customer is an OPS task, hence, introducing a new data source should not affect the development team, because there is nothing to develop!

A first brainstorming session brought up only one obvious idea: We need to partly expose our Spring XML config. “With great power comes great responsibility” and to be honest, no one really wants to share the power of configuring a Spring container.

After a short recherche session, we found many questions dealing with the same sort of topic: “How to dynamically create spring beans based on some sort of configuration provided without rebuilding the war file?”

Our approach with the ‘AbstractRoutingDataSource’ was referenced sometimes which confirmed our initial approach. But those especially arose my attention:

My idea was to have our very own Spring Bean definition approach which does not affect the war file! I immediately started to search for bean definition alternatives and also stumbled upon the Spring JdbcBeanDefinitionReader and the above mentioned PropertiesBeanDefinitionReader. Both classes helped me understanding how Spring internally collects all the different bean definition details before initializing the beans. (A good resource for understanding, how a Spring ApplicationContext gets initialized is this SpringONE presentation.) I got temporarily discouraged, as I saw, that I can’t simply add a BeanDefinitionReader implementation somewhere, to an ApplicationContext. It felt wrong to implement my own ApplicationContext so I had one last chance. The fourth link provides an approach how to programatically create bean definitions with a BeanFactoryPostProcessor (BFPP) implementation. When you first read the javadoc it sounds a bit like the wrong way, even the bean name itself does not really imply to create bean definitions. Second thing is, that BFPP implementations assume, that all bean definitions are already in place, so the order of execution would become essential. But we are lucky, “there is some Spring for it” of course ;). In Spring 3.0.1 an interface extension to BFPP was introduced, the BeanDefinitionRegistryPostProcessor (BDRPP) (I love these names :)).

2011.09.07_object_oriented_programmer_world.png.pagespeed.ce.nADFXm02tz

Manu Cornet – Bonkers World Blog

4. The solution

I decided to go for it and confined myself for a prototyping weekend.

This is the list of acceptance criteria I defined:

  • Based on the following line of configuration, the application should initialize following certain rules:
    customerKeys=customer_x, customer_y, demo_customer, ui_test 
  • The customer keys are identical to the set of possible keys provided by the login mechanism. A user can only have one customerKey assigned.
  • A single data source is created per customerKey and will be managed by the AbstractRoutingDataSource.

There were many pitfalls an dead-ends coming around this weekend. That’s why I prefer to show you the solution and explain why I did it like that.

4.1) How to get the ‘customerKey’ property injected in a BeanDefinitionRegistryPostProcessor?

Well not the common way. Because BeanDefinitionRegistryPostProcessors are instantiated in a very early stage of the initialization process, you can’t use functionality like @Autowired or @Value. The technical reason can be looked up in the SpringONE presentation. Now the only things you can do:

  • Access a SystemProperty / ServletContextParam *juck*
  • Add the Spring Environment to your BeanDefinitionRegistryPostProcessor

A short explanation, why the Spring Environment is helpful here: Since 3.1 Spring separates the configuration from the ApplicationContext, which means, that stuff like defined config properties and activated spring profiles got their own interfaces. So it is good practice to initialize your Environment before a single bean is instantiated or even defined. There are many conventions, which setup your environment automatically, but the true power comes along if you do it on your own. But that’s a topic for a new post. For the impatient here are some good resources to get a first impression:

4.2) Core implementation

Now, this is what our BeanDefinitionRegistryPostProcessor now looks like:

public class DataSourcesBeanFactoryPostProcessor implements BeanDefinitionRegistryPostProcessor {

  private final List<String> customerKeys;

  public DataSourcesBeanFactoryPostProcessor(Environment springEnvironment) {
    parseCustomerKeys(springEnvironment.getProperty("customerKeys"));
  }

  @Override
  public void postProcessBeanDefinitionRegistry(BeanDefinitionRegistry registry) throws BeansException {
    for (String customerKey : customerKeys) { 
      String dataSourceName = "dataSource_" + customerKey;
      BeanDefinitionBuilder definitionBuilder = 
          BeanDefinitionBuilder.genericBeanDefinition(JndiObjectFactoryBean.class); 
      definitionBuilder.addPropertyValue("jndiName", "jdbc/" + dataSourceName); 
      registry.registerBeanDefinition(dataSourceName, definitionBuilder.getBeanDefinition()); 
    }
  }

  @Override
  public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory) throws BeansException {
    // we actually only add new beans, but do not post process the existing definitions
  }
 
  private static List<String> parseCustomerKeys(String rawCustomerKeys) {
    if (StringUtils.isEmpty(rawCustomerKeys)){
      throw new IllegalArgumentException("Property 'customerKeys' is undefined.");
    }
    return Collections.unmodifiableList(Arrays.asList(StringUtils.split(rawCustomerKeys, ",")));
  }
}

4.3) Get it running!

Now Spring offers again different possibilities how to actually add a custom BeanFactoryPostProcessor / BeanDefinitionRegistryPostProcessor to the set of processors executed:

  • @Component would be possible for beans implementing BeanFactoryPostProcessor in general, but we need the Spring Environment inside to access the customerKeys. And in that early stage, no Autowiring takes place.
  • Create an XML BeanConfig entry for it. (Oldschool we don’t want to maintain XML files any longer ;) )
  • JavaConfig not possible, because the AnnotationScanners for it are BeanDefinitionRegistryPostProcessors itself: See SPR-7868.
  • Add the BDRPP programmatically to the ApplicationContext during initialization:
public class MyApplicationContextInitializer implements ApplicationContextInitializer<ConfigurableApplicationContext> {

  @Override
  public void initialize(ConfigurableApplicationContext applicationContext) {
    ConfigurableEnvironment springEnvironment = applicationContext.getEnvironment(); 
    applicationContext.addBeanFactoryPostProcessor(new DataSourcesBeanFactoryPostProcessor(springEnvironment));
  }
}

In the likely case of developing a web application, please don’t forget to define the initializer in you web.xml:

<context-param>
  <param-name>contextInitializerClasses</param-name>
  <param-value>your.package.MyApplicationContextInitializer</param-value>
</context-param>

If you are lucky and OPS offers you a Servlet 3.0+ environment, the programmatic approach should be self-explanatory.

Looks useful? Thanks :).

4.4)  Plugging it together.

The last thing missing is the initialization of the AbstractRoutingDataSource. And there it gets a bit bumpy. What we now have, is a set of factory beans which do a jndi lookup for adding those DataSources to our context. Later we want to bunde all those beans into the routing datasource, but then, we wont be able to know which datasource belongs to which customer. Although the customerKey is part of the bean name and the jndi path by convention, both are not rechable when injecting by the DataSource interface. Hmmm. Well we can rely on our bean name creation strategy once I think. Based on that, this is our JavaConfig for creating the AbstractRoutingDataSource:

@Configuration
public class RepositoryConfig {

  private String[] customerKeys;
  @Autowired private ApplicationContext applicationContext;

  @Value("${customerKeys}")
  public void setCustomerKeys(String rawCustomerKeys){
    // parseCustomerKeys() extracted to Util class
    this.customerKeys = Util.parseCustomerKeys(rawCustomerKeys);
  }

  @Bean
  public AbstractRoutingDataSource routingDataSource(){
    AbstractRoutingDataSource routingDataSource = new CustomerRoutingDataSource();
    Map<String, DataSource> customerIndexMap = createCustomerMap();
    routingDataSource.setTargetDataSources(customerIndexMap);
    return routingDataSource;
  }

  private Map<String, DataSource> createCustomerMap() {
    HashMap<String, DataSource> result = new HashMap<String, DataSource>();

    for (String customerKey : customerKeys) {
      // could also be extracted to Util class to centralize contract
      String beanName = "dataSource_" + customerKey;
      DataSource dataSource = lookupBean(beanName, DataSource.class);
      result.put(customerKey, dataSource);
    }

    return result;
  }

  private <T> T lookupBean(String beanName, Class<T> clazz) {
    T bean = applicationContext.getBean(beanName, clazz);
    if (bean == null) {
      throw new MyStartupException("Mandatory Spring bean '" + beanName + "' missing! Aborting");
    }
    return bean;
  }
}  

The presented solution can generally be used for dynamically defining a set of beans of the same type which should take part in the whole Spring bean life cycle.

5.) Conclusion, drawbacks and upcomings

With reaching 1600 words, I will take a break. But I won’t leave without summarizing the open topics so far:

  • Why do we want to use Flyway for that.
  • The current solution lacks the possibility to use Hibernate 2nd level caching. Will there be a solution?
  • How can I efficiently use the Spring Environment for doing some config magic?
  • Is there a possibility to inject or create whole maps of beans which would provide access to the beanName and probably be directly injected into the AbstractRoutingDataSource?

I hope there will be time to discuss it :)

So long! Have fun!

Processing and analysing sensor data – a DIY approach Part II

In a recent Lab, we set up sensors in our office to collect our own machine-generated data (see this blog post). For the follow-up lab, our team split up into three groups. Two of us set out to implement a nice real-time visualization of our data, another colleague investigated and benchmarked different document schemas for storing time series data in MongoDB (see this post on the MongoSoup blog), while the remaining two started analysing the data we collected. Continue reading

Advanced Reactive Programming with Akka and Scala

After getting acquainted with Akka in our first Akka lab, we – @RoadRunner12048 and @dmitterd – wanted to try monitoring an Akka application to get a better understanding of its dynamic behavior. Additionally, we wanted to play with clustering support. We used a very rough Stock trading simulation and a Ping-Pong application that we’ve both implemented in the first lab as subject for our experiments.

Continue reading

Interview with MapR’s M.C. Srivas about Apache Drill

MCSrivasRecently we had M.C.Srivas, CTO and Co-Founder of MapR Technologies, as a speaker at our Munich Hadoop User Group. He gave a nice talk about the Apache Drill Project which develops a tool providing fast interactive SQL on Hadoop and other data sources. We took the opportunity to ask Srivas a thing or two about Drill and his view on it.

 

Continue reading

Scala Days 2014

Scala Days 2014 Banner

Scala Days 2014 is over and I assume during those days Berlin was the place with the highest density of Scala talent in the world. Apart from several hundred participants, a lot of people from Typesafe and EPFL who work on projects like Scala and Akka were there. In this post I would like to share some of my favorite talks with you.

Continue reading

Map-Reducing Everywhere – comSysto and MapR at TDWI Europe 2014

This year’s TDWI Europe conference takes place in Munich from June 23rd til 25th. The conference is one of the major hubs for the Data Warehousing and Business Intelligence scene, and comSysto and our partners MapR are happy to be giving one of the talks.

Continue reading

Search, Scale, Store – My experience at Berlin Buzzwords 2014

Last week was the first time that I participated in the conference Berlin Buzzwords. Even though I read that the conference is all about store, scale and search I wasn’t quite sure what to expect, except for lots of information about Elasticsearch (the Gold partner of the conference).

Continue reading