In this article I will explain how to extend a simple proxy pattern to fulfill a typical developer task: replacing one system with another and still regain control during the going live phase.

At some point in time, every developer struggles with the situation in which he or she needs to migrate a currently working system into a new one. Moreover, it’s often common to receive a task demanding that the new system fulfills all the requirements of the old one. In this blogpost, I would like to describe a way of handling this situation with the introduction of a new component based on a known design pattern.  

Assume you’ve been given a project to migrate a currently working solution into a completely new technology stack while keeping the contract for the entry point. Let’s says that the problem that was being solved by the old solution was aggregation of a large number of data points into logical objects that were later queried via a REST endpoint. We know that REST endpoints won’t change and that the new solution should have better aggregation results in terms of business logic. 

When the new project is completed and ready to deploy for customers, we are faced with the question of how we should handle the change. We had an old solution that was working in production and launched a new one in the background which we assume is performing better. We know that we cannot perform unit tests on all the cases defined in the system and we don’t know how good or bad the legacy system actually is. 

changing legacy system

How can we approach solving this kind of problem?  

We can set aside a maintenance window and replace one system with another; however, we then can’t see if the new system is performing as expected (as we cannot compare results with old one) and, if something goes wrong with the new one, we then need to prepare for a fast rollback to restore the previous solution. 

Legacy system and new system

Another option could be to introduce a load balancer/nginx in front of those services. This allows us to control the traffic by putting only 10% of it into the new system, for example, and seeing how it behaves. This removes the need for a fast rollback if the new system crashes, as we can easily move all traffic to the old solution. However,  we still can’t compare both systems with live traffic and full data as we only have 10% of the new system compared to 90% of the old one. 

Old system and new system traffic

As described above, none of these solutions truly fulfill our requirements.  

To achieve the expected result, we decided to build our own proxy component that will work as a normal proxy but will also include some logic that will be helpful when comparing two similar systems.  

Proxy Patterns

Before we go deeper into the details, let me briefly describe what proxy patterns are. As Gang of Four stated: the job of a proxy pattern is to “provide a surrogate or placeholder for another object to control access to it.”1 It also has the same interface as the object after the proxy, so it is invisible for external calls and does not require any changes in callers’ clients.  

Text Box
UML diagram of Proxy Pattern

We can define a proxy with a simple UML diagram. 

Now that we know what a proxy is, we can define a list of requirements for it to solve the problems we had during the migration period: 

  • Allow the passing of a request to two systems simultaneously  
  • Allow control of which system response is the preferred one 
  • Allow comparison of results of two systems, in order to see the difference between them 
  • It should be light and fast so that the client won’t see a difference in request time  
  • It should provide a large number of metrics that allow operations to monitor behavior 

Solution 

To handle the first two bullet points, we decided to use a strategy pattern for managing the flow of the data. To implement this pattern, we need to define an interface which will then be implemented by specific classes. This interface will contain only a signature of a function:  

(oldSystemResponse: T, newSystemResponse: T) => T. 

Now as we know what the definition of our strategy is, we can decide what cases we can support. As we have two systems working in parallel, the strategies implementing this interface can be: 

  1. oldSystemOnly – which means we are querying only the legacy system and returning its result. 
  2. oldSystemSilent – which means we pass a request to both the legacy system and the new one, but we always return the result from the new system. 
  3. newSystemSilent – which means we pass the request to both the legacy system and the new one, but we always return the result from the old system. 
  4. newSystemOnly – where we are querying only the new system and returning its results. 

Implementing the first and last points is trivial as it can just be a simple lambda expression: 

(oldSystem, newSystem) -> oldSystem (oldSystemSilent strategy) 

or 

(oldSystem, newSystem) -> newSystem (newSystemSilent strategy) 

As this example can be implemented using reactive programming libraries with cold streams, only a request to one of the systems will be executed. 

Implementation of the second and third points can also be very simple but, as we’ve got another requirement, we can easily fulfill one of them here with the command to “Allow comparison of results of two systems to see the difference between them”.  

We will focus on the first case because the second is similar to it.  

By silent strategy (let’s take newSystemSilent) we mean that we want to execute exactly the same query on both systems but return only one of the results. This approach allows us to test both of them with real traffic and real-life queries but in a safe manner as we are responding with the old solution.  

A sample implementation of this strategy could look like this: 

fun <T> query(oldSystem: Mono<T>, newSystem: Mono<T>): Mono<T> { 
	return oldSystem.zipWith(newSystem) 
    	.flatMap {  
TODO("Logic for managing two results goes here.")  
return oldSystem 
} 
} 

By executing zipWith function on oldSystem, we are subscribing to results of both systems so that in the map we can add some additional business logic. Note that we always return oldSystem so, no matter what result is received from the new solution, we are perfectly safe here as no wrong responses are transferred to the client. 

Returning to the flatMap function, it’s a perfect place for fulfilling the third requirement – namely, result comparison. As the comparison is a thing that is completely independent of querying, we can decide here how we want to proceed with it. We can move it to a separate thread and not block client requests (as the comparison may be time consuming) or do it synchronously in flatMap. Both solutions have drawbacks but we don’t need to focus on them. It’s more important that we can make our proxy a bit smart by adding a simple (or complex, depending on the model) comparison module that will provide us with metrics of how our systems behave. It should also tell us if the returned results are equal or if our new solution greatly increases the positive result set.  

How the comparison function is written depends fully on your model but, from the perspective of the query function, it can have a very simple signature. 

class Comparison { 
	companion object { 
    	fun <T> compare(results: Tuple2<T, T>) { 
        	TODO("Compare....") 
    	} 
	} 
} 
 
fun <T> query(oldSystem: Mono<T>, newSystem: Mono<T>): Mono<T> { 
	return oldSystem.zipWith(newSystem) 
    	.flatMap { 
        	Comparison.compare(it) 
        	oldSystem 
    	} 
} 

With this approach, having a smart proxy gives us a lot of abilities to control our flow and introduce any additional logic. For example, we can easily write some recovery strategies based on business rules, enrich request/responses with additional parameters/headers, and so on. 

Flow control  

Now, when we know how to make our proxy smart, the next task is to make it more user friendly. As this component will likely be used by testers, developers, and PMs, it would be nice to dynamically steer the flow. If one of the holders would like to get results from one of the systems or would like to force the comparison, our proxy needs to support some kind of feature switch that is easy for end users. When steering the flow maps and choosing the right strategy in the proxy, we can provide a factory pattern which is smart enough to handle our needs.  

By default, our strategy should be working on some configuration system (properties file, environment variables, etc.) as, in most of those systems, changing the property value necessitates a restart of our proxy, and this is something we want to dodge when testing the app.   

How can we be more flexible? As we are using a REST API on the front, we can take advantage of this and base our factory on it by extending request headers with a new one that tells us which strategy we want to use with this request. We could call this header for example “X-Proxy-Toggle” and add a value which connects to the strategy: new-only, new-silent, old-only, old-silent

If you are using Spring framework, it may look like this:

 
@RequestMapping(method = [RequestMethod.GET]) 
fun getJson(@RequestHeader(name = “X-Proxy-Toggle”, required = false) toggle: String): Mono<ResponseEntity<ByteArray>> { 
 
	var strategy = queryStrategyFactory.create(toggle) 
	TODO("Use the strategy...") 
} 

And the factory. 

 
@Service 
@EnableConfigurationProperties(ApplicationProperties::class) 
class QueryStrategyFactory(@Autowired private val appConfig: ApplicationProperties) { 
	private val mapping = HashMap<String, QueryStrategy> 
 
	private fun lookup(constant: String?): Option<QueryStrategy> { 
    	return if (constant == null) { 
        	Option.none() 
    	} else mapping.get(constant.toLowerCase()) 
	} 
 
	private fun create(): QueryStrategy { 
    	return create(appConfig.queryStrategy) 
	} 
 
	fun create(strategyName: String): QueryStrategy { 
    	return lookup(strategyName) 
        	.getOrElse { create() } 
	} 
} 

Now we can easily steer our proxy with each request. This additional flexibility is a big help during tests and system debugging.  

Putting all things together, we achieve a nice, light, and smart proxy app with great flexibility. As with making those design decisions, like splitting the logic into several classes and applying additional patterns (factory, strategy), a proxy app can now be easily extended with more functionalities. Building a monitoring and metrics system on top of it is very easy as logical components are separated and do not interfere with each other. I would recommend measuring the following things: 

  • old system response time 
  • new system response time 
  • total proxy response time 
  • average/max comparison time 
  • number of unequal comparisons 

Summary  

We’ve seen how to extend a simple proxy pattern to fulfill a typical developer task: how to replace one system with another and still regain control during the going live phase. In my projects, those proxy apps stay live for some time after going to production. This is very useful when the complexity of a system you are replacing may cause troubles, and when it is hard to decide beyond doubt that the new system is working better. When making those comparisons and looking into monitoring systems like Grafana, you gain confidence to undertake the next steps with your software.  

I’d also like to share my lessons learned about using this pattern.  

First of all, it is important to see what’s going on in your proxy. Mostly those components are forwarding dozens of requests so you are blind without proper logging and metrics. You need a good access log with data like contacted endpoints, query parameters, headers and response codes—both from the system below and from the proxy itself.

Secondly, when enriching the client’s request with your headers, it’s good practice to place additional headers like Via for pointing out who was acting as a proxy and X-Forwarder-* for maintaining the name of the original caller. In response, add some information that helps identify which system the proxy really contacted.

Thirdly, try to block as little as possible. A proxy is a component that needs to be fast. To achieve that by using one of the reactive libraries like Project Reactor or RxJava to handle async operations and scale up your load. 

Lastly, remember that the proxy in migration projects is not a permanent thing. It may be heavily used at the beginning but it is designed to be removed after a certain period of time. Bear this in mind and try not to fix your system around it. It’s always better to fix your new system than introducing some strange hacks in the proxy itself because, when the time arrives to remove the proxy, the bugs in the system beneath will remain.