Circuit breaker design pattern
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Circuit breaker is a design pattern used in software development. It is used to detect failures and encapsulates the logic of preventing a failure from constantly recurring, during maintenance, temporary external system failure or unexpected system difficulties. Circuit breaker pattern prevents cascading failures particularly in distributed systems.[1]
According to Marc Brooker, circuit breakers can misinterpret a partial failure as total system failure and inadvertently bring down the entire system. In particular, sharded systems and cell-based architectures are vulnerable to this issue. A workaround is that the server indicates to the client which specific part is overloaded and the client uses a corresponding mini circuit breaker. However, this workaround can be complex and expensive.[2][3]
Circuit breaker pattern should be used along with other patterns such as retry, fallback and timeout pattern. This helps the system to be more fault tolerant.[4]
Common uses
[edit]Assume that an application connects to a database 100 times per second and the database fails. The application designer does not want to have the same error reoccur constantly. They also want to handle the error quickly and gracefully without waiting for TCP connection timeout.
Generally Circuit Breaker can be used to check the availability of an external service. An external service can be a database server or a web service used by the application.
Circuit breaker detects failures and prevents the application from trying to perform the action that is doomed to fail (until it's safe to retry).
Implementation
[edit]Implementations of the Circuit Breaker Design Pattern need to retain the state of the connection over a series of requests. It must offload the logic to detect failures from the actual requests. Therefore, the state machine within the circuit breaker needs to operate in some sense concurrently with the requests passing through it. One way this can be achieved is asynchronously.
In a multi-node (clustered) server, the state of the upstream service will need to be reflected across all the nodes in the cluster. Therefore, implementations may need to use a persistent storage layer, e.g. a network cache such as Memcached or Redis, or local cache (disk or memory based) to record the availability of what is, to the application, an external service.
Circuit Breaker records the state of the external service on a given interval.
Before the external service is used from the application, the storage layer is queried to retrieve the current state.
Performance implication
[edit]While it's safe to say that the benefits outweigh the consequences, implementing Circuit Breaker will negatively affect storage space, application complexity, and computational cost to the executing application. This is because it adds additional code into the execution path to check for the state of the circuit. This can be seen in the PHP example below, where checking APC for the database status costs a few extra cycles. Also, running the circuit breaker code itself consumes resources on the system where it is running, thus leaving less execution power for "real" applications.[why?]
By how much depends on the storage layer used and generally available resources. The largest factors in this regard are the type of cache, for example, disk-based vs. memory-based and local vs. network.
Different states of circuit breaker
[edit]- Closed
- Open
- Half-open
Closed state
[edit]When everything is normal, the circuit breakers remained closed, and all the request passes through to the services as shown below. If the number of failures increases beyond the threshold, the circuit breaker trips and goes into an open state.
Open state
[edit]In this state circuit breaker returns an error immediately without even invoking the services. The Circuit breakers move into the half-open state after a timeout period elapses. Usually, it will have a monitoring system where the timeout will be specified.
Half-open state
[edit]In this state, the circuit breaker allows a limited number of requests from the service to pass through and invoke the operation. If the requests are successful, then the circuit breaker will go to the closed state. However, if the requests continue to fail, then it goes back to Open state.
Example implementation
[edit]PHP
[edit]The following is a sample implementation in PHP. The proof of concept stores the status of a MySQL server into a shared memory cache (APC User Cache).
Check
[edit]The following script could be run on a set interval through crontab.
$mysqli = new mysqli("localhost", "username", "password");
if ($mysqli->connect_error) {
apcu_add("dbStatus", "down");
} else {
apcu_add("dbStatus", "up");
$mysqli->close();
}
Usage in an application
[edit]if (apcu_fetch("dbStatus") === "down") {
echo "The database server is currently not available. Please try again in a minute.";
exit;
}
$mysqli = new mysqli("localhost", "username", "password", "database");
$result = $mysqli->query("SELECT * FROM table");
References
[edit]- ^ Machine Learning in Microservices Productionizing Microservices Architecture for Machine Learning Solutions. Packt Publishing. 2023. ISBN 9781804612149.
- ^ Understanding Distributed Systems. ISBN 9781838430214.
- ^ "Will circuit breakers solve my problems?".
- ^ Kubernetes Native Microservices with Quarkus and MicroProfile. Manning. 2022. ISBN 9781638357155.
External links
[edit]- Example of PHP implementation with diagrams
- Example of Retry Pattern with Polly using C#
- Example of C# implementation from Anders Lybeckers using Polly
- Polly NuGet package
- Example of C# implementation from Alexandr Nikitin
- Implementation in Python
- Stability patterns applied in a RESTful architecture
- Martin Fowler Bliki