- Mastering NServiceBus and Persistence
- Rich Helton
- 4292字
- 2021-08-05 18:06:38
Introduction to SOA
Service Oriented Architecture (SOA) is a very important architectural concept (http://en.wikipedia.org/wiki/Service-oriented_architecture). To understand what services it brings to the table, we bring up the four tenets of services, also known as the Principles of Service Oriented Design (for more details refer to http://msdn.microsoft.com/en-us/library/bb972954.aspx). They are autonomous, boundaries, share schema and class, and compatibility.
- Autonomous: Services are autonomous; this means that each individual service takes care of its own self-contained life cycle independent of other services, and changing a particular service will not have any side effects on other services.
- Boundaries: Boundaries to services are explicit. There are distinct entry and exit points for messaging; it is well defined where these points are in the service.
- Share schema and class: Services share schema and contract, but not their classes. This means that the internals of the services are not exposed. Again, the messaging interface is defined, but the internals of what is going on are not exposed across the platform. This adds a layer of abstraction to services that define a business requirement, say an order service, without having to go into every detail of the business.
- Compatibility: A service's compatibility is based on its policy. The policy defines the nonfunctional requirements of what the service must conform to while it is being produced. For example, what is the level of encryption, maintenance, and effort required? For instance, in an order service, what data needs to be saved to the disk, what data needs to be encrypted, and what is the level of fault tolerance of the service?
A simple example comes from ordering websites that need to send payments to third-party servers to receive the payment. Assume a pizza-ordering site; there are a number of issues that may occur at the time of credit card processing, which include insufficient funds as well as network and connectivity issues. If SOA or ESB is not used, the customer may be asked not to refresh the page. This is required so that the payment request is sent to the third-party processing server, and the customer may even receive a network error. When an error is received, the customer is asked to retry again.
There are many major ordering websites that function in this way today. As a customer, some of the concerns include the integrity of how a website handles orders since it requires customer validation and intervention to process payments. Even ensuring that a page does not refresh relies on the customer, which makes the site less appealing in comparison to those that do not require customer intervention for issues the customer does not need to be made aware of.
Instead, the responsibility to ensure the funds are processed should be on the system rather than on the customer. Of course, in order for a website to take on the responsibility of firing off the message to an SOA, there has to be an SOA in place to take on the responsibility of processing the message for the payment.
While developing an SOA or ServiceBus system, many software architects consider starting it from scratch. However, they soon realize that there are many unstated requirements that are expected to be incorporated. These requirements assume a specific behavior and do not explicitly call them out. It is a given fact that a good design takes these non-business functional requirements into account.
Some examples of these requirements include second-level retries for when a credit card isn't processed the first time. When this happens, the system stores the messages along the way; keeps track of the state of the services; and integrates into other company systems network errors, the encryption of the credit card number, and the access control level that different users and systems may need.
These requirements become complex quickly, as the following diagram implies. It may take years to resolve some of the issues but most of the time, the business allocates months rather than years to address them. In order to resolve these non-business functional requirements and to address the associated issues that may arise, it is best to study solutions that other architects have provided for similar situations.
For instance, use a ServiceBus product such as NServiceBus as a guide to performance-enhanced products with built-in message reliability and integrity.
Continuing with the order system for a pizza establishment, the website would process the order and hand off the message to ServiceBus to process the payment. Then, the system takes the ownership of the payment message instead of relying on the customer.
The messages need to accommodate the partner's systems. However, the bus handles data and queues internally and saves the state, messages, and objects if something goes wrong. This is important since payments affect the bottom line, and the company has a business need to keep track of its payments.
The hand-off of messaging allows a customer to continue to the next action or website page. The payment response is later processed as the system takes on the responsibility for the payment.
The messages are sent between services as autonomous tasks, and the messages need to be made durable, scalable, reliable, secure, transactional, and capable of being distributed among different systems. This backbone, the pieces as a whole, is by definition an Enterprise Service Bus (ESB). ESB is simply a common bus across the enterprise, with the preceding characteristics (durable, scalable, reliable, secure, transactional, and distributable).
A saga is a mechanism that evolved in ESBs to save the state of messages. A saga also keeps track of the originating message's endpoints so that it can respond to the originator with changes to the message.
Just as an accountant must keep track of receivable payments and orders in a company, so must a company's systems—record keeping is of paramount concern. Once a user creates an account, they become a customer; as a customer, they assume that the company protects their information, unless told otherwise.
Throughout history, many companies that are no longer in existence neither protected users' data, nor adequately kept track of payments and orders. Security and sales are an overall concern in the industry. A company's main goal is to make more money than it spends, which includes keeping track of the company's data. Losing sales and data can be expensive. Reporting where data is and its current state (be it a sale or customer's data) is important. Therefore, of course, it is better to have a system that never has an issue. Though, if a system has an issue (such as losing data or funds), it is best to know the magnitude of the issue and as much information as possible. Therefore, when building payment engines, it is not uncommon to require daily reports of dollar totals, the number of successes or failures, reasons for failures, root cause of failures, and more.
In order to provide such reports, there needs to be an end-to-end tracking of messages. A message is nothing more than a piece of data that travels through a system as the system completes a transaction.
A transaction is a completed unit of work, such as completing a payment. A message can be saved after a transaction is completed in order to keep a record and be able to provide feedback on what happened through the workflow.
A workflow is the end-to-end processing of transactions as the message moves through the system to complete its life cycle. During a message's life cycle, some data may be mutated. An example is payment in part or additional fees. The system uses the message's metadata to determine how the message moves through the workflow.
Metadata is information about the message itself, such as a message ID or header information. Header information is used to keep information that may show, for instance, the originating system and destination.
A saga uses a message ID to save and lookup the state of the message at a given point using the originator of the message to respond, with the status of the message, to the originator.
All of the previous work is performed in order to do reporting; also, instead of creating a solution from the ground up, NServiceBus is built explicitly to simplify and assist with the amount of work within a system. NServiceBus uses queuing to pass messages to other services, such as MSMQ, which includes error queues and audit queues.
For example, a simple report may be there to send a daily message of how many messages were sent to the error queue. Since messages can be created in XML, there could be an error field to be easily parsed out for error details. However, in no way does this replace logging.
Products such as ServicePulse and other reporting mechanisms are used to assist in giving reports of the company's messages and data. This simple example could be expanded to send messages that contain payments above a threshold ($100 for instance) to one queue and under the threshold to a different queue. A report could be made daily based on timestamps. Since sagas are saved in databases before a message is completed, another report could be generated to report on all the payments over $100 that are not processed.
There are many ways to provide reports of messages, and because sagas and queues are used, it can be drilled down to very detailed information. It is obvious that there is extensive work to be done to create and implement a solution from scratch.
The need for metadata
During the course of building enterprise systems, there are functional and nonfunctional requirements. Functional requirements describe the business rules, and nonfunctional requirements are system characteristics with non-business rules. A simple nonfunctional requirement for a system is, for instance, that any SSN must be encrypted both at rest and in-transient states. Nonfunctional requirements simply go beyond security requirements; nonfunctional requirements include notifications, alerts, monitoring, logging, and other software qualities.
Nonfunctional requirements include many of the components that make up software quality maintainability, security, code quality, reliability, integrity, and so on. Software quality is the ideal state for software to achieve; nonfunctional requirements form the specifics of how to achieve certain pieces.
The problem is that, while business requirements may be clearly spelled out, nonfunctional requirements may not be defined clearly or negotiated enough ahead of time. Therefore, tweaks are required along the way during the application life cycle, including development or maintenance. Metadata and precreated frameworks are the key players of this tweaking.
Consider an administration application that business analysts (BAs) and operational teams use to check the current state of an enterprise application. The application takes orders for aircraft maps and equipment, and customer service representatives (CSRs) have an interface for working with the customers and changing their data at will. Operations use an administration application to monitor the end-to-end throughput from a browser to a database and receive notifications if the levels are not achieved.
In the previous example, notifications and monitoring are nonfunctional requirements. BAs may use the administration application to handle special customer cases and monitor the number of orders, customers, and other reports. The generation of the reports, the data for monitoring, is based on the business data and generates metadata. This metadata is used to check the business data.
The following is a common 3-tier diagram for an application that gathers sales information:
The application has a frontend, a logic tier (middle tier), and a data tier. So far, this is a very common design for an application. The frontend is done in HTML or ASP.NET to control the presentation layer in a browser. The logic tier contains the workflow and messaging to handle business logic. Finally, the data tier is the storage to hold the information in a persisted repository—usually a database, mainframe, file I/O, or third-party server among other options.
When you look at this basic application, you'll realize that many endpoints are missing. These endpoints are used to monitor the application, to log the application, and perform other operational and administration tasks previously mentioned. Therefore, this model is incomplete since it does not address nonfunctional requirements.
Many software projects seem to need continuous enhancements because the developer keeps on adding components for security, operational reports, and other application characteristics that were not mentioned in the list of business requirements, even though they are components required to ensure the integrity of the application itself.
The need for persistence patterns
To paraphrase what's written in http://en.wikipedia.org/wiki/Service_oriented_architecture, the idea behind Service-oriented Architecture (SOA) is to decouple the end-to-end application functionality between discreet services.
So far, we have discussed sagas and some metadata of applications. There are other types of data that are saved to the data store, including business objects that contain the information used for business rules. Business rules run the business engines and are used to execute business logic.
In the ESB world, the bus transports (moves) objects that could be considered business objects; these business objects move through sagas. These objects are the pieces of NSBs that are used for notifications, timeouts, gateways for message distribution, Second–level Retries (SLRs), and even endpoints to where the messages are sent.
The preceding objects make up many of the application metadata. Many of these are the configurations of the services that make up the distribution of the messages and the behavior of the transactions. The metadata that NSB keeps track of during a publish-subscribe message pattern is the same subscription information required for NSB to keep track of the publish-subscribe endpoints. The subscription information is needed for the subscribers to keep track of the message types and queue endpoints. This is needed to subscribe to the publishers. NSB uses the database to keep track of these types of endpoints.
A small table of what is available can be seen at http://docs.particular.net/nservicebus/persistence-in-nservicebus.
The persistence configurations are just some of the typical ESB service configurations in NSB. There are many more configurations as NSB is meant to do so much more as a complete automation framework for the middleware. We will be discussing the various features and their associated configurations on the bus called IBus throughout this book.
Through this table, we know that the timeout for sagas, the saga object itself, the subscription information for publish-subscribe, the second-level retries, the fault management, notification, the gateway, and distributor can be supported in MSMQ. Some of these pieces can be stored in the local memory of the host application; it cannot be saved when the application is not running. Pieces can be saved in the RavenDB database, which is a NoSQL document-oriented database. Pieces can also be saved using the NHibernate database connecter, which is an ORM mapper to various relational databases, such as SQL Server, MySQL, and Oracle. Some of the items have been referred to as data, which is data that describes the messages versus the messages themselves that will be part of the ESB workflow. The workflow itself makes up the business logic, while the messages themselves could be considered as business objects.
The benefit of NServiceBus is that it will handle the persisting of the object's messages and various pieces for the developer, as long as the developer has configured NSB correctly.
For instance, when using NHibernate, NSB will perform the mapping of the messages to the relational database, and the developer does not have to configure the NHibernate-mapping properties to map the objects to the relational database. This saves the developer a lot of time and effort. The messages themselves can also be persisted through various means using the settings for using the transport in IBus configurations. These message queues include MSMQ, Azure queues, SQL Server queues, ActiveMQ, and RabbitMQ.
Fallacies of distributed computing
Many books are written on just various troubleshooting issues over networks and servers. There are many issues that come up in operations and maintenance that were never conceived as potential issues, anywhere from intermittent routers due to a power cord not being plugged in all the way, patches that left the servers in a hung state, DNS errors from a domain controller, and so on. There is no guarantee that the networks, or servers, are secure, remain unchanged, and all the routes remain reliable for the application that was built. Not having to deal with these abnormal issues by having someone else deal with the uptime issues is what makes cloud computing so attractive. In many enterprise applications, as in this usage, we discuss where uptime is critical, and where it is normal to have to code, notification, and monitoring, for failure along every step of the way between services and clients. There are many assumptions that we can make, including the one that it is someone else's concern; however, in the end, it becomes a piece of the application's responsibility to describe how it is working.
Because the network may not be reliable, there may be a changeover in staff and servers. The need for persistent enterprise objects, such as bus technology and persistent messaging, has evolved. Also, the need for instrumentation has grown to track the messages and objects. Not knowing where payments and orders are in a system can be bad for any organization that needs to track them. In the end, the data that runs through applications is owned by the organization; if it is hacked, if financial data is lost, or if employees are not paid, it is their responsibility, rather than considering that it lives in the cloud or it is the fault of a bad network or any other condition. Because of this need for reporting on the systems, there is a need for metadata, which is just another form of persisting the company's data, except for business data such as a customer's address. Metadata is a form of reporting data, such as the current state of a message or if there was an error with a message reaching its endpoint. It is a snapshot in the organization's operations of applications. Sometimes these snapshots are very important; in many cases, where money and personal identifying information are involved, they are used to provide information, even to courts, on what happened when the money goes missing. We will start on this journey of running through the designing of systems with a common SOA design pattern called saga that will assist us in providing these pieces discussed thus far.
The need for sagas
A saga is a design pattern that was originally coined in a paper by Hector Garcia-Molina in 1987, http://www.amundsen.com/downloads/sagas.pdf. To quote a piece:
"A long-lived transaction (LLT) is a saga if it can be written as a sequence of transaction that can be interleaved with other transactions."
In Arnon Rotem-Gal-Oz's book on SOA Patterns, page 137 says:
"Sagas are a way for services to reach distributed consensus without relying on distributed transactions."
It is expressed by many references that sagas may be built differently, depending on the need.
A saga pattern is supported by NServiceBus; for more information see handles the persisting of pieces of messages as part of an ESB. During a workflow of messages, a message is sent to a saga; the saga persists the needed data and responds to the original client with messages. A saga itself is a data object with an ID, getters, and setters. As messages are passed back and forth between services, the saga is an intermediate to save valuable data. The data are message parts.
The messages of a service bus are persisted by nature and can be replayed when there is an issue with the delivery of the message with the endpoint; however, the saga keeps track of the originator and can store other data to be associated with the original message. This updated data, which is defined by the developer, may be the state of the message, the session information related to the message, or any other data needed by the application. The saga correlates messages it receives, synchronizes the activity using the corresponding ID, and deals with other features such as timeouts and lookups.
The saga evolves in the ServiceBus architecture as a pattern; it is discussed in greater detail in the next chapters.
Many common frameworks such as Microsoft MVC and EF are designed for business requirements only, with additional frameworks to assist in nonfunctional requirements; this point is stressed throughout this book. Also, we emphasize the concept of ServiceBus.
ServiceBus is a messaging workflow; it stores messages along the way. It is a workflow since it incorporates both business and nonfunctional requirements. ServiceBus does have transactional persistence to perform second-level retries if there is an error in the server or the network. The saga pattern extends that concept by giving feedback to services along the way to the originator and timing out messages. Also, it provides feedback on which operations business analysts and CSRs normally require to perform day-to-day operations. This information is used to correct issues that are of interest to the business. Remember that the saga pattern is a framework that is easily extensible, and so it is not a stress to use it for more than just retries.
A real-life saga
NServiceBus simplifies the implementation of the concepts in the previous section; the following is a real-life scenario to illustrate them and multiple services that communicate with each other.
Recall the pizza-ordering example we discussed earlier where the Please do not refresh the page and wait for the order to complete message is displayed when a user places an order. We discussed the concern that the user may have doubts about whether the order is completed, and there is the implication that a browser refresh could cause order issues. Obviously, an ASP or JSP web page waits for some web service to go out and charge my card as it waits for the result. To avoid this behavior, a better solution is needed. One such solution is a workflow for passing messages around so that the system fires off a transaction to process the payments, allowing user interaction to continue; eventually, the system is to receive an update once the payment is processed.
There are a few possible solutions for the preceding example, and all of them have one thing in common: combining a workflow with a middle layer simplifies the solution.
One possible solution is to have several services that are responsible for different actions. We need to save data entered by a user to a database; this can be accomplished via some backend services. These services handle all the transactions needed. A service, say Service1, can pick up the data and pass it into a MSMQ for processing. This provides the separation of knowing which messages are in the state of processing. Another service, say Service2, can be responsible for the interaction with a payment engine.
Continuing with the pizza-ordering example, Service1 is responsible for getting the data entered by the customer and Service2 is responsible for processing the credit card payment. If there are errors with the payment engine, Service2 and the ServiceBus have the logic to retry again. However, Service1 remains unaware that there are errors with the payment. Service2 is atomic and does not provide notifications and feedback to the user. The payment service may place the error in an error queue, but some information, such as why the payment was not processed, will remain missing.
Using the saga pattern provides many of the features that are currently missing in the solution presented thus far. The saga is the end-to-end message workflow that can be used to save the state in an intermediate process. This can be accomplished by saving an intermediate saga data object. This persistence typically is done to a database and looked up when the same message is passed back through. Sagas can get complicated but, because very little code is required—since the ServiceBus handles most of the work—sagas can be simple to use.
As hinted previously, a saga can be created as an intermediate between the services to keep the client, in our example Service1, informed about the progress of the message.
The saga can update other endpoints of the message status and change the message if it needs updating as it moves through the workflow. The important piece of a saga is the one-to-one lookup of the data related to the message and the message itself. This allows the workflow to follow a message's progress and know where it is at a given moment along multiple services. We could define a timer to fail the message if it continually errors out, since we don't want messages to live forever.
Returning back to the pizza-ordering example, instead of waiting and not refreshing the page, we can create a page where the user can go to and check the status as the order progresses through the ServiceBus workflow. Notice that this allows many nonfunctional requirements to be addressed.
Nonfunctional requirements (such as monitoring, logging, manual retries, timeouts, checking encryption, and the message) can be addressed by monitoring the services and messages.
To recap, we can address the payment engine errors by adding logic to the saga to notify the user, operations, and the organization of specific errors. For instance, we could add logic to the saga to send an e-mail to the user saying that the order was denied due to insufficient funds. In addition, we could add another error-checking option into the workflow for network failure and other unexpected events. When such events happen, have a notification sent to operations stating that the payment engine server is not available at this time. Notice that the user does not need to be notified of these errors. Therefore, the saga becomes the focal point for checking the status of the message.