Articles
- "Data Replication on Steroids," as seen in the July/August 2011 Connect publication The Connection
Concurrent replication has many efficiency advantages over serial replication. These advantages are described in this paper. The highly efficient technique of concurrent replication is commercially available only on NonStop systems because of the unique structure of and access to the TMF Audit Trail coupled with replication engine extensions that allow the replication engine to replay the target transaction mix concurrently with source transaction processing. It is the specialized nature of the NonStop Audit Trail that allows data replication to perform as if on steroids. - Eliminating Planned Downtime with Zero Downtime Migrations (ZDM) (Part 1)
ZDM for Active/Backup Configurations (Part 2)
How active/backup systems can take advantage of zero downtime migrations by using a fast and reliable failover in order to eliminate planned downtime for upgrades and migrations, and how active/active systems, comprised of two or more nodes cooperating in a common application, achieve continuous availability and eliminate unplanned downtime with zero downtime migrations. - "Achieving Century Uptimes" as seen in The Connection, published by Connect, by Dr. Bill Highleyman, Paul J. Holenstein and Dr. Bruce Holenstein
- Part 25: Is the Active/Active Topic Getting Stale? (11/10)
As the active/active message spreads and interest begins to build in the commodity server community, active/active technology should be in the position to explode. - Part 24: Is It Worth the Effort to Move to Active/Active? (09/10)
Determining whether moving to an active/active architecture is worth it, whether the savings to your company in current downtime costs will provide an attractive return on investment. - Part 23: Fast Failover with Active/Active Systems (2 of 2) (07/10)
How to rapidly recover from a node failure in an active/active network using server redirection, in which the nodes themselves monitor faults and control network reconfiguration. - Part 22: Fast Failover with Active/Active Systems (1 of 2) (05/10)
How to rapidly recover from a node failure in an active/active network using client redirection and network redirection; failover using these techniques can be automatic and very rapid. - Part 21: Active/Active NonStop Blades (03/10)
A review of the increased computing density of NonStop Blade Systems, and how active/active systems can be implemented to provide even higher availability, greater capacity, and lower cost of ownership. - Part 20: Is Your Application Active-Active Ready? (01/10)
A review of the problems that can result when active/active systems achieve their scalability and continuous availability by distributing application and database copies across an application network. - Part 19: Reviewing Three Years of The Availability Corner (11/09)
A three-year-anniversary review of the past 18 articles published continuously in The Connection column,. "The Availability Corner: Achieving Century Uptimes." - Part 18: Recovering from Synchronous Replication Failures (9/09)
A discussion of the procedures that allow transaction processing to continue in the face of a target database failure and for reinstating the target database as a participant in transactions upon its recovery. - Part 17: HP Unveils Its Synchronous Replication API for TMF (7/09)
How HP’s Synchronous Replication Gateway (SRG) API allows TMF to safely support gateways to foreign systems through volatile-resource managers, which allows replication engines to be integrated with TMF so that updates to remote databases can be synchronously replicated. - Part 16: Zero-Downtime Migrations for Active/Backup Configurations (5/09)
How active/backup systems can take advantage of zero downtime migrations by using a fast and reliable failover in order to eliminate planned downtime for upgrades and migrations. - Part 15: Zero-Downtime Migrations: Eliminating Planned Downtime (3/09)
How active/active systems, comprised of two or more nodes cooperating in a common application, achieve continuous availability and eliminate unplanned downtime with zero downtime migrations. - Part 14: The Evolution of Real-Time Business Intelligence (1/09)
How real-time business intelligence systems provide the information necessary to strategically improve an enterprise’s processes as well as to take tactical advantage of events as they occur. - Part 13: Synchronous Replication: Pros, Cons, and Myths (11/08)
A comparison between the two primary methods of data replication: asynchronous replication and synchronous replication, and a discussion of the advantages, disadvantages and contemporary misunderstandings regarding synchronous replication. - Part 12: Rules of Availability III (09/08)
A discussion of the importance of recovery time and the rules from the book Breaking the Availability Barrier III, choosing specific rules as “best practices” to achieve continuous availability with redundant systems and a focus on active/active systems. (Rules 41-64) - Part 11: Rules of Availability II (07/08)
A discussion of the importance of recovery time and the rules from the book Breaking the Availability Barrier III, choosing specific rules as “best practices” to achieve continuous availability with redundant systems and a focus on active/active systems. (Rules 18-40) - Part 10: Rules of Availability I (05/08)
A discussion of the importance of recovery time and the rules from the book Breaking the Availability Barrier III, choosing specific rules as “best practices” to achieve continuous availability with redundant systems and a focus on active/active systems. (Rules 2-17) - Part 9: Where is My Database of Record? (03/08)
How through expansions, mergers, and acquisitions, companies tend to wind up with many databases, causing problems with data unavailability and data loss, and how active/active systems can largely solve these twin problems. - Part 8: Let's Make Availability a Part of Performance Benchmarking (01/08)
How adding an availability test to performance benchmarks strengthens the benchmark data of the expected restore time which currently only gives the transaction per minute (tpm) capacity of the system and the cost per tpm. - Part 7: What is the Availability Barrier, Anyway? (11/07)
In commercial data processing, the availability barrier is recovery time, but how recovery time can be reduced so that the availability barrier can be pushed back as far as possible. - Part 6: Active/Active versus Clusters (09/07)
A comparison between clusters that are five 9s and a mature technology with thousands of installations, and active/active systems that are six 9s and beyond technology, but are relatively new. - Part 5: Modular Redundancy - To Need or Not To Need (07/07)
A review of how Neoview is a massively parallel database appliance derived from NonStop technology; however, its hardware architecture is different and its SQL engine, though derived from SQL/MX, has been significantly enhanced to support BI-specific features, such as very large queries. - Part 4: Resolving Data Collisions (05/07)
Why it is important to minimize data collisions by using a replication engine with a short replication latency time and to minimize the requirement to manually resolve data collisions by using the appropriate set of collision resolution algorithms. - Part 3: Avoiding Data Collisions (03/07)
How to structure an active/active system that uses asynchronous replication to avoid data collisions. - Part 2: What Will Active/Active Cost Me? (01/07)
The complex comparison of the costs of a monolithic system versus an active/active system, which incurs the cost of redundancy and network management, while reducing the cost of downtime and the related insurance costs. - Part 1: Survivable Systems for Enterprise Computing (11/06)
A description of active/active architectures, their advantages, and the issues associated with them while approaching 100% uptime. - "The Availability Corner" as seen in The Connection, published by ITUG
- Testing Your System Recovery Plan (09/06)
How the use of active/active technology solves the recovery plan testing syndrome; if recovery from a node failure can be accomplished almost transparently to the users, then node failures can be simulated at will to test the recovery plan. - Is IBM’s Parallel Sysplex a NonStop Competitor? (06/06)
A comparison of HP’s NonStop and IBM’s Parallel Sysplex that are the primary industry offerings tolerating single failures that lead to very high levels of availability. - Grid Computing (03/06)
How grid computing seeks to standardize the use of disparate systems in a computing utility that would manage all of the resources available to it as a set of services to improve reliability, availability, business agility, security, and a better return on IT investment. - The Net Present Value of Active/Active Systems (01/06)
A review of the sum of the present values of expenditure for a multi-node active/active system today and its return of investment, called the net present value (NPV). - TCO for Active/Active Systems (11/05)
How cost, understanding, and optimizing the total cost of ownership (TCO) of an active/active system is much more involved than it is for standard monolithic systems. - Fault Tolerance vs. High Availability (09/05)
A comparison of the factors that differentiate fault-tolerant systems from high-availability systems, with the length of recovery time as the secret to high availability. - The Great Tape Backup Paradigm Shift (07/05)
A reconsideration of the way in which systems are backed-up, since recovery time is what matters when making an application available after an outage. - The Language of Availability (05/05)
A glossary of several terms defining reliability and the systems that are used in these articles. - What Reliability Do We Really Need? (01/05)
A review about how adding expensive redundant hardware when striving for higher reliability benefits only two of three areas: high reliability, high performance, or low cost. - Let's Measure System Reliability in Centuries (11/04)
This series of Availability Column articles focuses on how to achieve a high level of availability at little additional cost.
- Testing Your System Recovery Plan (09/06)
To view past articles, click here.
