Wednesday, February 24, 2010

Eventual Consistency vs Atomic Consistency

I was having a very interesting and neurally stimulating conversation with Brian Holmes who runs SmashedApples, a cloud computing framework for Flex, about Eventual Consistency and trusting that data is correct.

In my original post on swarm computing, I was discussing the use of shards over a distributed system. One thing that I hadn't considered was the possibility that you could write data to more than one source. I still held to the principle that when it comes to data, a system requires a single source of truth. After pondering this thought for a while, I have firmly come to the conclusion that you do not necessarily require a single source of truth.

What is Eventual Consistency?
Eventual consistency is the premise that eventually all copies of data will propagate and update themselves. I always had a problem with this. Most situations assume atomic consistency, or that once a write is made, any further reads will obtain the correct information.

This assumption of atomic consistency is a hang over from decades of programmers working with Relational Databases on single systems. In distributed collaborative systems, there is always a degree of lag. It comes down to how that lag is handled by the system.

The concept of Read Validation over Write Validation
We all have encountered and used write validation. By Write Validation, I mean that you cannot save the state of a system without passing all integrity checks.

Let's say that we have a payment form for a utility company, and we have a field that allows you to put in how much you want to pay on your bill. A write validation is that you can't put $-100 into the field and save the form. A read validation is that for submission you cannot have $-100 in the field, but you can save any and all values to the server.

Example 1 - A Holiday in Elbonia:
Let's say a fellow decides to go on holiday. Let's call him John Smith. John loves adventure, danger and generally experiencing life at the edge of civilization. In fact, much to the dismay of his twin sister Jane, John would have much preferred to be involved in anthropological activities on a National Geographic level rather than a computer programmer.


This is why John has decided to depart for the shores of the island of far away Elbonia.

To prevent his dear sister from suffering a heart attack from worry, John has promised Jane that he will provide her with regular updates as to where he is, and what he is doing. Hence, before checking in, John decides to update his facebook profile to say that he is at the airport.


Under the atomic consistency model, John would push the update to the server, the server would update, and anyone who would read his facebook page would immediately see he is at the airport.

Under the Eventual Consistency model under normal operation, both John and Jane would see the same thing within a mere number of seconds, as the clouds holding the data would be updated when the call is made.

But what would happen if the City Node was unable to talk to the central Facebook cloud?

John would see that his change has been accepted on his device, but his sister Jane will still see his previous status until the City cloud node that she sees has updated.

But what happens if multiple updates occur?

Let's assume that John is busy playing the latest game he downloaded to his portable device, and almost completely runs his battery dead. It is a long flight to the Balkans; especially if you fly with Elbonian airlines. All the way along, John is updating his status but his device cannot talk to the Internet as his device must be in Flight Mode.

When John arrives in Armpit, the capital of Elbonia, he updates his status to "Arrived At Customs", at which time all of his previous updates are sent.

Since the Internet is not so reliable in a fourth world country such as Elbonia, as luck would have it, the link to the outside world is currently down. After John leaves customs five hours later, he receives a frantic call from Jane, who says he hasn't updated his facebook status. John calmly explains that the internet is down in Elbonia, but he's heading for the hotel now. Just then, John's phone runs out of power.

After Jane's conversation ends with John, Jane logs in to Johns account, and updates his status to Going to the hotel.

When John returns to the hotel, he logs in to update his status. He still sees "Arrived At Customs", so he updates his status to "At Hotel".

An hour later the Internet in Elbonia is back up again, and the servers update. So what should everyone see?

An assumption must be made that the latest data is the most relevant. Each status change would have a timestamp. Jane's timestamp would be earlier than John's timestamp, so the servers can record the change to the record in the correct order. The latest status change would then be that John is "At Hotel".

Example 2 - Withdrawing cash in Elbonia:

After settling in to his hotel room, John decides that he wants to go out and buy some food, but he forgot to change money back at the airport. He needs some Elbonian Dollars($EB) or as colloquially known, some Grubniks.




Let's assume that there are two banks, the Bank of Elbonia (BoE), and the First National Bank of Elbonia (FNBE). Say that John has a bank account with WestBank back in civilization, with $101 in it.



Let's say John tries to withdraw $100 from the BoE, and $100 from FNBE immediately afterwards. Given the case of atomic consistency, John can successfully withdraw $100 from BoE, but not from FNBE, as he has insufficient funds. This is correct behaviour.


If we are using Eventual Consistency, under normal circumstances, the lag from one bank to the next will be far less than the time it takes for John to complete your transactions, let alone that John is able to get to another ATM and withdraw funds. So hence, John can successfully withdraw $100 from BoE, but not from FNBE.

Now lets say that Elbonia is having internet issues again, and the connection between BoE, FNBE and WestBank are down. This gives us a few scenarios.

Under the atomic consistency model, as BoE and FNBE cannot talk to Westbank, the transaction will fail. This is obviously not a good situation for the client, as they do not have legitimate access to their funds. This is because we are relying upon Write Validation.

Under the Eventual Consistency model, you will be able to withdraw $100 from BoE, which will record the $100 delta change on your account. Since the bank has not yet communicated the delta change, our intrepid explorer can successfully withdraw funds from both the Bank of Elbonia, and the First National Bank of Elbonia and wind up with a $101 balance. Right?

Well, yes and no. It's just a question of when. Let me first clarify that if John were to return to the BoE or the FNBoE, and try to withdraw money, he would be greeted with a $1 balance.

Eventually, the internet (even in Elbonia), will come back up. This is when Westbank receives the 2 deltas of a $100 withdrawl, and leaves our punter with a negative balance of -$99 in his account.

The advantage of using Read Validation in this circumstance is that John will be able to legitimately withdraw $100 from his bank account. The disadvantage is that John will be able to illegitimately withdraw a further $100 from his bank account, which will need to be resolved by business procedures and workflows later.

But I bet a few of you out there just like John might be thinking, hey, that's great! Why don't I just pull out a million bucks in cash, and then move to Paraguay? It comes down to the business rules of an ATM to prevent you from being able to do that. Remember that there is a maximum amount that can be withdrawn from an ATM at a given time, say $1000. Furthermore, there are only a set number of banks operating in a country.

Following this assumption, not many people will be withdrawing copious amounts of money given these circumstances, as generally Terms of Service will mean that the banks can communicate with each other in under a day in the worst of disasters.

Couple this to the high level of scrutiny that is placed to holders of bank accounts with the Anti Money Launder and Counter Terrorism Funding (AML/CTF) legislation that has been passed in many countries; You have a high level of assurance that the person exists and is legitimate. In any case, a shady character would not be able to glean enough from a system to make it worth their while to commit fraud.

This level of risk can be acceptable for some banks. But what if the bank that John transacts with does not wish to allow withdrawls if communication channels drop out? This can technically still be achieved.

Some objects of very high importance may still require a single source of truth to authorize transactions. Given this situation, you may only have partial functionality should a link go down. As in, John could check his balance from an Elbonian ATM, but could not actually withdraw any money.

There is another situation as well. Some bank accounts may allow for a certain amount of money to be passed out, but other bank accounts may not. This would merely be a piece of data associated with John's account to say that he can withdraw up to $800 per day, or $1000 or however much risk that both John and his bank are willing to accept.

In the end, in this example the Read Validation method provides more flexibility and customer service for the consumer, whilst still allowing for resolution of the problem in this area.

Example 3 - The disjointed timesheet
As always, on his first day on holidays, John gets a phone call from his boss, Bob. After the usual discussion of explosions, earthquakes, fires, flood and tidalwaves caused by a small show stopper defect in John's code that got shaken out in testing, John has to provide a fix.

Of course this is not that big a problem for John who implements the fix and updates SVN in under 8 hours, but now he has to update his timesheet. The timesheet application is cloud enabled, so he updates his timesheet, and is blissfully unaware that Elbonia has lost its connection to the outside world again.

John happens to be assigned to a task called "Release Defects" to which he places 8 hours on. This updates to the server.

In the meantime, the whole discussion about John in the office means that Dave, the resource manager, remembers that he had promised John that he would fill out his holiday timesheets for him. So, he opens John's timesheet and places 8 hours of work against his vacation task.

The timesheet has a business rule on it that says that no more than 12 hours of time can be assigned on a particular day. So how would the system resolve this situation?

Updates can occur on mistaken axioms. Dave had assumed that John had not placed any time on his timesheet. John had assumed that Dave had not placed any time on his timesheet. Both of them relied upon a mistaken belief. Dave thought that John has spent no time working, and John didn't know that Dave had updated his time.

This is why Read Validation must allow data that is invalid to the business rules to exist. Although there is currently 16 hours worth of work on John's timesheet, users should be able to enter more time. However, when John goes to submit his timesheet for approval, it shouldn't be allowed, as he has not satisfied the business rule of 12 hours.

So what would happen if Dave subsequently submitted John's timesheet?

There is a business rule that states that once a timesheet has been submitted for approval, it cannot be changed unless it has been unsubmitted.

So let's say that John added his time after Dave submitted his timesheet. In this case, it is clear cut that John should be informed that his timesheet had been submitted before John added his time. It would then be up to John to fix the situation.

If John added his changes before Dave submitted John's timesheet on his behalf, then the system should allow John's changes, roll back the submission, and inform Dave that there is a discrepancy of time.

But we need a new piece of technology here. We need to know when the update was made and what timestamp of data that the user had seen at the time. Each update needs to contain both pieces of information, so that the users can glean the reasoning behind how the decision was made to be able to resolve the conflict. Those familiar with Concurrent Versioning Systems (CVS) and Subversion (SVN) et al, will see that this kind of temporal dilation of information has been dealt with for quite a long time already, and it is an established principle.

In Conclusion
As my friend Brian Holmes always says, cloud computing is like as though every function is an island that can be calculated and the results be shipped out to its destination.

This is just like us humans. We are all islands that have limited vision of the world that we live in. It is not about being perfect in our universe, but rather about being fault tolerant.

Although Atomic Consistency is the ultimate in data integrity, in reality, we never truly have Atomic Consistency in life. We follow business rules and have elastic error validation to tolerate the False Axiom problem. This allows us to have Eventual Consistency, and when a data collision occurs, we can collaborate to resolve the issue correctly.

The Eventual Consistency model comes down to the level of trust that you place into the action, and the level of damage that can occur should an irreversible collision happen, such as withdrawing funds from an ATM whilst you have a negative balance.

This is a business decision, which can be made on a use case basis, so you can still allow for a single source of truth for a select number of transactions, where the business does not wish it to occur. This is exactly like human transactions.

If you go to a store, and their credit card facilities are down, and you don't have any cash, chances are, you are not walking out of there with the goods that you want. This is exactly the same situation with distributed transactions.

As for John, he has successfully conquered Elbonia and gotten paid for the day of his holiday that he had to work.

Meanwhile, Jane is back to worrying whether John is eating a healthy lunch and flossing regularly. No-one said that it's easy being John.

I on the other hand, would like to hear what your thoughts are on this topic.

Tuesday, February 16, 2010

A manifesto on swarm and utility based cloud computing

Introduction

So why is swarm and utility based cloud computing so interesting? The answer is because businesses wish to reduce cost, and increase scalability of their system. The only way to achieve this, is by removing the overhead of managing systems, and having them sit idle.


In my mind, the largest problem with distributed computing is a mindset shift. Right now, we are focusing mainly on pushing data to the cloud, with data size and transfer being left as an afterthought. My goal with a swarm based framework is to focus on moving the process, but not the data. The way I believe this can be achieved is to structure the storage strategy to optimise and limit the amount of data packet transfer, followed by moving of process. Execution of process is not a consideration at this point in time, as a client CPU is rarely a limiting factor.


The goal of this blog post series is to hold an open and frank discussion about how to best achieve the goals outlined here, and to avoid any further pitfalls that may arise.



There are several major considerations that I have thought out that need to be considered in a successful cloud or swarm framework.


Data Shards
The framework requires the data to be segmented into meaningful segments of knowledge, or "Shards". Basically a shard is a predefined set of data that has an explicit relationship dictated by the programmer. For instance, in a banking home loan system, a Person's contact details combined with their home loan information would be considered to belong to a particular shard of data.

A shard of data can belong to another shard. For example, the particular home loan data for a person can belong to a set of home loan data that belongs to a mortgage broker, as well as a bank. This becomes particularly useful for null dressing as well as Garbage collection, as you can explicitly pull information from the cloud, as well as delete objects that are referenced by shard. I will expand on this in a later post.

Using Data Shards to reduce the amount of movement of information
My thoughts on how a shard applies to limiting the movement of data goes as follows:

Moving data has 2 dimensions of expense. The amount of data, and how expensive it is to move the data. The former is self explanatory, but the latter becomes interesting with Flex 4's capability to handle peer to peer between clients. Take an office situation. Generally similar people tend to cluster together both geographically and from a network perspective that access similar types of data. For instance, in a bank, insurance brokers work on the same network, sales work in the same building, and mortgage brokers do the same.

It makes no sense that these guys should all be contacting the server to pull down the same data. This is where the concept of data gravity comes in. When a client accesses a piece of data, a counter is incremented as to the amount of times the particular shard has been accessed in total. An implicit total get tallied down the children from the root source shard.

This counter is used to rank the amount of carpet wear on a shard, making it more important, akin to increasing its mass. The client will cache the data according to level of importance, and discard and stop listening to changes occurring on the least important objects. Updating the data is another problem, but I think we will leave that discussion for later as well.

I have thought about naming this algorithm Data Gravity, as the importance of the object is akin to a planet increasing its mass, and clustering objects together in space.

This will make further sense when I expand on my ideas with the incorporation of Hyper Planes, Neural Networks and Convex hull structures with respect to Data Shard clustering in a future blog post and introduce the concept of a Gravity Well.

Private Clouds and Data Security

A large sales barrier to Cloud computing is data security and integrity. Private clouds seem to be gathering momentum. However, the ideal should be a mix of the two.

Non-sensitive data can be pushed to third party cloud vendors, while sensitive information should be prioritised by the company's own infrastructure.

What you pay for should be used as close to maximum capacity, while there is always option to load shed processes such as serving public web pages, public document storage and so forth to an external provider should your resources be overloaded.

The idea of adding a security clearance level to a data shard allows the private cloud framework to manage the distribution of data shards by servers and members. For instance, a server inside the organisation may be instructed to store top secret level shards of information down to public level access information, while say the Amazon or Google cloud may be set to store only public level access information.

A more complex security policy might add access to particular types of data shards. For instance, if you are the creator of a data shard, you would get access, even if it is top secret. Furthermore, you might provide other people direct access to the file on this level, as well as transitive access, by belonging to a team or group.

Data Recovery Management

DRM is a huge area that cost organisations a very pretty penny. It is also an area where if a pretty penny is not spent, should a risk materialise into an issue, it becomes a significant unexpected expense, or worse.



Cloud has brought forward a solution that is more like fools gold than anything else.



The problem is that on small scale amounts of data, we can definitely push it to the cloud for backup purposes, however, for larger scales of data, such as movie files or images etc, cloud storage is simply not an option.



Security is also a major issue here again, as it is very foolish for an organisation to push something like their complete client list, or their accounts to the cloud. Hence, these files tend not to be backed up, or stored in a more traditional DRM strategy such as on a file server, or copied to CD.



The problem here is that an organisation that moved their DRM reliance to the cloud will experience an atrophied traditional DRM strategy. In my days working as a systems administrator, I came across many sites where my predecessor diligently made backups to cd's or removable storage drives, but never tested whether they worked.



In such a circumstance, it becomes a dangerous placebo.



A swarm based DRM could be very interesting, in that the data movement and update solution can be applied to storing files in an encrypted fashion on a peer to peer basis in a private swarm cloud. This way, if a client is knocked out by disaster, or worse, a server is knocked out by disaster, a recovery process can automatically reconstruct data from a swarm of clients.



I will discuss this further in a later post.



Unidentified Issues or Benefits

At this point I would like to open the floor to any and all who read this post to provide thoughts and feedback on the ideas I have outlined above.



I feel that there are many untapped possibilities that have not been thought of yet in this space, and I would love to explore them with the community.