I have been reading about single-decree Paxos (primarily looking at Paxos Made Simple) yet am confused about if a consensus among Acceptors is guaranteed to not change after being reached.
According to James Aspnes's notes,
So now we suppose that some value v is eventually accepted by a majority T with number n. Then we can show by induction on proposal number that all proposals issued with higher numbers have the same value (even if they were issued earlier).
However, I am confused since I believe I have a counterexample, shown below. Feel free to jump to step 12 since that is where simple steps can illustrate the problem. I have included steps 1-12 in case it is not possible to reach the state in step 12.
Consider the following behavior. Borrowing notation from Contradiction in Lamport's Paxos made simple paper. That is, X(n:v, m), means Acceptor X has largest accepted proposal n:v, where n is proposal number and v is value, and m is the largest numbered prepare response to which Acceptor X has responded.
Say we have 3 Acceptors A, B, C. Let Px be a proposer, or even multiple proposers, who keeps sending proposals since they don't find out about any consensus being reached.
- P1 broadcasts prepare 1
- AB respond with promise, state is A(:, 1), B(:, 1)
- P1 receives promises from AB, confirms majority and broadcasts accept 1:'foo'
- Only A receives this accept, state is A(1:'foo', 1), B(:, 1), C(:,)
- P2 broadcasts prepare 2
- BC respond with promise, state is A(1:'foo', 1), B(:, 2), C(:,2)
- P2 receives promises from BC, confirms majority and broadcasts accept 2:'bar'
- Only B receives this accept, state is A(1:'foo', 1), B(2:'bar', 2), C(:,2)
- P3 broadcasts prepare 3
- AC response with promise, state is A(1:'foo', 3), B(2:'bar', 2), C(:,3)
- P3 receives promises from AC, confirms majority, notes that 1:'foo' is the largest numbered accepted value, and broadcasts accept 3:'foo'
- Only C receives this accept, state is A(1:'foo', 3), B(2:'bar', 2), C(3:'foo', 3) -- A consensus has been reached! 'foo' is the value decided upon --
- P4 doesn't know about this, broadcasts prepare 4
- AB response with promise, state is A(1:'foo', 4), B(2:'bar', 4), C(3:'foo', 3)
- P4 receives promises from AB, confirms majority, notes that 2:'bar' is the largest numbered accepted value, and broadcasts accept 4:'bar'
- A receives this broadcast, state is A(4:'bar', 4), B(4:'bar', 4), C(3:'foo', 3). -- A consensus has been reached! 'bar' is the value decided upon --
To be clear, steps 4, 8, 12, don't necessarily mean that the other nodes "failed", but I think it can just the proposer in question is taking a really long time to deliver the messages. Thus this shouldn't be a case where more than N acceptors crash in out of 2N 1.
The top-voted answer in Contradiction in Lamport's Paxos made simple paper suggests that Proposers only sent accept messages to Acceptors who promised them and an acceptor accepting a value means updating the maxBal. Both of these are satisfied in the above example, yet it shows how consensus can flip flop between two different value. Am I missing something here?
CodePudding user response:
I am going to copy&paste a statement straight from Paxos article in wikipedia: "consensus is achieved when a majority of Acceptors accept the same identifier number (rather than the same value)".
So in your step 12 "state is A(1:'foo', 3), B(2:'bar', 2), C(3:'foo', 3) -- A consensus has been reached! 'foo' is the value decided upon" - there is no consensus yet; even if there is a majority in the value.
p.s. thanks for the questions, this is a nice highlight on the consensus definition in Paxos!