Add a cosigning protocol to ensure finalizations are unique (#433)

* Add a function to deterministically decide which Serai blocks should be co-signed

Has a 5 minute latency between co-signs, also used as the maximal latency
before a co-sign is started.

* Get all active tributaries we're in at a specific block

* Add and route CosignSubstrateBlock, a new provided TX

* Split queued cosigns per network

* Rename BatchSignId to SubstrateSignId

* Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it

* Handle the CosignSubstrateBlock provided TX

* Revert substrate_signer.rs to develop (and patch to still work)

Due to SubstrateSigner moving when the prior multisig closes, yet cosigning
occurring with the most recent key, a single SubstrateSigner can be reused.
We could manage multiple SubstrateSigners, yet considering the much lower
specifications for cosigning, I'd rather treat it distinctly.

* Route cosigning through the processor

* Add note to rename SubstrateSigner post-PR

I don't want to do so now in order to preserve the diff's clarity.

* Implement cosign evaluation into the coordinator

* Get tests to compile

* Bug fixes, mark blocks without cosigners available as cosigned

* Correct the ID Batch preprocesses are saved under, add log statements

* Create a dedicated function to handle cosigns

* Correct the flow around Batch verification/queueing

Verifying `Batch`s could stall when a `Batch` was signed before its
predecessors/before the block it's contained in was cosigned (the latter being
inevitable as we can't sign a block containing a signed batch before signing
the batch).

Now, Batch verification happens on a distinct async task in order to not block
the handling of processor messages. This task is the sole caller of verify in
order to ensure last_verified_batch isn't unexpectedly mutated.

When the processor message handler needs to access it, or needs to queue a
Batch, it associates the DB TXN with a lock preventing the other task from
doing so.

This lock, as currently implemented, is a poor and inefficient design. It
should be modified to the pattern used for cosign management. Additionally, a
new primitive of a DB-backed channel may be immensely valuable.

Fixes a standing potential deadlock and a deadlock introduced with the
cosigning protocol.

* Working full-stack tests

After the last commit, this only required extending a timeout.

* Replace "co-sign" with "cosign" to make finding text easier

* Update the coordinator tests to support cosigning

* Inline prior_batch calculation to prevent panic on rotation

Noticed when doing a final review of the branch.
This commit is contained in:
Luke Parker
2023-11-15 16:57:21 -05:00
committed by GitHub
parent 79e4cce2f6
commit 96f1d26f7a
29 changed files with 1900 additions and 348 deletions

View File

@@ -156,20 +156,37 @@ pub mod sign {
pub mod coordinator {
use super::*;
pub fn cosign_block_msg(block: [u8; 32]) -> Vec<u8> {
const DST: &[u8] = b"Cosign";
let mut res = vec![u8::try_from(DST.len()).unwrap()];
res.extend(DST);
res.extend(block);
res
}
#[derive(
Clone, Copy, PartialEq, Eq, Hash, Debug, Zeroize, Encode, Decode, Serialize, Deserialize,
)]
pub enum SubstrateSignableId {
CosigningSubstrateBlock([u8; 32]),
Batch([u8; 5]),
}
#[derive(Clone, PartialEq, Eq, Hash, Debug, Zeroize, Encode, Decode, Serialize, Deserialize)]
pub struct BatchSignId {
pub struct SubstrateSignId {
pub key: [u8; 32],
pub id: [u8; 5],
pub id: SubstrateSignableId,
pub attempt: u32,
}
#[derive(Clone, PartialEq, Eq, Debug, Serialize, Deserialize)]
pub enum CoordinatorMessage {
CosignSubstrateBlock { id: SubstrateSignId },
// Uses Vec<u8> instead of [u8; 64] since serde Deserialize isn't implemented for [u8; 64]
BatchPreprocesses { id: BatchSignId, preprocesses: HashMap<Participant, Vec<u8>> },
BatchShares { id: BatchSignId, shares: HashMap<Participant, [u8; 32]> },
SubstratePreprocesses { id: SubstrateSignId, preprocesses: HashMap<Participant, Vec<u8>> },
SubstrateShares { id: SubstrateSignId, shares: HashMap<Participant, [u8; 32]> },
// Re-attempt a batch signing protocol.
BatchReattempt { id: BatchSignId },
BatchReattempt { id: SubstrateSignId },
}
impl CoordinatorMessage {
@@ -179,16 +196,18 @@ pub mod coordinator {
// This synchrony obtained lets us ignore the synchrony requirement offered here
pub fn required_block(&self) -> Option<BlockHash> {
match self {
CoordinatorMessage::BatchPreprocesses { .. } => None,
CoordinatorMessage::BatchShares { .. } => None,
CoordinatorMessage::CosignSubstrateBlock { .. } => None,
CoordinatorMessage::SubstratePreprocesses { .. } => None,
CoordinatorMessage::SubstrateShares { .. } => None,
CoordinatorMessage::BatchReattempt { .. } => None,
}
}
pub fn key(&self) -> &[u8] {
match self {
CoordinatorMessage::BatchPreprocesses { id, .. } => &id.key,
CoordinatorMessage::BatchShares { id, .. } => &id.key,
CoordinatorMessage::CosignSubstrateBlock { id } => &id.key,
CoordinatorMessage::SubstratePreprocesses { id, .. } => &id.key,
CoordinatorMessage::SubstrateShares { id, .. } => &id.key,
CoordinatorMessage::BatchReattempt { id } => &id.key,
}
}
@@ -203,9 +222,11 @@ pub mod coordinator {
#[derive(Clone, PartialEq, Eq, Debug, Zeroize, Serialize, Deserialize)]
pub enum ProcessorMessage {
SubstrateBlockAck { network: NetworkId, block: u64, plans: Vec<PlanMeta> },
InvalidParticipant { id: BatchSignId, participant: Participant },
BatchPreprocess { id: BatchSignId, block: BlockHash, preprocesses: Vec<Vec<u8>> },
BatchShare { id: BatchSignId, shares: Vec<[u8; 32]> },
InvalidParticipant { id: SubstrateSignId, participant: Participant },
CosignPreprocess { id: SubstrateSignId, preprocesses: Vec<Vec<u8>> },
BatchPreprocess { id: SubstrateSignId, block: BlockHash, preprocesses: Vec<Vec<u8>> },
SubstrateShare { id: SubstrateSignId, shares: Vec<[u8; 32]> },
CosignedBlock { block: [u8; 32], signature: Vec<u8> },
}
}
@@ -350,10 +371,12 @@ impl CoordinatorMessage {
}
CoordinatorMessage::Coordinator(msg) => {
let (sub, id) = match msg {
// Unique since this embeds the batch ID (hash of it, including its network) and attempt
coordinator::CoordinatorMessage::BatchPreprocesses { id, .. } => (0, id.encode()),
coordinator::CoordinatorMessage::BatchShares { id, .. } => (1, id.encode()),
coordinator::CoordinatorMessage::BatchReattempt { id, .. } => (2, id.encode()),
// Unique since this is the entire message
coordinator::CoordinatorMessage::CosignSubstrateBlock { id } => (0, id.encode()),
// Unique since this embeds the batch ID (including its network) and attempt
coordinator::CoordinatorMessage::SubstratePreprocesses { id, .. } => (1, id.encode()),
coordinator::CoordinatorMessage::SubstrateShares { id, .. } => (2, id.encode()),
coordinator::CoordinatorMessage::BatchReattempt { id, .. } => (3, id.encode()),
};
let mut res = vec![COORDINATOR_UID, TYPE_COORDINATOR_UID, sub];
@@ -420,10 +443,12 @@ impl ProcessorMessage {
coordinator::ProcessorMessage::SubstrateBlockAck { network, block, .. } => {
(0, (network, block).encode())
}
// Unique since BatchSignId
// Unique since SubstrateSignId
coordinator::ProcessorMessage::InvalidParticipant { id, .. } => (1, id.encode()),
coordinator::ProcessorMessage::BatchPreprocess { id, .. } => (2, id.encode()),
coordinator::ProcessorMessage::BatchShare { id, .. } => (3, id.encode()),
coordinator::ProcessorMessage::CosignPreprocess { id, .. } => (2, id.encode()),
coordinator::ProcessorMessage::BatchPreprocess { id, .. } => (3, id.encode()),
coordinator::ProcessorMessage::SubstrateShare { id, .. } => (4, id.encode()),
coordinator::ProcessorMessage::CosignedBlock { block, .. } => (5, block.encode()),
};
let mut res = vec![PROCESSSOR_UID, TYPE_COORDINATOR_UID, sub];