Add a cosigning protocol to ensure finalizations are unique (#433)

* Add a function to deterministically decide which Serai blocks should be co-signed

Has a 5 minute latency between co-signs, also used as the maximal latency
before a co-sign is started.

* Get all active tributaries we're in at a specific block

* Add and route CosignSubstrateBlock, a new provided TX

* Split queued cosigns per network

* Rename BatchSignId to SubstrateSignId

* Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it

* Handle the CosignSubstrateBlock provided TX

* Revert substrate_signer.rs to develop (and patch to still work)

Due to SubstrateSigner moving when the prior multisig closes, yet cosigning
occurring with the most recent key, a single SubstrateSigner can be reused.
We could manage multiple SubstrateSigners, yet considering the much lower
specifications for cosigning, I'd rather treat it distinctly.

* Route cosigning through the processor

* Add note to rename SubstrateSigner post-PR

I don't want to do so now in order to preserve the diff's clarity.

* Implement cosign evaluation into the coordinator

* Get tests to compile

* Bug fixes, mark blocks without cosigners available as cosigned

* Correct the ID Batch preprocesses are saved under, add log statements

* Create a dedicated function to handle cosigns

* Correct the flow around Batch verification/queueing

Verifying `Batch`s could stall when a `Batch` was signed before its
predecessors/before the block it's contained in was cosigned (the latter being
inevitable as we can't sign a block containing a signed batch before signing
the batch).

Now, Batch verification happens on a distinct async task in order to not block
the handling of processor messages. This task is the sole caller of verify in
order to ensure last_verified_batch isn't unexpectedly mutated.

When the processor message handler needs to access it, or needs to queue a
Batch, it associates the DB TXN with a lock preventing the other task from
doing so.

This lock, as currently implemented, is a poor and inefficient design. It
should be modified to the pattern used for cosign management. Additionally, a
new primitive of a DB-backed channel may be immensely valuable.

Fixes a standing potential deadlock and a deadlock introduced with the
cosigning protocol.

* Working full-stack tests

After the last commit, this only required extending a timeout.

* Replace "co-sign" with "cosign" to make finding text easier

* Update the coordinator tests to support cosigning

* Inline prior_batch calculation to prevent panic on rotation

Noticed when doing a final review of the branch.
This commit is contained in:
Luke Parker
2023-11-15 16:57:21 -05:00
committed by GitHub
parent 79e4cce2f6
commit 96f1d26f7a
29 changed files with 1900 additions and 348 deletions

View File

@@ -0,0 +1,126 @@
use std::collections::HashMap;
use rand_core::{RngCore, OsRng};
use ciphersuite::group::GroupEncoding;
use frost::{
curve::Ristretto,
Participant,
dkg::tests::{key_gen, clone_without},
};
use sp_application_crypto::{RuntimePublic, sr25519::Public};
use serai_db::{DbTxn, Db, MemDb};
use serai_client::primitives::*;
use messages::coordinator::*;
use crate::cosigner::Cosigner;
#[tokio::test]
async fn test_cosigner() {
let keys = key_gen::<_, Ristretto>(&mut OsRng);
let participant_one = Participant::new(1).unwrap();
let block = [0xaa; 32];
let actual_id = SubstrateSignId {
key: keys.values().next().unwrap().group_key().to_bytes(),
id: SubstrateSignableId::CosigningSubstrateBlock(block),
attempt: (OsRng.next_u64() >> 32).try_into().unwrap(),
};
let mut signing_set = vec![];
while signing_set.len() < usize::from(keys.values().next().unwrap().params().t()) {
let candidate = Participant::new(
u16::try_from((OsRng.next_u64() % u64::try_from(keys.len()).unwrap()) + 1).unwrap(),
)
.unwrap();
if signing_set.contains(&candidate) {
continue;
}
signing_set.push(candidate);
}
let mut signers = HashMap::new();
let mut dbs = HashMap::new();
let mut preprocesses = HashMap::new();
for i in 1 ..= keys.len() {
let i = Participant::new(u16::try_from(i).unwrap()).unwrap();
let keys = keys.get(&i).unwrap().clone();
let mut db = MemDb::new();
let mut txn = db.txn();
let (signer, preprocess) =
Cosigner::new(&mut txn, vec![keys], block, actual_id.attempt).unwrap();
match preprocess {
// All participants should emit a preprocess
ProcessorMessage::CosignPreprocess { id, preprocesses: mut these_preprocesses } => {
assert_eq!(id, actual_id);
assert_eq!(these_preprocesses.len(), 1);
if signing_set.contains(&i) {
preprocesses.insert(i, these_preprocesses.swap_remove(0));
}
}
_ => panic!("didn't get preprocess back"),
}
txn.commit();
signers.insert(i, signer);
dbs.insert(i, db);
}
let mut shares = HashMap::new();
for i in &signing_set {
let mut txn = dbs.get_mut(i).unwrap().txn();
match signers
.get_mut(i)
.unwrap()
.handle(
&mut txn,
CoordinatorMessage::SubstratePreprocesses {
id: actual_id.clone(),
preprocesses: clone_without(&preprocesses, i),
},
)
.await
.unwrap()
{
ProcessorMessage::SubstrateShare { id, shares: mut these_shares } => {
assert_eq!(id, actual_id);
assert_eq!(these_shares.len(), 1);
shares.insert(*i, these_shares.swap_remove(0));
}
_ => panic!("didn't get share back"),
}
txn.commit();
}
for i in &signing_set {
let mut txn = dbs.get_mut(i).unwrap().txn();
match signers
.get_mut(i)
.unwrap()
.handle(
&mut txn,
CoordinatorMessage::SubstrateShares {
id: actual_id.clone(),
shares: clone_without(&shares, i),
},
)
.await
.unwrap()
{
ProcessorMessage::CosignedBlock { block: signed_block, signature } => {
assert_eq!(signed_block, block);
assert!(Public::from_raw(keys[&participant_one].group_key().to_bytes())
.verify(&cosign_block_msg(block), &Signature(signature.try_into().unwrap())));
}
_ => panic!("didn't get cosigned block back"),
}
txn.commit();
}
}