Add a cosigning protocol to ensure finalizations are unique (#433)

* Add a function to deterministically decide which Serai blocks should be co-signed

Has a 5 minute latency between co-signs, also used as the maximal latency
before a co-sign is started.

* Get all active tributaries we're in at a specific block

* Add and route CosignSubstrateBlock, a new provided TX

* Split queued cosigns per network

* Rename BatchSignId to SubstrateSignId

* Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it

* Handle the CosignSubstrateBlock provided TX

* Revert substrate_signer.rs to develop (and patch to still work)

Due to SubstrateSigner moving when the prior multisig closes, yet cosigning
occurring with the most recent key, a single SubstrateSigner can be reused.
We could manage multiple SubstrateSigners, yet considering the much lower
specifications for cosigning, I'd rather treat it distinctly.

* Route cosigning through the processor

* Add note to rename SubstrateSigner post-PR

I don't want to do so now in order to preserve the diff's clarity.

* Implement cosign evaluation into the coordinator

* Get tests to compile

* Bug fixes, mark blocks without cosigners available as cosigned

* Correct the ID Batch preprocesses are saved under, add log statements

* Create a dedicated function to handle cosigns

* Correct the flow around Batch verification/queueing

Verifying `Batch`s could stall when a `Batch` was signed before its
predecessors/before the block it's contained in was cosigned (the latter being
inevitable as we can't sign a block containing a signed batch before signing
the batch).

Now, Batch verification happens on a distinct async task in order to not block
the handling of processor messages. This task is the sole caller of verify in
order to ensure last_verified_batch isn't unexpectedly mutated.

When the processor message handler needs to access it, or needs to queue a
Batch, it associates the DB TXN with a lock preventing the other task from
doing so.

This lock, as currently implemented, is a poor and inefficient design. It
should be modified to the pattern used for cosign management. Additionally, a
new primitive of a DB-backed channel may be immensely valuable.

Fixes a standing potential deadlock and a deadlock introduced with the
cosigning protocol.

* Working full-stack tests

After the last commit, this only required extending a timeout.

* Replace "co-sign" with "cosign" to make finding text easier

* Update the coordinator tests to support cosigning

* Inline prior_batch calculation to prevent panic on rotation

Noticed when doing a final review of the branch.
This commit is contained in:
Luke Parker
2023-11-15 16:57:21 -05:00
committed by GitHub
parent 79e4cce2f6
commit 96f1d26f7a
29 changed files with 1900 additions and 348 deletions

View File

@@ -1,4 +1,7 @@
use core::ops::{Deref, Range};
use core::{
ops::{Deref, Range},
fmt::Debug,
};
use std::io::{self, Read, Write};
use zeroize::Zeroizing;
@@ -15,6 +18,7 @@ use schnorr::SchnorrSignature;
use frost::Participant;
use scale::{Encode, Decode};
use processor_messages::coordinator::SubstrateSignableId;
use serai_client::{
primitives::{NetworkId, PublicKey},
@@ -167,8 +171,8 @@ impl TributarySpec {
}
#[derive(Clone, PartialEq, Eq, Debug)]
pub struct SignData<const N: usize> {
pub plan: [u8; N],
pub struct SignData<Id: Clone + PartialEq + Eq + Debug + Encode + Decode> {
pub plan: Id,
pub attempt: u32,
pub data: Vec<Vec<u8>>,
@@ -176,10 +180,10 @@ pub struct SignData<const N: usize> {
pub signed: Signed,
}
impl<const N: usize> ReadWrite for SignData<N> {
impl<Id: Clone + PartialEq + Eq + Debug + Encode + Decode> ReadWrite for SignData<Id> {
fn read<R: io::Read>(reader: &mut R) -> io::Result<Self> {
let mut plan = [0; N];
reader.read_exact(&mut plan)?;
let plan = Id::decode(&mut scale::IoReader(&mut *reader))
.map_err(|_| io::Error::new(io::ErrorKind::Other, "invalid plan in SignData"))?;
let mut attempt = [0; 4];
reader.read_exact(&mut attempt)?;
@@ -208,7 +212,7 @@ impl<const N: usize> ReadWrite for SignData<N> {
}
fn write<W: io::Write>(&self, writer: &mut W) -> io::Result<()> {
writer.write_all(&self.plan)?;
writer.write_all(&self.plan.encode())?;
writer.write_all(&self.attempt.to_le_bytes())?;
writer.write_all(&[u8::try_from(self.data.len()).unwrap()])?;
@@ -253,6 +257,9 @@ pub enum Transaction {
},
DkgConfirmed(u32, [u8; 32], Signed),
// Co-sign a Substrate block.
CosignSubstrateBlock([u8; 32]),
// When we have synchrony on a batch, we can allow signing it
// TODO (never?): This is less efficient compared to an ExternalBlock provided transaction,
// which would be binding over the block hash and automatically achieve synchrony on all
@@ -263,11 +270,11 @@ pub enum Transaction {
// IDs
SubstrateBlock(u64),
BatchPreprocess(SignData<5>),
BatchShare(SignData<5>),
SubstratePreprocess(SignData<SubstrateSignableId>),
SubstrateShare(SignData<SubstrateSignableId>),
SignPreprocess(SignData<32>),
SignShare(SignData<32>),
SignPreprocess(SignData<[u8; 32]>),
SignShare(SignData<[u8; 32]>),
// This is defined as an Unsigned transaction in order to de-duplicate SignCompleted amongst
// reporters (who should all report the same thing)
// We do still track the signer in order to prevent a single signer from publishing arbitrarily
@@ -415,6 +422,12 @@ impl ReadWrite for Transaction {
}
5 => {
let mut block = [0; 32];
reader.read_exact(&mut block)?;
Ok(Transaction::CosignSubstrateBlock(block))
}
6 => {
let mut block = [0; 32];
reader.read_exact(&mut block)?;
let mut batch = [0; 5];
@@ -422,19 +435,19 @@ impl ReadWrite for Transaction {
Ok(Transaction::Batch(block, batch))
}
6 => {
7 => {
let mut block = [0; 8];
reader.read_exact(&mut block)?;
Ok(Transaction::SubstrateBlock(u64::from_le_bytes(block)))
}
7 => SignData::read(reader).map(Transaction::BatchPreprocess),
8 => SignData::read(reader).map(Transaction::BatchShare),
8 => SignData::read(reader).map(Transaction::SubstratePreprocess),
9 => SignData::read(reader).map(Transaction::SubstrateShare),
9 => SignData::read(reader).map(Transaction::SignPreprocess),
10 => SignData::read(reader).map(Transaction::SignShare),
10 => SignData::read(reader).map(Transaction::SignPreprocess),
11 => SignData::read(reader).map(Transaction::SignShare),
11 => {
12 => {
let mut plan = [0; 32];
reader.read_exact(&mut plan)?;
@@ -534,36 +547,41 @@ impl ReadWrite for Transaction {
signed.write(writer)
}
Transaction::Batch(block, batch) => {
Transaction::CosignSubstrateBlock(block) => {
writer.write_all(&[5])?;
writer.write_all(block)
}
Transaction::Batch(block, batch) => {
writer.write_all(&[6])?;
writer.write_all(block)?;
writer.write_all(batch)
}
Transaction::SubstrateBlock(block) => {
writer.write_all(&[6])?;
writer.write_all(&[7])?;
writer.write_all(&block.to_le_bytes())
}
Transaction::BatchPreprocess(data) => {
writer.write_all(&[7])?;
Transaction::SubstratePreprocess(data) => {
writer.write_all(&[8])?;
data.write(writer)
}
Transaction::BatchShare(data) => {
writer.write_all(&[8])?;
Transaction::SubstrateShare(data) => {
writer.write_all(&[9])?;
data.write(writer)
}
Transaction::SignPreprocess(data) => {
writer.write_all(&[9])?;
data.write(writer)
}
Transaction::SignShare(data) => {
writer.write_all(&[10])?;
data.write(writer)
}
Transaction::SignCompleted { plan, tx_hash, first_signer, signature } => {
Transaction::SignShare(data) => {
writer.write_all(&[11])?;
data.write(writer)
}
Transaction::SignCompleted { plan, tx_hash, first_signer, signature } => {
writer.write_all(&[12])?;
writer.write_all(plan)?;
writer
.write_all(&[u8::try_from(tx_hash.len()).expect("tx hash length exceed 255 bytes")])?;
@@ -585,11 +603,13 @@ impl TransactionTrait for Transaction {
Transaction::InvalidDkgShare { signed, .. } => TransactionKind::Signed(signed),
Transaction::DkgConfirmed(_, _, signed) => TransactionKind::Signed(signed),
Transaction::CosignSubstrateBlock(_) => TransactionKind::Provided("cosign"),
Transaction::Batch(_, _) => TransactionKind::Provided("batch"),
Transaction::SubstrateBlock(_) => TransactionKind::Provided("serai"),
Transaction::BatchPreprocess(data) => TransactionKind::Signed(&data.signed),
Transaction::BatchShare(data) => TransactionKind::Signed(&data.signed),
Transaction::SubstratePreprocess(data) => TransactionKind::Signed(&data.signed),
Transaction::SubstrateShare(data) => TransactionKind::Signed(&data.signed),
Transaction::SignPreprocess(data) => TransactionKind::Signed(&data.signed),
Transaction::SignShare(data) => TransactionKind::Signed(&data.signed),
@@ -607,7 +627,7 @@ impl TransactionTrait for Transaction {
}
fn verify(&self) -> Result<(), TransactionError> {
if let Transaction::BatchShare(data) = self {
if let Transaction::SubstrateShare(data) = self {
for data in &data.data {
if data.len() != 32 {
Err(TransactionError::InvalidContent)?;
@@ -655,11 +675,13 @@ impl Transaction {
Transaction::InvalidDkgShare { ref mut signed, .. } => signed,
Transaction::DkgConfirmed(_, _, ref mut signed) => signed,
Transaction::CosignSubstrateBlock(_) => panic!("signing CosignSubstrateBlock"),
Transaction::Batch(_, _) => panic!("signing Batch"),
Transaction::SubstrateBlock(_) => panic!("signing SubstrateBlock"),
Transaction::BatchPreprocess(ref mut data) => &mut data.signed,
Transaction::BatchShare(ref mut data) => &mut data.signed,
Transaction::SubstratePreprocess(ref mut data) => &mut data.signed,
Transaction::SubstrateShare(ref mut data) => &mut data.signed,
Transaction::SignPreprocess(ref mut data) => &mut data.signed,
Transaction::SignShare(ref mut data) => &mut data.signed,