Files
serai/substrate/tendermint/machine/src/lib.rs
Luke Parker 8f4d6f79f3 Initial Tendermint implementation (#145)
* Machine without timeouts

* Time code

* Move substrate/consensus/tendermint to substrate/tendermint

* Delete the old paper doc

* Refactor out external parts to generics

Also creates a dedicated file for the message log.

* Refactor <V, B> to type V, type B

* Successfully compiling

* Calculate timeouts

* Fix test

* Finish timeouts

* Misc cleanup

* Define a signature scheme trait

* Implement serialization via parity's scale codec

Ideally, this would be generic. Unfortunately, the generic API serde 
doesn't natively support borsh, nor SCALE, and while there is a serde 
SCALE crate, it's old. While it may be complete, it's not worth working 
with.

While we could still grab bincode, and a variety of other formats, it 
wasn't worth it to go custom and for Serai, we'll be using SCALE almost 
everywhere anyways.

* Implement usage of the signature scheme

* Make the infinite test non-infinite

* Provide a dedicated signature in Precommit of just the block hash

Greatly simplifies verifying when syncing.

* Dedicated Commit object

Restores sig aggregation API.

* Tidy README

* Document tendermint

* Sign the ID directly instead of its SCALE encoding

For a hash, which is fixed-size, these should be the same yet this helps 
move past the dependency on SCALE. It also, for any type where the two 
values are different, smooths integration.

* Litany of bug fixes

Also attempts to make the code more readable while updating/correcting 
documentation.

* Remove async recursion

Greatly increases safety as well by ensuring only one message is 
processed at once.

* Correct timing issues

1) Commit didn't include the round, leaving the clock in question.

2) Machines started with a local time, instead of a proper start time.

3) Machines immediately started the next block instead of waiting for 
the block time.

* Replace MultiSignature with sr25519::Signature

* Minor SignatureScheme API changes

* Map TM SignatureScheme to Substrate's sr25519

* Initial work on an import queue

* Properly use check_block

* Rename import to import_queue

* Implement tendermint_machine::Block for Substrate Blocks

Unfortunately, this immediately makes Tendermint machine capable of 
deployment as  crate since it uses a git reference. In the future, a 
Cargo.toml patch section for serai/substrate should be investigated. 
This is being done regardless as it's the quickest way forward and this 
is for Serai.

* Dummy Weights

* Move documentation to the top of the file

* Move logic into TendermintImport itself

Multiple traits exist to verify/handle blocks. I'm unsure exactly when 
each will be called in the pipeline, so the easiest solution is to have 
every step run every check.

That would be extremely computationally expensive if we ran EVERY check, 
yet we rely on Substrate for execution (and according checks), which are 
limited to just the actual import function.

Since we're calling this code from many places, it makes sense for it to 
be consolidated under TendermintImport.

* BlockImport, JustificationImport, Verifier, and import_queue function

* Update consensus/lib.rs from PoW to Tendermint

Not possible to be used as the previous consensus could. It will not
produce blocks nor does it currenly even instantiate a machine. This is
just he next step.

* Update Cargo.tomls for substrate packages

* Tendermint SelectChain

This is incompatible with Substrate's expectations, yet should be valid 
for ours

* Move the node over to the new SelectChain

* Minor tweaks

* Update SelectChain documentation

* Remove substrate/node lib.rs

This shouldn't be used as a library AFAIK. While runtime should be, and 
arguably should even be published, I have yet to see node in the same 
way. Helps tighten API boundaries.

* Remove unused macro_use

* Replace panicking todos with stubs and // TODO

Enables progress.

* Reduce chain_spec and use more accurate naming

* Implement block proposal logic

* Modularize to get_proposal

* Trigger block importing

Doesn't wait for the response yet, which it needs to.

* Get the result of block importing

* Split import_queue into a series of files

* Provide a way to create the machine

The BasicQueue returned obscures the TendermintImport struct. 
Accordingly, a Future scoped with access is returned upwards, which when 
awaited will create the machine. This makes creating the machine 
optional while maintaining scope boundaries.

Is sufficient to create a 1-node net which produces and finalizes 
blocks.

* Don't import justifications multiple times

Also don't broadcast blocks which were solely proposed.

* Correct justication import pipeline

Removes JustificationImport as it should never be used.

* Announce blocks

By claiming File, they're not sent ovber the P2P network before they 
have a justification, as desired. Unfortunately, they never were. This 
works around that.

* Add an assert to verify proposed children aren't best

* Consolidate C and I generics into a TendermintClient trait alias

* Expand sanity checks

Substrate doesn't expect nor officially support children with less work 
than their parents. It's a trick used here. Accordingly, ensure the 
trick's validity.

* When resetting, use the end time of the round which was committed to

The machine reset to the end time of the current round. For a delayed 
network connection, a machine may move ahead in rounds and only later 
realize a prior round succeeded. Despite acknowledging that round's 
success, it would maintain its delay when moving to the next block, 
bricking it.

Done by tracking the end time for each round as they occur.

* Move Commit from including the round to including the round's end_time

The round was usable to build the current clock in an accumulated 
fashion, relative to the previous round. The end time is the absolute 
metric of it, which can be used to calculate the round number (with all 
previous end times).

Substrate now builds off the best block, not genesis, using the end time 
included in the justification to start its machine in a synchronized 
state.

Knowing the end time of a round, or the round in which block was 
committed to, is necessary for nodes to sync up with Tendermint. 
Encoding it in the commit ensures it's long lasting and makes it readily 
available, without the load of an entire transaction.

* Add a TODO on Tendermint

* Misc bug fixes

* More misc bug fixes

* Clean up lock acquisition

* Merge weights and signing scheme into validators, documenting needed changes

* Add pallet sessions to runtime, create pallet-tendermint

* Update node to use pallet sessions

* Update support URL

* Partial work on correcting pallet calls

* Redo Tendermint folder structure

* TendermintApi, compilation fixes

* Fix the stub round robin

At some point, the modulus was removed causing it to exceed the 
validators list and stop proposing.

* Use the validators list from the session pallet

* Basic Gossip Validator

* Correct Substrate Tendermint start block

The Tendermint machine uses the passed in number as the block's being 
worked on number. Substrate passed in the already finalized block's 
number.

Also updates misc comments.

* Clean generics in Tendermint with a monolith with associated types

* Remove the Future triggering the machine for an async fn

Enables passing data in, such as the network.

* Move TendermintMachine from start_num, time to last_num, time

Provides an explicitly clear API clearer to program around.

Also adds additional time code to handle an edge case.

* Connect the Tendermint machine to a GossipEngine

* Connect broadcast

* Remove machine from TendermintImport

It's not used there at all.

* Merge Verifier into block_import.rs

These two files were largely the same, just hooking into sync structs 
with almost identical imports. As this project shapes up, removing dead 
weight is appreciated.

* Create a dedicated file for being a Tendermint authority

* Deleted comment code related to PoW

* Move serai_runtime specific code from tendermint/client to node

Renames serai-consensus to sc_tendermint

* Consolidate file structure in sc_tendermint

* Replace best_* with finalized_*

We test their equivalency yet still better to use finalized_* in 
general.

* Consolidate references to sr25519 in sc_tendermint

* Add documentation to public structs/functions in sc_tendermint

* Add another missing comment

* Make sign asynchronous

Some relation to https://github.com/serai-dex/serai/issues/95.

* Move sc_tendermint to async sign

* Implement proper checking of inherents

* Take in a Keystore and validator ID

* Remove unnecessary PhantomDatas

* Update node to latest sc_tendermint

* Configure node for a multi-node testnet

* Fix handling of the GossipEngine

* Use a rounded genesis to obtain sufficient synchrony within the Docker env

* Correct Serai d-f names in Docker

* Remove an attempt at caching I don't believe would ever hit

* Add an already in chain check to block import

While the inner should do this for us, we call verify_order on our end 
*before* inner to ensure sequential import. Accordingly, we need to 
provide our own check.

Removes errors of "non-sequential import" when trying to re-import an 
existing block.

* Update the consensus documentation

It was incredibly out of date.

* Add a _ to the validator arg in slash

* Make the dev profile a local testnet profile

Restores a dev profile which only has one validator, locally running.

* Reduce Arcs in TendermintMachine, split Signer from SignatureScheme

* Update sc_tendermint per previous commit

* Restore cache

* Remove error case which shouldn't be an error

* Stop returning errors on already existing blocks entirely

* Correct Dave, Eve, and Ferdie to not run as validators

* Rename dev to devnet

--dev still works thanks to the |. Acheieves a personal preference of 
mine with some historical meaning.

* Add message expiry to the Tendermint gossip

* Localize the LibP2P protocol to the blockchain

Follows convention by doing so. Theoretically enables running multiple 
blockchains over a single LibP2P connection.

* Add a version to sp-runtime in tendermint-machine

* Add missing trait

* Bump Substrate dependency

Fixes #147.

* Implement Schnorr half-aggregation from https://eprint.iacr.org/2021/350.pdf

Relevant to https://github.com/serai-dex/serai/issues/99.

* cargo update (tendermint)

* Move from polling loops to a pure IO model for sc_tendermint's gossip

* Correct protocol name handling

* Use futures mpsc instead of tokio

* Timeout futures

* Move from a yielding loop to select in tendermint-machine

* Update Substrate to the new TendermintHandle

* Use futures pin instead of tokio

* Only recheck blocks with non-fatal inherent transaction errors

* Update to the latest substrate

* Separate the block processing time from the latency

* Add notes to the runtime

* Don't spam slash

Also adds a slash condition of failing to propose.

* Support running TendermintMachine when not a validator

This supports validators who leave the current set, without crashing 
their nodes, along with nodes trying to become validators (who will now 
seamlessly transition in).

* Properly define and pass around the block size

* Correct the Duration timing

The proposer will build it, send it, then process it (on the first 
round). Accordingly, it's / 3, not / 2, as / 2 only accounted for the 
latter events.

* Correct time-adjustment code on round skip

* Have the machine respond to advances made by an external sync loop

* Clean up time code in tendermint-machine

* BlockData and RoundData structs

* Rename Round to RoundNumber

* Move BlockData to a new file

* Move Round to an Option due to the pseudo-uninitialized state we create

Before the addition of RoundData, we always created the round, and on 
.round(0), simply created it again. With RoundData, and the changes to 
the time code, we used round 0, time 0, the latter being incorrect yet 
not an issue due to lack of misuse.

Now, if we do misuse it, it'll panic.

* Clear the Queue instead of draining and filtering

There shouldn't ever be a message which passes the filter under the 
current design.

* BlockData::new

* Move more code into block.rs

Introduces type-aliases to obtain Data/Message/SignedMessage solely from 
a Network object.

Fixes a bug regarding stepping when you're not an active validator.

* Have verify_precommit_signature return if it verified the signature

Also fixes a bug where invalid precommit signatures were left standing 
and therefore contributing to commits.

* Remove the precommit signature hash

It cached signatures per-block. Precommit signatures are bound to each 
round. This would lead to forming invalid commits when a commit should 
be formed. Under debug, the machine would catch that and panic. On 
release, it'd have everyone who wasn't a validator fail to continue 
syncing.

* Slight doc changes

Also flattens the message handling function by replacing an if 
containing all following code in the function with an early return for 
the else case.

* Always produce notifications for finalized blocks via origin overrides

* Correct weird formatting

* Update to the latest tendermint-machine

* Manually step the Tendermint machine when we synced a block over the network

* Ignore finality notifications for old blocks

* Remove a TODO resolved in 8c51bc011d

* Add a TODO comment to slash

Enables searching for the case-sensitive phrase and finding it.

* cargo fmt

* Use a tmp DB for Serai in Docker

* Remove panic on slash

As we move towards protonet, this can happen (if a node goes offline), 
yet it happening brings down the entire net right now.

* Add log::error on slash

* created shared volume between containers

* Complete the sh scripts

* Pass in the genesis time to Substrate

* Correct block announcements

They were announced, yet not marked best.

* Correct pupulate_end_time

It was used as inclusive yet didn't work inclusively.

* Correct gossip channel jumping when a block is synced via Substrate

* Use a looser check in import_future

This triggered so it needs to be accordingly relaxed.

* Correct race conditions between add_block and step

Also corrects a <= to <.

* Update cargo deny

* rename genesis-service to genesis

* Update Cargo.lock

* Correct runtime Cargo.toml whitespace

* Correct typo

* Document recheck

* Misc lints

* Fix prev commit

* Resolve low-hanging review comments

* Mark genesis/entry-dev.sh as executable

* Prevent a commit from including the same signature multiple times

Yanks tendermint-machine 0.1.0 accordingly.

* Update to latest nightly clippy

* Improve documentation

* Use clearer variable names

* Add log statements

* Pair more log statements

* Clean TendermintAuthority::authority as possible

Merges it into new. It has way too many arguments, yet there's no clear path at
consolidation there, unfortunately.

Additionally provides better scoping within itself.

* Fix #158

Doesn't use lock_import_and_run for reasons commented (lack of async).

* Rename guard to lock

* Have the devnet use the current time as the genesis

Possible since it's only a single node, not requiring synchronization.

* Fix gossiping

I really don't know what side effect this avoids and I can't say I care at this
point.

* Misc lints

Co-authored-by: vrx00 <vrx00@proton.me>
Co-authored-by: TheArchitect108 <TheArchitect108@protonmail.com>
2022-12-03 18:38:02 -05:00

639 lines
22 KiB
Rust

use core::fmt::Debug;
use std::{
sync::Arc,
time::{SystemTime, Instant, Duration},
collections::VecDeque,
};
use log::debug;
use parity_scale_codec::{Encode, Decode};
use futures::{
FutureExt, StreamExt,
future::{self, Fuse},
channel::mpsc,
};
use tokio::time::sleep;
mod time;
use time::{sys_time, CanonicalInstant};
mod round;
mod block;
use block::BlockData;
pub(crate) mod message_log;
/// Traits and types of the external network being integrated with to provide consensus over.
pub mod ext;
use ext::*;
pub(crate) fn commit_msg(end_time: u64, id: &[u8]) -> Vec<u8> {
[&end_time.to_le_bytes(), id].concat().to_vec()
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug, Encode, Decode)]
enum Step {
Propose,
Prevote,
Precommit,
}
#[derive(Clone, Debug, Encode, Decode)]
enum Data<B: Block, S: Signature> {
Proposal(Option<RoundNumber>, B),
Prevote(Option<B::Id>),
Precommit(Option<(B::Id, S)>),
}
impl<B: Block, S: Signature> PartialEq for Data<B, S> {
fn eq(&self, other: &Data<B, S>) -> bool {
match (self, other) {
(Data::Proposal(valid_round, block), Data::Proposal(valid_round2, block2)) => {
(valid_round == valid_round2) && (block == block2)
}
(Data::Prevote(id), Data::Prevote(id2)) => id == id2,
(Data::Precommit(None), Data::Precommit(None)) => true,
(Data::Precommit(Some((id, _))), Data::Precommit(Some((id2, _)))) => id == id2,
_ => false,
}
}
}
impl<B: Block, S: Signature> Data<B, S> {
fn step(&self) -> Step {
match self {
Data::Proposal(..) => Step::Propose,
Data::Prevote(..) => Step::Prevote,
Data::Precommit(..) => Step::Precommit,
}
}
}
#[derive(Clone, PartialEq, Debug, Encode, Decode)]
struct Message<V: ValidatorId, B: Block, S: Signature> {
sender: V,
block: BlockNumber,
round: RoundNumber,
data: Data<B, S>,
}
/// A signed Tendermint consensus message to be broadcast to the other validators.
#[derive(Clone, PartialEq, Debug, Encode, Decode)]
pub struct SignedMessage<V: ValidatorId, B: Block, S: Signature> {
msg: Message<V, B, S>,
sig: S,
}
impl<V: ValidatorId, B: Block, S: Signature> SignedMessage<V, B, S> {
/// Number of the block this message is attempting to add to the chain.
pub fn block(&self) -> BlockNumber {
self.msg.block
}
#[must_use]
pub fn verify_signature<Scheme: SignatureScheme<ValidatorId = V, Signature = S>>(
&self,
signer: &Scheme,
) -> bool {
signer.verify(self.msg.sender, &self.msg.encode(), &self.sig)
}
}
#[derive(Clone, Copy, PartialEq, Eq, Debug)]
enum TendermintError<V: ValidatorId> {
Malicious(V),
Temporal,
}
// Type aliases to abstract over generic hell
pub(crate) type DataFor<N> =
Data<<N as Network>::Block, <<N as Network>::SignatureScheme as SignatureScheme>::Signature>;
pub(crate) type MessageFor<N> = Message<
<N as Network>::ValidatorId,
<N as Network>::Block,
<<N as Network>::SignatureScheme as SignatureScheme>::Signature,
>;
/// Type alias to the SignedMessage type for a given Network
pub type SignedMessageFor<N> = SignedMessage<
<N as Network>::ValidatorId,
<N as Network>::Block,
<<N as Network>::SignatureScheme as SignatureScheme>::Signature,
>;
/// A machine executing the Tendermint protocol.
pub struct TendermintMachine<N: Network> {
network: N,
signer: <N::SignatureScheme as SignatureScheme>::Signer,
validators: N::SignatureScheme,
weights: Arc<N::Weights>,
queue: VecDeque<MessageFor<N>>,
msg_recv: mpsc::UnboundedReceiver<SignedMessageFor<N>>,
step_recv: mpsc::UnboundedReceiver<(BlockNumber, Commit<N::SignatureScheme>, N::Block)>,
block: BlockData<N>,
}
pub type StepSender<N> = mpsc::UnboundedSender<(
BlockNumber,
Commit<<N as Network>::SignatureScheme>,
<N as Network>::Block,
)>;
pub type MessageSender<N> = mpsc::UnboundedSender<SignedMessageFor<N>>;
/// A Tendermint machine and its channel to receive messages from the gossip layer over.
pub struct TendermintHandle<N: Network> {
/// Channel to trigger the machine to move to the next block.
/// Takes in the the previous block's commit, along with the new proposal.
pub step: StepSender<N>,
/// Channel to send messages received from the P2P layer.
pub messages: MessageSender<N>,
/// Tendermint machine to be run on an asynchronous task.
pub machine: TendermintMachine<N>,
}
impl<N: Network + 'static> TendermintMachine<N> {
// Broadcast the given piece of data
// Tendermint messages always specify their block/round, yet Tendermint only ever broadcasts for
// the current block/round. Accordingly, instead of manually fetching those at every call-site,
// this function can simply pass the data to the block which can contextualize it
fn broadcast(&mut self, data: DataFor<N>) {
if let Some(msg) = self.block.message(data) {
// Push it on to the queue. This is done so we only handle one message at a time, and so we
// can handle our own message before broadcasting it. That way, we fail before before
// becoming malicious
self.queue.push_back(msg);
}
}
// Start a new round. Returns true if we were the proposer
fn round(&mut self, round: RoundNumber, time: Option<CanonicalInstant>) -> bool {
if let Some(data) =
self.block.new_round(round, self.weights.proposer(self.block.number, round), time)
{
self.broadcast(data);
true
} else {
false
}
}
// 53-54
async fn reset(&mut self, end_round: RoundNumber, proposal: N::Block) {
// Ensure we have the end time data for the last round
self.block.populate_end_time(end_round);
// Sleep until this round ends
let round_end = self.block.end_time[&end_round];
sleep(round_end.instant().saturating_duration_since(Instant::now())).await;
// Clear our outbound message queue
self.queue = VecDeque::new();
// Create the new block
self.block = BlockData::new(
self.weights.clone(),
BlockNumber(self.block.number.0 + 1),
self.signer.validator_id().await,
proposal,
);
// Start the first round
self.round(RoundNumber(0), Some(round_end));
}
async fn reset_by_commit(&mut self, commit: Commit<N::SignatureScheme>, proposal: N::Block) {
let mut round = self.block.round().number;
// If this commit is for a round we don't have, jump up to it
while self.block.end_time[&round].canonical() < commit.end_time {
round.0 += 1;
self.block.populate_end_time(round);
}
// If this commit is for a prior round, find it
while self.block.end_time[&round].canonical() > commit.end_time {
if round.0 == 0 {
panic!("commit isn't for this machine's next block");
}
round.0 -= 1;
}
debug_assert_eq!(self.block.end_time[&round].canonical(), commit.end_time);
self.reset(round, proposal).await;
}
async fn slash(&mut self, validator: N::ValidatorId) {
if !self.block.slashes.contains(&validator) {
debug!(target: "tendermint", "Slashing validator {:?}", validator);
self.block.slashes.insert(validator);
self.network.slash(validator).await;
}
}
/// Create a new Tendermint machine, from the specified point, with the specified block as the
/// one to propose next. This will return a channel to send messages from the gossip layer and
/// the machine itself. The machine should have `run` called from an asynchronous task.
#[allow(clippy::new_ret_no_self)]
pub async fn new(
network: N,
last_block: BlockNumber,
last_time: u64,
proposal: N::Block,
) -> TendermintHandle<N> {
let (msg_send, msg_recv) = mpsc::unbounded();
let (step_send, step_recv) = mpsc::unbounded();
TendermintHandle {
step: step_send,
messages: msg_send,
machine: {
let sys_time = sys_time(last_time);
// If the last block hasn't ended yet, sleep until it has
sleep(sys_time.duration_since(SystemTime::now()).unwrap_or(Duration::ZERO)).await;
let signer = network.signer();
let validators = network.signature_scheme();
let weights = Arc::new(network.weights());
let validator_id = signer.validator_id().await;
// 01-10
let mut machine = TendermintMachine {
network,
signer,
validators,
weights: weights.clone(),
queue: VecDeque::new(),
msg_recv,
step_recv,
block: BlockData::new(weights, BlockNumber(last_block.0 + 1), validator_id, proposal),
};
// The end time of the last block is the start time for this one
// The Commit explicitly contains the end time, so loading the last commit will provide
// this. The only exception is for the genesis block, which doesn't have a commit
// Using the genesis time in place will cause this block to be created immediately
// after it, without the standard amount of separation (so their times will be
// equivalent or minimally offset)
// For callers wishing to avoid this, they should pass (0, GENESIS + N::block_time())
machine.round(RoundNumber(0), Some(CanonicalInstant::new(last_time)));
machine
},
}
}
pub async fn run(mut self) {
loop {
// Also create a future for if the queue has a message
// Does not pop_front as if another message has higher priority, its future will be handled
// instead in this loop, and the popped value would be dropped with the next iteration
// While no other message has a higher priority right now, this is a safer practice
let mut queue_future =
if self.queue.is_empty() { Fuse::terminated() } else { future::ready(()).fuse() };
if let Some((broadcast, msg)) = futures::select_biased! {
// Handle a new block occuring externally (an external sync loop)
// Has the highest priority as it makes all other futures here irrelevant
msg = self.step_recv.next() => {
if let Some((block_number, commit, proposal)) = msg {
// Commit is for a block we've already moved past
if block_number != self.block.number {
continue;
}
self.reset_by_commit(commit, proposal).await;
None
} else {
break;
}
},
// Handle our messages
_ = queue_future => {
Some((true, self.queue.pop_front().unwrap()))
},
// Handle any timeouts
step = self.block.round().timeout_future().fuse() => {
// Remove the timeout so it doesn't persist, always being the selected future due to bias
// While this does enable the timeout to be entered again, the timeout setting code will
// never attempt to add a timeout after its timeout has expired
self.block.round_mut().timeouts.remove(&step);
// Only run if it's still the step in question
if self.block.round().step == step {
match step {
Step::Propose => {
// Slash the validator for not proposing when they should've
debug!(target: "tendermint", "Validator didn't propose when they should have");
self.slash(
self.weights.proposer(self.block.number, self.block.round().number)
).await;
self.broadcast(Data::Prevote(None));
},
Step::Prevote => self.broadcast(Data::Precommit(None)),
Step::Precommit => {
self.round(RoundNumber(self.block.round().number.0 + 1), None);
continue;
}
}
}
None
},
// Handle any received messages
msg = self.msg_recv.next() => {
if let Some(msg) = msg {
if !msg.verify_signature(&self.validators) {
continue;
}
Some((false, msg.msg))
} else {
break;
}
}
} {
let res = self.message(msg.clone()).await;
if res.is_err() && broadcast {
panic!("honest node had invalid behavior");
}
match res {
Ok(None) => (),
Ok(Some(block)) => {
let mut validators = vec![];
let mut sigs = vec![];
// Get all precommits for this round
for (validator, msgs) in &self.block.log.log[&msg.round] {
if let Some(Data::Precommit(Some((id, sig)))) = msgs.get(&Step::Precommit) {
// If this precommit was for this block, include it
if id == &block.id() {
validators.push(*validator);
sigs.push(sig.clone());
}
}
}
let commit = Commit {
end_time: self.block.end_time[&msg.round].canonical(),
validators,
signature: N::SignatureScheme::aggregate(&sigs),
};
debug_assert!(self.network.verify_commit(block.id(), &commit));
let proposal = self.network.add_block(block, commit).await;
self.reset(msg.round, proposal).await;
}
Err(TendermintError::Malicious(validator)) => self.slash(validator).await,
Err(TendermintError::Temporal) => (),
}
if broadcast {
let sig = self.signer.sign(&msg.encode()).await;
self.network.broadcast(SignedMessage { msg, sig }).await;
}
}
}
}
// Returns Ok(true) if this was a Precommit which had its signature validated
// Returns Ok(false) if it wasn't a Precommit or the signature wasn't validated yet
// Returns Err if the signature was invalid
fn verify_precommit_signature(
&self,
sender: N::ValidatorId,
round: RoundNumber,
data: &DataFor<N>,
) -> Result<bool, TendermintError<N::ValidatorId>> {
if let Data::Precommit(Some((id, sig))) = data {
// Also verify the end_time of the commit
// Only perform this verification if we already have the end_time
// Else, there's a DoS where we receive a precommit for some round infinitely in the future
// which forces us to calculate every end time
if let Some(end_time) = self.block.end_time.get(&round) {
if !self.validators.verify(sender, &commit_msg(end_time.canonical(), id.as_ref()), sig) {
debug!(target: "tendermint", "Validator produced an invalid commit signature");
Err(TendermintError::Malicious(sender))?;
}
return Ok(true);
}
}
Ok(false)
}
async fn message(
&mut self,
msg: MessageFor<N>,
) -> Result<Option<N::Block>, TendermintError<N::ValidatorId>> {
if msg.block != self.block.number {
Err(TendermintError::Temporal)?;
}
// If this is a precommit, verify its signature
self.verify_precommit_signature(msg.sender, msg.round, &msg.data)?;
// Only let the proposer propose
if matches!(msg.data, Data::Proposal(..)) &&
(msg.sender != self.weights.proposer(msg.block, msg.round))
{
debug!(target: "tendermint", "Validator who wasn't the proposer proposed");
Err(TendermintError::Malicious(msg.sender))?;
};
if !self.block.log.log(msg.clone())? {
return Ok(None);
}
// All functions, except for the finalizer and the jump, are locked to the current round
// Run the finalizer to see if it applies
// 49-52
if matches!(msg.data, Data::Proposal(..)) || matches!(msg.data, Data::Precommit(_)) {
let proposer = self.weights.proposer(self.block.number, msg.round);
// Get the proposal
if let Some(Data::Proposal(_, block)) = self.block.log.get(msg.round, proposer, Step::Propose)
{
// Check if it has gotten a sufficient amount of precommits
// Use a junk signature since message equality disregards the signature
if self.block.log.has_consensus(
msg.round,
Data::Precommit(Some((block.id(), self.signer.sign(&[]).await))),
) {
return Ok(Some(block.clone()));
}
}
}
// Else, check if we need to jump ahead
#[allow(clippy::comparison_chain)]
if msg.round.0 < self.block.round().number.0 {
// Prior round, disregard if not finalizing
return Ok(None);
} else if msg.round.0 > self.block.round().number.0 {
// 55-56
// Jump, enabling processing by the below code
if self.block.log.round_participation(msg.round) > self.weights.fault_thresold() {
// If this round already has precommit messages, verify their signatures
let round_msgs = self.block.log.log[&msg.round].clone();
for (validator, msgs) in &round_msgs {
if let Some(data) = msgs.get(&Step::Precommit) {
if let Ok(res) = self.verify_precommit_signature(*validator, msg.round, data) {
// Ensure this actually verified the signature instead of believing it shouldn't yet
debug_assert!(res);
} else {
// Remove the message so it isn't counted towards forming a commit/included in one
// This won't remove the fact the precommitted for this block hash in the MessageLog
// TODO: Don't even log these in the first place until we jump, preventing needing
// to do this in the first place
self
.block
.log
.log
.get_mut(&msg.round)
.unwrap()
.get_mut(validator)
.unwrap()
.remove(&Step::Precommit);
self.slash(*validator).await;
}
}
}
// If we're the proposer, return now so we re-run processing with our proposal
// If we continue now, it'd just be wasted ops
if self.round(msg.round, None) {
return Ok(None);
}
} else {
// Future round which we aren't ready to jump to, so return for now
return Ok(None);
}
}
// The paper executes these checks when the step is prevote. Making sure this message warrants
// rerunning these checks is a sane optimization since message instances is a full iteration
// of the round map
if (self.block.round().step == Step::Prevote) && matches!(msg.data, Data::Prevote(_)) {
let (participation, weight) =
self.block.log.message_instances(self.block.round().number, Data::Prevote(None));
// 34-35
if participation >= self.weights.threshold() {
self.block.round_mut().set_timeout(Step::Prevote);
}
// 44-46
if weight >= self.weights.threshold() {
self.broadcast(Data::Precommit(None));
return Ok(None);
}
}
// 47-48
if matches!(msg.data, Data::Precommit(_)) &&
self.block.log.has_participation(self.block.round().number, Step::Precommit)
{
self.block.round_mut().set_timeout(Step::Precommit);
}
// All further operations require actually having the proposal in question
let proposer = self.weights.proposer(self.block.number, self.block.round().number);
let (vr, block) = if let Some(Data::Proposal(vr, block)) =
self.block.log.get(self.block.round().number, proposer, Step::Propose)
{
(vr, block)
} else {
return Ok(None);
};
// 22-33
if self.block.round().step == Step::Propose {
// Delay error handling (triggering a slash) until after we vote.
let (valid, err) = match self.network.validate(block).await {
Ok(_) => (true, Ok(None)),
Err(BlockError::Temporal) => (false, Ok(None)),
Err(BlockError::Fatal) => (false, {
debug!(target: "tendermint", "Validator proposed a fatally invalid block");
Err(TendermintError::Malicious(proposer))
}),
};
// Create a raw vote which only requires block validity as a basis for the actual vote.
let raw_vote = Some(block.id()).filter(|_| valid);
// If locked is none, it has a round of -1 according to the protocol. That satisfies
// 23 and 29. If it's some, both are satisfied if they're for the same ID. If it's some
// with different IDs, the function on 22 rejects yet the function on 28 has one other
// condition
let locked = self.block.locked.as_ref().map(|(_, id)| id == &block.id()).unwrap_or(true);
let mut vote = raw_vote.filter(|_| locked);
if let Some(vr) = vr {
// Malformed message
if vr.0 >= self.block.round().number.0 {
debug!(target: "tendermint", "Validator claimed a round from the future was valid");
Err(TendermintError::Malicious(msg.sender))?;
}
if self.block.log.has_consensus(*vr, Data::Prevote(Some(block.id()))) {
// Allow differing locked values if the proposal has a newer valid round
// This is the other condition described above
if let Some((locked_round, _)) = self.block.locked.as_ref() {
vote = vote.or_else(|| raw_vote.filter(|_| locked_round.0 <= vr.0));
}
self.broadcast(Data::Prevote(vote));
return err;
}
} else {
self.broadcast(Data::Prevote(vote));
return err;
}
return Ok(None);
}
if self
.block
.valid
.as_ref()
.map(|(round, _)| round != &self.block.round().number)
.unwrap_or(true)
{
// 36-43
// The run once condition is implemented above. Since valid will always be set by this, it
// not being set, or only being set historically, means this has yet to be run
if self.block.log.has_consensus(self.block.round().number, Data::Prevote(Some(block.id()))) {
match self.network.validate(block).await {
Ok(_) => (),
Err(BlockError::Temporal) => (),
Err(BlockError::Fatal) => {
debug!(target: "tendermint", "Validator proposed a fatally invalid block");
Err(TendermintError::Malicious(proposer))?
}
};
self.block.valid = Some((self.block.round().number, block.clone()));
if self.block.round().step == Step::Prevote {
self.block.locked = Some((self.block.round().number, block.id()));
self.broadcast(Data::Precommit(Some((
block.id(),
self
.signer
.sign(&commit_msg(
self.block.end_time[&self.block.round().number].canonical(),
block.id().as_ref(),
))
.await,
))));
}
}
}
Ok(None)
}
}