Commit Graph

63 Commits

Author SHA1 Message Date
Luke Parker
5e0e91c85d Add tasks to publish data onto Serai 2025-01-14 01:58:26 -05:00
Luke Parker
b5a6b0693e Add a proper error type to ContinuallyRan
This isn't necessary. Because we just log the error, we never match off of it,
we don't need any structure beyond String (or now Debug, which still gives us
a way to print the error). This is for the ergonomics of not having to
constantly write `.map_err(|e| format!("{e:?}"))`.
2025-01-12 18:29:08 -05:00
Luke Parker
0ce9aad9b2 Add flow to add transactions onto Tributaries 2025-01-12 07:32:45 -05:00
Luke Parker
e7de5125a2 Have processor-messages use CosignIntent/SignedCosign, not the historic cosign format
Has yet to update the processor accordingly.
2025-01-12 05:52:33 -05:00
Luke Parker
74106b025f Publish SlashReport onto the Tributary 2025-01-11 06:51:55 -05:00
Luke Parker
3c664ff05f Re-arrange coordinator/
coordinator/tributary was tributary-chain. This crate has been renamed
tributary-sdk and moved to coordinator/tributary-sdk.

coordinator/src/tributary was our instantion of a Tributary, the Transaction
type and scan task. This has been moved to coordinator/tributary.

The main reason for this was due to coordinator/main.rs becoming untidy. There
is now a collection of clean, independent APIs present in the codebase.
coordinator/main.rs is to compose them. Sometimes, these compositions are a bit
silly (reading from a channel just to forward the message to a distinct
channel). That's more than fine as the code is still readable and the value
from the cleanliness of the APIs composed far exceeds the nits from having
these odd compositions.

This breaks down a bit as we now define a global database, and have some APIs
interact with multiple other APIs.

coordinator/src/tributary was a self-contained, clean API. The recently added
task present in coordinator/tributary/mod.rs, which bound it to the rest of the
Coordinator, wasn't.

Now, coordinator/src is solely the API compositions, and all self-contained
APIs are their own crates.
2025-01-11 04:14:21 -05:00
Luke Parker
b2bd5d3a44 Remove Debug bound on tributary::P2p 2025-01-08 17:40:32 -05:00
Luke Parker
376a66b000 Remove async-trait from tendermint-machine, tributary-chain 2025-01-08 16:41:11 -05:00
Luke Parker
7e2b31e5da Clean the transaction definitions in the coordinator
Moves to borsh for serialization. No longer includes nonces anywhere in the TX.
2024-12-31 12:14:32 -05:00
Luke Parker
e4e4245ee3 One Round DKG (#589)
* Upstream GBP, divisor, circuit abstraction, and EC gadgets from FCMP++

* Initial eVRF implementation

Not quite done yet. It needs to communicate the resulting points and proofs to
extract them from the Pedersen Commitments in order to return those, and then
be tested.

* Add the openings of the PCs to the eVRF as necessary

* Add implementation of secq256k1

* Make DKG Encryption a bit more flexible

No longer requires the use of an EncryptionKeyMessage, and allows pre-defined
keys for encryption.

* Make NUM_BITS an argument for the field macro

* Have the eVRF take a Zeroizing private key

* Initial eVRF-based DKG

* Add embedwards25519 curve

* Inline the eVRF into the DKG library

Due to how we're handling share encryption, we'd either need two circuits or to
dedicate this circuit to the DKG. The latter makes sense at this time.

* Add documentation to the eVRF-based DKG

* Add paragraph claiming robustness

* Update to the new eVRF proof

* Finish routing the eVRF functionality

Still needs errors and serialization, along with a few other TODOs.

* Add initial eVRF DKG test

* Improve eVRF DKG

Updates how we calculcate verification shares, improves performance when
extracting multiple sets of keys, and adds more to the test for it.

* Start using a proper error for the eVRF DKG

* Resolve various TODOs

Supports recovering multiple key shares from the eVRF DKG.

Inlines two loops to save 2**16 iterations.

Adds support for creating a constant time representation of scalars < NUM_BITS.

* Ban zero ECDH keys, document non-zero requirements

* Implement eVRF traits, all the way up to the DKG, for secp256k1/ed25519

* Add Ristretto eVRF trait impls

* Support participating multiple times in the eVRF DKG

* Only participate once per key, not once per key share

* Rewrite processor key-gen around the eVRF DKG

Still a WIP.

* Finish routing the new key gen in the processor

Doesn't touch the tests, coordinator, nor Substrate yet.
`cargo +nightly fmt && cargo +nightly-2024-07-01 clippy --all-features -p serai-processor`
does pass.

* Deduplicate and better document in processor key_gen

* Update serai-processor tests to the new key gen

* Correct amount of yx coefficients, get processor key gen test to pass

* Add embedded elliptic curve keys to Substrate

* Update processor key gen tests to the eVRF DKG

* Have set_keys take signature_participants, not removed_participants

Now no one is removed from the DKG. Only `t` people publish the key however.

Uses a BitVec for an efficient encoding of the participants.

* Update the coordinator binary for the new DKG

This does not yet update any tests.

* Add sensible Debug to key_gen::[Processor, Coordinator]Message

* Have the DKG explicitly declare how to interpolate its shares

Removes the hack for MuSig where we multiply keys by the inverse of their
lagrange interpolation factor.

* Replace Interpolation::None with Interpolation::Constant

Allows the MuSig DKG to keep the secret share as the original private key,
enabling deriving FROST nonces consistently regardless of the MuSig context.

* Get coordinator tests to pass

* Update spec to the new DKG

* Get clippy to pass across the repo

* cargo machete

* Add an extra sleep to ensure expected ordering of `Participation`s

* Update orchestration

* Remove bad panic in coordinator

It expected ConfirmationShare to be n-of-n, not t-of-n.

* Improve documentation on  functions

* Update TX size limit

We now no longer have to support the ridiculous case of having 49 DKG
participations within a 101-of-150 DKG. It does remain quite high due to
needing to _sign_ so many times. It'd may be optimal for parties with multiple
key shares to independently send their preprocesses/shares (despite the
overhead that'll cause with signatures and the transaction structure).

* Correct error in the Processor spec document

* Update a few comments in the validator-sets pallet

* Send/Recv Participation one at a time

Sending all, then attempting to receive all in an expected order, wasn't working
even with notable delays between sending messages. This points to the mempool
not working as expected...

* Correct ThresholdKeys serialization in modular-frost test

* Updating existing TX size limit test for the new DKG parameters

* Increase time allowed for the DKG on the GH CI

* Correct construction of signature_participants in serai-client tests

Fault identified by akil.

* Further contextualize DkgConfirmer by ValidatorSet

Caught by a safety check we wouldn't reuse preprocesses across messages. That
raises the question of we were prior reusing preprocesses (reusing keys)?
Except that'd have caused a variety of signing failures (suggesting we had some
staggered timing avoiding it in practice but yes, this was possible in theory).

* Add necessary calls to set_embedded_elliptic_curve_key in coordinator set rotation tests

* Correct shimmed setting of a secq256k1 key

* cargo fmt

* Don't use `[0; 32]` for the embedded keys in the coordinator rotation test

The key_gen function expects the random values already decided.

* Big-endian secq256k1 scalars

Also restores the prior, safer, Encryption::register function.
2024-09-19 21:43:26 -04:00
Luke Parker
e772b8a5f7 #560 take two, now that #560 has been reverted (#561)
* Clear upons upon round, not block

* Cache the proposal for a round

* Rebase onto develop, which reverted this PR, and re-apply this PR

* Set participation upon participation instead of constantly recalculating

* Cache message instances

* Add missing txn commit

Identified by @akildemir.

* Correct clippy lint identified upon rebase

* Fix tendermint chain sync (#581)

* fix p2p Reqres protocol

* stabilize tributary chain sync

* fix pr comments

---------

Co-authored-by: akildemir <34187742+akildemir@users.noreply.github.com>
2024-07-16 19:42:15 -04:00
Luke Parker
bc1dec7991 Move TRANSACTION_MESSAGE to 1 2024-04-28 04:04:53 -04:00
Luke Parker
933b17aa91 Revert coordinator/tributary to fd4f247917
\#560 is causing notable CI failures, with its logs including slashes at 10x
the prior rate.
2024-04-21 10:16:12 -04:00
Luke Parker
523d2ac911 Rewrite tendermint's message handling loop to much more clearly match the paper (#560)
* Rewrite tendermint's message handling loop to much more clearly match the paper

No longer checks relevant branches upon messages, yet all branches upon any
state change. This is slower, yet easier to review and likely without one or
two rare edge cases.

When reviewing, please see page 5 of https://arxiv.org/pdf/1807.04938.pdf.
Lines from the specified algorithm can be found in the code by searching for
"// L".

* Sane rebroadcasting of consensus messages

Instead of broadcasting the last n messages on the Tributary side of things, we
now have the machine rebroadcast the message tape for the current block.

* Only rebroadcast messages which didn't error in some way

* Only rebroadcast our own messages for tendermint
2024-04-21 05:30:31 -04:00
Luke Parker
bcc88c3e86 Don't broadcast added blocks
Online validators should inherently have them. Offline validators will receive
from the sync protocol.

This does somewhat eliminate the class of nodes who would follow the blockchain
(without validating it), yet that's fine for the performance benefit.
2024-04-18 01:48:11 -04:00
Luke Parker
af9b1ad5f9 Initial pruning of backlogged consensus messages 2024-03-22 23:18:53 -04:00
Luke Parker
2f07d04d88 Extend timeout for rebroadcast of consensus messages in coordinator 2024-03-22 16:06:31 -04:00
Luke Parker
454bebaa77 Have the TendermintMachine domain-separate by genesis
Enbables support for multiple machines over the same DB.
2024-03-08 01:22:02 -05:00
Luke Parker
e266bc2e32 Stop validators from equivocating on reboot
Part of https://github.com/serai-dex/serai/issues/345.

The lack of full DB persistence does mean enough nodes rebooting at the same
time may cause a halt. This will prevent slashes.
2024-03-07 22:56:35 -05:00
akildemir
ad0ecc5185 complete various todos in tributary (#520)
* complete various todos

* fix pr comments

* Document bounds on unique hashes in TransactionKind

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2024-02-05 03:50:55 -05:00
Luke Parker
065d314e2a Further expand clippy workspace lints
Achieves a notable amount of reduced async and clones.
2023-12-17 00:04:49 -05:00
Luke Parker
6caf45ea1d Downscope usage of futures 2023-12-10 19:32:52 -05:00
Luke Parker
1ca66b846a Use multiple nonces in the Tributary 2023-12-03 00:04:58 -05:00
Luke Parker
797604ad73 Replace usage of io::Error::new(io::ErrorKind::Other, with io::Error::other
Newly possible with Rust 1.74.
2023-11-19 18:31:37 -05:00
akildemir
97fedf65d0 add reasons to slash evidence (#414)
* add reasons to slash evidence

* fix CI failing

* Remove unnecessary clones

.encode() takes &self

* InvalidVr to InvalidValidRound

* Unrelated to this PR: Clarify reasoning/potentials behind dropping evidence

* Clarify prevotes in SlashEvidence test

* Replace use of read_to_end

* Restore decode_signed_message

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2023-11-05 00:04:41 -04:00
Luke Parker
e05b77d830 Support multiple key shares per validator (#416)
* Update the coordinator to give key shares based on weight, not based on existence

Participants are now identified by their starting index. While this compiles,
the following is unimplemented:

1) A conversion for DKG `i` values. It assumes the threshold `i` values used
will be identical for the MuSig signature used to confirm the DKG.
2) Expansion from compressed values to full values before forwarding to the
processor.

* Add a fn to the DkgConfirmer to convert `i` values as needed

Also removes TODOs regarding Serai ensuring validator key uniqueness +
validity. The current infra achieves both.

* Have the Tributary DB track participation by shares, not by count

* Prevent a node from obtaining 34% of the maximum amount of key shares

This is actually mainly intended to set a bound on message sizes in the
coordinator. Message sizes are amplified by the amount of key shares held, so
setting an upper bound on said amount lets it determine constants. While that
upper bound could be 150, that'd be unreasonable and increase the potential for
DoS attacks.

* Correct the mechanism to detect if sufficient accumulation has occured

It used to check if the latest accumulation hit the required threshold. Now,
accumulations may jump past the required threshold. The required mechanism is
to check the threshold wasn't prior met and is now met.

* Finish updating the coordinator to handle a multiple key share per validator environment

* Adjust stategy re: preventing noce reuse in DKG Confirmer

* Add TODOs regarding dropped transactions, add possible TODO fix

* Update tests/coordinator

This doesn't add new multi-key-share tests, it solely updates the existing
single key-share tests to compile and run, with the necessary fixes to the
coordinator.

* Update processor key_gen to handle generating multiple key shares at once

* Update SubstrateSigner

* Update signer, clippy

* Update processor tests

* Update processor docker tests
2023-11-04 19:26:13 -04:00
Luke Parker
19e90b28b0 Have Tributary's add_transaction return a proper error
Modifies main.rs to properly handle the returned error.
2023-10-14 21:50:11 -04:00
Luke Parker
62e1d63f47 Abort the P2P meta task when dropped
This should cause full cleanup of all Tributary async tasks, since the machine
already cleans itself up on drop.
2023-10-14 20:08:51 -04:00
akildemir
d5c6ed1a03 Improve provided handling (#381)
* fix typos

* remove tributary sleeping

* handle not locally provided txs

* use topic number instead of waiting list

* Clean-up, fixes

1) Uses a single TXN in provided
2) Doesn't continue on non-local provided inside verify_block, skipping further
   execution of checks
3) Upon local provision of already on-chain TX, compares

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2023-10-13 19:45:47 -04:00
Luke Parker
b0fcdd3367 Regularly rebroadcast consensus messages to ensure presence even if dropped from the P2P layer
Attempts to fix #342, #382.
2023-10-12 22:14:42 -04:00
Luke Parker
2508633de9 Add a next block notification system to Tributary
Also adds a loop missing from the prior commit.
2023-09-25 23:20:51 -04:00
Luke Parker
0440e60645 Move heartbeat_tributaries from Tributary to TributaryReader
Reduces contention.
2023-09-25 17:15:38 -04:00
Luke Parker
cd4c3a6c88 Correct publication of Completed Tributary TXs 2023-09-02 00:50:54 -04:00
Luke Parker
2db53d5434 Use &self for handle_message and sync_block in Tributary
They used &mut self to prevent execution at the same time. This uses a lock
over the channel to achieve the same security, without requiring a lock over
the entire tributary.

This fixes post-provided Provided transactions. sync_block waited for the TX to
be provided, yet it never would as sync_block held a mutable reference over the
entire Tributary, preventing any other read/write operations of any scope.

A timeout increased (bc2f23f72b) due to this bug
not being identified has been decreased back, thankfully.

Also shims in basic support for Completed, which was the WIP before this bug
was identified.
2023-08-27 05:07:11 -04:00
akildemir
39ce819876 Slash malevolent validators (#294)
* add slash tx

* ignore unsigned tx replays

* verify that provided evidence is valid

* fix clippy + fmt

* move application tx handling to another module

* partially handle the tendermint txs

* fix pr comments

* support unsigned app txs

* add slash target to the votes

* enforce provided, unsigned, signed tx ordering within a block

* bug fixes

* add unit test for tendermint txs

* bug fixes

* update tests for tendermint txs

* add tx ordering test

* tidy up tx ordering test

* cargo +nightly fmt

* Misc fixes from rebasing

* Finish resolving clippy

* Remove sha3 from tendermint-machine

* Resolve a DoS in SlashEvidence's read

Also moves Evidence from Vec<Message> to (Message, Option<Message>). That
should meet all requirements while being a bit safer.

* Make lazy_static a dev-depend for tributary

* Various small tweaks

One use of sort was inefficient, sorting unsigned || signed when unsigned was
already properly sorted. Given how the unsigned TXs were given a nonce of 0, an
unstable sort may swap places with an unsigned TX and a signed TX with a nonce
of 0 (leading to a faulty block).

The extra protection added here sorts signed, then concats.

* Fix Tributary tests I broke, start review on tendermint/tx.rs

* Finish reviewing everything outside tests and empty_signature

* Remove empty_signature

empty_signature led to corrupted local state histories. Unfortunately, the API
is only sane with a signature.

We now use the actual signature, which risks creating a signature over a
malicious message if we have ever have an invariant producing malicious
messages. Prior, we only signed the message after the local machine confirmed
it was okay per the local view of consensus.

This is tolerated/preferred over a corrupt state history since production of
such messages is already an invariant. TODOs are added to make handling of this
theoretical invariant further robust.

* Remove async_sequential for tokio::test

There was no competition for resources forcing them to be run sequentially.

* Modify block order test to be statistically significant without multiple runs

* Clean tests

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2023-08-21 00:28:23 -04:00
Luke Parker
f6f945e747 Add a LibP2P instantiation to coordinator
It's largely unoptimized, and not yet exclusive to validators, yet has basic
sanity (using message content for ID instead of sender + index).

Fixes bugs as found. Notably, we used a time in milliseconds where the
Tributary expected  seconds.

Also has Tributary::new jump to the presumed round number. This reduces slashes
when starting new chains (whose times will be before the current time) and was
the only way I was able to observe successful confirmations given current
surrounding infrastructure.
2023-08-08 15:12:47 -04:00
Luke Parker
3c38a0ec11 cargo +nightly fmt 2023-08-01 00:47:36 -04:00
Luke Parker
88f0e89350 Ensure Tributary commits are minimal 2023-05-09 23:45:05 -04:00
Luke Parker
78d5372fb7 Initial code to handle messages from processors 2023-04-25 03:14:42 -04:00
Luke Parker
e74b4ab94f Add a TributaryReader which doesn't require a borrow to operate
Reduces lock contention.

Additionally changes block_key to include the genesis. While not technically
needed, the lack of genesis introduced a side effect where any Tributary on the
the database could return the block of any other Tributary. While that wasn't a
security issue, returning it suggested it was on-chain when it wasn't. This may
have been usable to create issues.
2023-04-24 07:02:00 -04:00
Luke Parker
cc491ee1e1 Don't return from sync_block until the Tendermint machine returns if it's valid or not
We had a race condition where'd we be informed of blocks 1 .. 3, and
immediately add 1 .. 3. Because we immediately tried to add 2 after 1, it'd
fail since the tip was still the genesis, yet 2 needs the tip to be 1.

Adding a channel, while ugly, was the simplest way to accomplish this.

Also has any added block be broadcasted. Else there's a race condition where a
node which syncs up to the most recent block does so, yet fails to add the next
block when it's committed to.
2023-04-24 02:46:13 -04:00
Luke Parker
14388e746c Implement Tributary syncing
Also adds a forwards-lookup to the Tributary blockchain.
2023-04-24 00:53:18 -04:00
Luke Parker
215155f84b Remove reliance on a blockchain read lock from block/commit 2023-04-23 23:51:10 -04:00
Luke Parker
c476f9b640 Break coordinator main into multiple functions
Also moves from std::sync::RwLock to tokio::sync::RwLock to prevent wasting
cycles on spinning.
2023-04-23 23:15:15 -04:00
Luke Parker
aa0ec4ac41 cargo fmt 2023-04-23 18:56:48 -04:00
Luke Parker
05b1fc5f05 Send a heartbeat message when a Tributary falls behind 2023-04-23 18:55:43 -04:00
Luke Parker
ad5522d854 Start handling P2P messages
This defines the tart of a very complex series of locks I'm really unhappy
with. At the same time, there's not immediately a better solution. This also
should work without issue.
2023-04-23 17:01:30 -04:00
Luke Parker
af84b7f707 Add a test for Tributary
Further fleshes out the Tributary testing code.
2023-04-22 22:28:20 -04:00
Luke Parker
8c74576cf0 Add a test to the coordinator for running a Tributary
Impls a LocalP2p for testing.

Moves rebroadcasting into Tendermint, since it's what knows if a message is
fully valid + original.

Removes TributarySpec::validators() HashMap, as its non-determinism caused
different instances to have different round robin schedules. It was already
prior moved to a Vec for this issue, so I'm unsure why this remnant existed.

Also renames the GH no-std workflow from the prior commit.
2023-04-22 10:49:52 -04:00
Luke Parker
8041a0d845 Initial Tributary handling 2023-04-20 05:05:17 -04:00