In this assignment you will continue programming in Go and will
learn some basic inter-thread synchronization primitives. You will
build a multi-threaded "bitcoin miner": a system that, based on an
input slice of bytes, figures out by trial and error what suffix to
that slice of bytes will hash the slice to a string representation
that ends in some number of "0" characters. Due to the brute-force
nature of the approach, it is an ideal target for divide-and-conquer
methodology and is a great way to practice concurrency in Go lang.
Overview
This assignment's objective is introduce you to some threading
primitives available in Go. Most of these are built-in directly into
the language and are easy to use. You will learn:
Proof-of-work
Proof-of-work is an important concept in blockchain-based systems,
which are distributed systems for recording events, such as
financial transactions. Replication of information in blockchains
introduces the issue of trust: what if some malicious actor claims
to be a replica, or somehow compromises a previously valid node, and
attempts to falsify information?
One approach that helps with this problem is a cryptographic "might
makes right" principle based on computing power. To generate a block
in a blockchain system based on proof-of-work, nodes must perform a
proof-of-work task: a task that is computationally difficult to
perform, but the results of which are efficient to verify. This
makes trust a function of computing power: unless a malicious actor
controls a large fraction of computing power in the blockchain
network, it will be computationally out-matched by the rest of the
network.
The variation of proof-of-work used in this assignment is based on
reversing the MD5 hashing function.
While MD5 is no
longer cryptographically trustworthy, and therefore not suitable
for security-critical usage, the algorithm is widely
available and simple to use. Most importantly for this assignment,
it remains computationally hard to naively brute-force its reversal,
while, like all reasonable hash functions, it is trivial to check if
a result is correct by just running it in the intended direction.
The requirement of searching for a suffix of an existing byte slice
allows for incorporating the notion of progression into
proof-of-work: the prefix could be set to a hash of the blockchain's
state-so-far, forcing the work to be computationally relevant to the
blockchain's current state. Likewise, the number of "0" characters
to find allows a configurable difficulty setting: hashes ending with
just "0" will be more common than those ending with "00", and so
forth, all the way to finding a hash that is all zeroes, which
is very, very hard (try it!).
Parallelizing proof-of-work
Often, the division of a problem in a divide-and-conquer approach is
self-evident from the problem structure. But, since it is not entirely
obvious when dealing with byte sequences, and coming up with such a
scheme is not the objective of the assignment, we will provide the
method you should use:
- The suffix to be calculated may have unbounded length, so
somehow partitioning out segments of the suffix-space will not
work.
- Instead, consider the space of possible suffixes as a
prefix-trie: as long as the number of threads is a power of two
(or you are prepared to deal with a lot of edge cases), we can
consider all possible suffixes as paths.
- This set of paths can be partitioned by giving different
workers different path fragments on which to build, ensuring each
worker will explore separate parts of the search space.
Further assuming that the number of threads is a power of two, we
can work through an example. Consider the case where we have 2 bits
for the number of threads, giving us 2^2 = 4 threads. Each thread
can then be given a different part of the key space:
- Thread 0 can be given all suffixes starting with binary 00
- Thread 1 can be given all suffixes starting with binary 01
- Thread 2 can be given all suffixes starting with binary 10
- Thread 3 can be given all suffixes starting with binary 11
Thus, we have enumerated all possible choices for the first 2 bits
of the first byte of our possible prefixes. Now, each thread can
explore the remaining 6 bits of that byte, along with any additional
bytes in the suffix. The combined search operation will eventually
cover all possible suffixes.
For even more simplicity, you can assume that we will not use
thread counts larger than 2^8 = 256. So you do not have to worry about
the thread count requiring more than one byte.
Implementation notes
The provided starter code will contain one public function that you should implement:
func Mine(tracer *tracing.Tracer, nonce []uint8, numTrailingZeroes, threadBits uint) (secret []uint8)
The behaviour should be as follows:
nonce is a byte slice, which, when suffixed by the
return value secret , should hash via MD5 to a byte
slice whose string representation (via the format specifier "%x")
ends in
numTrailingZeroes '0' characters
threadBits indirectly specifies the number of
worker goroutines to use when searching
for secret : 1 << threadBits , or 2^threadBits , threads
should be used. This should lead to an approximate 1/(1 <<
threadBits) decrease in execution time for different values
of threadBits , keeping all other parameters the same.
tracer is used for tracing your solution's control flow.
This can help you to check your solution and we will use for grading.
See the Testing section below for a description of what tracing is required in this assignment.
See https://github.com/DistributedClocks/tracing
for more precise API documentation of the tracing library, which is published as a Go module
on github.
- Once a goroutine has found a satisfying suffix, all other
goroutines should stop working as soon as possible, without
completing their search. That is: you need a way to tell your
worker goroutines to stop. All workers should signal
cancellation or completion soon after a satisfying result has been
found.
- Shorter suffixes should be searched before longer ones; it is
expected that you perform a simple linear search, only considering
an additional byte once all options are exhausted for a given suffix
length.
- Use "crypto/md5"
for hashing purposes.
We provide you with 6 files:
(1) main.go ,
(2) hash_miner/hash_miner.go ,
(3) go.mod ,
(4) go.sum ,
(5) tracing_config.json ,
(6) tracing_server_config.json .
The file main.go contains code that invokes the hash_miner.Mine function in an appropriate context,
including proper setup of the tracing facility.
You can use this file for testing ad-hoc scenarios (including, for instance, running two searches in sequence), or as basis
for you own automated tests in the style of assignment 1.
The go.mod file contains dependency tracking metadata, including a dependency on
https://github.com/DistributedClocks/tracing, the tracing library.
The go.sum file contains auto-generated metadata (checksums for module dependencies).
You should not need to touch or understand this file.
Your solution should be added to hash_miner/hash_miner.go , which is the only file you should
meaningfully change.
You may also modify main.go and add additional files (e.g., tests) for your convenience, but you should not rely
on any changes there: we will discard these files during grading.
The configuration files tracing_config.json and
tracing_server_config.json provide
configuration for a basic instance of the tracer and tracing server facilities, respectively.
As above, you may change them for your own convenience (e.g., more testing), but we will discard these during grading,
replacing them with our own versions.
Testing
This assignment introduces the tracing library, which will be used
to test whether your code is behaving correctly.
The provided file main.go contains an example of how to run tracing
on a single process, and the provided stub hash_miner/hash_miner.go
contains examples of how to report actions being performed.
Your solution should precisely follow the tracing semantics described in the next section.
No automated testing is provided for this assignment, though you are welcome to
build your own, as has been demonstrated in assignment 1.
All graded tests will be performed via calls to hash_miner.Mine ,
orchestrated in some unspecified way.
Timing will be measured, to ensure that when more threads are used, your code
can find a solutions to the same problem more efficiently.
With such an embarrassingly parallel problem, speed-up is expected to be roughly
proportional to thread count.
You can test this yourself by choosing a large number of zeroes and timing your
code (perhaps using the Linux time command) for different values
of threadBits .
Tracing semantics
In your solution, tracer.RecordAction should be called on one of the
collection of structs defined in hash_miner/hash_miner.go to
report a given action.
Each action you need to report, alongside its meaning, is listed here:
MiningBegin{} signifies the start of mining. This
should appear exactly once, before any other actions.
MiningComplete{Secret} signifies the end of mining. This should appear exactly once
after all other actions, and should contain one of the discovered secrets.
If hash_miner.Mine is called more than once, this should appear strictly before any actions
relating to any subsequent calls.
WorkerStart{ThreadByte} indicates that a given worker with ThreadByte has started
searching for the secret. Every worker must report this exactly once as part of its execution, before any other
actions that worker might record.
WorkerSuccess{ThreadByte,Secret} indicates that the worker with ThreadByte
has found a secret. This worker should terminate, recording no actions past this one.
WorkerCancelled{ThreadByte} indicates that the worker with ThreadByte received
a cancellation. This worker should terminate, recording no actions past this one.
Note that ThreadByte refers to the bit prefix a worker has been assigned to explore.
If we are considering two-bit prefixes, and a worker's ThreadByte is 3 , then that worker
should be exploring only secrets that begin with binary 11 .
To precisely define "begins with", here are some examples, in Go slice syntax, of secrets that might be found
by a worker with threadByte = 3 (in binary: 11), given an arbitrary nonce for which the secret is valid:
[]uint8{ 192, 168, 1, 1 } , because 192 is 11000000 in binary, which left-to-right starts with 11
[]uint8{ 242, 0, 1, 9, 6 } , because 242 is 11110010 in binary, which also starts with 11
[]uint8{ 193 } , because 193 is 11000001 in binary, which again starts with 11
Note that there are several (allowed) race conditions. It is
possible for more than one worker to report
a WorkerSuccess action. In this case,
the MiningComplete{Secret} action should report the
first secret that was found (corresponding to the first
recorded WorkerSuccess action).
Also note that cancellation actions should be logged strictly after
a secret has been discovered (after at least
one WorkerSuccess action).
To illustrate all these things together, take for example the following correct tracing output, using
numTrailingZeroes=7 , threadBits=4 :
[miner] MiningBegin
[miner] WorkerStart ThreadByte=1
[miner] WorkerStart ThreadByte=3
[miner] WorkerStart ThreadByte=15
[miner] WorkerStart ThreadByte=2
[miner] WorkerStart ThreadByte=10
[miner] WorkerStart ThreadByte=5
[miner] WorkerStart ThreadByte=4
[miner] WorkerStart ThreadByte=6
[miner] WorkerStart ThreadByte=0
[miner] WorkerStart ThreadByte=9
[miner] WorkerStart ThreadByte=13
[miner] WorkerStart ThreadByte=14
[miner] WorkerStart ThreadByte=11
[miner] WorkerStart ThreadByte=12
[miner] WorkerStart ThreadByte=7
[miner] WorkerStart ThreadByte=8
[miner] WorkerSuccess ThreadByte=12, Secret=[194 170 210 13]
[miner] WorkerCancelled ThreadByte=9
[miner] WorkerCancelled ThreadByte=11
[miner] WorkerCancelled ThreadByte=4
[miner] WorkerCancelled ThreadByte=2
[miner] WorkerCancelled ThreadByte=3
[miner] WorkerCancelled ThreadByte=14
[miner] WorkerCancelled ThreadByte=13
[miner] WorkerCancelled ThreadByte=10
[miner] WorkerCancelled ThreadByte=7
[miner] WorkerCancelled ThreadByte=0
[miner] WorkerCancelled ThreadByte=6
[miner] WorkerCancelled ThreadByte=1
[miner] WorkerCancelled ThreadByte=15
[miner] WorkerCancelled ThreadByte=8
[miner] WorkerCancelled ThreadByte=5
[miner] MiningComplete Secret=[194 170 210 13]
Note that date/time info, alongside some benign logging noise, have been omitted relative to what you might
see when actually running your solution.
These are the important parts you should pay attention to.
Assumptions you can make
-
No malicious or otherwise invalid inputs will be given to
hash_miner.Mine .
Values will be given within specified ranges, according to stated invariants.
Assumptions you cannot make
-
Anything beyond the basic semantics of goroutines and channels; sufficient testing will likely expose
any such assumptions as violating some of the tracing conditions.
Implementation requirements
- The client code must be runnable on CS ugrad machines and be
compatible with Go version 1.15.6.
- Your code must be compatible with the given API, and run
successfully against the provided test cases.
- Your solution can only
use standard library Go
packages, and the tracing library.
- Your solution code must be Gofmted
using gofmt.
Starter code
You can download all the starter files here: starter.zip.
Note that you cannot change the API, or the tracing structs. Our marking scripts will
rely on these to automatically grade your solution.
Handin instructions
Keep the provided file layout at the top-level in your solution. You
may include additional files, but we do not expect this will be
necessary. Do not reference any additional libraries in your solution.
As with A1, use your personal repository for this assignment under the
CPSC416-2020W-T2
org. Your repository will look like cpsc416_a1_USERNAME
Rough grading scheme
Your
code must not change the API above. Your code must work on ugrad
servers. If any of these are violated, your mark for this
assignment is 0. This is true regardless of how many characters had to
be changed to make your solution compile, and regardless of how well
your solution works with a different API or on a different
machine.
The high-level A2 mark breakdown looks like this:
- 5% : MiningBegin action is reported correctly
- 5% : MiningComplete action is reported correctly
- 5% : WorkerStart action is reported correctly
- 5% : Output secret is associated with the right goroutine via ThreadByte prefix
- 10% : WorkerSuccess action is reported correctly
- 10% : WorkerCancelled action is reported correctly
- 40% : The solution runs in parallel across the right number of goroutines
- 20% : The output secret is correct
Advice
- Hint: look
at bytes.Buffer,
and fmt.Fprintf
(it is not immediately obvious, but
bytes.Buffer
satisfies io.Writer ) when dealing with the string
representation of a
hash.
- Compile and run your code on the ugrad servers.
Make sure to follow the
course collaboration policy and refer
to the submission instructions
that detail how to submit your solution.
|