Saturday, February 24, 2018

Let's get going!


You might be asking if this is just one more of the many blog posts about go that can be found all over the internet.  I don't want to duplicate what other people have written, so I'll mostly be crypto functions sha3/keccak in go.

Despite a brief experiment with go almost two years ago, I had not done any serious coding in go.  That all changed when early this year I decided to write an ethereum miner from scratch.  After maintaining and improving https://github.com/nerdralph/ethminer-nr, I decided I would like to try something other than C++.  My first attempt was with D, and while it fixes some of the things I dislike about C++, 3rd-party library support is minimal.  After working with it for about a week, I decided to move on.  After some prototyping with python/cython, I settled on go.

After eight years of development, go is quite mature.  As I'll explain later in this blog post, my concerns about code performance were proven to be unwarranted.  Although it is quite mature, I've found it's still new enough that there is room for improvements to be made in go libraries.

Since I'm writing an ethereum miner, I need code that can perform keccak hashing.  Keccak is the same as the official sha-3 standard with a different pad (aka domain separation) byte.  The crypto/sha3 package internally supports the ability to use arbitrary domain separation bytes, but the functionality is not exported.  Therefore I forked the repository and added functions for keccak-256 and keccak-512.  A common operation in crypto is XOR, and the sha3 package includes an optimized XOR implemenation.  This function is not exported either, so I added a fast XOR function as well.

Ethereum's proof-of-work uses a DAG of about 2GB that is generated from a 32MB cache.  This cache and the DAG changes and grows slightly every 30,000 blocks (about 5 days).  Using my modified sha3 library and based on the description from the ethereum wiki, I wrote a test program that connects to a mining pool, gets the current seed hash, and generates the DAG cache.  The final hex string printed out is the last 32 bytes of the cache.  I created an internal debug build of ethminer-nr that also outputs the last 32 bytes of the cache in order to verify that my code works correctly.

When it comes to performance, I had read some old benchmarks that show gcc-go generating much faster code than the stock go compiler (gc).  Things have obviously changed, as the go compiler in my tests was much faster in my tests.  My ETH cache generation test program takes about 3 seconds to run when using the standard go compiler versus 8 seconds with gcc-go using -O3 -march=native.  This is on an Intel G1840 comparing go version go1.9.2 linux/amd64 with go1.6.1 gccgo.  The versions chosen were the latest pre-packaged versions for Ubuntu 16 (golang-1.9 and gccgo-6).  At least for compute-heavy crypto functions, I don't see any point in using gcc-go.


Sunday, February 4, 2018

Ethereum mining pool comparisons


Since I started mining ethereum, the focus of my optimizations have been on mining software and hardware tuning.  While overclocking and software mining tweaks are the major factor in maximizing earnings, choosing the best mining pool can make a measurable difference as well.

I tested the top three pools with North American servers: Ethermine, Mining Pool Hub, and Nanopool.  I tested mining on each pool, and wrote a small program to monitor pools.  Nanopool came out on the bottom, with Ethermine and Mining Pool Hub both performing well.

I think the biggest difference between pool earnings has to do with latency.  For someone in North America, using a pool in Asia with a network round-trip latency of 200-300ms will result in lower earnings than a North American pool with a network latency of 30-50ms.  The reason is higher latency causes a higher stale share rate.  If it takes 150ms for a share submission to reach the pool, with Ethereum's average block time of 15 seconds, the latency will add 1% to your stale share rate.  How badly that affects your earnings depends on how the pool rewards stale shares, something that is unfortunately not clearly documented on any of the three pools.

When I first started mining I would do simple latency tests using ping.  Following Ethermine's recent migration of their servers to AWS, they no longer respond to ping.  What really matters is not ping response time, but how quickly the pool forwards new jobs and processes submitted shares.  What further an evaluation of different pools, is that they often have multiple servers for one host name.  For example, here are the IP address for us-east1.ethereum.miningpoolhub.com from dig:
us-east1.ethereum.miningpoolhub.com. 32 IN A   192.81.129.199
us-east1.ethereum.miningpoolhub.com. 32 IN A   45.56.112.78
us-east1.ethereum.miningpoolhub.com. 32 IN A   45.33.104.156
us-east1.ethereum.miningpoolhub.com. 32 IN A   45.56.113.50

Even though 45.56.113.50 has a ping time about 40ms lower than 192.81.129.199, the 192.81.129.199 server usually sent new jobs faster than 45.56.113.50.  The difference between the first and last server to send a job was usually 200-300ms.  With nanopool, the difference was much more significant, with the slowest server often sending a new job 2 seconds (2000ms) after the fastest.  Recent updates posted on nanopool's site suggest their servers have been overloaded, such as changing their static difficulty from 5 billion to 10 billion.  Even with miners submitting shares at half the rate, it seems they are still having issues with server loads.

Less than a week ago, us1.ethermine.org resolved to a few different IPs, and now it resolves to a single AWS IP: 18.219.59.155.  I suspect there are at least two different servers using load balancing to respond to requests for the single IP.  By making multiple simultaneous stratum requests and timing the new jobs received, I was able to measure variations of more than 100ms between some jobs.  That seems to confirm my conclusion that there are likely multiple servers with slight variations in their performance.

In order to determine if the timing performance of the pools was actually having an impact on pool earnings, I looked at stats for blocks and uncles mined from etherscan.io.
Those stats show that although Nanopool produces about half as many blocks as Ethermine, it produces more uncles.  Since uncles receive a reward of at most 2.625 ETH vs 3 ETH for a regular block, miners should receive higher payouts on Ethermine than on Nanopool.  Based solely on uncle rate, payouts on Ethermine should be slightly higher than MPH.  Eun, the operator of MPH has been accessible and responsive to questions and suggestions about the pool, while the Ethermine pool operator is not accessible.  As an example of that accessibility, three days ago I emailed MPH about 100% rejects from one of their pool servers.  Thirty-five minutes later I received a response asking me to verify that the issue was resolved after they rebooted the server.

In conclusion, either Ethermine or MPH would make reasonable choices for someone mining in North America.  This pool comparison has also opened my eyes to optimization opportunities in mining software in how pools are chosen.  Until now mining software has done little more than switch pools when a connection is lost or no new jobs are received for a long period of time.  My intention is to have my mining software dynamically switch to mining jobs from the most responsive server instead of requiring an outright failure.