For those new to the blog, I am the lead developer of Sia, a blockchain based cloud storage platform. About a year ago, I and some members of the Sia team started Obelisk, a cryptocurrency ASIC manufacturing company. Our first ASICs are going to ship in about 8 weeks, and our journey with Obelisk has given us a lot of insight into the world of cryptocurrency mining.
One of the reasons we started Obelisk was because we felt that coin devs, in general, had a very poor view into the mining world and that the best way to understand it would be to get our hands dirty and bring a miner to market.
Since starting Obelisk, we’ve learned a lot about the mining space, as relevant to GPUs, to ASICs, to FPGAs, to ASIC resistance, mining farms, electricity, and to a whole host of other subjects that coin developers should be more aware of. We aren’t able to share everything that we know, but we’ve pulled together information on a set of key topics that I think will be helpful to cryptocurrency designers and other members of the cryptocurrency community.
We’ve been pessimistic on ASIC resistance for a long time, and our journey into the hardware world solidly confirmed our position. Hardware is extremely flexible. General purpose computational devices like CPUs, GPUs, and even DRAM all make substantial compromises to their true potential in order to be useful for general computation. For basic hardware development, most algorithms can see substantial optimization just by taking away all of that generality and focusing on one specific thing.
The vast majority of ASIC-resistant algorithms were designed by software engineers making assumptions about the limitations of custom hardware. These assumptions tend to be incorrect.
Equihash is perhaps the easiest target, as a lot of people were quite confident in the equihash algorithm, and we’ve been saying for close to a year that we know how to make very effective equihash ASICs.
The key is to make sorting memory. A lot of algorithm designers don’t seem to realize that in an ASIC, you can merge the computational and storage pieces of a chip. When a GPU does equihash computations, it has to go all the way out to off-chip memory, bring data to the computational cores, manipulate the data, and then send the altered data all the way back out to the off-chip memory.
For equihash, the manipulations that you need to make to the data are simple enough that you can just merge the memory and computation together, meaning that you can do most of your manipulating in-place, substantially reducing the amount of energy used to move data back and forth, and also substantially decreasing the amount of time between adjustments to the data. This greatly increases efficiency and speed.
Needless to say, we weren’t the least bit surprised when Bitmain released powerful ASICs for equihash. The Bitmain ASICs are actually substantially less performant (5x to 10x) than our own internal study suggested they would be. There could be many reasons for this, but overall we think that it’s pretty reasonable to assume that more powerful equihash ASICs will be released in the coming months.
We also had loose designs for ethash (Ethereum’s algorithm). Admittedly, ethash was not as easily amenable to ASICs as equihash, but as we’ve seen from products on the market today, you can still do well enough to obsolete GPUs. Ethash is by far the most ASIC resistant algorithm we’ve looked at, most of the others have shortcuts that are even more significant than the shortcuts you can take with equihash.
At the end of the day, you will always be able to create custom hardware that can outperform general purpose hardware. I can’t stress enough that everyone I’ve talked to in favor of ASIC resistance has consistently and substantially underestimated the flexibility that hardware engineers have to design around specific problems, even under a constrained budget. For any algorithm, there will always be a path that custom hardware engineers can take to beat out general purpose hardware. It’s a fundamental limitation of general purpose hardware.
A lot of people believe that computing is broken up into 3 categories: CPU, GPU, and ASIC. While those are the categories that are generally visible to the public, in the chip world there’s really only one type of chip: an ASIC. Internally, Nvidia, Intel, and other companies refer to their products as ASICs. The categories as known to the public are really a statement about how flexible the ASIC is.
I would like to use a 1 to 10 scale to measure flexibility. At one side, a ‘1’, we’ll put an Intel CPU. And on the other side, a ‘10’, we’ll put a bitcoin ASIC. Designers have the ability to create chips that fall anywhere on this scale. As you move from a ‘1’ to a ‘10’, you lose substantial flexibility, but gain substantial performance. You also decrease the amount of design and development effort required as you sacrifice flexibility. On this scale, a GPU is a ‘2’.
Generally speaking, we don’t see products developed that fall anywhere between a GPU and a fully inflexible ASIC because typically by the time you’ve given up enough flexibility to move away from a GPU, you’ve only got a very specific application in mind and you are willing to sacrifice every last bit of flexibility to maximize performance. It’s also a lot less costly to design fully inflexible ASICs, which is another reason you don’t see too many things in the middle.
Two examples of products between a GPU and an ASIC would be the Baikal miners and the Google TPU. These are chips which can cover a flexible range of applications at performances which are substantially better than a GPU. The Baikal case specifically is interesting, because it’s good enough to obsolete GPUs for a large number of coins, all using the same basic chip. These chips appear to be flexible enough to follow hardforks as well.
The strategy of hardforking ASICs off of a network is going to lose potency the more it happens, because chip designers do have the ability to make chips that are flexible, anywhere from slightly flexible to highly flexible, with each piece of flexibility costing only a bit of performance. The Monero devs have committed to keeping the same general structure for the PoW algorithm, and because of that commitment, we believe that you could make a Monero miner capable of surviving hard forks with less than a 5x hit to performance.
Equihash is an algorithm that has three parameters. Zcash mining happens with one particular choice for these parameters, and any naive hardfork from Zcash to drop ASICs would likely involve changing one or more of these parameters. We were able to come up with a basic architecture for equiahsh ASICs that would be able to successfully follow a hardfork that chose any set of parameters. Meaning, a basic hardfork tweaking the algorithm parameters would not be enough to disrupt our chip, a more fundamental change would be needed. Despite this flexibility, we believe our ASIC would be able to see massive speedups and efficiency gains over GPUs. We never found funding for the equihash ASICs, and as a result, our designs ended up on the shelf.
The ultimate conclusion here once again wraps back to the capabilities of ASICs. I think there are a lot of people out there who do not realize that flexible ASICs are possible, and expected that routinely doing small hardforks to disrupt any ASICs on the network would be sufficient. It may be sufficient sometimes, but just as algorithms can attempt to be ASIC resistant, ASICs can attempt to be hardfork resistant, especially when the changes are more minor.
A few months ago, it was publicly exposed that ASICs had been developed in secret to mine Monero. My sources say that they had been mining on these secret ASICs since early 2017, and got almost a full year of secret mining in before discovery. The ROI on those secret ASICs was massive and gave the group more than enough money to try again with other ASIC resistant coins.
It’s estimated that Monero’s secret ASICs made up more than 50% of the hashrate for almost a full year before discovery, and during that time, nobody noticed. During that time, a huge fraction of the Monero issuance was centralizing into the hands of a small group, and a 51% attack could have been executed at any time.
Monero’s hardfork appears to have been successful in shaking the ASICs. I don’t believe that the ASIC designers attempted to build flexibility into their ASICs, but now that Monero has announced a twice-annual PoW change, we may see another round of secret ASICs with more flexibility. The block reward for Monero is high enough that even if you think you have only a 30% chance of your ASIC surviving the PoW hardfork, it’s more than worthwhile to pursue a hardfork resistant ASIC.
My strong guess is that Monero is going to have another round of secret ASICs built, and that these ASICs will be more conservative and flexible, attempting to follow the hard forks that Monero puts out every 6 months.
We’ve heard rumors of plenty of other secret ASICs. People who own secret ASICs tend not to talk about them very much, but as of March 2018, we had heard of secret ASIC rumors specifically for both Equihash and Ethash, and then for many other smaller coins that don’t have any ASICs on them yet. We believe a full 3 different groups were actively mining on Zcash with different ASICs prior to the Bitmain Z9 announcement.
We know of mining farms that are willing to pay millions of dollars for exclusive access to designs for specific cryptocurrencies. Even low ranking cryptocurrencies have the potential to make millions in profits for someone with exclusive access to secret ASICs. As a result, an informal underground industry has been set up around secret mining. The heavy amount of secrecy involved means it’s disconnected and mostly operates off of rumor and previous relationships. But it’s nonetheless a very lucrative industry, and even when things happen like the Vertcoin hardfork, the setback to secret miners is dwarfed by the returns of the successes.
At this point, I think it’s safe to assume that every Proof-of-Work coin with a block reward of more than $20 million in the past year has at least one group of secret ASICs currently mining on it, or will have secret ASICs mining on it within a few months. The easiest way to detect this is GPU returns, however as ASICs continue to infiltrate every coin on the market, that will cease to be a reliable metric as there will be no GPU-only coin to use as a baseline, at least not one that is large enough to sustain all of the massive GPU farms that are out there.
The ASIC game has become such an advanced game because there is so much money on the table. Even small coins can be worth tens of millions of dollars, which is more than enough to justify a high-risk production run.
Manufacturers that sell ASICs to the public, like Bitmain, tend to be less exposed than consumers to things like ASIC hardforks. Using Sia as an example, we estimate it cost Bitmain less than $10 million to bring the A3 to market. Within 8 minutes of announcing the A3, Bitmain already had more than $20 million in sales for the hardware they spent $10 million designing and manufacturing. Before any of the miners had made any returns for customers, Bitmain had recovered their full initial investment and more.
In this case, a hardfork doesn’t hurt Bitmain. Bitmain made a profit off of Sia, and there’s nothing the developers can do about that. And it seems that was the case for the Monero miners that Bitmain announced as well. Bitmain didn’t even get to announce the miners until after Monero announced their hardfork, and still, it seems that they sold enough obsolete hardware to customers to make back their costs and turn a hearty profit.
The mining game is weighted heavily in favor of the manufacturers. They get to control the hardware production, the supply, and they know more about the state of the industry than anyone else. The profitability of a miner largely depends on variables that the manufacturer controls without disclosure to anyone else.
In the case of Halong’s Decred miner, we saw them “sell out” of an unknown batch size of $10,000 miners. After that, it was observed that more than 50% of the mining rewards were collecting into a single address that was known to be associated with Halong, meaning that they did keep the majority of the hashrate and profits to themselves. Our investigation into the mining equipment strongly suggests to us that the total manufacturing cost of the equipment is less than $1,000, meaning that anyone who paid $10,000 for it was paying a massive profit premium to the manufacturer, giving them the ability to make 9 more units for themselves. Beyond this, the buyer has no idea how many were sold nor where the difficulty would be when the units shipped. The manufacturer does know whether or not the buyer is going to be able to make a return, but the buyer does not. The buyer is trusting the manufacturer entirely.
If a cryptocurrency like Sia has a monthly block reward of $10 million, and a batch of miners is expected to have a shelf life of $120 million, the most you would expect a company could make off of building miners is $120 million. But, manufacturers actually have a way to make substantially more than that.
In the case of Bitmain’s A3, a small batch of miners was sold to the public with very fast shipping time, less than 10 days. Shortly afterward, YouTube videos started circulating of people who had bought the miners and were legitimately making $800 per day off of their miner. This created a lot of mania around the A3, setting Bitmain up for a very successful batch 2.
While we don’t know exactly how many A3 units got sold, we suspect that the profit margins they made on their batch 2 sales are greater than the potential block reward from mining using the A3 units. That is to say, Bitmain sold over a hundred million dollars in mining rigs knowing that the block reward was not large enough for their customers to make back that money, even assuming free electricity. And this isn’t the first time, they pulled something similar with the Dash miners. We call it flooding, and it’s another example of the dangerous asymmetry that exists between manufacturers and customers.
At the end of the day, cryptocurrency miner manufacturers are selling money printing machines. A well-funded profit-maximizing entity is only going to sell a money printing machine for more money than they expect they could get it to print themselves. The buyer needs to understand why the manufacturer is selling the units instead of keeping them for themselves.
There are a few reasons it would make sense for a manufacturer to sell a money printing machine rather than keep it. The first is capital — manufacturing is an expensive process with a lot of lead times. If the manufacturer doesn’t have enough money to build their own units, then it makes sense to sell the units instead, and use the money from sales for production. It boils down to the manufacturer selling future revenue to get revenue today, which is a very common transaction in the financial world.
Another reason the manufacturer may sell money printing machines instead of running them is the electricity costs of running them. If the manufacturer can only get a certain deal on electricity, then there may be someone else with cheaper electricity or better datacenters who would be willing to buy the units at a price that’s higher than what the manufacturer values them at. Most manufacturers, however, have access to good electricity deals, and unless you have some deal with free electricity or are otherwise running a cutting edge professional operation, you are not likely to do better than the manufacturer.
And finally, the manufacturer may have some other reason that they want money quickly instead of making a longer-term investment into the hardware themselves. This is likely not the case in cryptocurrency mining though, because the shelf life of miners tends to be under two years, and to a business that’s not a long time at all to wait for healthy returns.
In the traditional chip development world, it takes about 2 years to go from launching a development effort to getting a chip out out the door. In the case of the Sia and Decred miners we built, it looks like we’re going to be at about 13 months total from project launch to product delivery. If we had to do the same thing again, I think we could do it in about 9 months.
A huge portion of the time spent is on full-custom routing for the chip. There’s a much faster development process called place-and-route which trims about 3 months off of the chip development time but produces chips that are 2x-5x slower than what a full-custom team can produce. We think that if we used a place-and-route design methodology, we could get our product delivery timeline close to 6 months.
We believe it took Bitmain about 5 months to create the A3 miner, and we believe it took Halong about 9 months to create the B52 miner. We suspect both of these were completed using place-and-route methodologies, especially given the relatively poor performance of each.
Those are timelines for creating a chip from scratch. If the goal is to chase a hardfork, the timelines are a lot shorter. If you know in advance that you are going to need to redesign your chip, there are a lot of shortcuts you can take to reduce the overall time required to get to market. Changing a design to meet a tweak is going to take much less time than starting from scratch, a good team with a well-planned base architecture can probably complete designs in about 2 weeks. From there, with some help from a hot-lot, you can get a new set of chips in about 40 days. These then need to be packaged, which is going to take around a week, and then sent to the manufacturer for assembly. Finally, you have to get the units to a datacenter and start mining.
If you had all the wafers, parts, and everything reserved ahead of time, we believe that you could upgrade a chip to adapt to a hardfork and have miners mining on the new hashing algorithm in about 70 days, at least in theory. In practice, Bitmain would probably require 3–4 months to adapt an existing chip to a hard fork, and if they hadn’t reserved wafers in advance they’d be looking at 4–5 months. Any company that is not Bitmain can probably add another 30–60 days to these numbers.
Some people already understand the situation with economies of scale well. The more money you spend, the more effective each dollar is. This effect is maintained throughout every level of scale that I’ve been able to peer into, including scales where you are going from billions of dollars to tens of billions of dollars.
The most simple way that this manifests is in volume orders. If you order one hundred thousand heatsinks, you can get one price. If you order one million heatsinks, you get a better price. As you continue to scale up, the price falls off. This effect is in place for almost all parts in the hardware industry, and it happens because manufacturers get to the point where they can buy and dedicate equipment to your order, and then keep that equipment at 100% efficiency. As you scale up, you gain greater customization and specialization in addition to cost savings, meaning your products become more effective as well as cheaper.
At some point, it makes sense just to buy out all the capacity at the manufacturer. A huge component of the manufacturing price is paying for equipment. Equipment that is idle 50% of the time is going to have 2x the effective per-part price than equipment that is in use 100% of the time. As you increase lead times and order volumes, you can start getting fully dedicated equipment running nonstop for you, which again pulls the price down substantially.
In a similar sort of turn, someone has to manufacture that equipment. If you scale up to the point where you are continuously ordering specific equipment, the manufacturer can start to dedicate pipelines to you and keep their own equipment operating 100% of the time, and so the equipment you use for manufacturing is getting cheaper now on top of being in use all the time.
And that’s just the beginning. At every step, each provider, manufacturer, etc. is going to be taking margins, typically somewhere around 30% depending on how commodity your orders are. If you have enough money, you can start to engage in vertical integration, cutting out the margins of your manufacturers by either buying them out or creating your own margin-free entity.
Hardware goes through a lot of steps. There’s the acquisition of raw materials like iron and oil, the refinement of these materials, and then they get manufactured into base parts that can be sold for more general products. Those base parts often have lead times that are 6+ months, which means that suppliers typically keep a huge number of them in stock, so that they can provide customers with parts on faster timelines. Every step typically introduces both a middleman and inefficiency, especially because each step is targeted towards general use parts instead of a specific product. If you have a specific product that has enough volume/scale to justify dedicated supply chain elements, you will shave costs, you will shave lead times, you will improve product quality and performance, and you will get ahead of what anyone without that type of scale can achieve.
To present a very rough number, it seems to me that every time you 10x the amount of money you are spending, you can save about 30% per part. That is to say, if you are spending $100 million on mining units, you might get units for $500 each. And if you are spending $1 billion on mining units, you can squeeze that price down to $350 per miner just by having more money to throw around. And then if you jump to $10 billion, your per-miner price might drop to $245. Your mining machines aren’t just getting cheaper though, they are also becoming more customized and higher performance. You don’t just build a dollar moat with scale, you also build a quality moat.
When we started Obelisk, we had numerous separate sources reach out and warn us that Bitmain plays dirty and that if we try to manufacture in China, we will be stopped.
With that in mind, we brought up the issue to everyone we worked with and proceeded cautiously with an American manufacturer that owned a facility in China. This was attractive because the prices were close to half of what we would have paid to manufacture in America, and manufacturing was going to be one of our largest expenses by far.
We did everything we could to keep the entity disconnected from Obelisk, and we hid the name of the manufacturer from our website or any public data, and we were very careful with who we gave the name of our manufacturer privately. We had a separate entity put in parts orders where we could.
After any reasonable timeframe to reach out to another manufacturer, after-hours on a Friday night, our manufacturer reached out to us and said with little warning or reasonable explanation that they would be unable to manufacture for us. Just as we had been warned, our attempt to manufacture in China had fallen flat on its face. This setback is estimated to have cost us somewhere north of $2 million.
We have absolutely no evidence that Bitmain was involved in any way. We’ve had other companies reach out to us and confirm that they’ve experienced similar things, but they too had no concrete evidence that Bitmain was involved in any way. I honestly was not sure whether to include this section in the blog post, because unlike most of the other things I’ve been saying, we really have nothing more than a bunch of warnings that ended up being correct to inform us.
But it’s well established in the industry that Bitmain plays dirty, and it’s been suggested to us from all sides that they have been and will continue making moves within our supply chain to ensure we can’t succeed, and that they do the same with all of their competitors.
Mining farms are perhaps the one area where manufacturers and economies of scale are not dominant. Good electricity deals tend to come in smaller packages, tend to be distributed around the world, and tend to be difficult to find and each involve unique circumstances. As such, it’s been difficult for larger companies to create a system for scooping up low-cost electricity worldwide. Instead, the cheapest electricity and data centers in the world tend to be held by smaller parties that don’t individually own all that much electricity or hashrate.
From what I’ve been able to dig up, the average professional mining farm is paying somewhere between 4 and 6 cents for electricity, and then another 3 to 6 cents for management and maintenance. A total cost of $50 per kilowatt per month is probably somewhere close to the median for large scale mining farms. As techniques improve and the industry grows, we expect this number to fall closer to $35 per kilowatt per month (including maintenance, land, taxes, etc.) throughout 2019 and 2020. We don’t believe that anyone paying more than $80 per kilowatt per month will be able to remain competitive unless the price of cryptocurrency continues to rise rapidly over the next year.
The top 20% of miners all seem to be below $35 per kilowatt per month already, from what we’ve been able to glean, and the top 5% seem to be below $20 per kilowatt per month. By my estimation, if the price of Bitcoin were to fall substantially, these mining operations would be able to stay in business and everyone paying $50 or more would be forced to shut down their facilities.
It’s really hard to know where Bitmain is at but based on everything we’ve been seeing we estimate that Bitmain is somewhere around the $30 per kilowatt per month mark. That is, they are doing better than the median mining operation, but by no means are they in the elite tier.
Most mining startups seem incredibly focused around the chip itself. From what we’ve seen, the chip is really less than half the story. So, the chip is important (apologies for the title), but if all you’ve got is the best chip in the world, you aren’t going to be a competitive manufacturer.
As a miner, the goal at the end of the day is to do as many hashes as possible for as little money as possible. A faster chip means that you need to spend less money on chips to get hashrate. And a more energy efficient chip means you need to spend less money on electricity to get hashrate. But you aren’t just spending money on chips and electricity. You spend money on PCB, on controllers, on ports like ethernet ports, on power supplies and power management, on fans, on enclosures, on shelves in your datacenter, etc.
At the end of the day, the chip is only a portion of the equation for mining successfully. If you aren’t thinking about the whole picture, you are going to end up with a chip that will lose you money. This is actually one of the things that killed Butterfly Labs (among many) — they designed a high-performance chip that produced hundreds of watts of heat. By comparison, Bitmain chips are typically about six watts each. Where Bitmain is able to throw on a forest of fat heatsinks, Butterfly Labs had to struggle with expensive, cutting edge, unreliable cooling systems, and that ultimately meant their powerhouse chip was late to market and too expensive to operate.
People tend to under-estimate Bitmain. Yes, they have the most money, and yes, they dominate because of their economies of scale. But they also dominate because they’ve got the fastest-to-market time of any company. They dominate because they’ve got the best chip developers in cryptocurrency. They dominate because they’ve innovated in dozens of places to squeeze costs and inefficiencies out of corners that most people aren’t aware exists. They hire the best people and pay them well. And they work hard to make sure that at every iteration, they are the ones on top.
There’s not a whole lot more to say here. I feel that a lot of people under-estimate Bitmain or assume that because they play dirty they wouldn’t be able to keep up without playing dirty. But that’s not true. They play dirty because it’s yet another place they can optimize their business, and because they know they can get away with it. Everything else they does is highly optimized as well. If we want to understand mining, we need to appreciate that the entity that controls most of mining today is an impressive, highly skilled, well-refined entity.
The biggest takeaway from all of this is that mining is for big players. The more money you spend, the more of an advantage you have, and there’s not an easy way to change that equation. At least with traditional Nakamoto style consensus, a large entity that produces and controls most of the hashrate seems to be more or less the outcome, and at the very best you get into a situation where there are 2 or 3 major players that are all on similar footing. But I don’t think at any point in the next few decades will we see a situation where many manufacturing companies are all producing relatively competitive miners. Manufacturing just inherently leads to centralization, and it happens across many different vectors.
Though that’s discouraging news, it’s not the end of the world for Bitcoin or other Proof of Work-based cryptocurrencies. Decentralization of hashrate is a good-to-have, but there are a large number of other incentives and mechanisms at play that keep monopoly manufacturers in line. A great example of this is the Bitcoin / Segwit2x situation. More than 80% of the hashrate was openly in support of activating Segwit2x, and yet the motion as a whole failed.
There are plenty of other tools available to cryptocurrency developers and communities as well to deal with a hostile hashrate base, including hardforks and community, splits. The hashrate owners know this, and as a result, they are careful not to do anything that would cause a revolt or threaten their healthy profit streams. And now that we know to expect a largely centralized hashrate, we can continue as developers and inventors to work on structures and schemes which are secure even when the hashrate is all pooled into a small number of places.