For completeness, I have no affiliation or correction with Apress -- please consider this a heads-up.
http://meyerweb.com/eric/comment/chech.html
[edit: clarified context]
Considering that this second post got much more traction than the first, I don't see anything wrong.
https://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF
In context, it was a piece advocating against the use of GOTO to the exclusion of all other control structures (e.g, 'for' or 'while' loops, etc).
http://mainisusuallyafunction.blogspot.com/2012/11/attacking...
Several Intel chipset generations require certain register writes on shutdown (disable busmaster) or they won't _actually_ shut down. Operating systems aren't aware of that. (https://github.com/coreboot/coreboot/blob/master/src/southbr...)
UEFI Secure Boot requires "authenticated variables", which can be updated by the OS (after checking authentication, using a signature scheme). UEFI code resides somewhere in memory, so the OS (or ring0 code) could opt to bypass the verification and simply rewrite those variables. The recommended (but not required) solution is to move variable update to SMM. (https://firmware.intel.com/sites/default/files/resources/A_T...)
Several hardware features are actually implemented in SMM. I've seen SMM-based handling of certain special keys (eg. "disable Wifi" button) where ACPI grabs the event, then traps into SMM using a magic IO port.
This already suggests the owner of the CPU isn't who they are protecting, but it gets worse (even before we consider the risk from AMT). Starting an SGX enclave seems to require[3] a "launch key" that is only known by Intel, allowing Intel to control what software is allowed to be protected by SGX.
[1] https://software.intel.com/en-us/blogs/2013/09/26/protecting...
[2] Before the term "DRM" was coined, the same crap used to be called "trusted computing" (back when Microsoft was pushing Palladium/NGSCB)
[3] https://jbeekman.nl/blog/2015/10/intel-has-full-control-over...
https://hn.algolia.com/?query=%22the%20new%20goto%22&sort=by...
http://blog.invisiblethings.org/papers/2015/x86_harmful.pdf
http://blog.invisiblethings.org/2015/10/27/x86_harmful.html
Not a great approach; one ought just to pick the better of the two, which in this case is the html, because it gives more background, loads faster, and links to the pdf.
General remark: I doubt that we'll make the dupe detector sophisticated enough to catch a case like this, but I do think we'll add software support for users to identify dupes when they see them. That's what happens informally already (as you all did in this thread, and by flagging the other post) so the shortest path to better deduplication for HN seems to be: write software to make community contribution easy. Also I kind of like the idea of giving a karma point to the first user who correctly links a given pair of posts.
but it gets worse, every processor from PPro (1995) on to sandy bridge has a gaping security hole reported (conveniently only AFTER Intel patched it 2 generations ago) by a guy working for Battelle Memorial Institute, known CIA front and black budget sink
https://www.blackhat.com/docs/us-15/materials/us-15-Domas-Th...
surprisingly good writeup: http://www.theregister.co.uk/2015/08/11/memory_hole_roots_in...
list of CIA fronts: http://www.jar2.com/2/Intel/CIA/CIA%20Fronts.htm battelle is on it
I critiqued QubesOS in the past over re-inventing the wheel and on a highly insecure platform. Her recent write-up supports my critique more than ever. Regardless, they're at least doing something with provable benefit and high usability on a platform with proven benefit, both of which can be further secured or extended by others. An exception to the rule of mainstream INFOSEC where the sense of security is almost entirely false as no effort is taken to address TCB.
The only project in this space leveraging best practices in TCB or architecture is GenodeOS. They're doing what I suggested QubesOS do a while back: build on all proven, low-TCB techniques in academia. Main critique I had of them is they're too flexible and need to focus on a single stack long enough to get it working solidly like Qubes team did. They stay building on and integrating the better stuff out of L4 family of security engineering research, though.
http://www.gaisler.com/index.php/products/ipcores/soclibrary
Also SPARC but with plenty GPL. Has a quad-core, too, with all of them designed to be easily modified and re-synthesized. :)
http://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/
It's amazing how much has been done in what was essentially a pile of hacks on top of a pre-processor making it pretend to be an application language. That pro's avoid it and its wiser users almost always leave it eventually further corroborates the author that it's fundamentally bad. If anything, it's one option among better ones (Python, Ruby) for non-programmers to get started in web apps. Little reason to use it at this point with all the 3rd party components and communities around better languages.
(From the excellent https://lkml.org/lkml/2011/5/25/228)
Most motherboard vendors also thrown stuff onto enterprise motherboards for doing things remotely. They can have issues: https://www.youtube.com/watch?v=GZeUntdObCA
http://www.adapteva.com/andreas-blog/semiconductor-economics...
Far as cost, it depends on how you do it. There's three ways to do it:
1. FPGA-proven design done by volunteers that's ported to a Structured ASIC by eASIC or Triad Semiconductor.
2. Standard Cell ASIC that's done privately.
3. Standard Cell ASIC that's done in academia whose OSS deliverables can be used privately.
Option 1 will be the cheapest and easiest. An example of these are here:
http://www.easic.com/products/90-nm-easic-nextreme/
http://www.triadsemi.com/vca-technology/
These are a lot like FPGA's, although Triad adds analog. The idea is there's a bunch of pre-made logic blocks that your hardware maps to. Unlike FPGA's, the routing is done with a custom layer of metal that only includes (or powers) necessary blocks. That lets it run faster, with less power, and cheaper. "Cheaper" is important given FPGA vendors recover costs with high unit prices.
The S-ASIC vendors will typically have premade I.P. for common use cases (eg ethernet) and other vendors' stuff can target it. Excluding your design cost and I.P. costs, the S-ASIC conversion itself will be a fraction of a full ASIC's development costs. I don't know eASIC's price but I know they do maskless prototyping for around $50,000 for 50 units. They'll likely do a six digit fee upfront with a cut of sales, too, at an agreed volume. Last I heard, Triad is currently picky about who they work with but cost around $400,000.
Option 2 is the easier version of real-deal: an actual ASIC. This basically uses EDA tools to create, synthesize, integrate, and verify an ASIC's components before fabbing them for real testing. The tools can be $1+ mil a seat. Mask & EDA costs are the real killer. Silicon itself is cheap with packaging probably around $10-30 a chip with minimum of maybe 40 chips or so. Common strategies are to use smart people with cheaper tools (eg Tanner, Magma back in day), use older nodes whose masks are cheaper (350nm/180nm), license I.P. from third parties (still expensive), or build the solution piecemeal while licensing the pieces to recover costs. Multi-project wafers (MPW's) to keep costs down. What that does is split a mask and fab run among a number of parties where each gets some of the real estate and an equivalent portion of cost. 350nm or 180nm are best for special purpose devices such as accelerators, management chips, I/O guards, etc that don't need 1GHz, etc. 3rd-party license might be no go for OSS unless it's dual-licensed or open-source proprietary. Reuse is something they all do. All in all, on a good node (90nm or lower), a usable SOC is going to cost millions no matter how you look at it. That said, the incremental cost can be in hundreds of thousands if you re-use past I.P. (esp I/O) and do MPW's.
Company doing MPW with cool old node + 90nm memory trick on top:
http://www.tekmos.com/products/asics/process-technologies
Option 3 is academic development. The reason this is a good idea is that Universities get huge discounts on EDA tools, get significant discounts on MPW's at places like MOSIS fabrication service, and may have students smart enough to use the tools while being much cheaper than pro's. They might work hand-in-hand with proprietary companies to split the work between them or at least let pro's assist the amateurs. I've often pushed for our Universities to make a bunch of free, OSS components for cutting edge nodes ranging from cell libraries to I/O blocks to whole SOC's. There's little of that but occasional success stories. Here's two standard cell ASIC's from academia: a 90nm microcontroller and (my favorite) teh 45nm Rocket RISC-V processor which was open-sourced.
http://repository.tudelft.nl/assets/uuid:8a569a87-a972-480c-...
http://www.eecs.berkeley.edu/~yunsup/papers/riscv-esscirc201...
Note: Those papers will show you the ASIC Standard Cell process flow and tools that cane be involved. The result was awesome with Rocket.
So, enough academics doing that for all the critical parts of SOC's could dramatically reduce costs. My proposal was to do each I/O (where possible) on 180nm, 90nm, 45nm, and 28nm. The idea being people moving their own work down a process node could just drop-in replacements. The I/O and supplementary stuff would be almost free so that let's developers focus on their real functionality.
My other proposal was a free, OSS FPGA architecture with a S-ASIC and ASIC conversion process at each of the major nodes. Plenty of pre-made I.P. as above with anyone able to contribute to it. Combined with QFlow OSS flow or proprietary EDA, that would dramatically reduce OSS hardware cost while letting us better see inside.
Archipelago Open-Source FPGA http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-43...
Note: Needs some improvements but EXCITING SHIT to finally have one!
Qflow Open-source Synthesis Flow http://opencircuitdesign.com/qflow/
Synflow open-source HDL and synthesis http://cx-lang.org/
Note: I haven't evaluated or vetted Synflow yet. However, the I.P. is the only ones I've ever seen for under $1,000. If they're decent quality, then there must be something to their method and tools, eh?
So, there's your main models. Both commercial and academic one might benefit from government grants (esp DARPA/NSF) or private donations from companies/individuals that care about privacy or just cheaper HW development. Even Facebook or Google might help if you're producing something they can use in their datacenters.
For purely commercial, the easiest route is to get a fabless company in Asia to do it so you're getting cheaper labor and not paying for full cost of tools. This is true regardless of who or where: tools paid for in one project can be reused on next for free as you pay by year. Also, licensing intermediate I.P. or selling premium devices can help recover cost. Leads me to believe open-source proprietary, maybe dual-licensed, is the best for OSS HW.
So, hope there's enough information in there for you.
already existed - but apparently not (except for targeting FPGAs as you mention) ?
1. Monitor hardware itself for bad behavior.
2. Monitor and restrict I/O to catch any leaks or evidence of attacks.
3. Use triple, diverse redundancy with voter algorithms for given HW chip and function.
4. Use a bunch of different ones while obfuscating what you're using.
5. Use a trusted process to make the FPGA, ASIC, or both.
I've mainly used No's 2-4 with No 5 being the endgame. I have a method for No 5 but can't publish it. Suffice it to say that almost all strategies involve obfuscation and shellgames where publishing it gives enemies an edge. kerckhoff's principle is wrong against nation-states: obfuscated and diversified combination of proven methods is best security strategy. Now, ASIC development is so difficult and cutting edge that knowing that the processes themselves aren't being subverted is likely impossible.
So, my [unimplemented] strategy focuses on the process, people, and key steps. I can at least give an outline as the core requirements are worth peer review and others' own innovations. We'd all benefit.
1. You must protect your end of the ASIC development.
1-1. Trusted people who won't screw you and with auditing that lets each potentially catch others' schemes.
1-2. Trusted computers that haven't been compromised in software or physically.
1-3. Endpoint protection and energy gapping of those systems to protect I.P. inside with something like data diodes used to release files for fabs.
1-4. Way to ensure EDA tools haven't been subverted in general or at least for you specifically.
2. CRITICAL and feasible. Protect the hand-off of your design details to the mask-making company.
3. Protect the process for making the masks.
3-1. Ensure, as in (1), security of their computers, tools, and processes.
3-2. Their interfaces should be done in such a way that they always do similar things for similar types of chips with same interfaces. Doing it differently signals caution or alarm.
3-3. The physical handling of the mask should be how they always do it and/or automated where possible. Same principle as 3-2.
3-4. Mask production company's ownership and location should be in a country with low corruption that can't compel secret backdoors.
4. Protect the transfer of the mask to the fab.
5. Protect the fab process, at least one set of production units, the same way as (3). Same security principles.
6. Protect the hand-off to the packaging companies.
7. Protect the packaging process. Same security principles as (3).
8. Protect the shipment to your customers.
9. Some of the above apply to PCB design, integration, testing, and shipment.
So, there you have it. It's a bit easier than some people think in some ways. You don't need to own a fab really. However, you do have to understand how mask making and fabbing are used, be able to observe that, have some control over how tooling/software are done, and so on. Plenty of parties and money involved in this. It will add cost to any project doing it which means few will (competitiveness).
I mainly see it as something funded by governments or private parties for increased assurance of sales to government and security-critical sectors. It will almost have to be subsidized by governments or private parties. My hardware guru cleverly suggested that a bunch of smaller governments (eg G-88) might do it as a differentiator and for their own use. Pool their resources.
It's a large undertaking regardless. Far as specifics, I have a model for that and I know one other high-assurance engineer with one. Most people just do clever obfuscation tricks in their designs to detect modifications or brick the system upon their use with optional R.E. of samples. I don't know those tricks and it's too cat n mouse for me. I'm focused at fixing it at the source.
EDIT: I also did another essay tonight on cost of hardware engineering and ways to get it down for OSS hardware. In case you're interested:
[ed: I'm thinking of things like LEON etc - but as mentioned, and as I understand it, for the ASIC case, maybe not the whole eval board is open. And it's not really in the same ballpark as the dual/quad multi-GHz cpus we've come to expect from low-end hard-ware:
http://www.gaisler.com/index.php/products/boards/gr-cpci-leo... ]
Example of custom design flow http://viplab.cs.nctu.edu.tw/course/VLSI_SOC2009_Fall/VLSI_L...
Note: Load up this right next to the simple, 90nm MCU PDF I gave you and compare the two. I think that you'll easily see the difference in complexity. One you'll be able to mostly follow just googling terms and understand a lot of what they're doing. You're not going to understand the specifics of the full-custom flow at all. Simply too much domain knowledge built into it that combines years of analog and digital design knowledge. Top CPU's hit their benchmarks using full-custom for pipelines, caches, etc.
Example of verification that goes into making those monstrosities work:
http://fvclasspsu2009q1.pbworks.com/f/Yang-GSTEIntroPSU2009....
So, yeah, getting to that level of performance would be really hard work. The good news is that modern processors, esp x86, are lots of baggage that drains performance that we don't need. Simpler cores in large numbers with accelerators can be much easier to design and perform much better. Like so:
http://www.cavium.com/OCTEON-III_CN7XXX.html
Now, that's 28nm for sure. Point remains, though, as Cavium didn't have nearly the financial resources of Intel despite their processors smoking them in a shorter amount of time. Adapteva's 64-core Epiphany accelerator was likewise created with a few million dollars by pro's and careful choice of tooling. So, better architecture can make up for the lack of speed that comes from full-custom.
On opposite end, my link was at least clear on attributes of a good language. These were specifically mentioned: predictable, consistent, concise, reliable, debuggable. The author gave specific examples showing PHP lacks these traits. An analysis of Python or Ruby show them to embody these traits much more while also possessing the supposed advantages PHP fans tell me including easy learning, many libraries, huge community, tools, etc. So, the evidence indicates PHP is a poorly designed language (or not designed at all) while some competitors are well-designed languages with most of same benefits.
Other authors say about the same about both philosophy and specific details showing why PHP is a pain to work with if you want robust software along with building skills a good developer should have.
https://www.quora.com/Why-is-PHP-hated-by-so-many-developers
https://blog.codinghorror.com/php-sucks-but-it-doesnt-matter...
Truth be told, though, the burden of proof is on you PHP supporters to show why PHP is a good language and people should use it. I claim it was a mere pre-processor that had all kinds of programming language features bolted onto it over time to let it handle certain situations. That's not design at all. Python and Ruby are designed languages with consistency, core functionality for many situations, extensions/libraries, and optionally the ability to pre-process web pages. World of difference in both language attributes and quality of what people produce with them. So, not only have you presented no evidence of PHP's alleged good design, I've presented evidence against it and for two competitors have better designs.
Feel free to back up your claims with some actual data rather than dismiss whatever data anyone else brings up. I mean, you want to dismiss the guys ranting feel free. Can even edit all that crap out to leave just the data and arguments. Same for other links. Resulting text still supports our claims against PHP. So, status quo among professionals should be "PHP Is Garbage" that leads to buggy, hard to maintain, insecure, slow software. It will remain until PHP's community proves otherwise and demonstrates their counter-claim in practice with apps possessing the opposite of those negative traits.
That was covered here: " I have a method for No 5 but can't publish it. Suffice it to say that almost all strategies involve obfuscation and shellgames where publishing it gives enemies an edge."
There are black box processes trusted and checked in my best scheme, though, with security ranging from probabilistic to strong with some risks. Mainstream research [1] has a few components of mine. They're getting closer. DARPA is funding research right now into trying to solve the problem without trust in masks or fabs. We're not there yet. Further, the circuits are too small to see with a microscope, the equipment is too expensive, things like optimal proximity correction algorithms too secret, properties of fabs too varying, and too little demand to bring this down to so just anyone can do it and openly. Plus, even tooling itself is black boxes of black boxes out of sheer necessity due to esoteric nature, constant innovation, competition, and patents on key tech.
Note: Seeing chip teardowns at 500nm-1um did make me come up with one method. I noted they could take pictures of circuits with a microscope. So, I figured circuit creators could create, distribute, and sign a reference image for what that should look like. The user could decap and photo some subset of their chips. They could use some kind of software to compare the two. If enough did this, a chip modification would be unlikely except as a denial-of-service attack. Alas, you stop being able to use visual methods around 250nm and it only gets harder [2] from there.
Very relevant is this statement by a hardware guru that inspired my methods which embrace and secure black boxes instead of go for white boxes:
"To understand what is possible with a modern fab you'll need to understand cutting edge Lithography, Advanced directional etching , Organic Chemistry and Physics that's not even nearly mature enough to be printed in any text book. These skills are all combined to repeatedly create structures at 1/10th the wavelength of the light being used. Go back just 10 or 15 years and you'll find any number of real experts (with appropriate Phd qualifications) that were willing to publicly tell you just how impossible the task of creating 20nm structures was, yet here we are!
Not sure why you believe that owning the fab will suddenly give you these extremely rear technical skills. If you dont have the skills, and I mean really have the skills (really be someone that knows the subject and is capable of leading edge innovation) then you must accept everything that your technologists tell you, even when they're intentionally lying. I cant see why this is any better then simply trusting someone else to properly run their fab and not intentionally subvert the chip creation process.
In the end it all comes down to human and organizational trust, "
Very well said. Still an argument for securing machines they use or transportation of design/masks/chips. The critical processes, though, will boil down to you believing someone that claims expertise and to have your interests at heart. I'm not sure I've even seen someone fully understand an electron microscope down to every wire. I'll assure you the stuff in any fabrication process, from masks to packaged IC's, are much more complex. Hence, my framework of looking at it.
"how to physically secure the shipment to the fab company, etc. I'm asking how Consumer, who gets one IC, can verify that the IC matches exactly with the published VHDL code and contains no backdoors."
Now, for your other question, you'd have to arrange that with the fabs or mask makers. Probably cost extra. I'm not sure as I don't use the trusted foundry model [yet]. My interim solution is a combination of tricks that don't strictly require that but are mostly obfuscation. You'd need guards you can trust who can do good OPSEC and it can never leave your sight at customs. You still have to trust mask maker, fab, and packager. That's the big unknown, though, ain't it? The good news is that most of them have a profit incentive to crank out product fast in a hurry at lowest cost while minimizing any risks that hurt business. If they aren't attacking or cooperating, it's probably for that reason.
"how to physically secure the shipment to the fab company, etc. I'm asking how Consumer, who gets one IC, can verify that the IC matches exactly with the published VHDL code and contains no backdoors."
That's semi-true. Re-read my model. The same one can protect the consumer with minor tweaks. That's because my model maps to the whole lifecycle of ASIC design and production. One thing people can do is periodically have a company like ChipWorks tear it down to compare it to published functionality. For patents and security, people will do that already if it's a successful product. So, like Orange Book taught me long ago, I'm actually securing the overall process plus what I can of its deliverables. So long as process stays in check, it naturally avoids all kinds of subversions and flaws. High assurance design and evaluation by independent parties with skill do the rest.
"The bit about multiple independent implementations with voting (NASA-style) sounds extremely expensive and inefficient, but also very interesting for high-security systems. Are you aware of any projects implementing it for a general-purpose computer, specifically to prevent hardware backdooring (as opposed to for reliability)?"
It's not extremely expensive: many embedded systems do it. Just takes extra hardware, an interconnect, and maybe one chip (COTS or custom) for the voting logic. These can all be embedded. Those of us doing it for security all did it custom on a per-project basis: no reference implementation that I know of. There's plenty of reference implementations for the basic scheme under phrases triple modular redundancy, lockstep, voting-based protocols, recovery-oriented computing, etc. Look up those.
You can do the voting or error detection as real-time I/O steps, transactions, whatever. You can use whole systems, embedded boards, microcontrollers, FPGA's, and so on. The smaller and cheaper stuff has less functionality with lower odds of subversion or weaknesses. Helps to use ISA's and interfaces with a ton of suppliers for diversity and obfuscation part. If your targeted, don't order with your name, address, or general location. A few examples of fault-tolerant architectures follow. You're just modifying them to do security checks and preserve invariants instead of mere safety checks, although safety tricks often help given the overlap.
App-layer, real-time embedded http://www.montenegros.de/sergio/public/SIES08v5.pdf
Onboard an ASIC in VHDL http://www.ijaet.org/media/Design-and-analysis-of-fault-tole...
FPGA scheme http://crc.stanford.edu/crc_papers/yuthesis.pdf
A survey of "intrusion-tolerant architectures" which give insight http://jcse.kiise.org/files/V7N4-04.pdf
"To clarify, as wording is important in these kinds of discussions: When something is described as 'trusted', that's a negative to me, as a 'trusted' component by definition can break the security of the system."
Oops. I resist grammar nazi's but appreciate people catching wording that really affects understanding. That example is a mistake I intentionally try to avoid in most writing. I meant "trustworthy" and "trusted" combined. You can't avoid trusted people or processes in these things. The real goal should be to minimize amount of trust necessary while increasing assurance in what you trust. Same as for system design.
"Me knowing that the UPS shipment containing the mask had an armed guard does not make me more likely to want to trust the chip."
Sorry to tell you that it's not going to get better for you outside making sacrifices of above-style schemes which are only probabilistic and with singificant unknowns in the probabilities. Tool makers, fabs, and packaging must be semi-trusted in all schemes I can think of. They must be turned into circuitry at some point. Best mix is putting detection, voting, or something critical on an older node or custom wiring. What you can vet by eye if necessary. Can still do a lot with 350nm. Many high assurance engineers use older hardware with hand-designed software between modern systems due to subversion risk. I have a survey [3] of that stuff, too. :)
Note: My hardware guru did have a suggestion I keep reconsidering. He said most advanced nodes are so difficult [4] to use that they barely function at all. Plus, mods of an unknown design at mask or wiring level are unlikely to work except most simplistic cases. I mean, they spend millions verifying circuits they understand, so arbitrary modifications to black boxes should be difficult. His advice, though expensive, was to use most cutting-edge node in existence while protecting transfer of design and the chips themselves. Idea being that subversion of ASIC itself would fail or not even be tried due to difficulty. I like it more I think about it.
[1] https://www.cs.virginia.edu/~evans/talks/dssg.pptx
[2] https://www.iacr.org/archive/ches2009/57470361/57470361.pdf
[3] https://www.schneier.com/blog/archives/2013/09/surreptitious...
[4] http://electronicdesign.com/digital-ics/understanding-28-nm-...