Theo de Raadt [interview] described an active effort by OpenBSD developers to work around "serious bugs in Intel's Core 2 cpu". He went on to explain, "these processors are buggy as hell, and some of these bugs don't just cause development/debugging problems, but will *ASSUREDLY* be exploitable from userland code. As is typical, BIOS vendors will be very late providing workarounds / fixes for these processors bugs. Some bugs are unfixable and cannot be worked around. Intel only provides detailed fixes to BIOS vendors and large operating system groups. Open Source operating systems are largely left in the cold." He provided a link to the full errata (in PDF format) as well as a graphical overview, summarizing:
"Note that some errata like AI65, AI79, AI43, AI39, AI90, AI99 scare the hell out of us. Some of these are things that cannot be fixed in running code, and some are things that every operating system will do until about mid-2008, because that is how the MMU has always been managed on all generations of Intel/AMD/whoeverelse hardware. Now Intel is telling people to manage the MMU's TLB flushes in a new and different way. Yet even if we do so, some of the errata listed are unaffected by doing so.
As I said before, hiding in this list are 20-30 bugs that cannot be worked around by operating systems, and will be potentially exploitable. I would bet a lot of money that at least 2-3 of them are."
From: Theo de Raadt [email blocked] To: misc Subject: Intel Core 2 Date: Wed, 27 Jun 2007 11:08:16 -0600 Various developers are busy implimenting workarounds for serious bugs in Intel's Core 2 cpu. These processors are buggy as hell, and some of these bugs don't just cause development/debugging problems, but will *ASSUREDLY* be exploitable from userland code. As is typical, BIOS vendors will be very late providing workarounds / fixes for these processors bugs. Some bugs are unfixable and cannot be worked around. Intel only provides detailed fixes to BIOS vendors and large operating system groups. Open Source operating systems are largely left in the cold. Full (current) errata from Intel: http://download.intel.com/design/processor/specupdt/31327914.pdf - We bet there are many more errata not yet announced -- every month this file gets larger. - Intel understates the impact of these erraata very significantly. Almost all operating systems will run into these bugs. - Basically the MMU simply does not operate as specified/implimented in previous generations of x86 hardware. It is not just buggy, but Intel has gone further and defined "new ways to handle page tables" (see page 58). - Some of these bugs are along the lines of "buffer overflow"; where a write-protect or non-execute bit for a page table entry is ignored. Others are floating point instruction non-coherencies, or memory corruptions -- outside of the range of permitted writing for the process -- running common instruction sequences. - All of this is just unbelievable to many of us. An easier summary document for some people to read: http://www.geek.com/images/geeknews/2006Jan/core_duo_errata__2006_01_21__full.gif Note that some errata like AI65, AI79, AI43, AI39, AI90, AI99 scare the hell out of us. Some of these are things that cannot be fixed in running code, and some are things that every operating system will do until about mid-2008, because that is how the MMU has always been managed on all generations of Intel/AMD/whoeverelse hardware. Now Intel is telling people to manage the MMU's TLB flushes in a new and different way. Yet even if we do so, some of the errata listed are unaffected by doing so. As I said before, hiding in this list are 20-30 bugs that cannot be worked around by operating systems, and will be potentially exploitable. I would bet a lot of money that at least 2-3 of them are. For instance, AI90 is exploitable on some operating systems (but not OpenBSD running default binaries). At this time, I cannot recommend purchase of any machines based on the Intel Core 2 until these issues are dealt with (which I suspect will take more than a year). Intel must be come more transparent. (While here, I would like to say that AMD is becoming less helpful day by day towards open source operating systems too, perhaps because their serious errata lists are growing rapidly too).
Just return the CPUs
At least in europe this is a defect seen by the warranty law which requires sellers to repair/replace it or leads to an annulment of the sale contract.
Users should get into this to aquire docs for developers.
return every cpu ever made?
return every cpu ever made? :)
Verdict: overblown
Thread at RWT
Comments from Linus and Andi Kleen! It must be true!
Intel Only?
Is it only Intel CPUs that have these problems and not AMD CPUs? Do similar problems exist on AMD CPUs?
Intel Only?
Do problems like this exist on any AMD CPUs?
No.
No, AMD CPUs have their own set of bugs. See the following link for Athlon 64 / Opteron bugs:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf
Errata
All CPUs have errata, but not all have bugs like these, which are serious security holes that cannot be worked around in software.
I side w/ Linus on this one
I read through the errata, Theo's comments and Linus' comments. I'll have to side with Linus on this one.
I also agree with Linus' assessment of embedded CPUs vs. commodity CPUs. Your typical Intel or AMD x86 device will be much cleaner in the end (especially relative to their complexity!) than J. Random Embedded CPU, because neither AMD nor Intel has much control over the software stack running on their CPUs, and there are just way, way, way too many units out there.
There might be a ton of DSPs and ARMs out there, but the number of people writing code for them are far fewer, by one to four orders of magnitude depending on the device. I work for a company that makes embedded processors, and can vouch for the fact that we will work around CPU / chip bugs with compiler tweaks or with detailed instructions to programmers, as opposed to spending $BIGNUM on a spin. (I'm not divulging any secrets here. It's all right there in our errata. Some of those words *I* wrote!)
If the bug is big enough, we'll respin. But, it's got to be a pretty big bug. Otherwise, there goes the profitability. And that's a calculation every vendor makes. Your volumes and your revenue go a long way in determining that threshold.
Now, we do tend to fix bugs in the core IP even if we don't respin chips specifically due to those bugs. Later spins can then pick up the bug fixes "for free." That's another common practice.
And the point is...
That you give out detailed information on the bug.
Intel does not. AMD I believe is getting worse. It
is in *everyone's* best interest to have this info.
Ok, *maybe*, but that's a big maybe, not intel's best
interest. At least short term... long term I believe
it also is.
Hmmm.
I'm not sure how much more detail is actually useful. You need more detail to reproduce the bug, but not necessarily more detail to avoid the bug. And if the bug is only triggerable from kernel space, the additional detail isn't necessary to determine whether hostile user space could trigger it.
In my experience with bugs like this, sometimes it's very hard to explain, even among people intimately familiar with the device's architecture, the sequence of events that exposes a bug. Some of these sound like they could take several pages to explain, should you want enough detail to be able to reproduce the bug and know all the detailed ways the bug can manifest.
Let's take for example AI56:
Problem: Updating a page table entry by changing R/W, U/S or P bits without TLB shootdown (as defined by the 4 step procedure in "Propagation of Page Table and Page Directory Entry Changes to Multiple Processors" in volume 3A of the IA-32 Intel Architecture Software Developer's Manual), in conjunction with a complex sequence of internal processor micro-architectural events, may lead to unexpected processor behavior.
Implication: This erratum may lead to livelock, shutdown or other unexpected processor behavior. Intel has not observed this erratum with any commercially available system.
Ok, what details are missing that are relevant to a developer trying to avoid this bug? None, really. They tell you exactly where to go to get the proper procedure for shooting down the TLB so that the bug never occurs.
But, it's not very detailed about what the bug is. There's not nearly enough detail here on how to reproduce the bug quickly (though there is enough that you could probably goof around in this area and eventually trigger it). There's also no indication of what goes wrong or the full range of misbehavior to expect. (Sure, "livelock" and "shutdown" are mentioned, but what's "other unexpected processor behavior"? 2 + 2 returning something other than 4? Thermal meltdown?)
My point is, does that matter? The severity of the bug is clear: It can crash the system. The scope of the bug is clear: This could happen if the OS doesn't follow the procedure we laid out. The required course of action is clear: Follow the directions. What's left out probably requires a deep explanation of the microarchitectural details as a starting point, likely followed by an equally complex description of the sequence(s) of events that get the machine into an odd state.
The missing part might be interesting to an architect or a designer, but not terribly so to an OS writer, unless they like playing armchair architect. I can see some of these could inspire conference papers at design verification oriented conferences. But in the errata? Do we really need all that? I suspect if that were there, most people would be saying "Just get to the point, ok?"
Oh, and I forgot to mention
There are plenty of bugs we don't disclose. That's true of just about any vendor. Most bugs simply don't matter all that much.
Sometimes it's a simple performance bug. For instance, maybe a certain sequence of instructions should run in 10 cycles if the processor was in spec, but it happens to take 11 due to some bug. That's pretty benign, but it's still a bug. Is it worth an errata? Probably not.
Intel Core Duo 2 the worst, Intel Core Duo the best.
IMHO, Intel Core Duo 2 overflows bugs because they complicated the design of Intel Core Duo + new complex instructions set x86-64 computer that they go to flaw it in little time.
So, Intel Core Duo (has not x86-64) is currently the best processor after Intel Pentium-M, and Intel Core Duo (has x86-64 that 50% nobody uses it) the worst.
Flaws? Wait a moment ...
Windows Vista is still using 8086 16-bit code?
Yes, 640 k is ok for everything!!!
It still uses the A20 gate to address more than 1 MB!!!
It jokes me an asshole.
Undeadly: *BSD Matt Dillon on this matters
http://undeadly.org/cgi?action=article&sid=20070630105416
i hate subjects
http://it.slashdot.org/article.pl?sid=08/07/14/1852203
RfwQhBCL
BMWSWVUQ RfwQhBCL