Google Neural Network Can Predict Health Status From Your Retina

Google Neural Network Can Predict Health Status From Your Retina

This site may earn affiliate commissions from the links on this page. Terms of use.

Machine learning can be used to recognize faces, drive cars, and even spot exoplanets, but now Google is teaching its computers to do something even more unexpected. Researchers at Google have developed a way to predict a person’s blood pressure, age, and smoking status from an image of their retina, according to Scientific American. This data may even be enough to determine when someone is at high risk of a heart attack.

Google’s research used a convolutional neural network, the same biologically inspired system used to identify objects in photos. However, these networks can do plenty of other things if you just train them on different data sets. Convolutional neural networks are able to analyze an overall image more like the human brain, without splitting it up into pieces. These networks can actually understand the content of an image, and they’re getting very good at it.

Google’s research arm had the idea to apply neural network design to biological problems, but it didn’t start with the retina. In a past study, Google created a tool called DeepVariant that could scan a DNA sequence to find small mutations that would be missed by other methods. Outside of Google, researchers from the Allen Institute for Cell Science in Seattle are using convolutional neural networks to automatically identify cellular organelles in 3D images from microscopes. The components are colored by the computer, which eliminates the need to stain cells.

Deep neural networks have at least one hidden layer, and often hundreds. That makes them expensive to emulate on traditional hardware.

To develop its retina-scanning neural network, Google needed a lot of data. It used retinal images from 284,335 patients to set up the network. Later, it validated the network’s deep learning abilities using two different data sets of 12,026 and 999 patients. This was an important step as it showed that Google’s model could accurately predict health metrics. Just from retinal images, the model can determine age within about 3 years, gender (97 percent accuracy), smoking status (71 percent accuracy), blood pressure (within 11.23mmHg), and how likely it is that someone will have an “adverse cardiac event.” The model was able to predict that last one with 70 percent accuracy. It’s not a sure thing, but that’s pretty accurate when you consider it’s just looking at blood vessels in the eye.

The study is still just in preprint right now and has not been peer reviewed. Other researchers will need to go over the models and validate the results before we’ll know the impact, but it could be a boon to medicine. Even if it’s not 100 percent accurate, a retina scan is a simple, noninvasive procedure that could provide more data to doctors.

Published at Fri, 05 Jan 2018 15:45:49 +0000

212 0

Early Data Shows Linux Update to Fix Intel Security Flaw Hits Performance Hard

Early Data Shows Linux Update to Fix Intel Security Flaw Hits Performance Hard

This site may earn affiliate commissions from the links on this page. Terms of use.

Intel CPUs are now known to contain a serious flaw that can compromise system security. It can’t be fixed by microcode or UEFI update, and the solution — a significant set of patches applied to Windows, macOS, and Linux systems — is expected to carry a significant performance penalty in at least some benchmarks. This story is still evolving, but Phoronix has put some benchmarks together, along with sources like Linux, unlike macOS or Windows, has already been publicly patched (Windows patches are available via Windows Insider).

Treat all early data as preliminary, take with a grain of salt, etc, etc. Phoronix’s tests — which deliberately mix some different system configurations and models with faster and slower SSDs — show sharply reduced synthetic throughput results when the new kernel table isolation patch is applied. A synthetic compiler benchmark also showed reduced throughput, as below (purple = pre-patch, green = post-patch):


Graph and data by Phoronix.

More worrisome are the database tests, which definitely show a decline. Early data again suggest anywhere from a 7-20 percent hit may be normal; isolated results showing larger declines seem to be confined to synthetic tests, at least so far.


Graph and data by Phoronix.

That’s a 14 percent performance hit on Coffee Lake, and a nearly 20 percent performance whack on Broadwell-E. Redis performance (not pictured) was down about 7 percent on both systems. has some early benchmarks as well, mostly showing that the impact on user space applications (most consumer apps) is minimal. There may be a very small performance hit on the order of 2-5 percent in some games, but this is not an absolute.

Intel’s Comments

Intel has released a statement on the issue. It reads, in part:

Intel and other technology companies have been made aware of new security research describing software analysis methods that, when used for malicious purposes, have the potential to improperly gather sensitive data from computing devices that are operating as designed. Intel believes these exploits do not have the potential to corrupt, modify or delete data.

Recent reports that these exploits are caused by a “bug” or a “flaw” and are unique to Intel products are incorrect. Based on the analysis to date, many types of computing devices — with many different vendors’ processors and operating systems — are susceptible to these exploits.

Intel is committed to product and customer security and is working closely with many other technology companies, including AMD, ARM Holdings and several operating system vendors, to develop an industry-wide approach to resolve this issue promptly and constructively.

This is true — ARM also appears to be affected — but AMD, as of this writing, is not. Benchmarking a patched OS on an AMD system will produce a performance hit if the page table isolation capability is enabled in Linux, but AMD maintains it does not need this fix in the first place.

We reject Intel’s argument that “recent reports that these exploits are caused by a ‘bug’ or a ‘flaw’… are incorrect.” It may be true that securing chips from this kind of attack wasn’t a concern before, but the fact that Apple, Microsoft, and Google are all believed to be working on patches for a variety of products indicates they believe this flaw represents a serious security risk. It may not be unique to Intel, but it’s absolutely a problem. And you can bet AMD will be quite interested to see which applications and scenarios take a perf hit with the fix in place. Epyc, AMD’s nascent server lineup, might pick up a few customer wins off this problem if the issue is widespread.

Published at Wed, 03 Jan 2018 22:02:17 +0000

219 0

Massive Intel CPU Bug Leaves Kernel Vulnerable, Slows Performance: Report

Massive Intel CPU Bug Leaves Kernel Vulnerable, Slows Performance: Report

This site may earn affiliate commissions from the links on this page. Terms of use.

Intel’s CPU security took some whacks a few months ago, with well-publicized problems with the Intel Management Engine. If rumors are to believed, 2018 could kick off on an even worse year for the company. There’s growing speculation that there’s a major bug in Intel CPUs that requires a wholesale change in how Linux, Windows, and macOS map page tables, with the apparent goal of preventing Intel x86 CPUs from disclosing the layout of the kernel address space to an attacker. A similar patch is in the works for ARM systems as well; AMD CPUs are (as of this writing) not affected by this issue.

Here’s what we know so far: An initial article at LWN.Net lays out a new set of patches for the Linux kernel that began in late October and have continued through the present day. These efforts focus on implementing kernel page-table isolation, or KPTI, which splits page tables (currently shared between kernel space and user space) into two sets of data, one for each side. Microsoft is apparently prepping its own fix and is expected to launch it in the not-too-distant future.

We don’t know how attackers exploit the hardware bug in Intel and apparently ARM CPUs yet. All we know is that it’s apparently possible to discern the contents of protected kernel memory by leveraging this exploit. There may be some conceptual similarities to Rowhammer, the DDR memory attack technique that we’ve discussed before, in how this attack is carried out. Rowhammer can be used to change the data stored in certain memory locations by “hammering” adjacent rows of DRAM until the electrical charge in the target cells flips.


Image by PythonSweetness

The blog Python Sweetness has published a fairly good discussion of what we know and don’t know about this security issue, though the author of the post also links to an erroneous report suggesting that AMD CPUs take a 50 percent performance hit when the software solution for the fix is enabled (AMD CPUs, as of this writing, are not expected to need patching). The solution to the problem is to enable a capability known as page table isolation (PTI), but this apparently causes significant performance degradation in some Intel CPUs running some workloads. Postgre SQL tests suggest slowdowns of 7 percent to 23 percent, depending on which Intel CPU you test.

Recent Intel CPUs may not be affected by this issue to the same extent as older chips, but I haven’t been able to confirm that personally. There are references to using the “nopcid” instruction to disable other features Intel built into its Core microarchitecture to mitigate the performance hit from separating the kernel and user memory space, but no clear demarcation on when those mitigating features were themselves introduced. The nopcid instruction was added with AVX2 support when Haswell was new, which would seem to imply that Intel CPUs pre-Haswell might face larger penalties than chips post-Hawell.

Right now, the list of what we don’t know is longer than what we do. There are implications for cloud vendors and developers across the entire spectrum where ARM and x86 are deployed, but until we know more about the security flaw and at-risk systems, we’d counsel against any quick conclusions. Hat-tip to Hot Hardware, where we first saw the story.

Published at Wed, 03 Jan 2018 12:29:21 +0000

228 0

New Details on Intel’s Goldmont Plus, the CPU Architecture Inside Gemini Lake

New Details on Intel’s Goldmont Plus, the CPU Architecture Inside Gemini Lake

This site may earn affiliate commissions from the links on this page. Terms of use.

Intel’s PAO (Process-Architecture-Optimization) process hasn’t been as easy to follow as the old tick-tock system. Where tick-tock produced a steady cadence of die shrinks and new architectures, PAO is a bit more flexible. Intel’s new Goldmont Plus architecture is the basis for its recent Gemini Lake platform launch. Gemini Lake devices are now marketed under the “Pentium Silver” brand, and Intel is finally sharing a bit more information on how the new chips improve on Apollo Lake and its significant uplift compared to Silvermont.


The new chip is supposed to improve performance by up to 15% compared to previous Apollo Lake processors. Intel has now released optimization guides for the new chip and they detail what changes have been made from Apollo Lake (Goldmont) to Gemini Lake (Goldmont Plus).

The new cores have a wider back-end pipeline and can retire up to four instructions per cycle, up from three. Fetch and decode are still limited to three instructions per cycle, however. Branch prediction performance has been improved, and there’s a much larger 64KB shared L2 pre-decode cache (64KB, up from 16KB). There’s also now a dedicated port for the JEU (Jump Execution Unit) that didn’t previously exist. AES instruction latency and throughput have both improved, with larger load/store buffers, improved store-to-load forwarding latency, and an increased L2 cache (from 512KB per core to 1MB per core). The SoC CPU cluster is now organized into quads instead of pairs with more L2 allocated on a per-core basis. Goldmont-derived SoCs used 2-4 cores, typically with 2MB of L2 cache total. The new Goldmont Plus SoCs have four cores and 4MB of L2 cache per core cluster. If Intel follows its previous trend and releases a dual-core variant, it’ll probably use a 4MB L2 cluster as well.


The performance uplift from these new processors isn’t likely to be huge, and the performance gap between Intel’s entry-level and ultra-low-power “big core” chips is going to remain significant. At the same time, however, we’ve seen noted evolution since Bay Trail debuted back in 2012. Goldmont was a huge jump over the Silvermont architecture and if Goldmont Plus delivers as well, it should make low-cost, entry-level ultralight systems that much more attractive.

Goldmont Plus is a quick way for Intel to boost performance of its own entry level hardware, though it’s still not clear how much we’ll see ARM and Intel slugging it out in this segment. Microsoft is putting a push behind devices with Snapdragon 835 processors and x86 emulation, but we don’t know if Intel and ARM hardware will compete in the same TDP brackets or not.

Published at Thu, 28 Dec 2017 19:21:14 +0000

276 0

New PS4 Exploit Opens the Door to Jailbreaking

New PS4 Exploit Opens the Door to Jailbreaking

This site may earn affiliate commissions from the links on this page. Terms of use.

Sony is about to have a new headache to ring in the new year. A team of developers made good on a promise to drop a new exploit for the PlayStation 4, and it’s a doozie. Specter and Team Fail0verflow have revealed a flaw in kernel v4.05 for the PS4, which allows for the running of arbitrary code. This opens up the PS4 to homebrew software as well as easier game piracy.

Game consoles are some of the most notoriously locked-down devices in our homes thanks to a combination of custom hardware and heavily modified software. Companies take a dim view of attempts to hack their game consoles, even going so far as to launch legal action against those who would seek to experiment with “jailbreaks” for a console. In fact, Sony took famed developer George Hotz to court over his PS3 jailbreak in 2011. That case ended with Hotz promising not to hack Sony hardware anymore, as well as plenty of bad press for Sony.

Early in the PS4’s life cycle, Team Fail0verflow managed to get Linux up and running on the hardware, but the latest development is potentially more powerful. Specter and Team Fail0verflow teased the “namedobj” PS4 exploit several weeks ago, and now it’s available on GitHub. Perhaps as a way to deflect Sony’s legal team, the developers have not included the necessary tools to run homebrew software or jailbreak the device. However, as a kernel exploit, it allows modders to run any arbitrary code on the machine by listening for a payload via port 9020.

Even without the jailbreaking mechanisms in this release, it’s only a matter of time before someone develops one that can be executed with the help of namedobj. It’s not only modders and pirates who will be digging into the open source code. Sony too will be taking a close look at namedobj in order to patch the system. You can’t really blame Sony — in addition to being a jailbreak, namedobj is a huge security hole. Many of the tools enthusiasts rely upon to modify their devices also compromise security. A kernel exploit that runs arbitrary code could be used to hack someone’s console without their consent and steal data.

If you’re not interested in jailbreaking your console, Sony will probably patch the hole sooner rather than later. If you do want to jailbreak, you’re going to need to find a way to block future system updates.

Published at Thu, 28 Dec 2017 14:00:00 +0000

260 0

Seagate’s New Multi-Actuator Could Double Hard Drive Speeds

Seagate’s New Multi-Actuator Could Double Hard Drive Speeds

This site may earn affiliate commissions from the links on this page. Terms of use.

The humble hard drive doesn’t get much respect these days. Once SSDs started hitting the consumer market, it quickly became clear there was no way hard drive performance would compete in the long term. Hard drive capacities have continued to grow, thanks to new recording technologies and the use of helium inside the drives themselves, but performance improvements have been minimal. Seagate plans to change that in the near future, with new technology that could double hard drive performance.

There are several ways to improve hard drive performance. Fifteen years ago, Western Digital launched various 7200 RPM drives with a larger cache (8MB compared with 2MB on an 80GB drive). All HDDs today use memory caches, even if the ratio of cache size to disk size has fallen sharply.

The next way to increase HDD performance is to increase the rate at which the drive spins. A drive spinning at 15,000 RPM will obviously outperform a drive spinning at 7200 RPM, assuming identical firmware and workloads. But spinning a drive faster comes with its own set of problems: Such drives are obviously louder, it’s harder to build components that can withstand the higher spin rate, and 10K or 15K drives draw more power than their 7200 RPM cousins.

The third way, and the method Seagate has chosen, is to design a drive with multiple actuators. The idea of using multiple actuators isn’t new, and Seagate has experimented with it in the past, but had deemed the approach ineffective due to design challenges, higher drive weights, and additional material costs.

Standard hard drives have one actuator arm, as shown below:


Image by Christaan Colen

Seagate’s largest actuator arm has 8 platters and 16 heads — but the heads are all mounted to a single structure. Data tracks on the platters are too small to allow all of the heads to align simultaneously, sharply limiting read/write throughput, particularly in random workloads. Seagate’s new design doubles the number of actuators and halves the number of heads. Instead of one actuator with 16 heads, there are now two actuators with 16 heads, each capable of operating independently from the other. The drive can run two different read or write operations at once, provided each is handled by a dedicated actuator. It can also perform two commands in parallel and write from one head while reading from the other.

Seagate writes:

In its first generation, Seagate’s Multi Actuator technology will equip hard drives with dual actuators (two actuators). With two actuators operating on a single pivot point, each actuator will control half of the drive’s arms. Half the drive’s recording heads will operate together as a unit, while the other half will operate independently as a separate unit. This enables a hard drive to double its performance while maintaining the same capacity as that of a single actuator drive.

Why Bother Boosting Performance?

It might seem silly to worry about boosting hard drive performance when the gap between HDDs and SSDs is so large, but Seagate isn’t claiming this new technology will put SSDs and HDDs at parity. The reason overall drive performance needs to increase is because rising drive capacities are useless if you can’t write data to them in a reasonable amount of time.

Consider this: Even if you could maintain a constant 250MB/s write speed to a 20TB hard drive, it would take you 24 hours to fill the drive under perfect conditions. Since conditions aren’t perfect and write speeds slow down as you reach the outer edge of the platter, it would actually take longer. Enterprise organizations that would otherwise be interested in the high capacity drives Seagate brings to market might understandably balk at deploying RAID arrays that take 1-2 days to recover from a drive failure.

According to Seagate, the drives will offer multiple access streams that users can tap for various types of workloads or tasks. Long-term, Seagate wants to expand up to four simultaneous actuators, potentially doubling transfer rates again.

Now read: How do SSDs work?

Published at Thu, 21 Dec 2017 14:27:57 +0000

165 0

Major New Windows Insider Build Introduces Timeline, Sets

Major New Windows Insider Build Introduces Timeline, Sets

This site may earn affiliate commissions from the links on this page. Terms of use.

It’s only been a few months since the Fall Creators Update rolled out, but a new Windows Insider build (17063) with a number of new features just hit the proverbial streets.

First up is Timeline. This new feature makes Task View (Win-Tab) more effective by grouping previous applications or programs you were working with and offering them to you as a project, rather than simply displaying all open applications. Here’s the old Task View:


The Task View view.

And here’s the new Task View with Timeline enabled.


The new Timeline view can be organized by day or even by the hour if you need that level of granularity. If you know you were reading a story about cats, but can’t find the appropriate application, you can even search for the webpage you had open.


Timeline is grouped into Activities, which Microsoft defines as “the combination of a specific app and a specific piece of content you were working on at a specific time. Each activity links right back to a webpage, document, article, playlist, or task, saving you time when you want to resume that activity later.”

You can control which user accounts are tracked via Timeline and disable all activity collection from the Activity History menu.


It’s good to see Microsoft paying more attention to user privacy concerns from the start, rather than retroactively patching in capabilities it mystically forgot to implement.

Introducing Sets

The ability to switch between windows in Windows hasn’t really changed in decades. Task View (Win-Tab) may beef up search and activity capability, but switching back and forth between the various applications you’re using for a project can be a pain in and of itself. To combat this, Microsoft is introducing a new feature, currently called Sets. Here’s Microsoft again:

The concept behind Sets is to make sure that everything related to your task: relevant webpages, research documents, necessary files and applications, is connected and available to you in one click. Office (starting with Mail & Calendar and OneNote), Windows, and Edge become more integrated to create a seamless experience

This is an early test feature, so Microsoft didn’t have much more to say about it. It’s not clear if this is a Windows Store-only feature or not; Microsoft only mentions Office, Windows, and Edge as functioning in this fashion.

Other new features include various Cortana integrations, more applications that support Microsoft’s new design language called Fluent, Edge improvements (Ogg Vorbis and Theora are now supported), various gesture improvements, new text scaling options, display setting changes, and a number of UI updates. It looks like Redstone 4 will push a fair number of productivity changes, and some of them look fairly useful. We don’t have a launch date yet, but a late March or early April date seems likely.

Published at Wed, 20 Dec 2017 21:28:56 +0000

139 0

Samsung 49-inch Monitor First to be DisplayHDR-Certified

Samsung 49-inch Monitor First to be DisplayHDR-Certified

This site may earn affiliate commissions from the links on this page. Terms of use.

Samsung’s new 49-inch monitor is a sight to behold for several reasons. It’s the approximate size and dimensions of a subway car, with a 49-inch diagonal and a 32:9 aspect ratio. It’s enough to make the ultrawide displays of a few years ago (a paltry 21:9) feel a bit small.

The $999 Samsung CHG90 (full model number: LC49HG90DMNXZA) (See it on Amazon) supports AMD’s FreeSync 2, HDR, and the new VESA DisplayHDR standard. That last is a major step forwards; VESA’s DisplayHDR standard refers to the brightness (in nits) of the panel. The CHG90 qualifies for DisplayHDR 600; Samsung clarifies that this means the panel is meant to display HDR content “in bright indoor lighting conditions.”

One Monitor To (Almost) Rule Them All

Over the past few years, we’ve seen high-end panel specifications split somewhat depending on what kind of features you wanted. High-end, professional-grade displays with hardware-calibrated color and IPS panels with 4K support are available, but not necessarily great for gaming. Great gaming panels that support FreeSync and/or G-Sync are available, but don’t always have ideal image quality. 4K TVs with HDR support have been available for a while, but they’ve lacked FreeSync/G-Sync support. OLED televisions and a few monitors are available, with very fast refresh times, but they don’t always support HDR or Adaptive Sync (the generic VESA name for the capabilities baked into what AMD and Nvidia call FreeSync and G-Sync, respectively).

Some of these variations, like the split between panel color quality and low ‘ghosting’ in games have been common for years, but the multi-way split between OLED, LCDs, HDR, 4K, and ultrawide (21:9) and what I’m jokingly calling “subway car” (an actual NYC Metro subway car is 51 feet long and 8.6 feet wide, as opposed to 32:9, but it’s close enough for fun) is newer. We’re starting to see displays hitting market that combine these various capabilities without asking users to pick whether they want one or the other, and while the CHG90 doesn’t offer a superset of 4K resolution, it ticks most of the other boxes.


The CHG90 isn’t exactly cheap, but it packs quite a bit of performance into its four-figures-with tax-price tag. Just be aware the emphasis on width does come at the expense of some overall resolution. It’s become common for 3440×1440 displays to ship in 27-inch and 34-inch form factors, but this 49-inch panel has a resolution of just 3840×1080. Any content you watch on it in 4K will have broad black bars on both sides; modern movies and TV aren’t offered in a 32:9 aspect ratio.

Published at Wed, 20 Dec 2017 12:30:56 +0000

151 0

AMD’s Next-Generation Navi GPU Could Ship by Late 2018

AMD’s Next-Generation Navi GPU Could Ship by Late 2018

This site may earn affiliate commissions from the links on this page. Terms of use.

AMD’s Navi has been of interest to AMD fans since it first popped up on roadmaps, with hints of a next-generation memory subsystem and a “scalability” option that might be similar to the modular GPU designs that Nvidia is supposedly considering for its own products. First, the hints. As Hot Hardware reports, some driver notes for a Linux driver update back in July that were recently discovered reported:

[WARNING]: Should use –pci when using create_asic_from_script()
new_chip.gfx10.mmSUPER_SECRET => 0x12345670

GFX10 is a Navi reference, and there are plenty of other hints to ongoing Navi work at AMD, from a job opening for a senior ASIC design and layout engineer (Shanghai, China) to various statements from AMD that it’s working on 7nm ramps already (the remarks date to May of this year). A Navi tape-out now or in the next few months would clear the way for a professional product introduction late next summer or fall, with consumer cards arriving a few months later.


A hypothetical modular GPU

As for what the GPU will be, that’s anyone’s guess. AMD and Nvidia have both made noise about scalability and building distributed GPUs, but such designs come with a lot of potential issues that need to be addressed. GPU internal bandwidth is much, much higher than what we’ve seen in multi-core CPUs interconnects — think along the lines of 300GB/s, as opposed to 30GB/s. Building an interconnect that could keep all the subdivided GPU components suitably fed would be a very careful balancing act. I’m not suggesting that NV, AMD, or both won’t build it, but it may not be an easy road to delivery.

Frankly, I’m not sure now is the right time for AMD to be clever as far as GPU design is concerned. HBM and HBM2 may have delivered some benefits to AMD’s overall power consumption profile, but the company has had to push all of its GPU designs extremely hard to match Nvidia’s performance going back as far as Hawaii in 2013. Granted, Vega doesn’t hit the 95C temperatures that Hawaii did, but AMD didn’t really deliver the performance or power consumption that people were hoping for in 2017, either. If Navi cleans up cruft in the Vega design and delivers a large performance uplift thanks to further design refinements, so much the better, even if it isn’t a brand-new architecture or major design shift compared with Vega. Either way, repeated rumors are pointing to 2H 2018 for a refresh from Team Red.

Published at Tue, 19 Dec 2017 16:15:57 +0000

108 0

Intel’s new Stratix 10 MX FPGA Taps HBM2 For Massive Memory Bandwidth

Intel’s new Stratix 10 MX FPGA Taps HBM2 For Massive Memory Bandwidth

This site may earn affiliate commissions from the links on this page. Terms of use.

Intel announced its new Stratix 10MX FPGA today, marking the first time an FPGA has been available with HBM2 memory onboard. The Stratix 10 MX has up to 10x more memory bandwidth than competing solutions that rely on DDR4 (512GB/s of aggregate bandwidth in two HBM2 stacks). Like the now-confirmed AMD / Intel team-up to build Vega graphics into certain Intel CPUs, Intel is using its Embedded Multi-Die Interconnect Bridge (EMIB) to connect various components of the system.

All Stratix 10 MX FPGAs use HBM2, but they offer varying amounts of memory on-package, from 3.25GB (MX 1100) to 16GB (MX 1650, MX 2100). Available SRAM also varies (45-90Mbit), as do the number of logic elements, I/O pins, and PCIe 3.0 x16 IP blocks. The point of this particular FPGA family, unsurprisingly, is to offer far more memory bandwidth than you’d typically see on an FPGA, with a lower physical footprint and less energy consumption.

According to Intel, this kind of shift is critical to deploying FPGAs in certain spaces. Implementing large memory pools on an FPGA with DDR4 is limited by the number of I/O pins and memory channels you can plausibly fit on a card. HBM2 short-circuits this problem by packing a huge amount of bandwidth into a much smaller form factor. Those of you who have followed the memory standard’s evolution may recall that AMD justified adopting it for the Fury X family because it reduced memory subsystem power consumption dramatically (and energy efficiency tests later bore out that the Fury Nano was the smallest, most-efficient GPU AMD shipped for quite a long time).


The more memory bandwidth you need, the more lopsided and in-favor of HBM2 the comparison becomes. At 400GB/s of bandwidth, Intel projects that it can reduce platform size by 24x, with power consumption savings of 50 percent at 128GB/s of memory bandwidth and even more at higher capabilities.

According to Intel, adding the HBM2 buffer to FPGA designs is critical for enabling FPGAs to continue scaling into HPCs and other data center designs. To date, HBM2 has been locked up almost exclusively in very high end products. Only AMD’s Vega has tried to bring HBM2 to mainstream graphics cards, and the high price on those GPUs strains the definition of ‘mainstream.’ We may eventually see the memory technology come to lower-end, cheaper cards, or it may be that HBM2 ultimately be supplanted by GDDR6.

Now read: How L1 and L2 CPU Caches Work, and Why They’re an Essential Part of Modern Chips

Published at Mon, 18 Dec 2017 21:02:10 +0000

86 0