I would expect the memory itself is actually a pretty similar price -- the thing about HBM though is the interposer and requiring that you have a good GPU die, good HBM stacks and then a good final product with the interposer, GPU, and memory -- so many more places for things to go wrong plus the cost of the interposer itself.
The difference is nvidia is not using it no "mainstream" grade products, but instead on products with immensely higher margins.
I suspect the reason for the heavy vega delay is someone **cough nvidia** poaching the initial supply, raising the price, forcing amd to wait for better availability and lower prices, since they want to put it in a product that will not enjoy such high margins, and from the looks of it, won't shatter performance records either.
I suspect this production ramp up is the reason why amd are finally planning a hard release, the FE doesn't really count.
So you suspect Nvidia bought much more HMB2 chips than they required? How many Tesla GP100 and GV100 boards is Nvidia expecting to sell compared to AMD's consumer and semi-pro Vega cards? Due to additional demand from self-driving car developers perhaps the ratio is not 1 Tesla (GP/GV combined) for every 10 AMD cards, but there is no way it is going to be more than 1 per 7. That would live them with a significant stock of unused HMB2 chips.
Unless you implied that Nvidia bought on purpose more HBM2 chips in order to hinder AMD's Vega release. Who knows, that might just be the case. But it would be *very* expensive.
No, it simply offered a better price, since they can afford it, with the added benefit of screwing with amd by depriving them of what they need to launch their long delayed high end offering.
I expect that nvidia has sold far more hbm2 products than amd has. And whoever told you that nvidia hbm2 products only go into self driving cars really took you for a ride.
Nvidia didn't buy more than they needed, they simply poached the initial supply, leaving amd with some scraps for testing and the limited FE release.
Yeah right as if NVIDIA needed to such thing... If NVIDIA could even do such thing then it just means the HBM2 wasn't ready for mass market and hence it's AMD own fault for making a product based on a technology not readily available. Besides the fact VEGA FE release is a big mess means AMD has bigger problems than HBM2 availability to sort out...
What initial supply? producers of HBM2 have been delaying it and reducing its capabilities... the HBM2 problems have been known from even before Vega was announced.
HBM hasn't been all that impressive to date. It's technically very impressive, but in practice it seems disappointing due to the hardware it's paired with, especially when contrasted with faster GPUs using GDDR5/X.
Are we going to ever see multi-die GPU unified by HBM? Navi? Volta 2.0?
I'm not sure I could be more objective, actually. GDDR5X easily hits 480GB/sec bandwidth, which matches Vega FE. GDDR6 GPUs are expected to hit around 768GB/sec. HBM is awesome on paper, but so far we've only seen it used twice in consumer/prosumer cards and twice in deepwhatever cards. It seems pretty obvious that GPU processing power is not yet advanced enough to use that much bandwidth effectively. That's why I asked about multi-die GPUs, which could theoretically offer the performance of SLI/CF, but be seen as a single GPU.
1. Fury (512GB/sec), GDDR5-equipped 980Ti, 1070, 1080, and 1080Ti, TitanXP/p beat it in games. 2. Vega FE (484GB/sec), GDDR5-equipped 1080 and 1080Ti, TitanXP/p beat it in games. 3. Tesla P100 (732GB/sec) and Instinct MI25 (484GB/sec), both irrelevant for games.
I'm sure HBM will be worthwhile eventually, but it's not anything special at the moment.
AMD's next generation GPU, Navi, is supposed to be scaleable. When you think about the Infinity Fabric and how Navi will work, it begins to make sense why AMD is going with HBM in Vega. Navi will be built on top of a more efficient version of Vega with multiple small dies and multiple stacks of HBM.
Nvidia has already maxed out what TMSC can do in terms of die size with Volta V100... the modular GPU, much like Ryzen was the modular CPU, is the next logical progression.
Bingo. I'm excited to see where we go with scalable GPUs. Assuming they don't encounter any significant latency problems like with see with operations spanning CCX/NUMA, it should be super effective at maintaining generational leaps in performance for several more years.
It's funny -- 3dfx was doing both multi-die GPU's and multi card setups, with NO issues in any game decades ago. Of course they ran glide and ancient versions of directx which were much simpler -- but still -- the fact remains. Voodoo 1 was a 2 chip GPU, and Voodoo 2 was a 3 chip GPU, and of course 3dfx invented SLI (of course the acronym stood for scan line interleave instead of scalable link interface back then). Too bad they never really innovated beyond the Voodoo2 -- the voodoo 3 was basically a Voodoo 2 on a single chip with a minor clock speed bump, and the Voodoo 4/5 ran on the VSA 100 chips which were basically a Voodoo 3 with again a bit of a clock bump, and then the Voodoo 5 ran two of them. I miss my Voodoo 5 -- I wish I still had it -- I traded it to someone for a Geforce 3, which while it was faster, it definitely doent have the same sort of nostalgia.
LOL @ "NO issues in any game decades ago". I'm sorry, but Voodoo SLI only worked with a few games and was not free of issues.
Also, it was not a "two chip" GPU in the sense we're talking about here. 3dfx had separate "chips" for each function of the overall GPU (TMU, frame buffer, etc.), not multiple wholly complete GPU dies on the same card.
They were multi-die but the 2nd and 3rd dies were TMU's (texturing unit) only so its a bit apple to oranges to what Navi will be. I wish they hung around, would be lovely having three major GPU vendors.
Um... history beautiful in hindsight? My memories of 3DFX GPUs were not quite a rosey. Constant driver and game compatibility issues, largely due to their multi-chip nature. Still, awesome cards... but without issue? hardly. Things are far FAR better today.
Agree. The move to HBM is forward thinking as they really don't need the bandwidth for Vega but since AMD is a small company they are going to build Navi from Vega then it makes total since.
HBM has been to include HBM1 very impressive. Even if you look at the older cards like the Fury with its 4Gb limit, The bandwidth allowed for some gains that enabled AMD to not be so far behind. The reduction in power alone allowed AMD to overvolt and drive the GPU faster. Granted it wasn't enough to beat NVidia cards but impressive none the less. The take away from HBM is same/more bandwidth for a lot less power. And that's already been shown within the limited power envelope of a gpu, the ability to redirect that powersavings to clock speed makes a difference between being way behind to only 90-95%.
And don't compare VEGA FE yet. It's not a gamer card. Once Vega consumer GPU's drop. If the FE's are similarly performant then by all means rip them up.
The one number I'd like to see is power consumption (both the ram chips and the GPUs memory controllers) because the stated reason for HBM over newer additions of GDDR a few years ago was supposed to be power consumed by the IO buses would get out of hand and eat the majority of the TDP budget in the latter case. Since then we've had GDDR5x and GDDR6 which suggests that the GDDR team has managed to do better than was feared at the time.
Fr hbm1 anands deep dive it was a good bit. Of power saving. I would like to know the numbers as well.Mainly bc I'd like to throttle back a fury card to see how much of a regression it would be
I'm 99% sure I saw the deep dive a few years ago. I know what was claimed for power savings then without having to reread the old propaganda. I'm interested in what its actually delivering now, and how the GDDR developers have apparently avoided the power apocalypse that was supposed to ruin them beyond GDDR5.
Yeah, it's a little disappointing, but it's not entirely clear "who" is at fault.
Recall that Fiji's HBM was overvolted to 1.3V in order to meet performance requirements. That's going to nullify some power savings.
Similarly, HBM2 doesn't appear to want to clock at 2.0 Gbps (GP100's only went to 1.4 Gbps, GV100 at 1.75 Gbps and the low-volume Vega FE at a weird 1.89 Gbps).
So some of the shortcomings of HBM, itself, appear to nullify its advantages over equivalent GDDR.
But yeah, it hasn't been paired with the best consumer GPUs on the market, so that hurts as well.
The memory manufacturers are at fault. They are the ones who promised and they are the ones who haven't delivered. The fact that NVIDIA put only 16 GB of VRAM on the V100 says a lot. The V100 is a card that is priced at perhaps $13,000 and contains a GPU with a die greater than 800 mm^2. NVIDIA didn't use 16 GB instead of 32 GB because of cost concerns. Apparently the memory manufacturers can't run 8-hi HBM2 stacks at high enough clocks to give the promised bandwidth so NVIDIA had to choose between more capacity or higher bandwidth. Having that extra 16 GB of memory capacity is actually rather important for HPC and AI uses, just not as important as memory bandwidth.
I don't know what advantage 8-hi HBM2 has over GDDR5X/6 at this point, other than power usage and PCB area. Those things might be more relevant for networking products that use high bandwidth RAM than for GPUs.
Navi's design has been most likely locked, and it should currently be at the prototyping stage. Nvidia recently entertained the possibility of developing an MCM (multi-chip-package) GPU, in a similar way to AMD's Threadripper and Epyc, because they say monolithic dies cannot get any bigger (than Volta). It will basically be like a very tightly packed, ultra low latency SLI, and they said the drivers will present the distinct dies as a single one to games and GPU compute apps (and to the OS, of course).
It is unknown if they are planning this for the successor of Volta or for generation after that. 7 or 10 nm (for Volta's successor) should allow them to add transistors and shader cores without enlarging the die further, so my guess is that Volta's successor will still be a monolithic die. AMD's potential plans about possible MCM GPUs are also unknown at this point.
According to GPU-Z screenshots that I have seen -- (at least some) Vega FE cards run MICRON HBM2, which is definitely surprising as I didn't even know they were in the HB2 game at all, betting solely on GDDR5x/6 -- of course GPU-Z could be wrong, but it is interesting none-the-less.
Thankfully, GPU-Z gets it very wrong in this case, which makes it clear that it is wrong to being with. Micron doesn't make HBM (they're focused on HMC).
So I guess Vega consumer will use two 4 GB 4-Hi stack KGSDs and Vega semi-pro will use two 8 GB 8-Hi stack KGSDs? These 8-Hi stack chips must still be too expensive.. I wonder how much more a 8 GB 8-Hi chip might cost than two 4 GB 4-Hi ones. I am kind of amazed though that even Tesla GV100 will use 4-Hi stack chips, since this is a part that can certainly swallow the extra cost. Unless Nvidia is worried about the availability of Hi-8 stack chips and/or the design was completed and locked with twice the number of HBM2 chips (compared to a design with 8 GB chips).
Each stack gives 1024 bits of bus width and I believe requires its own memory controllers on the GPU die. So having 4 stacks is more expensive and results in higher bandwidth than having 2. In order to get the bandwidth they want, NVIDIA needs to use 4 stacks, i.e., even if they used 8-hi stacks they'd need 4 of them.
Using 8-hi stacks instead of 4-hi stacks would give them more capacity. Why they don't put 32 GB on the cards I really don't know. It's either too expensive or not technically feasible, but I can't understand why they'd be so concerned about the extra expense for a card that is already $13,000. I would think that at least for the Summit and Sierra supercomputers they would make a 32 GB variant if it were technically feasible.
Not sure if anyone caught Gamers Nexus and their undervolting of the Vega FE, some interesting results from what I remember. Around a 100mV underclock (for stability across all test cases) gave a cooler running less thottling card (stock voltage is 1200mV?)
im holding off on a bigger GPU upgrade (from a 470) until Vega RE appears in retail where its gaming prowess and more importantly its hashing ability are benchmarked... A 580 is impossible to find and when you do its priced into the 1080 range based on availability in New Zealand :/ Early mining benchmarks look like vega would need alot of optimization to be a winner...
I'm guessing AMD's board partners will solve the thermal problems the air cooled Vega FE seems to be having.
But undervolting is liable to introduce stability issues just like overclocking. Its effectiveness will depend on the individual card you get and the individual games you play.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
33 Comments
Back to Article
milkod2001 - Wednesday, July 19, 2017 - link
What would be price difference between 8GB HBM2 AMD is using and 8GB GDDR5X NV is using?extide - Wednesday, July 19, 2017 - link
I would expect the memory itself is actually a pretty similar price -- the thing about HBM though is the interposer and requiring that you have a good GPU die, good HBM stacks and then a good final product with the interposer, GPU, and memory -- so many more places for things to go wrong plus the cost of the interposer itself.ddriver - Wednesday, July 19, 2017 - link
The difference is nvidia is not using it no "mainstream" grade products, but instead on products with immensely higher margins.I suspect the reason for the heavy vega delay is someone **cough nvidia** poaching the initial supply, raising the price, forcing amd to wait for better availability and lower prices, since they want to put it in a product that will not enjoy such high margins, and from the looks of it, won't shatter performance records either.
I suspect this production ramp up is the reason why amd are finally planning a hard release, the FE doesn't really count.
Santoval - Wednesday, July 19, 2017 - link
So you suspect Nvidia bought much more HMB2 chips than they required? How many Tesla GP100 and GV100 boards is Nvidia expecting to sell compared to AMD's consumer and semi-pro Vega cards? Due to additional demand from self-driving car developers perhaps the ratio is not 1 Tesla (GP/GV combined) for every 10 AMD cards, but there is no way it is going to be more than 1 per 7. That would live them with a significant stock of unused HMB2 chips.Unless you implied that Nvidia bought on purpose more HBM2 chips in order to hinder AMD's Vega release. Who knows, that might just be the case. But it would be *very* expensive.
ddriver - Thursday, July 20, 2017 - link
No, it simply offered a better price, since they can afford it, with the added benefit of screwing with amd by depriving them of what they need to launch their long delayed high end offering.I expect that nvidia has sold far more hbm2 products than amd has. And whoever told you that nvidia hbm2 products only go into self driving cars really took you for a ride.
Nvidia didn't buy more than they needed, they simply poached the initial supply, leaving amd with some scraps for testing and the limited FE release.
Strunf - Friday, July 21, 2017 - link
Yeah right as if NVIDIA needed to such thing... If NVIDIA could even do such thing then it just means the HBM2 wasn't ready for mass market and hence it's AMD own fault for making a product based on a technology not readily available.Besides the fact VEGA FE release is a big mess means AMD has bigger problems than HBM2 availability to sort out...
Strunf - Thursday, July 20, 2017 - link
What initial supply? producers of HBM2 have been delaying it and reducing its capabilities... the HBM2 problems have been known from even before Vega was announced.descendency - Wednesday, July 19, 2017 - link
Size of the board.nathanddrews - Wednesday, July 19, 2017 - link
HBM hasn't been all that impressive to date. It's technically very impressive, but in practice it seems disappointing due to the hardware it's paired with, especially when contrasted with faster GPUs using GDDR5/X.Are we going to ever see multi-die GPU unified by HBM? Navi? Volta 2.0?
Manch - Wednesday, July 19, 2017 - link
Dude just say you hate AMD but love NVidia instead of the thinly veiled "analysis" LOL. Nothing wrong with being a fan, but come on.nathanddrews - Wednesday, July 19, 2017 - link
I'm not sure I could be more objective, actually. GDDR5X easily hits 480GB/sec bandwidth, which matches Vega FE. GDDR6 GPUs are expected to hit around 768GB/sec. HBM is awesome on paper, but so far we've only seen it used twice in consumer/prosumer cards and twice in deepwhatever cards. It seems pretty obvious that GPU processing power is not yet advanced enough to use that much bandwidth effectively. That's why I asked about multi-die GPUs, which could theoretically offer the performance of SLI/CF, but be seen as a single GPU.1. Fury (512GB/sec), GDDR5-equipped 980Ti, 1070, 1080, and 1080Ti, TitanXP/p beat it in games.
2. Vega FE (484GB/sec), GDDR5-equipped 1080 and 1080Ti, TitanXP/p beat it in games.
3. Tesla P100 (732GB/sec) and Instinct MI25 (484GB/sec), both irrelevant for games.
I'm sure HBM will be worthwhile eventually, but it's not anything special at the moment.
WinterCharm - Wednesday, July 19, 2017 - link
AMD's next generation GPU, Navi, is supposed to be scaleable. When you think about the Infinity Fabric and how Navi will work, it begins to make sense why AMD is going with HBM in Vega. Navi will be built on top of a more efficient version of Vega with multiple small dies and multiple stacks of HBM.Nvidia has already maxed out what TMSC can do in terms of die size with Volta V100... the modular GPU, much like Ryzen was the modular CPU, is the next logical progression.
nathanddrews - Wednesday, July 19, 2017 - link
Bingo. I'm excited to see where we go with scalable GPUs. Assuming they don't encounter any significant latency problems like with see with operations spanning CCX/NUMA, it should be super effective at maintaining generational leaps in performance for several more years.extide - Wednesday, July 19, 2017 - link
It's funny -- 3dfx was doing both multi-die GPU's and multi card setups, with NO issues in any game decades ago. Of course they ran glide and ancient versions of directx which were much simpler -- but still -- the fact remains. Voodoo 1 was a 2 chip GPU, and Voodoo 2 was a 3 chip GPU, and of course 3dfx invented SLI (of course the acronym stood for scan line interleave instead of scalable link interface back then). Too bad they never really innovated beyond the Voodoo2 -- the voodoo 3 was basically a Voodoo 2 on a single chip with a minor clock speed bump, and the Voodoo 4/5 ran on the VSA 100 chips which were basically a Voodoo 3 with again a bit of a clock bump, and then the Voodoo 5 ran two of them. I miss my Voodoo 5 -- I wish I still had it -- I traded it to someone for a Geforce 3, which while it was faster, it definitely doent have the same sort of nostalgia.nathanddrews - Wednesday, July 19, 2017 - link
LOL @ "NO issues in any game decades ago". I'm sorry, but Voodoo SLI only worked with a few games and was not free of issues.Also, it was not a "two chip" GPU in the sense we're talking about here. 3dfx had separate "chips" for each function of the overall GPU (TMU, frame buffer, etc.), not multiple wholly complete GPU dies on the same card.
James S - Wednesday, July 19, 2017 - link
They were multi-die but the 2nd and 3rd dies were TMU's (texturing unit) only so its a bit apple to oranges to what Navi will be. I wish they hung around, would be lovely having three major GPU vendors.CaedenV - Wednesday, July 19, 2017 - link
Um... history beautiful in hindsight? My memories of 3DFX GPUs were not quite a rosey. Constant driver and game compatibility issues, largely due to their multi-chip nature. Still, awesome cards... but without issue? hardly. Things are far FAR better today.James S - Wednesday, July 19, 2017 - link
Agree. The move to HBM is forward thinking as they really don't need the bandwidth for Vega but since AMD is a small company they are going to build Navi from Vega then it makes total since.sonicmerlin - Tuesday, July 25, 2017 - link
Pretty sure AMD is going with HBM because they're desperate to save some power with their power hungry GPUs.Manch - Wednesday, July 19, 2017 - link
HBM has been to include HBM1 very impressive. Even if you look at the older cards like the Fury with its 4Gb limit, The bandwidth allowed for some gains that enabled AMD to not be so far behind. The reduction in power alone allowed AMD to overvolt and drive the GPU faster. Granted it wasn't enough to beat NVidia cards but impressive none the less. The take away from HBM is same/more bandwidth for a lot less power. And that's already been shown within the limited power envelope of a gpu, the ability to redirect that powersavings to clock speed makes a difference between being way behind to only 90-95%.And don't compare VEGA FE yet. It's not a gamer card. Once Vega consumer GPU's drop. If the FE's are similarly performant then by all means rip them up.
DanNeely - Wednesday, July 19, 2017 - link
The one number I'd like to see is power consumption (both the ram chips and the GPUs memory controllers) because the stated reason for HBM over newer additions of GDDR a few years ago was supposed to be power consumed by the IO buses would get out of hand and eat the majority of the TDP budget in the latter case. Since then we've had GDDR5x and GDDR6 which suggests that the GDDR team has managed to do better than was feared at the time.Manch - Wednesday, July 19, 2017 - link
Fr hbm1 anands deep dive it was a good bit. Of power saving. I would like to know the numbers as well.Mainly bc I'd like to throttle back a fury card to see how much of a regression it would beDanNeely - Wednesday, July 19, 2017 - link
I'm 99% sure I saw the deep dive a few years ago. I know what was claimed for power savings then without having to reread the old propaganda. I'm interested in what its actually delivering now, and how the GDDR developers have apparently avoided the power apocalypse that was supposed to ruin them beyond GDDR5.ImSpartacus - Wednesday, July 19, 2017 - link
Yeah, it's a little disappointing, but it's not entirely clear "who" is at fault.Recall that Fiji's HBM was overvolted to 1.3V in order to meet performance requirements. That's going to nullify some power savings.
Similarly, HBM2 doesn't appear to want to clock at 2.0 Gbps (GP100's only went to 1.4 Gbps, GV100 at 1.75 Gbps and the low-volume Vega FE at a weird 1.89 Gbps).
So some of the shortcomings of HBM, itself, appear to nullify its advantages over equivalent GDDR.
But yeah, it hasn't been paired with the best consumer GPUs on the market, so that hurts as well.
Yojimbo - Wednesday, July 19, 2017 - link
The memory manufacturers are at fault. They are the ones who promised and they are the ones who haven't delivered. The fact that NVIDIA put only 16 GB of VRAM on the V100 says a lot. The V100 is a card that is priced at perhaps $13,000 and contains a GPU with a die greater than 800 mm^2. NVIDIA didn't use 16 GB instead of 32 GB because of cost concerns. Apparently the memory manufacturers can't run 8-hi HBM2 stacks at high enough clocks to give the promised bandwidth so NVIDIA had to choose between more capacity or higher bandwidth. Having that extra 16 GB of memory capacity is actually rather important for HPC and AI uses, just not as important as memory bandwidth.I don't know what advantage 8-hi HBM2 has over GDDR5X/6 at this point, other than power usage and PCB area. Those things might be more relevant for networking products that use high bandwidth RAM than for GPUs.
Santoval - Wednesday, July 19, 2017 - link
Navi's design has been most likely locked, and it should currently be at the prototyping stage. Nvidia recently entertained the possibility of developing an MCM (multi-chip-package) GPU, in a similar way to AMD's Threadripper and Epyc, because they say monolithic dies cannot get any bigger (than Volta). It will basically be like a very tightly packed, ultra low latency SLI, and they said the drivers will present the distinct dies as a single one to games and GPU compute apps (and to the OS, of course).It is unknown if they are planning this for the successor of Volta or for generation after that. 7 or 10 nm (for Volta's successor) should allow them to add transistors and shader cores without enlarging the die further, so my guess is that Volta's successor will still be a monolithic die. AMD's potential plans about possible MCM GPUs are also unknown at this point.
extide - Wednesday, July 19, 2017 - link
According to GPU-Z screenshots that I have seen -- (at least some) Vega FE cards run MICRON HBM2, which is definitely surprising as I didn't even know they were in the HB2 game at all, betting solely on GDDR5x/6 -- of course GPU-Z could be wrong, but it is interesting none-the-less.HighTech4US - Wednesday, July 19, 2017 - link
I believe GPU-Z does not actually read the memory data from the actual memory just a copy and paste text string from the program.Ryan Smith - Wednesday, July 19, 2017 - link
Thankfully, GPU-Z gets it very wrong in this case, which makes it clear that it is wrong to being with. Micron doesn't make HBM (they're focused on HMC).Santoval - Wednesday, July 19, 2017 - link
So I guess Vega consumer will use two 4 GB 4-Hi stack KGSDs and Vega semi-pro will use two 8 GB 8-Hi stack KGSDs? These 8-Hi stack chips must still be too expensive.. I wonder how much more a 8 GB 8-Hi chip might cost than two 4 GB 4-Hi ones. I am kind of amazed though that even Tesla GV100 will use 4-Hi stack chips, since this is a part that can certainly swallow the extra cost. Unless Nvidia is worried about the availability of Hi-8 stack chips and/or the design was completed and locked with twice the number of HBM2 chips (compared to a design with 8 GB chips).Yojimbo - Wednesday, July 19, 2017 - link
Each stack gives 1024 bits of bus width and I believe requires its own memory controllers on the GPU die. So having 4 stacks is more expensive and results in higher bandwidth than having 2. In order to get the bandwidth they want, NVIDIA needs to use 4 stacks, i.e., even if they used 8-hi stacks they'd need 4 of them.Using 8-hi stacks instead of 4-hi stacks would give them more capacity. Why they don't put 32 GB on the cards I really don't know. It's either too expensive or not technically feasible, but I can't understand why they'd be so concerned about the extra expense for a card that is already $13,000. I would think that at least for the Summit and Sierra supercomputers they would make a 32 GB variant if it were technically feasible.
WatcherCK - Wednesday, July 19, 2017 - link
Not sure if anyone caught Gamers Nexus and their undervolting of the Vega FE, some interesting results from what I remember. Around a 100mV underclock (for stability across all test cases) gave a cooler running less thottling card (stock voltage is 1200mV?)http://www.gamersnexus.net/guides/2990-vega-fronti...
im holding off on a bigger GPU upgrade (from a 470) until Vega RE appears in retail where its gaming prowess and more importantly its hashing ability are benchmarked... A 580 is impossible to find and when you do its priced into the 1080 range based on availability in New Zealand :/ Early mining benchmarks look like vega would need alot of optimization to be a winner...
https://www.reddit.com/r/EtherMining/comments/6k52...
Yojimbo - Thursday, July 20, 2017 - link
I'm guessing AMD's board partners will solve the thermal problems the air cooled Vega FE seems to be having.But undervolting is liable to introduce stability issues just like overclocking. Its effectiveness will depend on the individual card you get and the individual games you play.