It always seems like a long wait between podcasts to me. Definitely the high point of my day when it comes out though. Looks like a good one from the outline. Listening to it now.
Enjoyed listening to all the topics in this edition of the podcast. Sounds like thermal throttling on the Nexus 4 isn't a deal breaker, especially if you buy into the greater philosophy of the device.
Love my job, since I've been bringing in $5600… I sit at home, music playing while I work in front of my new iMac that I got now that I'm making it online(Click on menu Home) http://goo.gl/rSjbU
Since this podcast was recorded, I essentially spent all weekend playing around with C++ AMP. It definitely gets around my 'Don't want to program on AMD cards' issue :) Currently trying to make a benchmark with it involving grid solvers, but it all seems too CPU limited. Switching to a better CPU speed makes more performance gain on AMP code than overclocking the GPU. Kind of odd :/
If I remember the MS presentations correctly, GPU entry and exit is rather expensive. If you try to maximize the amount of time you stay on the GPU, you might see better results.
have been testing AMP for a couple of months now, and i must say i prefer it to cuda. It definitely feels more C++ rather than C, and makes the conversion of existing C++ work easier. Not "slap code to phi and be done" easy, but i feel it's a step up from cuda. From a few benchmarks i ran, it's a tad slower (could be application dependent), but i'd happily trade that for the nicer language. Would be nice to hear your thoughts about it.
Hi, I love listening to your podcasts and learn a lot every time. Unfortunately, they're very long and I dont always have time to listen to the whole thing. I'd like to suggest that you add the ability to increase the playback rate of the audio for people like me.
I've been manually increasing it via my browser's javascript console since you're using html5 <audio> (thank god it's not flash!), but I'm sure more people will find it useful if it's built into the UI. I've found that at 1.5x the speech is still very understandable.
For those of you who would like to do this yourselves, run the following in your javascript console on this page:
Ha! In my defense, I tend to just build more systems instead of dealing with existing ones. It gives me an excuse to use new hardware in real world scenarios for extended testing, and often results in me finding issues I wouldn't have otherwise.
Now, I can not talk with certainty about this "cCleaner" in particular before I explore it a bit more (and I don't think I want to bother), but I've always regarded the class of programs of system or registry cleaners, optimizers and fixers as absolutely idiotic and pointless.
At best, they have a couple good tweaks, but I'd still prefer to apply those explicitly and after being well informed about how and why they work instead of just using a blackbox magical tool that does them in bulk for me. More likely than not they also have a good number of "tweaks" that are somewhat contextual and probably not a universally good thing that everyone should apply without consideration of the specific of the system, its usage and such. Then you also have a heap of similar programs that are a complete no-op, just counting on getting sales through scare tactics and user ignorance. And finally there even are malicious programs that are advertised in the same category as this.
No way anyone with a clue about computers should ever need to install this crap. Much like we did ok for years without an antivirus program (though today's free options make that no longer a reasonable choice).
OMG, I just had a read of its "features" page, and it is full of so much fail that it makes my head hurt. I retract my "can not talk with certainty" statement.
Normally you are right, these 'registry cleaners' are often nothing but a steaming pile of garbage. But CCleaner is the diamond in the rough, period. It's free, and there's no bugging to buy a paid version.
I recommend the above as well when looking at freeing space, great windows application for seeing what's taking up your space, a GUI version of linux's du.
Looking forward to listening to this after work. Quickly becoming my favorite podcast. Thanks for recording in between the busyness that October's pre Xmas product release was.
Thanks a lot for these podcasts. Very informative. Also hoping you guys have some time to discuss the transceiver stuff which you have been hinting for some time next
This is definitely my all time favorite podcast. Between the astute insights and the hilarious soapbox dialogue I always find myself anxiously awaiting the next episode.
Brian, for what its worth the attention to detail that you and the other editors demonstrate in your product reviews has made Anandtech my de facto tech news and review site. These days I find that I don't even bother with the reviews from other publications who seem to all echo the same bandwagon impressions. Even if you regularly have to be the one that makes the unpopular observation, some us truly appreciate the extra effort.
When was this recorded? Nothing about taking a razor to the Wii U's CPU? I have a feeling that this was recorded after that little 'incident'.
Two recent tidbits of news that were missing from this week's podcast. The first is that MIPS has been bought out. Really surprised that this didn't come up in the discussion of the TI OMAP. The other bit would have been the launch of the Itanium 9500. Outlook for this architecture is arguably darker than what is surrounding AMD at this point. Itanium's death won't have that great of an impact to Intel as a whole. These two items probably had time to sneak into this pod cast.
AMD's situation is as dire as you've indicated. It really boils down to how much their expense are in comparison to their revenue. Good to hear some discussion along the lines of 'if you were in charge of AMD, what would you do?' in the podcast. This type of analysis is what I've found lacking in previous episodes but glad to hear it, even if it was about doom and gloom.
Love the bit at the end with BRIAN SMASH. Will need video of him throwing a phone against a wall.
Ok, this is definitely my favorate episode so far. Very nice segment on the HPC stuff. What's you guys' take on Nvidia's GK104/110 strategy, i.e., little die with limited compute capbility for general consumers and big die for professional/HPC space? Is this going to be a trend that will continue for the next generations, and will AMD also join in? Also, I wonder how easy it really is for porting over OpenMP code to utilize Phi. A common issue with programming for PCI-E accelerator cards, be it Nvidia, AMD, or Intel, is the bottleneck of the PCI-E bus itself. Which means that the programmer has to be aware of the seperate memory space between the accelerator and the host, and to arrange memory transfers efficiently to avoid that bottleneck. I know in my simulation using CUDA, that is a very large part of the code. To me, that is definitely a barrier to entry if I was just used to program for the CPU. So I wonder how intel is going to deal with it, maybe with compiler directives just as with OpenMP to denote which memory blocks should reside on the GPU? But it seems that this problem alone is enough to make porting existing OpenMP code to efficiently utilize the Phi a less than a trival process. Of course, this is just speculation, since I don't have a Phi to try OpenMP with. Maybe in super computers this is a non-issue, since the bandwidth bottleneck there is probably intenode communication. What are your thoughts?
A lot of my CUDA work didn't rely on PCIe speed at all. One copy of the memory buffer from host to device, then a few thousand to a billion loops each with a few billion threads spawned, then copy back. Total time transferring over the PCIe bus was sub 0.1% of a long simulation.
I could see where Host->Device->Host copy times could be problematic, but it is all algorithm dependent. If you want to do a Matrix convert once on a bit of data, then yes the transfer will be a limiting factor. I try and keep my PCIe transfers to a minimum with my matrix solvers - keep the data on the host and only probe the data you actually need. If it's a science based simulation, you don't need the results of every time step - take every 10th or 100th loop around.
If PCIe is the bottleneck, then perhaps CUDA/GPGPU isn't the best way to look at your code? Or buy a machine where you can bump up the PCIe bus a bit and still maintain data coherency (if you need ECC).
With Xeon Phi, I imagine it'll be an API call to probe the Xeon Phi devices present, then a separate pragma for calls to Xeon Phi devices. Hopefully (fingers crossed) that it will automatically split over multiple Phi cards present in the system and do the cross talking automatically. That won't be the best solution for all, but for my Brownian motion simulations will love it. I wonder if they will include SLI/XFire type bridges between cards to minimize the PCIe crosstalk.
Yeah it took me a lot of effort to minimize transfer over the pci-e bus, which I needed to happen at each time step for synchronization across my neural network simulation.
It's not a bottleneck anymore, but for a moment there I was wondering why I didn't move my last bit of host code to the GPU and not deal with pci-e transfer at all. But then I remembered that I was programming to handle multiple cards, and it was easier to keep that bit of code on the host and handle the synchronization from multiple GPUs to the host over pci-e.
With multiple cards it seems that full optimization requires treating the code as running on a multi-node system. Which is probably a non-issue for people in the super computer space since they have to deal with multiple nodes anyways. But for scientists who want to have a super computer in a desktop, to run OpenMP code with little modification, it will be a barrier if they want multiple cards. So like you said, hopefully Intel can have software and hardware solutions to make that easier to handle.
It's kind of funny how AMD and Nvidia switched boats there, the 500 series actually often has better compute performance than the 600, but the 600 is much more efficient for what most consumers will use it for, namely games. And AMD had a game oriented architecture with the 6000 series then added in all the compute stuff with the 7k, just as Nvidia relegated all that to the more professionally oriented cards.
Is it a good strategy? I guess so, smaller dies = lower cost and power consumption and most users won't miss all the compute stuff. But it may limit some next generation games which may lean on GPGPU calculations, who knows how that will pan out. But for current tech, I think it's a good tradeoff, just a bummer for enthusiasts and scientists who may have wanted to run GPGPU calculations on a card that doesn't cost thousands.
The only reasons the top cards cost thousands is because they use ECC, often better double precision rates, and support if things go wrong. You essentially pay more for the HW and an extra layer of testing before you get the card. If you don't need ECC as an enthusiast, then don't bother - but for commercial results, ECC tends to be the barrier between several years of work or several years on the streets.
The NVIDIA 600 series does have its place - ideal when spawning a lot more lightweight threads. But the AMD change was more to do with architecture - VLIW4/5 was good but only great in a few niche examples that took care of ILP. GCN is a more general way of tackling everything GPGPU related, hence why those VLIW4/5 codes do not work as well, but everything else tends to work better.
one of my favourite episodes too. Insight on the HPC market was super informative. Looking forward to you guys getting your hands on a phi! The thoughts on what went wrong with AMD were very insightful too. . Never owned one of their products, but i know how bad for all of us a world without AMD would be.
Good podcast! I really feel that these guys who are talking now their stuff!! Obviously, Anand wouldn't of hired them if they didn't have technical backgrounds in compute... just saying, they're smart. ;)
BTW, with speculation of AMD being bought out or sold off piece by piece, what about Apple buying them up?? I think they would be MUCH more likely to buy them up then Samsung since they are both American companies, have the headquarters right next to each-other in silicon valley (so the integration of both companies would be much easier), and obviously Apple has interest in acquiring semiconductor companies in order to leverage their own products for their businesses.
Ideally tho, I would rather have ARM be able to somehow buy them up and be able to integrate their energy efficient designs with some of AMD's high-powered GPU prowess. Not only that but it could create a company that has the scale and technical talent to match up with Intel. I don't see that happening tho because ARM doesn't have the pile of cash to probably pull that off, except with maybe an outside investor group providing the financial capital to pull something like that off.
Great episode. Always interesting to hear more about different industries and how they interact with each other. Brian's rant is justified. Hopefully developers will fix it in future patches. My Padfone fortunately doesn't do that.
Will there be more coverage on Padfone 2? I was waiting to see Anandtech's take on the first one before I got it, then I just got impatient and got it.
Several podcasts ago I think Anand mentioned that he was getting a Thunderbolt to PCIe slot device from OWC and was going to test it out putting a GPU on it and seeing what would happen. Any updates from that?
I understand that it is already a lot of work to get the podcast together, however I think people would greatly appreciate a link dump of the topics/companies/articles/etc. of the topics you cover like Rooster Teeth does with theirs. Brian mentioned that a company would do screen calibrations for Google if they just approached them. I wanted to look at the company to know more about it and this happens on a regular basis of just people talking about tech.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
30 Comments
Back to Article
dishayu - Wednesday, November 21, 2012 - link
It always seems like a long wait between podcasts to me. Definitely the high point of my day when it comes out though. Looks like a good one from the outline. Listening to it now.noblemo - Wednesday, November 21, 2012 - link
Enjoyed listening to all the topics in this edition of the podcast. Sounds like thermal throttling on the Nexus 4 isn't a deal breaker, especially if you buy into the greater philosophy of the device.Have a great holiday.
LadyKate - Sunday, November 25, 2012 - link
Love my job, since I've been bringing in $5600… I sit at home, music playing while I work in front of my new iMac that I got now that I'm making it online(Click on menu Home)http://goo.gl/rSjbU
IanCutress - Wednesday, November 21, 2012 - link
Since this podcast was recorded, I essentially spent all weekend playing around with C++ AMP. It definitely gets around my 'Don't want to program on AMD cards' issue :) Currently trying to make a benchmark with it involving grid solvers, but it all seems too CPU limited. Switching to a better CPU speed makes more performance gain on AMP code than overclocking the GPU. Kind of odd :/Ian
Ryan Smith - Wednesday, November 21, 2012 - link
If I remember the MS presentations correctly, GPU entry and exit is rather expensive. If you try to maximize the amount of time you stay on the GPU, you might see better results.IanCutress - Wednesday, November 21, 2012 - link
I rearranged my code to do just that. Maybe I'm just not spawning enough threads.maximumGPU - Wednesday, November 21, 2012 - link
have been testing AMP for a couple of months now, and i must say i prefer it to cuda. It definitely feels more C++ rather than C, and makes the conversion of existing C++ work easier. Not "slap code to phi and be done" easy, but i feel it's a step up from cuda.From a few benchmarks i ran, it's a tad slower (could be application dependent), but i'd happily trade that for the nicer language.
Would be nice to hear your thoughts about it.
smike - Wednesday, November 21, 2012 - link
Hi, I love listening to your podcasts and learn a lot every time. Unfortunately, they're very long and I dont always have time to listen to the whole thing. I'd like to suggest that you add the ability to increase the playback rate of the audio for people like me.I've been manually increasing it via my browser's javascript console since you're using html5 <audio> (thank god it's not flash!), but I'm sure more people will find it useful if it's built into the UI. I've found that at 1.5x the speech is still very understandable.
For those of you who would like to do this yourselves, run the following in your javascript console on this page:
document.getElementsByTagName('audio')[0].playbackRate = 1.5;
Peanutsrevenge - Wednesday, November 21, 2012 - link
OMG Anand, -100 geek points for you, now go sit in the corner and think about your newbness.HOW can you not have heard of cCleaner, it's a geeks favorite application (Glary utilities is similar but in testing curently).
Please don't now hit me with a bunch of software you use all the time, my brain will explode :D
Anand Lal Shimpi - Wednesday, November 21, 2012 - link
Ha! In my defense, I tend to just build more systems instead of dealing with existing ones. It gives me an excuse to use new hardware in real world scenarios for extended testing, and often results in me finding issues I wouldn't have otherwise.Take care,
Anand
Visual - Thursday, November 22, 2012 - link
Now, I can not talk with certainty about this "cCleaner" in particular before I explore it a bit more (and I don't think I want to bother), but I've always regarded the class of programs of system or registry cleaners, optimizers and fixers as absolutely idiotic and pointless.At best, they have a couple good tweaks, but I'd still prefer to apply those explicitly and after being well informed about how and why they work instead of just using a blackbox magical tool that does them in bulk for me.
More likely than not they also have a good number of "tweaks" that are somewhat contextual and probably not a universally good thing that everyone should apply without consideration of the specific of the system, its usage and such.
Then you also have a heap of similar programs that are a complete no-op, just counting on getting sales through scare tactics and user ignorance.
And finally there even are malicious programs that are advertised in the same category as this.
No way anyone with a clue about computers should ever need to install this crap. Much like we did ok for years without an antivirus program (though today's free options make that no longer a reasonable choice).
Visual - Thursday, November 22, 2012 - link
OMG, I just had a read of its "features" page, and it is full of so much fail that it makes my head hurt. I retract my "can not talk with certainty" statement.Boogaloo - Thursday, November 22, 2012 - link
>using a blackbox magical tool>programs that are a complete no-op
>full of so much fail
Just stop posting.
IanCutress - Friday, November 23, 2012 - link
Normally you are right, these 'registry cleaners' are often nothing but a steaming pile of garbage. But CCleaner is the diamond in the rough, period. It's free, and there's no bugging to buy a paid version.Peanutsrevenge - Wednesday, November 21, 2012 - link
I recommend the above as well when looking at freeing space, great windows application for seeing what's taking up your space, a GUI version of linux's du.sideshow23bob - Wednesday, November 21, 2012 - link
Looking forward to listening to this after work. Quickly becoming my favorite podcast. Thanks for recording in between the busyness that October's pre Xmas product release was.SpitUK - Wednesday, November 21, 2012 - link
Great as always guys :)vkn - Wednesday, November 21, 2012 - link
Thanks a lot for these podcasts. Very informative.Also hoping you guys have some time to discuss the transceiver stuff which you have been hinting for some time next
crypticsaga - Wednesday, November 21, 2012 - link
This is definitely my all time favorite podcast. Between the astute insights and the hilarious soapbox dialogue I always find myself anxiously awaiting the next episode.Brian, for what its worth the attention to detail that you and the other editors demonstrate in your product reviews has made Anandtech my de facto tech news and review site. These days I find that I don't even bother with the reviews from other publications who seem to all echo the same bandwagon impressions. Even if you regularly have to be the one that makes the unpopular observation, some us truly appreciate the extra effort.
Kevin G - Wednesday, November 21, 2012 - link
When was this recorded? Nothing about taking a razor to the Wii U's CPU? I have a feeling that this was recorded after that little 'incident'.Two recent tidbits of news that were missing from this week's podcast. The first is that MIPS has been bought out. Really surprised that this didn't come up in the discussion of the TI OMAP. The other bit would have been the launch of the Itanium 9500. Outlook for this architecture is arguably darker than what is surrounding AMD at this point. Itanium's death won't have that great of an impact to Intel as a whole. These two items probably had time to sneak into this pod cast.
AMD's situation is as dire as you've indicated. It really boils down to how much their expense are in comparison to their revenue. Good to hear some discussion along the lines of 'if you were in charge of AMD, what would you do?' in the podcast. This type of analysis is what I've found lacking in previous episodes but glad to hear it, even if it was about doom and gloom.
Love the bit at the end with BRIAN SMASH. Will need video of him throwing a phone against a wall.
hammer256 - Wednesday, November 21, 2012 - link
Ok, this is definitely my favorate episode so far. Very nice segment on the HPC stuff. What's you guys' take on Nvidia's GK104/110 strategy, i.e., little die with limited compute capbility for general consumers and big die for professional/HPC space? Is this going to be a trend that will continue for the next generations, and will AMD also join in?Also, I wonder how easy it really is for porting over OpenMP code to utilize Phi. A common issue with programming for PCI-E accelerator cards, be it Nvidia, AMD, or Intel, is the bottleneck of the PCI-E bus itself. Which means that the programmer has to be aware of the seperate memory space between the accelerator and the host, and to arrange memory transfers efficiently to avoid that bottleneck. I know in my simulation using CUDA, that is a very large part of the code. To me, that is definitely a barrier to entry if I was just used to program for the CPU. So I wonder how intel is going to deal with it, maybe with compiler directives just as with OpenMP to denote which memory blocks should reside on the GPU? But it seems that this problem alone is enough to make porting existing OpenMP code to efficiently utilize the Phi a less than a trival process.
Of course, this is just speculation, since I don't have a Phi to try OpenMP with. Maybe in super computers this is a non-issue, since the bandwidth bottleneck there is probably intenode communication. What are your thoughts?
IanCutress - Friday, November 23, 2012 - link
A lot of my CUDA work didn't rely on PCIe speed at all. One copy of the memory buffer from host to device, then a few thousand to a billion loops each with a few billion threads spawned, then copy back. Total time transferring over the PCIe bus was sub 0.1% of a long simulation.I could see where Host->Device->Host copy times could be problematic, but it is all algorithm dependent. If you want to do a Matrix convert once on a bit of data, then yes the transfer will be a limiting factor. I try and keep my PCIe transfers to a minimum with my matrix solvers - keep the data on the host and only probe the data you actually need. If it's a science based simulation, you don't need the results of every time step - take every 10th or 100th loop around.
If PCIe is the bottleneck, then perhaps CUDA/GPGPU isn't the best way to look at your code? Or buy a machine where you can bump up the PCIe bus a bit and still maintain data coherency (if you need ECC).
With Xeon Phi, I imagine it'll be an API call to probe the Xeon Phi devices present, then a separate pragma for calls to Xeon Phi devices. Hopefully (fingers crossed) that it will automatically split over multiple Phi cards present in the system and do the cross talking automatically. That won't be the best solution for all, but for my Brownian motion simulations will love it. I wonder if they will include SLI/XFire type bridges between cards to minimize the PCIe crosstalk.
Ian
hammer256 - Saturday, November 24, 2012 - link
Yeah it took me a lot of effort to minimize transfer over the pci-e bus, which I needed to happen at each time step for synchronization across my neural network simulation.It's not a bottleneck anymore, but for a moment there I was wondering why I didn't move my last bit of host code to the GPU and not deal with pci-e transfer at all. But then I remembered that I was programming to handle multiple cards, and it was easier to keep that bit of code on the host and handle the synchronization from multiple GPUs to the host over pci-e.
With multiple cards it seems that full optimization requires treating the code as running on a multi-node system. Which is probably a non-issue for people in the super computer space since they have to deal with multiple nodes anyways. But for scientists who want to have a super computer in a desktop, to run OpenMP code with little modification, it will be a barrier if they want multiple cards. So like you said, hopefully Intel can have software and hardware solutions to make that easier to handle.
Should be exciting times.
tipoo - Sunday, November 25, 2012 - link
It's kind of funny how AMD and Nvidia switched boats there, the 500 series actually often has better compute performance than the 600, but the 600 is much more efficient for what most consumers will use it for, namely games. And AMD had a game oriented architecture with the 6000 series then added in all the compute stuff with the 7k, just as Nvidia relegated all that to the more professionally oriented cards.Is it a good strategy? I guess so, smaller dies = lower cost and power consumption and most users won't miss all the compute stuff. But it may limit some next generation games which may lean on GPGPU calculations, who knows how that will pan out. But for current tech, I think it's a good tradeoff, just a bummer for enthusiasts and scientists who may have wanted to run GPGPU calculations on a card that doesn't cost thousands.
IanCutress - Tuesday, November 27, 2012 - link
The only reasons the top cards cost thousands is because they use ECC, often better double precision rates, and support if things go wrong. You essentially pay more for the HW and an extra layer of testing before you get the card. If you don't need ECC as an enthusiast, then don't bother - but for commercial results, ECC tends to be the barrier between several years of work or several years on the streets.The NVIDIA 600 series does have its place - ideal when spawning a lot more lightweight threads. But the AMD change was more to do with architecture - VLIW4/5 was good but only great in a few niche examples that took care of ILP. GCN is a more general way of tackling everything GPGPU related, hence why those VLIW4/5 codes do not work as well, but everything else tends to work better.
Ian
hammer256 - Wednesday, November 21, 2012 - link
An AnandTech podcast just won't be the same without the Brian Rant.maximumGPU - Thursday, November 22, 2012 - link
one of my favourite episodes too. Insight on the HPC market was super informative. Looking forward to you guys getting your hands on a phi!The thoughts on what went wrong with AMD were very insightful too. . Never owned one of their products, but i know how bad for all of us a world without AMD would be.
Dman23 - Thursday, November 22, 2012 - link
Good podcast! I really feel that these guys who are talking now their stuff!! Obviously, Anand wouldn't of hired them if they didn't have technical backgrounds in compute... just saying, they're smart. ;)BTW, with speculation of AMD being bought out or sold off piece by piece, what about Apple buying them up?? I think they would be MUCH more likely to buy them up then Samsung since they are both American companies, have the headquarters right next to each-other in silicon valley (so the integration of both companies would be much easier), and obviously Apple has interest in acquiring semiconductor companies in order to leverage their own products for their businesses.
Ideally tho, I would rather have ARM be able to somehow buy them up and be able to integrate their energy efficient designs with some of AMD's high-powered GPU prowess. Not only that but it could create a company that has the scale and technical talent to match up with Intel. I don't see that happening tho because ARM doesn't have the pile of cash to probably pull that off, except with maybe an outside investor group providing the financial capital to pull something like that off.
Pfffman - Friday, November 23, 2012 - link
Great episode. Always interesting to hear more about different industries and how they interact with each other. Brian's rant is justified. Hopefully developers will fix it in future patches. My Padfone fortunately doesn't do that.Will there be more coverage on Padfone 2? I was waiting to see Anandtech's take on the first one before I got it, then I just got impatient and got it.
Several podcasts ago I think Anand mentioned that he was getting a Thunderbolt to PCIe slot device from OWC and was going to test it out putting a GPU on it and seeing what would happen. Any updates from that?
I understand that it is already a lot of work to get the podcast together, however I think people would greatly appreciate a link dump of the topics/companies/articles/etc. of the topics you cover like Rooster Teeth does with theirs. Brian mentioned that a company would do screen calibrations for Google if they just approached them. I wanted to look at the company to know more about it and this happens on a regular basis of just people talking about tech.
Thanks again and keep them up :)
iAnders - Friday, December 28, 2012 - link
Just listened to episode 10, very high quality discussions. Loved the bleak situation for AMD insight. Just wanted to say thanks!Regards
Anders