Was thinking the same thing. For their Xbox version there's no copy because of integrated memory, but on PCs it is a thing. I wonder if typical consumer PC hardware actually offloads these transfers efficiently, where they never have to touch main memory or the CPU and can use most of the two devices' bandwidth.
The problem is that on PC peer transfers across PCIe bus are not reliable simply because it has not been a thing. When operating systems support it and drivers support it and devices support it and they all speak the same dialect of the same protocol with the same timings and all support the same handshakes to agree which way they're going to do peer transfers, then it will work. Seems these stars are unlikely to align any time in the near future, but CXL was designed to solve these problems. So what we may see is a tunnelled CXL over PCIe as a way to align the industry.
This would be the best, but would first of all require new SSDs and have them support certain enterprise features. I expect vendors would be very reluctant to provide these feature for low cost in the open general purpose pc market.
I don't recall any special features that the SSD needs to support for p2p DMA. From the drive's perspective, it's just being told to DMA data to/from a different address, and it's the responsibility of the PCIe root complex or switches to route the packets to the appropriate physical device. Most of the difficulty is in the OS, ensuring that everything is configured correctly for this to work.
In many/most cases the data on disc will be encrypted via Bitlocker, so you can't just DMA to the GPU. Decrypting on the GPU is an option, but why share a sensitive key into more easily exploitable memory when CPUs are plenty fast?
This would be awesome, theorically the CPU involvement in textures should be minized as much as possible, there is not much that comes to mind that requires it that couldn't be accomplished more efficiently by the GPU.
Think of it like this. If you want to move the water from a five gallon bucket but you can only move a single cup at a time it's going to take a while. If you could use a larger container it will take less time. Depends on the size but potentially much much faster. Like drinking a glass of water a teaspoon at a time versus just chugging the glass down. 😂 That's what SAM is meant for. AMD is assuming lots of CPU power since obviously they sell them... Good for them 😉 Microsoft has an alternative method, Contrary to what most people here have most consumers have a weak CPU compared to what we likely have. Console systems need to maximize their utilization since typically they have a weaker CPU and quite limited resources compared to say an enthusiast computer. So if they can move assets with less CPU power that is good. So it makes perfect sense for them to do this. Different needs require different solutions. It just so happens to also make sense with these fast nvme drives being sold now. A HDD would not benefit much if at all from this. The bottle neck is the HDD itself. The new baldur gate game for example loads much faster on a nvme drive and this will probably make it load even faster still. Good stuff 😃
One thing about the article, game devs could simply send a utility to decompress and recompress assets instead of making us download the same assets over again. Not that hard to do that 🙂
AMD's SAM is just another name for Resizable BAR, which does not make the CPU any more involved in anything. It's just a matter of dropping a 32-bit compatibility hack so that driver software doesn't have to jump through hoops to access all of a GPU's memory. It only matters when the CPU already needed to be interacting with the GPU's memory.
Will implementing DirectStorage in PC games require less work if the game is also on Xbox consoles? Meaning, GDeflate and the APIs, are they already "working" in Xbox consoles and don't require porting that portion to PC games (which I assume they do right now when making a PC version of the game).
The thing with this DX12 Direct Storage API is, first on paper it sounds like something great. But in reality it does not matter much due to multiple reasons which I will explain.
First is the Storage to Performance, the argument of loading assets faster is only beneficial to the garbage consoles which were limited by HDDs and moved to NVMe finally. So they already have some kind of hardware API in them on both PS5 and XBSX.
On PC if we compare the titles loading speed without any of the so called Storage APIs the loading speed is literally same. So expecting major things out of this is I sincerely do not expect anything. DX11 vs DX12 was heralded as similar BS and today Vulkan powered RDR2, DOOM Eternal have significant performance over DX12. Same for so many titles running that API didn't provide anything new.
Also if we talk about game fidelity, there's not a single game today which does not use TAA horrible technology and pushes Rasterization to max. The reason I brought it up here is Crysis 3 launched in 2013 and it's been 10 years. There are barely any few games which provide such extravagant experience in Tessellation and no TAA blurfest game. Because there's no developer today who will make any game exclusive to PC. Even if the new consoles are out they barely are fast DMC 5 on PC on a 1080Ti runs at 180FPS Full HD. So if anyone develops for a PC the fidelity would be blowing everything out of water but that era is gone because nowadays games have politics, and lowest effort done ever. Just look at Arkham Knight and compare that to modern Gotham Knights, it's insane how worse the latest game looks and that's very important because not only you have got a visual downgrade BUT pisspoor optimization as the Min Spec is so laughable for that new Gotham Knights. So I do not expect literally anything from this so called Direct X Storage from both AAA developers AND M$.
Heck Red Dead Redemption 2 is a fantastic photorealism and on top the RAGE scripting with the leading Euphoria engine for physics, still it is hampered by TAA thus reducing massive fidelity. Do not take Cybertrash because it's garbage and it's not fit to be called as a game regardless of Nvidia shoving infinite PR BS on it with idiotic Psycho mode. There are barely any new high effort AAA titles.
TAA is mentioned here nobody pushing for high fidelity anymore since Console is the primary market and target. Witcher 3 massive downgrade because ? Consoles. So there's nothing but dust in expecting any sort of high fidelity changes in AAA Gaming space.
Unreal Engine 5 is being heralded as some next gen but in reality their metahuman was total garbage. RE Engine from Capcom beats it totally in that Facial animations. RDR2 and GTA V fidelity showcase is ultra massive, look at GTA V Natural Vision Evolved mod, it stresses out RTX3090, that engine is ultimate in technology scale. Also the UE4 ports so many of them suffer from horrible stutters, I think Epig recently fixed some of it gotta see how UE5 games will hold up.
Second, Hardware Unboxed did a review on how NVMe SSDs have impact, in majority of games there's basically no improvement vs SATA SSD, and forget PCIe 4.0 vs 3.0 SSDs. So having a few seconds shaven off won't provide any benefit whatsoever, since the CPU is not going to magically improve FPS here nor add any benefit to the equation.
Overrated glorified tech that MS wants just because they can lock out PC users to Windows 11 disaster which hampers Win32 shell and the entire Explorer downgrade PLUS VBS CPU problems and nightmarish instability with their OS RTM release. Win10 1909 or later that means LTSC2019 is outdated, and only 21H2 is supported. No thanks I can stick to old WDDM stable LTSC since the new LTSC needs more time to fix it's bugs.
Forgot to mention Metro Exodus, 4A Engine is also on par with superb FPS experience but it is also marred by TAA. Id Tech 6 was also using no TAA which is why DOOM 2016 looked solid and now TAA based id Tech 7 DOOM Eternal loses out a lot of fidelity with blur.
Still both of those are only other noteworthy engines and games that can sit atop. However Exodus is the only RT worth title in the current industry because of it's RTGI. Adding reflections with horrible performance hit is worthless when you can get SSR done in a much better way, just more stickers and advertising nonsense here in AAA nowadays.
All in all concluding that developer talent provides more fidelity experience and real innovation than these sticker PR features which do not do anything tangible in reality for PC esp.
I love reading through the comments here of people with interesting thoughts and questions about technology and then suddenly getting dropped into Silver5urfer's ridiculous, scorching hot takes.
What blows me away is just how many damn years he's been at this. I mean it's gotta be at least 15, maybe 20? And somehow every lengthy screed finds room for basically every unrelated issue he can imagine and ties it all together to rant against something he deems inferior like "consoles!" or whatever.
There are very few people in this world who go so many years without at least growing up a little.
Most games won't see improvements on different types of SSDs because they're working their magic to make things work in a way that makes that not visible to the user.
There's a ton of what-the-fuckery going on behind the scenes to work around thenfact they can't rely on storage being fast and whatnot. Some of these are even implemented in-game with cutscenes and small sections where the user has to wait for a given event to happen (think elevator rides and whatnot, for example).
You're seeing things backwards here. Being able to take advantage of those things takes time and you still need to support the old stuff too, so for a while the extent to which you can tap into those new APIs is limited.
Yet, they need to start existing at some point. Without them there, we can't take advantage of them and without us being able to do that, we'll never start designing games with that in mind.
It's a chicken and the egg thing, though slightly different.
How is MS addressing asset encryption that has been introduced by anti-piracy technologies for game assets with respect to decompressing these same assets in the GPU? It sounds like any game that uses those technologies will still have to process every bit of data in the processor to decrypt them before decompressing them, which means that the extra overhead of relocating that data to the GPU will incur additional overhead that could be avoided by just decompressing on the processor.
These are all assets the GPU needs (textures, geometry, etc) so time spent sending it to the GPU isn’t overhead. Given that, it’s probably still worth decompressing on the GPU, even if the CPU would be a little faster at it, just to save time transferring everything over the PCIe bus. (I’m assuming in your hypothetical that there’s some synergy between decrypting and decompressing that makes the latter faster when done simultaneously with the former)
Still seems like adding it to older games would show gains even if they weren't compressed with it in mind, where it worked properly the gains should far offset anywhere they didn't.
It's not like they can't write the software to decode virtually any container.
When sites say DirectStorage 1.0 is already out do they mean out to developers or already in Windows?
With no DirectStorage games in sight I was hoping DirectStorage can at least improve I/O of small files in Windows file system. Moving/copying/deleting thousands of files such as decompressed video png even with NVMe is still as painful as using HDD.
"When sites say DirectStorage 1.0 is already out do they mean out to developers or already in Windows?"
Both. That said, you'd have to check the API docs to see if there's even a provision for deleting things. It's primarily designed for loading game assets.
Aside from that, Windows 11 does have some storage stack optimizations that better tune things for NVMe drives. And, of course, deleting files via the CLI is a good deal faster than Windows Explorer.
I think the problem is that, inevitably, the updating of thousands of MFT records takes up time. Perhaps the updates are written out to disk one at a time, in a synchronous fashion with the file being deleted. Checking that a file's handles are closed could be adding to it as well. And, as Ryan pointed out, the GUI itself eats up a big share. Of course, I have got no idea how it actually works, but am speculating.
Incidentally, I deleted a folder now---an FFmpeg compilation of over 200,000 files---and it seems to corroborate this. While deleting, Task Manager showed that it wrote consistently to the SSD, at < 10 MB/sec, along with 30% CPU usage.
We've heard about this on pc for 3 years now ... and still nothing. I'll believe it when I see it. However, I really hope it kicks off asap. Because this is much needed.
I am confused here... I thought most DX games used the DXT variations for compression of textures, and that GPUs already had that in hardware for past 10 years or so. So why Gdeflate?!? Ok, that's 'just' textures, but that's also 90% of bandwidth, isn't it? Seems so much work when they could've made a system that allowed most of existing games to use DX Storage and GPU decompression for 80-90% of assets just by leveraging OS, DX APIs and drivers.
Asset compression is a layer of lossless compression that can be applied to most game assets; not just textures, but also meshes and other forms of geometry. GDeflate is just a variation on Deflate, so it can be used on any data type with easily-identified redundancies.
This does overlap with texture compression in a bit of a confusing way. Lossy texture compression does significantly reduce the size of a texture, but it's not guaranteed to make the texture as small as possible. Because of the need for random access (so that the texture units can operate on the data), texture compression operates on very small groups of pixels (typically 16 at a time), so there's additional redundancy that can be removed with lossless compression. This is in stark contrast to things like JPEG (which most people are far more familiar with), where the algorithm is far more advanced and doesn't generate the same kind of redundancies.
So why asset compression? Because at the end of the day, games are getting too big for their own good. Even with texture compression, games can be massive. Compressing them not only saves valuable SSD space, but with DirectStorage, it will speed up load times as well because the assets can be sent to the GPU in compressed form, allowing those transfers to complete more quickly.
Textures with lossy compression are already going to be stored at entropy. Lossless compression will only help models and instructions. Models do compress well and need to be loaded into VRAM, so there is savings to be had.
"Textures with lossy compression are already going to be stored at entropy."
That would generally be true for a more advanced compression format such as JPEG. But that's not the case for fixed ratio texture compression. These formats are very simple so that they can be readily used by texture units - particularly, that they allow random access - and the results cached with respect to page size boundaries. There's really no effort to remove redundancy or otherwise do entropy coding; it's closer to clever tricks to degrade an image and interpolate back a reasonable approximation.
For reference, after grabbing a 32KB colormap that's been DXTC1 compressed and placed in a DDS container, that ZIPs (DEFLATE) down to 12KB.
If you really want to go down the rabbit hole, look up BCPack, which is the lossless compression algorithm that the Xbox Series X uses to store its textures. As well as the Oodle suite of tools, which have been in use for several years now.
Will CXL benefit game load times similar to direct storage? What other areas/types of programs are bottlenecked by the PCIe interface... with the CPU being the middleman? Are there other aspects of the system that can be accelerated by cutting out the CPU and letting devices interact directly over the PCIe bus?
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
44 Comments
Back to Article
mczak - Friday, October 14, 2022 - link
PCIE peer-to-peer transfers seem such a good fit for DirectStorage. But I guess that's next version.twotwotwo - Friday, October 14, 2022 - link
Was thinking the same thing. For their Xbox version there's no copy because of integrated memory, but on PCs it is a thing. I wonder if typical consumer PC hardware actually offloads these transfers efficiently, where they never have to touch main memory or the CPU and can use most of the two devices' bandwidth.linuxgeex - Sunday, October 16, 2022 - link
The problem is that on PC peer transfers across PCIe bus are not reliable simply because it has not been a thing. When operating systems support it and drivers support it and devices support it and they all speak the same dialect of the same protocol with the same timings and all support the same handshakes to agree which way they're going to do peer transfers, then it will work. Seems these stars are unlikely to align any time in the near future, but CXL was designed to solve these problems. So what we may see is a tunnelled CXL over PCIe as a way to align the industry.Barca - Sunday, October 16, 2022 - link
CXL is the aligned starZeroVelocity - Friday, October 14, 2022 - link
This would be the best, but would first of all require new SSDs and have them support certain enterprise features. I expect vendors would be very reluctant to provide these feature for low cost in the open general purpose pc market.Billy Tallis - Friday, October 14, 2022 - link
I don't recall any special features that the SSD needs to support for p2p DMA. From the drive's perspective, it's just being told to DMA data to/from a different address, and it's the responsibility of the PCIe root complex or switches to route the packets to the appropriate physical device. Most of the difficulty is in the OS, ensuring that everything is configured correctly for this to work.niwax - Saturday, October 15, 2022 - link
In many/most cases the data on disc will be encrypted via Bitlocker, so you can't just DMA to the GPU. Decrypting on the GPU is an option, but why share a sensitive key into more easily exploitable memory when CPUs are plenty fast?abufrejoval - Saturday, October 15, 2022 - link
Why would you think that? Why would anyone want to use bitlocker on a gaming PC?It might have some justification on a laptop and/or corporate device, but otherwise why bother?
It really only creates problems e.g. when you need to swap out the main board or want to move storage to another system.
Reflex - Wednesday, October 19, 2022 - link
Most of us don't dedicate a PC just for gaming.JKflipflop98 - Saturday, October 15, 2022 - link
Uh, no. No one uses Bitlocker unless the system admin forces them to.Threska - Sunday, October 16, 2022 - link
Keep their porn from prying eyes.GreenReaper - Friday, December 2, 2022 - link
Or they are the system admin and want to keep away prying eyes.deil - Monday, October 17, 2022 - link
This version is at least usable, so it's a big step forward, don't ask Microshift to nail it, or they will nail the coffin of this tech.Theolendras - Monday, October 17, 2022 - link
This would be awesome, theorically the CPU involvement in textures should be minized as much as possible, there is not much that comes to mind that requires it that couldn't be accomplished more efficiently by the GPU.Threska - Friday, October 14, 2022 - link
Kind of funny how people want to cut the CPU out of the picture while AMD SAM puts it back in.https://www.digitaltrends.com/computing/what-is-am...
atragorn - Friday, October 14, 2022 - link
Think of it like this. If you want to move the water from a five gallon bucket but you can only move a single cup at a time it's going to take a while. If you could use a larger container it will take less time. Depends on the size but potentially much much faster. Like drinking a glass of water a teaspoon at a time versus just chugging the glass down. 😂That's what SAM is meant for.
AMD is assuming lots of CPU power since obviously they sell them... Good for them 😉
Microsoft has an alternative method,
Contrary to what most people here have most consumers have a weak CPU compared to what we likely have.
Console systems need to maximize their utilization since typically they have a weaker CPU and quite limited resources compared to say an enthusiast computer. So if they can move assets with less CPU power that is good. So it makes perfect sense for them to do this.
Different needs require different solutions. It just so happens to also make sense with these fast nvme drives being sold now. A HDD would not benefit much if at all from this. The bottle neck is the HDD itself.
The new baldur gate game for example loads much faster on a nvme drive and this will probably make it load even faster still. Good stuff 😃
One thing about the article, game devs could simply send a utility to decompress and recompress assets instead of making us download the same assets over again. Not that hard to do that 🙂
Billy Tallis - Friday, October 14, 2022 - link
AMD's SAM is just another name for Resizable BAR, which does not make the CPU any more involved in anything. It's just a matter of dropping a 32-bit compatibility hack so that driver software doesn't have to jump through hoops to access all of a GPU's memory. It only matters when the CPU already needed to be interacting with the GPU's memory.BinaryTB - Friday, October 14, 2022 - link
Will implementing DirectStorage in PC games require less work if the game is also on Xbox consoles? Meaning, GDeflate and the APIs, are they already "working" in Xbox consoles and don't require porting that portion to PC games (which I assume they do right now when making a PC version of the game).Silver5urfer - Friday, October 14, 2022 - link
The thing with this DX12 Direct Storage API is, first on paper it sounds like something great. But in reality it does not matter much due to multiple reasons which I will explain.First is the Storage to Performance, the argument of loading assets faster is only beneficial to the garbage consoles which were limited by HDDs and moved to NVMe finally. So they already have some kind of hardware API in them on both PS5 and XBSX.
On PC if we compare the titles loading speed without any of the so called Storage APIs the loading speed is literally same. So expecting major things out of this is I sincerely do not expect anything. DX11 vs DX12 was heralded as similar BS and today Vulkan powered RDR2, DOOM Eternal have significant performance over DX12. Same for so many titles running that API didn't provide anything new.
Also if we talk about game fidelity, there's not a single game today which does not use TAA horrible technology and pushes Rasterization to max. The reason I brought it up here is Crysis 3 launched in 2013 and it's been 10 years. There are barely any few games which provide such extravagant experience in Tessellation and no TAA blurfest game. Because there's no developer today who will make any game exclusive to PC. Even if the new consoles are out they barely are fast DMC 5 on PC on a 1080Ti runs at 180FPS Full HD. So if anyone develops for a PC the fidelity would be blowing everything out of water but that era is gone because nowadays games have politics, and lowest effort done ever. Just look at Arkham Knight and compare that to modern Gotham Knights, it's insane how worse the latest game looks and that's very important because not only you have got a visual downgrade BUT pisspoor optimization as the Min Spec is so laughable for that new Gotham Knights. So I do not expect literally anything from this so called Direct X Storage from both AAA developers AND M$.
Heck Red Dead Redemption 2 is a fantastic photorealism and on top the RAGE scripting with the leading Euphoria engine for physics, still it is hampered by TAA thus reducing massive fidelity. Do not take Cybertrash because it's garbage and it's not fit to be called as a game regardless of Nvidia shoving infinite PR BS on it with idiotic Psycho mode. There are barely any new high effort AAA titles.
TAA is mentioned here nobody pushing for high fidelity anymore since Console is the primary market and target. Witcher 3 massive downgrade because ? Consoles. So there's nothing but dust in expecting any sort of high fidelity changes in AAA Gaming space.
Unreal Engine 5 is being heralded as some next gen but in reality their metahuman was total garbage. RE Engine from Capcom beats it totally in that Facial animations. RDR2 and GTA V fidelity showcase is ultra massive, look at GTA V Natural Vision Evolved mod, it stresses out RTX3090, that engine is ultimate in technology scale. Also the UE4 ports so many of them suffer from horrible stutters, I think Epig recently fixed some of it gotta see how UE5 games will hold up.
Second, Hardware Unboxed did a review on how NVMe SSDs have impact, in majority of games there's basically no improvement vs SATA SSD, and forget PCIe 4.0 vs 3.0 SSDs. So having a few seconds shaven off won't provide any benefit whatsoever, since the CPU is not going to magically improve FPS here nor add any benefit to the equation.
Overrated glorified tech that MS wants just because they can lock out PC users to Windows 11 disaster which hampers Win32 shell and the entire Explorer downgrade PLUS VBS CPU problems and nightmarish instability with their OS RTM release. Win10 1909 or later that means LTSC2019 is outdated, and only 21H2 is supported. No thanks I can stick to old WDDM stable LTSC since the new LTSC needs more time to fix it's bugs.
Silver5urfer - Friday, October 14, 2022 - link
Forgot to mention Metro Exodus, 4A Engine is also on par with superb FPS experience but it is also marred by TAA. Id Tech 6 was also using no TAA which is why DOOM 2016 looked solid and now TAA based id Tech 7 DOOM Eternal loses out a lot of fidelity with blur.Still both of those are only other noteworthy engines and games that can sit atop. However Exodus is the only RT worth title in the current industry because of it's RTGI. Adding reflections with horrible performance hit is worthless when you can get SSR done in a much better way, just more stickers and advertising nonsense here in AAA nowadays.
All in all concluding that developer talent provides more fidelity experience and real innovation than these sticker PR features which do not do anything tangible in reality for PC esp.
Dizoja86 - Friday, October 14, 2022 - link
I love reading through the comments here of people with interesting thoughts and questions about technology and then suddenly getting dropped into Silver5urfer's ridiculous, scorching hot takes.Makaveli - Friday, October 14, 2022 - link
lol like something throwing a bucket of cold water on you.DigitalFreak - Friday, October 14, 2022 - link
Hot takes? More like shit takes.Reflex - Wednesday, October 19, 2022 - link
What blows me away is just how many damn years he's been at this. I mean it's gotta be at least 15, maybe 20? And somehow every lengthy screed finds room for basically every unrelated issue he can imagine and ties it all together to rant against something he deems inferior like "consoles!" or whatever.There are very few people in this world who go so many years without at least growing up a little.
SirDragonClaw - Monday, October 17, 2022 - link
Wow Dunning Kruger really did a number on you 🤣I have never seen such an amazing example of the Dunning Kruger effect.
andrebrait - Friday, October 14, 2022 - link
Most games won't see improvements on different types of SSDs because they're working their magic to make things work in a way that makes that not visible to the user.There's a ton of what-the-fuckery going on behind the scenes to work around thenfact they can't rely on storage being fast and whatnot. Some of these are even implemented in-game with cutscenes and small sections where the user has to wait for a given event to happen (think elevator rides and whatnot, for example).
You're seeing things backwards here. Being able to take advantage of those things takes time and you still need to support the old stuff too, so for a while the extent to which you can tap into those new APIs is limited.
Yet, they need to start existing at some point. Without them there, we can't take advantage of them and without us being able to do that, we'll never start designing games with that in mind.
It's a chicken and the egg thing, though slightly different.
Leather Jacket - Saturday, October 15, 2022 - link
The Dunning-Kruger effect in full displaylightningz71 - Friday, October 14, 2022 - link
How is MS addressing asset encryption that has been introduced by anti-piracy technologies for game assets with respect to decompressing these same assets in the GPU? It sounds like any game that uses those technologies will still have to process every bit of data in the processor to decrypt them before decompressing them, which means that the extra overhead of relocating that data to the GPU will incur additional overhead that could be avoided by just decompressing on the processor.Threska - Friday, October 14, 2022 - link
Sounds like they'll have the GPU do it.brucethemoose - Friday, October 14, 2022 - link
CPUs are pretty fast at decryption, so it should still be better than decrypting and *then* decompressing the assets.Also, game devs probably don't need to encrypt non-code assets unless they are ridiculously paranoid. Not sure how it works now.
Small Bison - Sunday, October 16, 2022 - link
These are all assets the GPU needs (textures, geometry, etc) so time spent sending it to the GPU isn’t overhead. Given that, it’s probably still worth decompressing on the GPU, even if the CPU would be a little faster at it, just to save time transferring everything over the PCIe bus. (I’m assuming in your hypothetical that there’s some synergy between decrypting and decompressing that makes the latter faster when done simultaneously with the former)0ldman79 - Friday, October 14, 2022 - link
Still seems like adding it to older games would show gains even if they weren't compressed with it in mind, where it worked properly the gains should far offset anywhere they didn't.It's not like they can't write the software to decode virtually any container.
wr3zzz - Saturday, October 15, 2022 - link
When sites say DirectStorage 1.0 is already out do they mean out to developers or already in Windows?With no DirectStorage games in sight I was hoping DirectStorage can at least improve I/O of small files in Windows file system. Moving/copying/deleting thousands of files such as decompressed video png even with NVMe is still as painful as using HDD.
Ryan Smith - Saturday, October 15, 2022 - link
"When sites say DirectStorage 1.0 is already out do they mean out to developers or already in Windows?"Both. That said, you'd have to check the API docs to see if there's even a provision for deleting things. It's primarily designed for loading game assets.
Aside from that, Windows 11 does have some storage stack optimizations that better tune things for NVMe drives. And, of course, deleting files via the CLI is a good deal faster than Windows Explorer.
GeoffreyA - Sunday, October 16, 2022 - link
I think the problem is that, inevitably, the updating of thousands of MFT records takes up time. Perhaps the updates are written out to disk one at a time, in a synchronous fashion with the file being deleted. Checking that a file's handles are closed could be adding to it as well. And, as Ryan pointed out, the GUI itself eats up a big share. Of course, I have got no idea how it actually works, but am speculating.GeoffreyA - Sunday, October 16, 2022 - link
Incidentally, I deleted a folder now---an FFmpeg compilation of over 200,000 files---and it seems to corroborate this. While deleting, Task Manager showed that it wrote consistently to the SSD, at < 10 MB/sec, along with 30% CPU usage.James5mith - Sunday, October 16, 2022 - link
Sounds like you need an Optane drive. Too bad it was decided they weren't worth the money.Zingam - Saturday, October 15, 2022 - link
Who thinks we don't need no CPU no more!JKJK - Saturday, October 15, 2022 - link
We've heard about this on pc for 3 years now ... and still nothing.I'll believe it when I see it. However, I really hope it kicks off asap. Because this is much needed.
LuxZg - Sunday, October 16, 2022 - link
I am confused here... I thought most DX games used the DXT variations for compression of textures, and that GPUs already had that in hardware for past 10 years or so. So why Gdeflate?!? Ok, that's 'just' textures, but that's also 90% of bandwidth, isn't it? Seems so much work when they could've made a system that allowed most of existing games to use DX Storage and GPU decompression for 80-90% of assets just by leveraging OS, DX APIs and drivers.Ryan Smith - Monday, October 17, 2022 - link
Asset compression is a layer of lossless compression that can be applied to most game assets; not just textures, but also meshes and other forms of geometry. GDeflate is just a variation on Deflate, so it can be used on any data type with easily-identified redundancies.This does overlap with texture compression in a bit of a confusing way. Lossy texture compression does significantly reduce the size of a texture, but it's not guaranteed to make the texture as small as possible. Because of the need for random access (so that the texture units can operate on the data), texture compression operates on very small groups of pixels (typically 16 at a time), so there's additional redundancy that can be removed with lossless compression. This is in stark contrast to things like JPEG (which most people are far more familiar with), where the algorithm is far more advanced and doesn't generate the same kind of redundancies.
So why asset compression? Because at the end of the day, games are getting too big for their own good. Even with texture compression, games can be massive. Compressing them not only saves valuable SSD space, but with DirectStorage, it will speed up load times as well because the assets can be sent to the GPU in compressed form, allowing those transfers to complete more quickly.
willis936 - Monday, October 24, 2022 - link
Textures with lossy compression are already going to be stored at entropy. Lossless compression will only help models and instructions. Models do compress well and need to be loaded into VRAM, so there is savings to be had.Ryan Smith - Tuesday, October 25, 2022 - link
"Textures with lossy compression are already going to be stored at entropy."That would generally be true for a more advanced compression format such as JPEG. But that's not the case for fixed ratio texture compression. These formats are very simple so that they can be readily used by texture units - particularly, that they allow random access - and the results cached with respect to page size boundaries. There's really no effort to remove redundancy or otherwise do entropy coding; it's closer to clever tricks to degrade an image and interpolate back a reasonable approximation.
For reference, after grabbing a 32KB colormap that's been DXTC1 compressed and placed in a DDS container, that ZIPs (DEFLATE) down to 12KB.
If you really want to go down the rabbit hole, look up BCPack, which is the lossless compression algorithm that the Xbox Series X uses to store its textures. As well as the Oodle suite of tools, which have been in use for several years now.
Exotica - Monday, October 17, 2022 - link
Will CXL benefit game load times similar to direct storage? What other areas/types of programs are bottlenecked by the PCIe interface... with the CPU being the middleman? Are there other aspects of the system that can be accelerated by cutting out the CPU and letting devices interact directly over the PCIe bus?