Registar

User Tag List

Likes Likes:  0
Página 13 de 21 PrimeiroPrimeiro ... 31112131415 ... ÚltimoÚltimo
Resultados 181 a 195 de 304

Tópico: DirectX 12

  1. #181
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Will DirectX 12 and Cloud Gaming Be Possible for VR Games?


    There have undoubtedly been a lot of attempts in introducing Cloud Gaming in the past. The concept is great and it will really help reduce a lot of computing power on a user’s machine, but failed attempts so far don’t make it practical, so how about adding DirectX 12 and VR to the mix?
    Elijah Freeman, Executive Producer at Crytek, told GamingBolt about some interesting ideas of combining all three technologies, and from what I understand, it might just work. Of course, VR requires low latency and high FPS to run the latest games, so optimizations for all titles using the technology should be on the developers’ minds all the time.
    DirectX 12 already revealed its power of optimized draw calls and high FPS, so the API should be a must for developers looking to get a lot of ‘fancy’ stuff in their titles and get good frame rates at the end. This is why Crytek believes that combining the cloud, DirectX 12 and VR might just do the trick.
    You can’t actually play a fully fledged VR game straight from the cloud, but Crytek believes that using it to simulate complex code which is not required on a user’s PC will greatly increase the performance and free up more resources. They presented such a concept when showing off a lot of explosions being simulated in Crackdown 3 without having a performance drop. You can view the title’s trailer below.




    So what do you think? Is DirectX 12 and Cloud Gaming the future for complex VR games? Let us know!
    Thank you GamingBolt for providing us with this information
    Noticia:
    http://www.eteknix.com/will-directx-...ible-vr-games/
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  2. #182
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Crytek: DX12 & Cloud Computing Could Help With VR


    Crytek showcased two different VR demos called Back to Dinosaur Island (Part I at GDC 2015 and Part II at E3 2015, respectively), in order to demonstrate the capabilities of CryEngine when employed in the VR environment.
    That’s not everything, as Crytek is also working on an actual VR game based on the tech demo. Its title is Robinson: The Journey, and here’s the official description:
    It will offer players an unparalleled sense of presence in a game world as they assume the role of a young boy who has crash-landed on a mysterious planet. With freedom to explore their surroundings in 360 degrees of detail, players will become pioneers by interacting with the rich ecosystem around them and unearthing incredible secrets at every turn.
    Given this sudden interest in VR technology, it’s no wonder that GamingBolt contacted Crytek about the potential benefits of technologies like DirectX 12 and cloud computing in this particular environment.
    Here’s what Executive Producer Elijah Freeman had to say on the subject:


    From an engine technology perspective, VR is mostly about being able to run consistently at high frame-rates, which results in some major challenges for rendering optimizations. Everything which helps to make drawcalls more efficient is highly useful, so DX12 with its reduced API overhead and better parallelization capabilities can definitely play a part in helping us create more complex VR worlds than before.
    Minimizing latency is essential for VR, so we will likely not see VR games being fully rendered in the cloud any time soon. However, some complex simulation code for which lower frequency updates are acceptable could well run in the cloud and free up resources on the player’s machine, helping to achieve higher frame-rates.
    Indeed, VR games require very high frame-rates on average in order to be really convincing, and both DirectX 12 and cloud computing could help in that regard.
    We’re now about half a year away from the commercial release of VR devices like the Oculus Rift and Sony’s Morpheus for PS4 (both scheduled for Q1 2016); the HTC Vive could be available even sooner than that, as it’s currently targeting a launch before the end of 2015.
    While we wait, let’s take another look at that Back to Dinosaur Island VR demo by Crytek.





    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  3. #183
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Oxide Games Dev Replies On Ashes of the Singularity Controversy

    By now, you’re probably aware of the controversy on Ashes of the Singularity’s Alpha DirectX 12 benchmark.
    While AMD cards got clear performance benefits by running in DX12 mode versus DX11, the same did not turn out to be true for NVIDIA cards, which registered mixed results. This sparked a controversy between Oxide Games (the developer of the game) and NVIDIA itself, which publicly deemed the benchmark as an inaccurate indicator of DX12 performance.

    Today, one of the developers of Oxide Games decided to share his view on the matter by leaving a couple posts on Overclock.net’s board:
    Wow, there are lots of posts here, so I’ll only respond to the last one. The interest in this subject is higher then we thought. The primary evolution of the benchmark is for our own internal testing, so it’s pretty important that it be representative of the gameplay. To keep things clean, I’m not going to make very many comments on the concept of bias and fairness, as it can completely go down a rat hole.
    Certainly I could see how one might see that we are working closer with one hardware vendor then the other, but the numbers don’t really bare that out. Since we’ve started, I think we’ve had about 3 site visits from NVidia, 3 from AMD, and 2 from Intel ( and 0 from Microsoft, but they never come visit anyone ;(). Nvidia was actually a far more active collaborator over the summer then AMD was, If you judged from email traffic and code-checkins, you’d draw the conclusion we were working closer with Nvidia rather than AMD wink.gif As you’ve pointed out, there does exist a marketing agreement between Stardock (our publisher) for Ashes with AMD. But this is typical of almost every major PC game I’ve ever worked on (Civ 5 had a marketing agreement with NVidia, for example). Without getting into the specifics, I believe the primary goal of AMD is to promote D3D12 titles as they have also lined up a few other D3D12 games.
    If you use this metric, however, given Nvidia’s promotions with Unreal (and integration with Gameworks) you’d have to say that every Unreal game is biased, not to mention virtually every game that’s commonly used as a benchmark since most of them have a promotion agreement with someone. Certainly, one might argue that Unreal being an engine with many titles should give it particular weight, and I wouldn’t disagree. However, Ashes is not the only game being developed with Nitrous. It is also being used in several additional titles right now, the only announced one being the Star Control reboot. (Which I am super excited about! But that’s a completely other topic wink.gif).
    Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only ‘vendor’ specific code is for Nvidia where we had to shutdownasync compute. By vendor specific, I mean a case where we look at the Vendor ID and make changes to our rendering path. Curiously, their driver reported this feature was functional but attempting to use it was an unmitigated disaster in terms of performance and conformance so we shut it down on their hardware. As far as I know, Maxwell doesn’t really have Async Compute so I don’t know why their driver was trying to expose that. The only other thing that is different between them is that Nvidia does fall into Tier 2 class binding hardware instead of Tier 3 like AMD which requires a little bit more CPU overhead in D3D12, but I don’t think it ended up being very significant. This isn’t a vendor specific path, as it’s responding to capabilities the driver reports.
    From our perspective, one of the surprising things about the results is just how good Nvidia’s DX11 perf is. But that’s a very recent development, with huge CPU perf improvements over the last month. Still, DX12 CPU overhead is still far far better on Nvidia, and we haven’t even tuned it as much as DX11. The other surprise is that of the min frame times having the 290X beat out the 980 Ti (as reported on Ars Techinica). Unlike DX11, minimum frame times are mostly an application controlled feature so I was expecting it to be close to identical. This would appear to be GPU side variance, rather then software variance. We’ll have to dig into this one.
    I suspect that one thing that is helping AMD on GPU performance is D3D12 exposes Async Compute, which D3D11 did not. Ashes uses a modest amount of it, which gave us a noticeable perf improvement. It was mostly opportunistic where we just took a few compute tasks we were already doing and made them asynchronous, Ashes really isn’t a poster-child for advanced GCN features.




    Our use of Async Compute, however, pales with comparisons to some of the things which the console guys are starting to do. Most of those haven’t made their way to the PC yet, but I’ve heard of developers getting 30% GPU performance by using Async Compute. Too early to tell, of course, but it could end being pretty disruptive in a year or so as these GCN built and optimized engines start coming to the PC. I don’t think Unreal titles will show this very much though, so likely we’ll have to wait to see. Has anyone profiled Ark yet?
    In the end, I think everyone has to give AMD alot of credit for not objecting to our collaborative effort with Nvidia even though the game had a marketing deal with them. They never once complained about it, and it certainly would have been within their right to do so. (Complain, anyway, we would have still done it, wink.gif)

    P.S. There is no war of words between us and Nvidia. Nvidia made some incorrect statements, and at this point they will not dispute our position if you ask their PR. That is, they are not disputing anything in our blog. I believe the initial confusion was because Nvidia PR was putting pressure on us to disable certain settings in the benchmark, when we refused, I think they took it a little too personally.
    AFAIK, Maxwell doesn’t support Async Compute, at least not natively. We disabled it at the request of Nvidia, as it was much slower to try to use it then to not.
    Weather or not Async Compute is better or not is subjective, but it definitely does buy some performance on AMD’s hardware. Whether it is the right architectural decision for Maxwell, or is even relevant to it’s scheduler is hard to say.
    Essentially, his explanation falls pretty much in line with the prevalent theory that AMD cards have been built to take advantage of DirectX 12’s features, mainly thanks to Async Compute which should become more prevalent in the next couple of years.
    Moreover, NVIDIA has relied a lot on driver specific optimization, but this is less relevant to DX12 than it was to DX11. On the other hand, though, this theory needs to be corroborated by more hard tests before it can really be accepted as fact; unfortunately, ARK: Survival Evolved’s DX12 patch has been delayed, otherwise we would have more information already.
    Still, NVIDIA users need not be concerned for the time being. First and foremost, not all games will be as reliant on draw calls as Ashes of the Singularity (or any RTS); moreover, even the Oxide Games’ dev admits that games developed with Unreal Engine 4 probably won’t show significant use of Async Compute, and the list of upcoming games made with UE4 grows each day.
    Hopefully, we’ll get a chance to test MIcrosoft’s DirectX 12 soon in many games from different genres, which should help us drawing a clearer picture.


    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  4. #184
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    DX12 Async Shaders A Big Advantage For AMD Over Nvidia Explains Oxide Games Dev

    Oxide Games have pinpointed Asynchronous Shaders as one of the main reasons AMD hardware showed significant gains vs Nvidia in DX12. More specifically in the recently launched DX12 benchmark for the developer’s real time strategy title Ashes of The Singularity, which is set for release next year. The benchmark however, which Oxide Games is adamant is accurately representative of the game’s performance, has been available to download for free since earlier this month. We have already ran this test on a variety of graphics cards from both Nvidia and AMD and published our results in an article earlier this month.

    What we ,and other publications, had found was that AMD GPUs consistently showed significantly greater performance gains than their Nvidia counterparts and in many instances the AMD cards matched or outperformed more expensive Nvidia offerings. On the Nvidia side the results were fairly inconsistent to say the least, where in some instances we, and other publications, registered a performance loss with Nvidia hardware running the DX12 version of the benchmark compared to DX11.
    DX12 Async Shaders A Big Advantage For AMD Over Nvidia According To Oxide Games Dev


    Since then an Oxide Games dev shared a lot of information on why that is and shed light on the issue more recently in a couple of posts on overclock.net which we covered yesterday and followed it with more details in an additional comment today.
    In regards to the purpose of Async compute, there are really 2 main reasons for it:
    1) It allows jobs to be cycled into the GPU during dormant phases. In can vaguely be thought of as the GPU equivalent of hyper threading. Like hyper threading, it really depends on the workload and GPU architecture for as to how important this is. In this case, it is used for performance. I can’t divulge too many details, but GCN can cycle in work from an ACE incredibly efficiently. Maxwell’s schedular has no analog just as a non hyper-threaded CPU has no analog feature to a hyper threaded one.
    2) It allows jobs to be cycled in completely out of band with the rendering loop. This is potentially the more interesting case since it can allow gameplay to offload work onto the GPU as the latency of work is greatly reduced. I’m not sure of the background of Async Compute, but it’s quite possible that it is intended for use on a console as sort of a replacement for the Cell Processors on a ps3. On a console environment, you really can use them in a very similar way. This could mean that jobs could even span frames, which is useful for longer, optional computational tasks.
    It didn’t look like there was a hardware defect to me on Maxwell just some unfortunate complex interaction between software scheduling trying to emmulate it which appeared to incure some heavy CPU costs. Since we were tying to use it for #1, not #2, it made little sense to bother. I don’t believe there is any specific requirement that Async Compute be required for D3D12, but perhaps I misread the spec.
    Previous comments :
    I suspect that one thing that is helping AMD on GPU performance is D3D12 exposes Async Compute, which D3D11 did not. Ashes uses a modest amount of it, which gave us a noticeable perf improvement. It was mostly opportunistic where we just took a few compute tasks we were already doing and made them asynchronous, Ashes really isn’t a poster-child for advanced GCN features.
    Our use of Async Compute, however, pales with comparisons to some of the things which the console guys are starting to do. Most of those haven’t made their way to the PC yet, but I’ve heard of developers getting 30% GPU performance by using Async Compute. Too early to tell, of course, but it could end being pretty disruptive in a year or so as these GCN built and optimized engines start coming to the PC. I don’t think Unreal titles will show this very much though, so likely we’ll have to wait to see. Has anyone profiled Ark yet?
    In the end, I think everyone has to give AMD alot of credit for not objecting to our collaborative effort with Nvidia even though the game had a marketing deal with them. They never once complained about it, and it certainly would have been within their right to do so. (Complain, anyway, we would have still done it, )

    P.S. There is no war of words between us and Nvidia. Nvidia made some incorrect statements, and at this point they will not dispute our position if you ask their PR. That is, they are not disputing anything in our blog. I believe the initial confusion was because Nvidia PR was putting pressure on us to disable certain settings in the benchmark, when we refused, I think they took it a little too personally.
    AFAIK, Maxwell doesn’t support Async Compute, at least not natively. We disabled it at the request of Nvidia, as it was much slower to try to use it then to not.
    Weather or not Async Compute is better or not is subjective, but it definitely does buy some performance on AMD’s hardware. Whether it is the right architectural decision for Maxwell, or is even relevant to it’s scheduler is hard to say.
    According to Oxide Games, what has seemingly helped propel AMD hardware in the DX12 version of the game benchmark was the company’s Asynchronous Compute feature found in the GCN architecture.
    Asynchronous Shaders or what’s otherwise known as Asynchronous Shading is one of the more exciting hardware features that DirectX12, Vulkan and Mantle before them exposed. This feature allows tasks to be submitted and processed by shader units inside GPUs ( what Nvidia calls CUDA cores and AMD dubs Stream Processors ) simultaneous and asynchronously in a multi-threaded fashion.
    Advertisements

    Asynchronous Shaders/Compute : What It Is And Why It Matters

    One would’ve thought that with multiple thousands of shader units inside modern GPUs that proper multi-threading support would have already existed in DX11. In fact one would argue that comprehensive multi-threading is crucial to maximize performance and minimize latency. But the truth is that DX11 only supports basic multi-threading methods that can’t fully take advantage of the thousands of shader units inside modern GPUs. This meant that GPUs could never reach their full potential, until now.

    Multithreaded graphics in DX11 does not allow for multiple tasks to be scheduled simultaneously without adding considerable complexity to the design. This meant that a great number of GPU resources would spend their time idling with no task to process because the command stream simply can’t keep up. This in turn meant that GPUs could never be fully utilized, leaving a deep well of untapped performance and potential that programmers could not reach.

    Other complementary technologies attempted to improve the situation by enabling prioritization of important tasks over others. Graphics pre-emption allowed for prioritizing tasks but just like multi-threaded graphics in DX11 it did not solve the fundamental problem. As it could not enable multiple tasks to be handled and submitted simultaneously independently of one another. A crude analogy would be that what graphics pre-emption does is merely add a traffic light to the road rather than add an additional lane.

    Out of this problem a solution was born, one that’s very effective and readily available to programmers with DX12, Vulkan and Mantle. It’s called Asynchronous Shaders and just as we’ve explained above it enables a genuine multi-threaded approach to graphics. It allows for tasks to be simultaneously processed independently of one another. So that each one of the multiple thousand shader units inside a modern GPU can be put to as much use as possible to improve performance.

    However to enable this feature the GPU must be built from the ground up to support it. In AMD’s Graphics Core Next based GPUs this feature is enabled through the Asynchronous Compute Engines integrated into each GPU. These are structures which are built directly into the GPU itself. And they serve as the multi-lane highway by which tasks are delivered to the stream processors.






    Each ACE is capable of handling eight queues and every GCN based GPU has a minimum of two ACEs. More modern chips such as the R9 285 and R9 290/290X have eight ACEs. ACEs debuted with AMD’s first GCN based GPU code named Tahiti in late 2011. They were originally added to GPUs mainly to handle compute tasks because they could not be leveraged with graphics APIs of the time. Today however ACEs take on a more important role in graphics processing in addition to compute.

    Asynchronous Shaders Can Provide A 46% Performance Uplift on AMD Hardware With DX12

    To showcase the performance advantage that this feature can bring to the table AMD demoed it via a Liuqid VR sample five months ago. The demo ran at 245 FPS with Asynchronous Shaders off and post-processing disabled. However after post-processing was enabled the performance dropped to 158 FPS. Finally when Asynchronous Shaders and post-processing were both enabled, the average FPS went up to 230 FPS, approximately a 46% performance uplift. While this is likely a best case scenario improvement, it isn’t too far off the 30% performance boost mark that Oxide Games mentioned other devs achieving with this feature on the consoles.

    This isn’t all just a theoretical exercise either, there are a number of games which have already been released with Asynchronous Shaders implemented. These games include Battlefield 4, Infamous Second Son and The Tomorrow Children on the PS4 and Thief when running under Mantle on the PC. Ashes Of The singularity will obviously be joining that list soon as well. AMD always likes to point out that the consoles and the PC share the same GCN graphics architecture. So whatever is achieved on one platform the company can be taken to the other.

    Naturally the mentioned demo only showcases the potential performance improvement that can be attained with Asynchronous Shaders and low level APIs such as Mantle, Vulkan and DX12. With a well designed implementation and proper optimization we may see DX12 games approach that performance uplift figure just from Async Shaders. Which is a very exciting prospect.






    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  5. #185
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    AMD: NVIDIA’s Maxwell Is Utterly Incapable Of Performing Async Compute


    The surprising results of Ashes of the Singularity’s DirectX 12 Alpha benchmark are still creating discussion and comments from all sides.
    Yesterday, we reported a couple of posts from an Oxide Games developer who mentioned that AMD’s GCN hardware is more suited to Asyng Compute than NVIDIA’s, which will prove to be beneficial under DirectX 12.

    Today, AMD’s Robert Hallock (Technical Marketing Lead) added a comment about this very topic on Reddit.
    Oxide effectively summarized my thoughts on the matter. NVIDIA claims “full support” for DX12, but conveniently ignores that Maxwell is utterly incapable of performing asynchronous compute without heavy reliance on slow context switching.
    GCN has supported async shading since its inception, and it did so because we hoped and expected that gaming would lean into these workloads heavily. Mantle, Vulkan and DX12 all do. The consoles do (with gusto). PC games are chock full of compute-driven effects.
    If memory serves, GCN has higher FLOPS/mm2 than any other architecture, and GCN is once again showing its prowess when utilized with common-sense workloads that are appropriate for the design of the architecture.

    Essentially, in order to perform Async Compute Maxwell might have to rely significantly on context switching, which is the process of storing and restoring a thread so that its execution can be resumed later; this is what allows a single CPU to have multiple threads, thus enabling multitasking. The problem is that context switching is usually pretty intensive from a computational point of view, which is why it’s not clear yet if Async Shaders can be done by current gen NVIDIA cards in a way that will be useful in terms of performance.
    We at WCCFTech tried to contact NVIDIA about this comment, and Senior PR Manager Brian Burke only had this to say:
    We’re glad to see DirectX 12 titles showing up. There are many titles with DirectX 12 coming before the end of the year and we are excited to see them
    This statement can be paired with the previous one from two weeks ago, when Burke said that NVIDIA had the utmost confidence in their architecture’s and drivers’ ability to perform under DX12. Clearly, they do not expect the same results in all upcoming titles.
    We were looking forward to test ARK: Survival Evolved under the new API last Friday, but Studio Wildcard delayed the DX12 patch. At any rate, stay tuned and we’ll be sure to report any development on this topic.


    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  6. #186
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nvidia Wanted Oxide dev DX12 benchmark to disable certain DX12 Features ? (content updated)

    An interesting read was posted by a spokes person of Oxide, basically their DX12 benchmark of the Beta Ashes of the Singularity runs a notch better at specific AMD graphics cards. Is seems Nvidia is not supporting DX12 Asynchronous Compute/Shaders and actually asked the dev to disable it (content updated).
    So the breaking story of the day obviously is the news that Nvidia tried to have certain features for the Oxide dev DX12 benchmark Ashes of Singularity disabled, the story continues though. The developer already posted a reply as how and what. Currently is seems that Nvidias Maxwell architecture (Series 900 cards) does not really support Asynchronous compute in DX12 at a proper hardware level. Meanwhile AMD is obviously jumping onto this a being HUGE and they quickly prepared a PDF slide presentation with their take on the importance of all this. Normally I'd share add the slides into a news item, but this is 41 page of content slides, hence I made it available as separate download.
    In short, here's the thing, everybody expected NVIDIA Maxwell architecture to have full DX12 support, as it now turns out, that is not the case. AMD offers support on their Fury and Hawaii/Grenada/Tonga (GCN 1.2) architecture for DX12 asynchronous compute shaders. The rather startling news is that Nvidia's Maxwell architecture, and yeah that would be the entire 900 range does not support it. I can think of numerous scenarios as to where asynchronous shaders would help.
    Asynchronous shaders (async compute)

    Normally with say DirectX 11 multi-threaded graphics are handled in a single queue, which is then scheduled in a set order - that's synchronously. Tasks that found themselves in different queues can now (with DX12 / Vulkan / Mantle) be scheduled independently in a prioritized order, which is asynchronous. That brings in several advantages, the biggest being less latency and thus faster rendered frames and response times while you are utilizing the GPU much better.



    While this is thorny stuff to explain in a quick one page post, AMD is trying to do so (explain this) with a set of screenshots corresponding to a road with traffic lights. In the past instructions always where handled by the order that they arrived. So they arrive and get queued up. However with DX12 and the combo of GCN GPU architecture instructions can be handled and prioritized separately starting with DX12. Meaning more important tasks and data-sets can be prioritized.



    With Asynchronous Shaders three different queues are now available. The graphics queue (rendering), a compute queue (physics, lighting, post-processing effects) and then there is a copy queue (data transfers). Tasks from each or any of these queues can be scheduled independently. All graphics cards based on GCN architecture can now handle multiple command instructions and data flows simultaneously which is managed by compute-engines called ACEs. Each queue can pass instructions without the need to wait on other tasks. That will keep your GPU 100% active as the work-flow is prioritized and thus always available.




    Asynchronous shaders will deliver a performance increase up-to 46% in a demo shown with their LiquidVR-SDK dev kit. How big of a difference this is going to be for your 'normal' gaming experience, remains to be seen of course. But every little bump in perf is welcomed of course.
    So what happened ?

    NVIDIA wanted the Asynchronous Compute Shaders feature level disabled by the dev (Oxide) for their hardware as it ran worse. Even though the driver exposed it as being available.
    Async DX12 shader support is a uniform API level feature, part of DX12. It allows software to better distribute task intensive data. NVIDIA driver reports back that "Maxwell" GPUs supports the feature, however Oxide Games made a benchmark to showcase the power of DX 12 and ran into anomalies. When they enable async shader support the result for Maxwell based product was in their own words "unmitigated disaster". With communication back and forth with NVIDIA trying to fix things Xoide learned that "Maxwell" architecture really doesn't really support async shaders at a tier 1 level, the NVIDIA driver might report it, but it apparently isn't working at a low-level hardware. At that stage NVIDIA started pressuring Oxide to remove parts of its code that use the feature altogether, they claim Oxide:
    Oxide: Nvidia GPU's do not support DX12 Asynchronous Compute/Shaders.
    The interest in this subject is higher then we thought. The primary evolution of the benchmark is for our own internal testing, so it's pretty important that it be representative of the gameplay. To keep things clean, I'm not going to make very many comments on the concept of bias and fairness, as it can completely go down a rat hole.

    Certainly I could see how one might see that we are working closer with one hardware vendor then the other, but the numbers don't really bare that out. Since we've started, I think we've had about 3 site visits from NVidia, 3 from AMD, and 2 from Intel ( and 0 from Microsoft, but they never come visit anyone ;(). Nvidia was actually a far more active collaborator over the summer then AMD was, If you judged from email traffic and code-checkins, you'd draw the conclusion we were working closer with Nvidia rather than AMD wink.gif As you've pointed out, there does exist a marketing agreement between Stardock (our publisher) for Ashes with AMD. But this is typical of almost every major PC game I've ever worked on (Civ 5 had a marketing agreement with NVidia, for example). Without getting into the specifics, I believe the primary goal of AMD is to promote D3D12 titles as they have also lined up a few other D3D12 games.

    If you use this metric, however, given Nvidia's promotions with Unreal (and integration with Gameworks) you'd have to say that every Unreal game is biased, not to mention virtually every game that's commonly used as a benchmark since most of them have a promotion agreement with someone. Certainly, one might argue that Unreal being an engine with many titles should give it particular weight, and I wouldn't disagree. However, Ashes is not the only game being developed with Nitrous. It is also being used in several additional titles right now, the only announced one being the Star Control reboot. (Which I am super excited about! But that's a completely other topic wink.gif).

    Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only 'vendor' specific code is for Nvidia where we had to shutdown async compute. By vendor specific, I mean a case where we look at the Vendor ID and make changes to our rendering path. Curiously, their driver reported this feature was functional but attempting to use it was an unmitigated disaster in terms of performance and conformance so we shut it down on their hardware. As far as I know, Maxwell doesn't really have Async Compute so I don't know why their driver was trying to expose that. The only other thing that is different between them is that Nvidia does fall into Tier 2 class binding hardware instead of Tier 3 like AMD which requires a little bit more CPU overhead in D3D12, but I don't think it ended up being very significant. This isn't a vendor specific path, as it's responding to capabilities the driver reports.

    From our perspective, one of the surprising things about the results is just how good Nvidia's DX11 perf is. But that's a very recent development, with huge CPU perf improvements over the last month. Still, DX12 CPU overhead is still far far better on Nvidia, and we haven't even tuned it as much as DX11. The other surprise is that of the min frame times having the 290X beat out the 980 Ti (as reported on Ars Techinica). Unlike DX11, minimum frame times are mostly an application controlled feature so I was expecting it to be close to identical. This would appear to be GPU side variance, rather then software variance. We'll have to dig into this one.

    I suspect that one thing that is helping AMD on GPU performance is D3D12 exposes Async Compute, which D3D11 did not. Ashes uses a modest amount of it, which gave us a noticeable perf improvement. It was mostly opportunistic where we just took a few compute tasks we were already doing and made them asynchronous, Ashes really isn't a poster-child for advanced GCN features.

    Our use of Async Compute, however, pales with comparisons to some of the things which the console guys are starting to do. Most of those haven't made their way to the PC yet, but I've heard of developers getting 30% GPU performance by using Async Compute. Too early to tell, of course, but it could end being pretty disruptive in a year or so as these GCN built and optimized engines start coming to the PC. I don't think Unreal titles will show this very much though, so likely we'll have to wait to see. Has anyone profiled Ark yet?

    In the end, I think everyone has to give AMD alot of credit for not objecting to our collaborative effort with Nvidia even though the game had a marketing deal with them. They never once complained about it, and it certainly would have been within their right to do so. (Complain, anyway, we would have still done it, wink.gif)

    --
    P.S. There is no war of words between us and Nvidia. Nvidia made some incorrect statements, and at this point they will not dispute our position if you ask their PR. That is, they are not disputing anything in our blog. I believe the initial confusion was because Nvidia PR was putting pressure on us to disable certain settings in the benchmark, when we refused, I think they took it a little too personally.
    AFAIK, Maxwell doesn't support Async Compute, at least not natively. We disabled it at the request of Nvidia, as it was much slower to try to use it then to not. Weather or not Async Compute is better or not is subjective, but it definitely does buy some performance on AMD's hardware. Whether it is the right architectural decision for Maxwell, or is even relevant to it's scheduler is hard to say.
    Findings like these raise valid questions. Like - What does the NVIDIA driver really report? NVIDIA drivers report support DirectX 12 feature-level 12_1 in Windows, but how real is that ? It that true hardware support or as Oxide claims, semi support with a trick or two ? Meanwhile this stuff is pure gold for AMD, and they have jumped onto it as their GCN 1.2 architecture does support the feature properly. How important it will be in future games remains to be seen and as such is a trivial question. Regardless it's good news for AMD as it really is a pretty big DX12 API feature if you ask me. The updated slide slide deck from AMD can be download at the link below. Obviously AMD is plastering this DX12 feature level all over the web to their advantage, please do take the perf slides with a grain of salt.
    Download AMD slide deck
    Noticia:
    http://www.guru3d.com/news-story/nvi...-settings.html
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  7. #187
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    AMD: There is no such thing as full support for DX12 today

    After yesterdays turmoil on a lacking DX12 feature for Nvidia AMD’s Robert Hallock shares that Fury X is also missing a number of DX12 features.
    The good man replied in a Reddit thread on the DX12 Async shader/compute feature that is missing from NVIDIA’s graphics cards, and then claimed that there is no such thing as “full support” for DX12 on the market today. Which obviously was already known as AMD never claimed full DX12 support on all feature levels with their GCN architecture
    I think gamers are learning an important lesson: there’s no such thing as “full support” for DX12 on the market today.” said Robert and continued:
    “There have been many attempts to distract people from this truth through campaigns that deliberately conflate feature levels, individual untiered features and the definition of “support.” This has been confusing, and caused so much unnecessary heartache and rumor-mongering.
    Here is the unvarnished truth: Every graphics architecture has unique features, and no one architecture has them all. Some of those unique features are more powerful than others.
    Yes, we’re extremely pleased that people are finally beginning to see the game of chess we’ve been playing with the interrelationship of GCN, Mantle, DX12, Vulkan and LiquidVR.”
    When somebody asked what are the aspects of DX12 that the FuryX is missing, Hallock replied and listed them.
    “Raster Ordered Views and Conservative Raster. Thankfully, the techniques that these enable (like global illumination) can already be done in other ways at high framerates (see: DiRT Showdown).”
    So it is simple, currently no graphics card with full 100% DirectX 12 support, that means that some games are to favor AMD and others Nvidia.
    I want to add thing here, yesterday Hallock was all over this downplaying Nvidia and evangelizing how good their GPUs are, and now he is taking a step back with these answers on Reddit.
    It's all marketing mud-fighting and attacking each other these days in-between Nvidia and AMD. Fun fact: on the AMD GPU Tech day for Fury, I myself literally confronted and asked about the DX12 supported feature levels to Hallock, and in this case Hallock himself absolutely refused to give a valid answer at the time as he very well knew that AMD would not fully support DX12 either.
    Noticia:
    http://www.guru3d.com/news-story/amd...x12-today.html
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  8. #188
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Entretanto existiu um novo update do Guru 3D à noticia colocada no post 186 deste topico e cá fica os novos desenvolvimentos da guerra AMD GCN VS Maxwell na parte dos Async Shaders

    Update: September 1st
    some guy on Beyond3d's forums made a small DX12 benchmark. He wrote some simple code to fill up the graphics and compute queues to judge if GPU architecture could execute them asynchronously.
    He generates 128 command queues and 128 command lists to send to the cards, and then executes 1-128 simultaneous command queues sequentially. If running increasing amounts of command queues causes a linear increase in time, this indicates the card doesn't process multiple queues simultaneously (doesn't support Async Shaders).
    He then released an updated version with 2 command queues and 128 command lists, many users submitted their results.
    On the Maxwell architecture, up to 31 simultaneous command lists (the limit of Maxwell in graphics/compute workload) run at nearly the exact same speed - indicating Async Shader capability. Every 32 lists added would cause increasing render times, indicating the scheduler was being overloaded.
    On the GCN architecture, 128 simultaneous command lists ran roughly the same, with very minor increased speeds past 64 command lists (GCN's limit) - indicating Async Shader capability. This shows the strength of AMD's ACE architecture and their scheduler.
    Interestingly enough, the GTX 960 ended up having higher compute capability in this homebrew benchmark than both the R9 390x and the Fury X - but only when it was under 31 simultaneous command lists. The 980 TI had double the compute performance of either, yet only below 31 command lists. It performed roughly equal to the Fury X at up to 128 command lists.

    Lower = better
    Furthermore, the new beta of GameworksVR has real results showing nearly halved render times in SLI, even on the old GTX 680. 980's are reportedly lag-free now.
    Well that's not proof!

    I'd argue that neither is the first DX12 game, in alpha status, developed by a small studio. However, both are important data points.
    Conclusion / TL;DR

    Maxwell is capable of Async compute (and Async Shaders), and is actually faster when it can stay within its work order limit (1+31 queues). Though, it evens out with GCN parts toward 96-128 simultaneous command lists (3-4 work order loads). Additionally, it exposes how differently Async Shaders can perform on either architecture due to how they're compiled.
    These preliminary benchmarks are NOT the end-all-be-all of GPU performance in DX12, and are interesting data points in an emerging DX12 landscape.
    Caveat: I'm a third party analyzing other third party's analysis. I could be completely wrong in my assessment of other's assessments
    Edit - Some additional info

    This program is created by an amateur developer (this is literally his first DX12 program) and there is not consensus in the thread. In fact, a post points out that due to the workload (1 large enqueue operation) the GCN benches are actually running "serial" too (which could explain the strange ~40-50ms overhead on GCN for pure compute). So who knows if v2 of this test is really a good async compute test?
    What it does act as, though, is a fill rate test of multiple simultaneous kernels being processed by the graphics pipeline. And the 980 TI has double the effective fill rate with graphics+compute than the Fury X at 1-31 kernel operations.
    Here is an old presentation about CUDA from 2008 that discusses asynch compute in depth - slide 52 goes more into parallelism:http://www.slideshare.net/angelamm20...rialnondaapr08 And that was ancient Fermi architecture. There are now 32 warps (1+31) in Maxwell. Of particular note is how they mention running multiple kernels simultaneously, which is exactly what this little benchmark tests.
    Take advantage of asynchronous kernel launches by overlapping CPU computations with kernel executions
    Async compute has been a feature of CUDA/nVidia GPUs since Fermi.https://www.pgroup.com/lit/articles/insider/v2n1a5.htm
    NVIDIA GPUs are programmed as a sequence of kernels. Typically, each kernel completes execution before the next kernel begins, with an implicit barrier synchronization between kernels. Kepler has support for multiple, independent kernels to execute simultaneously, but many kernels are large enough to fill the entire machine. As mentioned, the multiprocessors execute in parallel, asynchronously.
    That's the very definition of async compute.
    Noticia:
    http://www.guru3d.com/news-story/nvi...-settings.html
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  9. #189
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hot Topic: Asynchronous Shaders

    To the Max?

    Much of the PC enthusiast internet, including our comments section, has been abuzz with “Asynchronous Shader” discussion. Normally, I would explain what it is and then outline the issues that surround it, but I would like to swap that order this time. Basically, the Ashes of the Singularity benchmark utilizes Asynchronous Shaders in DirectX 12, but they disable it (by Vendor ID) for NVIDIA hardware. They say that this is because, while the driver reports compatibility, “attempting to use it was an unmitigated disaster in terms of performance and conformance”.

    AMD's Robert Hallock claims that NVIDIA GPUs, including Maxwell, cannot support the feature in hardware at all, while all AMD GCN graphics cards do. NVIDIA has yet to respond to our requests for an official statement, although we haven't poked every one of our contacts yet. We will certainly update and/or follow up if we hear from them. For now though, we have no idea whether this is a hardware or software issue. Either way, it seems more than just politics.
    So what is it?
    Simply put, Asynchronous Shaders allows a graphics driver to cram workloads in portions of the GPU that are idle, but not otherwise available. For instance, if a graphics task is hammering the ROPs, the driver would be able to toss an independent physics or post-processing task into the shader units alongside it. Kollock from Oxide Games used the analogy of HyperThreading, which allows two CPU threads to be executed on the same core at the same time, as long as it has the capacity for it.





    Kollock also notes that compute is becoming more important in the graphics pipeline, and it is possible to completely bypass graphics altogether. The fixed-function bits may never go away, but it's possible that at least some engines will completely bypass it -- maybe even their engine, several years down the road.
    I wonder who would pursue something so silly, whether for a product or even just research.
    But, like always, you will not get an infinite amount of performance by reducing your waste. You are always bound by the theoretical limits of your components, and you cannot optimize past that (except for obviously changing the workload itself). The interesting part is: you can measure that. You can absolutely observe how long a GPU is idle, and represent it as a percentage of a time-span (typically a frame).
    And, of course, game developers profile GPUs from time to time...
    According to Kollock, he has heard of some console developers getting up to 30% increases in performance using Asynchronous Shaders. Again, this is on console hardware and so this amount may increase or decrease on the PC. In an informal chat with a developer at Epic Games, so massive grain of salt is required, his late night ballpark “totally speculative” guesstimate is that, on the Xbox One, the GPU could theoretically accept a maximum ~10-25% more work in Unreal Engine 4, depending on the scene. He also said that memory bandwidth gets in the way, which Asynchronous Shaders would be fighting against. It is something that they are interested in and investigating, though.

    This is where I speculate on drivers. When Mantle was announced, I looked at its features and said “wow, this is everything that a high-end game developer wants, and a graphics developer absolutely does not”. From the OpenCL-like multiple GPU model taking much of the QA out of SLI and CrossFire, to the memory and resource binding management, this should make graphics drivers so much easier.
    It might not be free, though. Graphics drivers might still have a bunch of games to play to make sure that work is stuffed through the GPU as tightly packed as possible. We might continue to see “Game Ready” drivers in the coming years, even though much of that burden has been shifted to the game developers. On the other hand, maybe these APIs will level the whole playing field and let all players focus on chip design and efficient injestion of shader code. As always, painfully always, time will tell.

    Noticia:
    http://www.pcper.com/reviews/Editori...ronous-Shaders
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  10. #190
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained


    Foreword: There has been a lot of debate going on about the DirectX 12 capabilities of Nvidia and AMD graphic cards recently, and allegations have been flying high left and right. While most of these are completely unfounded, the subtle nature of the technicality involved is usually misconstrued to a horrible extent; and to better support the popular narrative. In light of this, I thought it was high time that I try to tackle the beast myself and make a one click resource for everything DirectX 12 and GPU related – that is relevant to today’s gamers.
    Not an official poster. @WCCFTech
    A thorough look at AMD and Nvidia DirectX 12 support and the AotS controversy

    This editorial, will not only cover all the basics and frequently asked questions about DirectX 12, it will also attempt to shed neutral light on the recent controversy that will, hopefully, steer things away from the redundant debate. This is my attempt to show just how things actually stand. It will also contain a complete list of all AMD and Nvidia DirectX12 capable base models and their respective capabilities, with these capabilities explained before hand. Ofcourse, it will not be possible (not to mention inadvisable) to go into the complete explicit details, however, I will go into more depth than the usual posts designed for gamers go.
    Here is what you will find in this article:

    1. A statement of problem, which is the DirectX 12 hype
    2. A complete overview of the technicalities that are critical to understanding “DirectX 12 Support”
    3. Addressing the ASync Question: Nvidia support and AMD advantage
    4. A complete list of AMD graphic cards that support DirectX12 and the extent of support.

    5. A complete list of Nvidia graphic cards that support DirectX12 and the extent of support.
    6. Our foray into the AotS controversy and an attempt to look at the problem in a new way.


    Eligibility Criteria: Here is our inclusion and exclusion criteria for the DX12 GPU list:

    • All base models from both vendors, cited as supporting DirectX 12, are included.
    • Rebadges in different generations are included.
    • Rebadges within the same generation are not included.
    • AIB variants are not included.
    • OEM variants are not included.

    Disclaimer: Every attempt has been taken to ensure the accuracy of the data present in this piece. However, we accept the possibility of a mistake or accidental omission due to human error. If any such hiccup is spotted, please let me know and I will make sure to update accordingly at the earliest.



    Parte 1
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  11. #191
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained

    The Main Problem: DirectX 12 Hype

    “DirectX 12 is a revolutionary API”. This statement, and variations thereof, have been echoing throughout the tech world for the past year or so. WCCF included, everyone has hailed the upcoming API like the harbinger of a new age in gaming technology. And to a certain extent – it is true. There is however, a very big problem with this mindset. While any tech enthusiast, myself included, will attest to the fact that the DirectX 12 API makes for a very significant update, the general gaming world has been quietly taking away a very subtly different meaning from all this.


    With hype at an all time high – even for hardware standards, it was only a matter of time before the over inflated bubble burst. There are amazing things that DirectX 12 can achieve, but magically adding power to a hardware configuration that it was not theoretically capable of in the first place, is not one of them. Confused? Let me elaborate.
    Differentiating between Untapped Potential and Maximum Potential

    The key difference between the hardware enthusiast and the general masses, is knowing what the words ‘untapped potential’ means. Contrary to what most gamers might believe, there are a very few cases in which a graphic card is being utilized completely, or at a true 100% capacity – especially in the high end spectrum. Usually, there are certain bottlenecks in place that stop that from happening, and usually, these exist in the software layer that acts as a bridge between the GPU and the end user software.
    The point of DirectX 12 is to move towards an approach that has many names – ‘low level’, ‘to the metal access’, etc etc. This is basically the implementation where the bloated middle software layer (drivers) provides minimum interference, and the ability to control specific parts of the hardware directly is also handed over by the API. At the same time, more autonomy is given to the GPU. Traditionally, the graphic card is a slave of the processor and can only work as fast as the work provided by the CPU. With DX12, much of the load is lifted off of the CPU so the GPU can work towards its true potential.




    So lets look at the following scenarios:

    • A very high powered graphic card, paired with a low powered processor. DX12 API gains will be massive in nature, since not only will the processor be able to issue more workloads, the GPU will have more autonomy to work on them faster.
    • A very powerful CPU with a low power graphic card. DX12 API gains will be negligible in nature, since the card already is operating at peak capacity and has more than it can chew.

    In the first case, the untapped potential of the graphic card was enormous – and DirectX 12’s low level API was able to unlock it to a very, very impressive extent. In the second case, the card was already operating at its maximum operational capacity and DX12 API did not make a lot of difference. Because of the consistent over-hyping, everyone now expects DirectX 12 to be nothing short of a miracle worker.
    Differentiating between the DirectX 12 ‘Low Level’ API and DirectX 12 ‘Hardware Features’

    Now granted, the vast majority of the cases are going to have one bottleneck or the other; that DX12 successfully overcomes. This means that most of the users, be it AMD or Nvidia, are going to benefit from the DirectX 12 API in general. But this is where another problem starts. People see the “DirectX 12″ tag on a GPU and will undoubtedly expect it to use and benefit from every single “feature” that DX12 unlocks.
    As far as I can see, the average gamer fails to differentiate the API from the various hardware features that it can access on a particular card. The point of DirectX 12 API, is to provide low level access and GPU autonomy capabilities. This translates into more draw calls (among various other things), more flexibility to the developers and basically universal performance gains of any given extent.
    What is not universal is the availability, use and advantage of a given hardware feature. For convenience’s sake, lets call it DX12 hardware features. These consist of many, some of which are available to select vendors, some of which are limited to certain graphical generations and some which are redundant to a particular IHV. Before we go any further, a short overview on feature levels is in order.



    Parte 2
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  12. #192
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained

    Direct3D 12 Feature Levels and Capabilities Overview

    Before we get into the nitty gritty details, lets differentiate between APIs, Feature levels and HW specifications such as Resource Binding. DirectX 12 is an API or an Application Programming Interface. It is simply put, code that forms a bridge between the GPU and any end user software. Everyone is thoroughly excited about the DirectX 12 API – because its low level capabilities are a huge upgrade over its predecessor.
    Low level access, as most of our readers know, means the ability of the API in question to access parts of the GPU directly. Now, we come to feature_levels. Feature levels are pre defined standards of GPU hardware capabilities and have almost nothing to do with the API in the very strict sense. The DirectX 12 API requires graphics hardware that conforms to Feature Level 11_0 (at the very least). But even after a new Feature Level is defined, many old GPUs and graphics architectures can still qualify for that feature level. For example a previously Feature Level 11_1 graphics card may very well meet all the requirements to fully support Feature Level 12_0.
    The various DirectX 12 feature levels

    A feature level will however, usually require a similarly named API to access its features in their entirety. So basically, all GPUs conforming to FL 11_0 through FL 12_1 can run the DirectX 12 API completely and fully. The much hyped about advantage that is the reduction of CPU overhead – everyone will get that (provided you fall in the FL 11_0 to 12_1 band). The thing is however, these new GPU had new hardware features, something that only the DirectX 12 API can finally access: so new standards had to be created: namely FL 12_0 and 12_1.
    Graphic cards supporting the following feature levels can run DirectX 12. The qualifying requirement for the particular feature level itself is also given:

    • FL 11_0: Supports Resource Binding Tier 1 and Tiled Resources Tier 1
    • FL 12_0: Supports Resource Binding Tier 2, Tiled Resources Tier 2 and Typed UAV Tier 1
    • FL 12_1: Conservative Rasterization Tier 1 and Raster Order Views (ROV)

    Now that we know what the definitions are, here is the complete specification table of all IHVs with released hardware (including the latest Skylake iGPU and GM200):




    IHV Hardware Specification Comparison






    So here is the thing. Maxwell 2.0 (GM200) has the hardware characteristics necessary to get the 12_1 stamp, so it does. However, AMD’s GCN actually has Resource Binding Tier 3 for a very long time now, not to mention Typed UAV Format Tier 2 and Asynchronous shaders for parallel functions. Similarly, Intel has supported Raster Order Views since Haswell iGPUs and has been rocking it on Feature Level 11_1. To put this into perspective Nvidia’s architectures supports ROV only after the GM200 Maxwell. You can clearly see that no hardware vendor has the undisputed best GPU hardware specification around.Every IHV has a weakness or missing specification in some form or the other. So who exactly has the best relative specification all things considered? This is where it gets really tricky and also unanswerable, mostly unanswerable.
    Hardware vendors DirectX 12 hardware specifications compared

    The question we can however answer is: which specification (or lack thereof) will actually translate to an increased (or decreased) gaming experience at the end of the day? Here, the answer is relatively simpler to explain.Lets start with AMD’s edge. Since I will be tackling Async shaders on the next page, lets start with Resource Binding. Resource Binding is basically, the process of linking resources (such as textures, vertex buffers, index buffers) to the graphics pipeline so that the shaders can process the resource. This means that AMD’s architecture is mostly limited only by memory and while this is a desirable trait, it is something that will happen out of sight, without translating to anything a gamer can observe on-screen. Similarly Typed UAV formats isn’t something an end user can observe. Currently there isn’t a fully developed ecosystem for these and only when VR becomes mainstream will these affect anything but a very small minority.Asynchronous compute shaders however, is a performance enhancing feature so the benefit is not strictly based on new visual effects but on improved performance.Intel has supported Raster Order Views since the Haswell days (fulfilling one half of the requirement for 12_1) and now with Skylake it also boasts full DirectX 12 API support with the Feature Level 12_0.Finally, we come to Nvidia. Nvidia has something that no other IHV currently has: and that is Conservative Rasterization and Raster Order Views. While the qualifying requirement is only Tier 1, GM200 has Tier 2 Conservative Rasterization support.Here is the thing however, Conservative Rasterization is a technique that samples pixels on screen based on the primitive in question and is much more accurate than conventional rasterization – in other words, it will make a difference to the end user in the form of special graphical effects. Conservative Raster itself will give way to many interesting graphical techniques – Hybrid Ray Traced Shadows for one.


    Parte 3
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  13. #193
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained

    The ASync Question: Does Nvidia Support it?


    All right, now that we have gotten that out of the way, lets begin with the heart of the current controversy: the Asynchronous Shaders and AotS benchmarks. To make this much more simpler, let me list the benchmarks tested and the basic configurations here, as well as the DirectX 11 and DirectX12 average frames per second:

    • Nvidia Test 1: Core i7 5960X + Geforce GTX Titan X
      • DX11: 45.7 fps
      • DX12: 42.7 fps

    • Nvidia Test 2: Core i5 3570K + Geforce GTX 770
      • DX11: 19.7 fps
      • DX12: 55.3 fps

    • AMD Test 1: Athlon X4 860K + Radeon R7 370
      • DX11: 12.4 fps
      • DX12: 15.9 fps

    • AMD Test 2: Xeon CPU E3-1230 + Radeon R9 Fury
      • DX11: 20.5 fps
      • DX12: 41.1 fps

    Explaining the AotS benchmarks with what we know of DirectX 12

    These benchmarks were mostly taken at face value and the usual frame war erupted over the raw value of these numbers. The problem is, when taking about an API that eliminates overhead, we need to look at the context as well. In both AMD tests, the numbers seem to rise, but that is because in the first test, the processor is obviously a bottleneck so the configuration had alot of untapped potential. In the second test, the processor was once again an arguable bottleneck, since Xeons are clocked pretty low.
    In the second Nvidia test, DX12 also performs as expected, when coupling a reasonably powerful GPU with a decent processor. In all these three scenarios, the bottleneck of the processor was eliminated.
    The actual anomalous test is the first one. With the incredibly powerful CPU and incredibly powerful GPU. Theoretically, this configuration has very little bottleneck – if any. DirectX 12 wouldn’t have yielded any major performance increase because the configuration is already very much near its maximum potential. But the funny thing is, switching to DX12 actually results in a lowered value than before. That is something, that shouldn’t have happened. To understand just what is going we need to look at what was happening behind the scene.
    An overview of Synchronous and Asynchronous Shaders in different GPU Architectures

    Now what exactly are Asynchronous Shaders? Traditionally, there is one graphical queue available for work to be scheduled. Whatever work needs to be done is scheduled in a serial order in the queue. The problem with this approach is that it usually results in bottlenecking and the GPU not working at its full capacity. For understanding’s sake you can imagine the Queue as a thread. And as you might know, multi threaded approach to computation is the future. So Asynchronous Shaders is basically where there is:

    • the standard Graphics Queue available and also another
    • ‘Compute Queue’ for computational tasks.

    A copy queue is also available, but since that is irrelevant to our current topic, I wont be going into that.
    Now contrary to popular belief, Nvidia’s Maxwell 2.0 does support “Asynchronous Shaders”. Do bear in mind that documentation on these things is very limited – most of it comes from engineer comments and documentation on HyperQ (Nvidia’s multiple queue implementation). The following data shows the Queue Engines of various AMD and Nvidia architectures:

    • AMD GCN 1.0 – 1 Graphics Queue + 16 Compute Queues ( 7900 series, 7800 series , 280 series, 270 series, 370 series)
      AMD GCN 1.0 – 1 Graphics Queue + 8 Compute Queues ( 7700 series, 250 series )
    • AMD GCN 1.1 – 1 Graphics Queue + 64 Compute Queues ( R9 290, R9 390 series )
    • AMD GCN 1.1 – 1 Graphics Queue + 16 Compute Queues ( HD 7790, R7 260 series, R7 360 )
    • AMD GCN 1.2 – 1 Graphics Queue + 64 Compute Queues ( R9 285/380, R9 Fury series, R9 Nano )
    • Nvidia Kepler – 1 Graphics, Mixed Mode not Supported (32 Pure Compute)
    • Nvidia Maxwell 1.0 – 1 Graphics Queue, Mixed Mode not Supported (32 Pure Compute)
    • Nvidia Maxwell 2.0 – 1 Graphics Queue + 31 Compute Queues

    There are two ways the extra compute threads can be used. In a “Pure Compute” mode which will be expensive because it will require switching and a “Mixed Mode” which is what Asynchronous Shaders is all about. In all AMD GPUs with ASync enabled, the card will be running 1 Graphical Queue and atleast 2 Compute Queues. This means that tasks ingame that require compute can be offloaded onto the GPU (If and only if, it has extra horsepower to spare.) This naturally translates to the GPU becoming more autonomous where the CPU is the bottleneck or the GPU is not being used to its full potential.




    As you can see, Maxwell 2.0 does support mixed mode and upto 31 compute queues in conjunction with its primary graphics Queue. No other Nvidia architecture has this capability without involving a complete mode switch. Now this where things get a bit muddy. Since there is no clear documentation, and since Nvidia has yet to release an official statement on the matter, it is alleged (read: not confirmed in any way), that Nvidia’s mixed mode requires the use of a software scheduler which is why it is actually expensive to deploy even on Maxwell 2.0.
    Different architectural approaches to achieving the same result: gaming excellence

    There is something else that we have to consider too. The chip currently employing the Maxwell 2.0 architecture is the GM200, 204 and 206. These chips were not designed to be compute extensive. AMD’s architecture on the other hand has always been exceptional in terms of compute. So using Compute threads to supplement the Graphical threads will always be better on a Radeon. That is a fact.

    Picture credits: Nvidia
    However, the question remains (and is currently unanswered) whether Nvidia cards needs ASync to achieve their maximum potential at all. There is no evidence to suggest that Maxwell architecture would benefit from ASync. There is no evidence to suggest they wouldn’t benefit either. But if we are to trust the each vendor on knowing their architecture, then Nvidia, these past generations have focused on creating graphical processors that specialize in single precision and gaming performance.
    Double precision and compute took a rather back seat since the Fermi era. Dynamic Parallelism is one of the examples of such technologies present in post Fermi architectures. But usually, these are only ever used in the HPC sector. This is also one of the reasons why gamers should still focus on the maximum potential or the raw frames per second achieved by the graphic card instead of focusing on the performance gain achieved by tapping into the untapped with DX12.



    Parte 4
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  14. #194
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained

    AMD Radeon DirectX 12 Graphic Cards List and Supported Features










    AMD DX12 Graphic Cards List Notes:

    • AMD has a total of 31 base models compatible with DX12 API.
    • Use the horizontal slider to view the complete feature specifications including the various hardware features and the respective Queue Engines (Asynchronous Shaders in Mixed Mode) of the GPUs.
    • Click on “Next” to view more graphic cards.




    Parte 5
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

  15. #195
    Tech Ubër-Dominus Avatar de Jorge-Vieira
    Registo
    Nov 2013
    Local
    City 17
    Posts
    30,121
    Likes (Dados)
    0
    Likes (Recebidos)
    2
    Avaliação
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Exclusive: The Nvidia and AMD DirectX 12 Editorial – Complete DX12 Graphic Card List with Specifications, Asynchronous Shaders and Hardware Features Explained

    Nvidia Geforce DirectX 12 Graphic Cards List and Supported Features











    Nvidia Graphic Cards List Notes:

    • Nvidia has a total of 50 base models promised with DX12 API support but only 30 base models currently supporting DX12 API. (Nvidia has yet to make on its promise of DX12 support for Fermi cards.
    • Use the horizontal slider to view the complete feature specifications including the various hardware features and the respective Queue Engines (Asynchronous Shaders in Mixed Mode) of the GPUs.
    • Click on “Next” to view more graphic cards.




    Parte 6
    http://www.portugal-tech.pt/image.php?type=sigpic&userid=566&dateline=1384876765

 

 
Página 13 de 21 PrimeiroPrimeiro ... 31112131415 ... ÚltimoÚltimo

Informação da Thread

Users Browsing this Thread

Estão neste momento 1 users a ver esta thread. (0 membros e 1 visitantes)

Bookmarks

Regras

  • Você Não Poderá criar novos Tópicos
  • Você Não Poderá colocar Respostas
  • Você Não Poderá colocar Anexos
  • Você Não Pode Editar os seus Posts
  •