This is a summary of the questions arising from the interactive chat Q&A from a NVIDIA GRID webinar in May 2016 featuring a deep dive with two existing GRID customers. We have checked and enhanced the answers but readers should be aware that as products evolve the answers may change and as such all information should be queried against the current product documentation. We aim to release these FAQs for the benefit of those who attended the webinars and also the wider community. Users with follow-on questions are encouraged to post on the NVIDIA GRID Forums where GRID support, product and engineering staff, answer questions: http://www.gridforums.nvidia.com.
Q: Where can I get a recording of the webinar?
The webinar recording is available here: http://cc.readytalk.com/play?id=c1tuiq
Q: for Jimmy & Fred: For these designers are what profiles are they mostly using? And secondly, are they doing persistent or some sort of non-persistent VDI?"
FRED replied: We use the K260Q or 2GB profile for the majority (80%) of our designer population. Although we also use the K240Q (1GB) and K280Q (4GB) profiles for roughly 20% of the population. We deploy using Citrix Provisioning Server (PVS), we are 100% non-persistent VDI but provide a persistent profile to the user by using a profile management software product called AppSense Environment Manager. This allows us to persist not just personalization settings of windows, but also capture and or provide folder redirections for data that is not contained within the Users AppData folder substructure. This includes things like Catia License configuration files, Catia tool panel and control configuration files, etc.
JIMMY: We use 3 different profiles for our 3 different types of users. For the people who don't typically use our graphics intensive design applications on a regular basis, they get the 512mb profile. The majority of our designers, who primarily use Revit and the Adobe applications get the 1GB profile, and our renderers who are typically using programs like 3ds max and rhino get the 2GB profile. We also use persistent images, so our end users can maintain their custom settings and plugins for all their applications.
Q: What are most current GRID customers doing?
A: The case studies are very good for getting a feeling of what you can do with GRID and cover applications virtualized (e.g. Autodesk, Siemens, PTC Creo, Adobe Photoshop, Aveva) and lots of other details, see https://virtuallyvisual.wordpress.com/useful-links/remote-graphics-case-studies/ for a list of both NVIDIA and partner case studies.
Q: My customer is using CDESIGN, a fashion design software package. Does it work on virtualized infrastructures?
A: If an application works on Windows 7/10 etc on a PC/Workstation it will work on a VM by virtue of the hypervisor support statements/commitments for Microsoft OSs. Some software vendors don't support virtualized platforms per se though which in practice means reproducing an application bug on a physical workstation. Applications used with XenApp have the additional constraint that they must be RDS compatible, the Citrix Ready organization manages those certifications. These are questions best placed with the hypervisor partnering programs such as Citrix or VMware Ready programs.
Q: I need information for a customer project.
A: Members of NPN (NVIDIA Partner Network) should contact their Partner Business Manager (PBM) and explore the material available to them via the GPU Genius Curriculum.
For community members who have questions, our forums are a great resource for pure NVIDIA questions http://gridforums.nvidia.com is where you'll find a lot of NVIDIA staff and community gurus.
Customers often find it useful to review the case studies containing details of others' real world problems and use of GRID as a solution, see: https://virtuallyvisual.wordpress.com/useful-links/general-citrix-case-studies/
The main GRID webpage also carries a great deal of detailed information under the "resources" tab: http://www.nvidia.com/object/grid-enterprise-resources.html
For those new to virtualization they may prefer to use a local NVIDIA partner with experience of GRID technologies: http://www.nvidia.com/object/partner-locator.html
Q: Is GRID is a viable solution for my Creative Team whom are using mostly software like Adobe Aftereffects, 3DS MAX, V-ray etc. ?
NVIDIA's GRID solution is applicable to any situation where applications running in a virtual environment require graphical acceleration to perform well. This statement applies across the range of applications typically found in M&E (Media and Entertainment) environments like 3dsMax, Maya, Premier Pro (PPro), Aftereffects, Photoshop, Nuke etc., with one or two provisos.
The most notable proviso is where CUDA or OpenCL support is required by an application. Since CUDA is only supported today by our 8Q profiles (where the whole physical GPU is allocated to the VM) applications that require CUDA will need to use this profile with the attendant reduction in density possibilities. Applications that do use CUDA in an M&E context are PPro (the Mercury playback engine), Aftereffects (3D text rendering) and many rendering engines like VRay-RT.
So these applications either need to be used in such a way that the CUDA part of the application is not used (for example not everyone needs 3D text in Aftereffects) or the CUDA part of the app is allowed to fall back to CPU. For example the Mercury Playback engine can use CPU and that would likely be acceptable for situations where heavy effects or multiple video streams where not being processed.
Other considerations for M&E would be that Wacom tablet support is only available in a Citrix environment and is not fully certified. Additionally the color-space support in VDI protocols is often not sufficient for true color QA or colour grading applications. The remoting protocols and virtualisation stacks also handle audio/graphical channel synchronisation differently and users should consult their remoting vendor e.g. Citrix/VMware/other on their technologies to avoid drift over time.
Many M&E applications run under Linux and this is a popular choice for many M&E customers. It should be noted that full Linux support with vGPU is available with both VMWare and Citrix (introduced in XenServer 7.0 just after the webinar was broadcast). RGS, XenDesktop, Horizon, MechDyne TGX and NiceDCV all support vGPU enabled VM's in their protocols too. Which Linux OSs are supported varies across vendors so those evaluating should investigate each protocols support.
A good range of M&E requirements can be met by GRID technology with some of our customers successfully using Maya, 3DSMax and similar but some aspects of a workflow may still require a dedicated workstation, colour grading complex editing with PPro or final QA for example.
One of our rendering and virtualization experts is having a similar discussion with a similar customer on this thread https://gridforums.nvidia.com/default/topic/801/using-grid-cards-in-a-render-farm/ and if you want to ask more questions that thread could be an appropriate place to do so.
Q: We have proposals from vendors with K1 cards and Cisco C240 servers - what advantage do the new M60 or M6 cards have over the K1 or K2 cards?
A: The M60/M6 and M10 (Maxwell) are designed to replace the K2 and K1 (Kepler) respectively. The K1/K2 have been available for 3 years and are reaching EOL for availability (NVIDIA EOL of availability for OEMs is September 2016) they will be maintained ongoing but new features and improvements will be focused in newer products.
The M60/M6 and M10 offer higher scalability and performance which can reduce the need for additional servers lowering the cost of a deployment. Additionally these newer cards offer 4K support and more options for multi-monitor/quad-monitor support (via the new B-profiles), as well as vGPU for Linux platforms.
The M60/M6/M10 are sold with software licensing and enterprise support providing direct support from NVIDIA 24/7.The software model of the M6/M60/M10 (GRID 2.0 and up) cards means that ongoing new features like h.264 encoder enhancements that protocols can benefit from and improved monitoring can be made available without the need for replacement hardware.
Q: Aside from having Graphics virtualization capabilities, is it possible to use the NVIDIA GRID GPUs to utilize the hardware for rather more specific applications such as Deep Learning (DL) or GPU based compute (FDTD acceleration) or running databases for big data analytics similar to how traditional CPU-based server virtualization is utilized for the needs of multiple tenants?
A: Yes it is possible to use the M60 cards for compute uses such as deep learning but this is not supported under the management of the graphics vGPU GRID software but we have many customers doing this on bare-metal. The hypervisors currently handle BARs very differently for graphics to compute needs. The M60 cards can be repurposed in "compute" (single precision) mode but not currently dynamically and their management and usage in compute mode is not supported under the NVIDIA GRID software. Virtualization such as hypervisors need to catch up a bit with HPC, DL and compute needs - where users are using applications not traditionally virtualized in the past. The hypervisors' handling of memory BARs etc. is currently optimized for graphical and VDI workloads. CUDA and OpenCL is only at the moment available on the vGPU profile representing a whole GPU on the M60. Customers interested in distributed compute management options should contact the HPC and DL teams at NVIDIA.
Q: Will Tesla P100 support a multi-tenant virtualization profile?
A: The P100 is a new NVIDIA GPU specified for compute rather than for the virtualization of graphics. It is not a supported option for GRID VDI or application delivery. The GRID product team at NVIDIA continues to work with the HPC and DL (Deep learning) product teams to work towards unified solutions, questions on HPC and DL though are best address to those product teams though.
Q: My question was actually more towards having multiple forms of resource utilization over Tesla P100 cluster-based resource pool for utilization in diverse applications provided to different tenants. So in this regard, is this form of functionality currently available or being slated on the roadmap for the near future with NVIDIA GRID paired with both current K80/M60 or Tesla P100 and the oncoming hardware? Supposed case is having a datacenter to be used by the researchers of a Computer & Electronics Engineering department of a college and I have multiple graduate students who want access to computation resources that utilize GPU acceleration, for effective use we would want to have effective pooling mechanisms? So could NVIDIA GRID be used for this purpose?
A: It is something long term we make steps towards but currently we are limited by the technology and the hypervisor vendors support for compute workloads (which is limited). So NVIDIA GRID would not be a solution for this use case.
Q: How many engineers is the speaker from Textron (Bell Helicopters) supporting for graphics?
A: The referenced initial deployment is covered in this case study: http://images.nvidia.com/content/grid/case-study/pdf/nvidia-grid-case-study-bell-helicopter-mar-2015.pdf. Textron have spoken before at NVIDIA's GTC events and some videos of their sessions are available online and contain further details, see here: http://on-demand-gtc.gputechconf.com/gtcnew/on-demand-gtc.php?searchByKeyword=textron&searchItems=&sessionTopic=&sessionEvent=&sessionYear=&sessionFormat=&submit=&select=
FRED Answered: Today, Textron/Bell Helicopter is supporting 400 concurrent NVIDIA GRID accelerated sessions on our hardware stack. This is a mix of resource sizes with about 80% of the population using a K260Q (2GB) graphics profile. This same 80% have 32GB of ram each, 4vCPUs, and 768GB of Write Cache hard drive space for model temporary scratch. Our models are rather large and we determined this to be the most optimal performance build for our designers. I cannot speak directly to the quantity of our roughly 4000 engineers that are using VDI as their primary source of graphics capability, but our goal is to migrate all of them by the end of their current dedicated hardware lifecycle.
Q: Three of our high schools utilize AutoCAD, SolidWorks, etc and with our current K1 cards we're really only able to get 30 users on a card utilizing vGPU at a lower end profile. Does this new model and hardware (M6/M60) allow for more user density per card?
A: NVIDIA has just announced a high-scalability M10 card that can support up to 64 vGPU enabled VMs per GPU (and so 128 per host on some servers). This card is primarily aimed at business and mainstream VDI applications as was the K1. Normally we would recommend the M6/M60 together with vWS (virtual workstation) for serious CAD usage; however since you are achieving 30 users on a K1 it sounds like your students are taking courses using very small models and lightweight CAD usage and it is possible that the M10 would suit your needs provided sufficient other resources (bandwidth, CPU, etc) are available to support more users. You should check how near to the capacity of the server's CPU you are.
Some educational users have had success deploying lighter weight 3D/CAD applications via RDSH platforms, which can achieve high user densities, this is usually a recommended method for office and business applications rather than 3D/CAD applications so you should test for yourself. Georgia Tech's case study may be of particular interest, see: NVIDIA GTC2015 Video; Citrix Case Study;
Q: Just so I understand your response, you're recommending we look at utilizing XenApp to deliver the individual applications to those student labs? Does any case study go into the details of how they're set up by chance?
A: In an educational setting if density is critical and the apps are being used in lightweight manner (small CAD teaching parts) then yes an RDSH solution would be worth evaluating. Something like a single server OS VM with a whole GPU from a K2/M60 attached to it might well serve your use case. This presentation gives details of Georgia Tech's XenApp (RDSH) usage with similar applications: http://on-demand.gputechconf.com/gtc/2015/presentation/S5128-Florian-Becker-Didier-Contis.pdf. Other case studies on XenApp and XenDesktop in Education can be found here (listed separately for VMware and Citrix): https://virtuallyvisual.wordpress.com/useful-links/remote-graphics-case-studies/
Q: We run a great deal of scripts querying databases and filtering into Excel - does the GRID product allow for enhanced performance in these areas?
A: It really depends on how those applications are architected to take advantage of GPUs, if the application benefits from a GPU on physical, it will as well when virtualized. It's unlikely that such scripts are designed to utilize a GPU for those operations on physical or virtualized platforms.
Q: Can I email you for more advice?
A: Please contact your NVIDIA partner or NVIDIA sales contact in the first instance. Another option to get further information is to use the forums to ask questions and share information https://gridforums.nvidia.com that way the info becomes public and you get multiple experts and are covered when we go on holiday etc.
Q: I'm about to purchase a DL380 HP for VDI of Autodesk Revit. It looks like K2 GRID can do maximum of 8GB. Will that be very good for my users?
A: The K2 and M60/M6 cards are usually the best choices for 3D CAD. You should choose a vGPU profile appropriate for the Framebuffer of the models you are using and the demands of the operations the users are using in the application. Sizing really does depend on what those users are doing and how many users you try and put on a server - have a look at some of the case studies from us and partners, particularly from AEC, many give details of Revit deployments: https://virtuallyvisual.wordpress.com/useful-links/remote-graphics-case-studies/. We also have an application guide for Revit that should give you some data points to reference.
Q: Did either of the presenters use Citrix appVolumes?
FRED's answer: We are not using Citrix appVolumes in production today. However, we are evaluating its use to determine if we can support our designer applications with this newer technology offering from Citrix. We currently support over 300 application titles in our designer virtual desktop.
JIMMY: We are not using Citrix, we have a VMware environment.
Q: Are we able to load our own software on the GRID trial platform?
A: The TryGRID platform (http://www.nvidia.com/object/vmware-trygrid.html) allows users to try GRID with a number of common applications including Autodesk AutoCAD, Dassault Systèmes SOLIDWORKS, Esri ArcGIS Pro, Siemens NX. It is not possible due to licensing and security constraints for users to bring their own applications. Users can however upload their own CAD parts and data for evaluation with the applications available.
Some partners are also running extended trials where bring-your-own-apps may be an option. Such as this program from IMSCAD: http://www.imscadglobal.com/try-grid-extended.php. These services are often a paid for service.
Other options include contacting a local partner who may be able to offer a test environment. Many of Citrix/VMware's partners also offer such a facility. The NVIDIA partner locator can be found here: http://www.nvidia.com/object/partner-locator.html
Q: Have either of the presenters had any problems with latency, how far away are their servers from the users?
A: Latency and bandwidth are often the limiting factor in these deployments and also the users' requirements. Typically for a picky CAD designer less than 80-100ms of network latency is ideal but we have customers with up to 400ms and 200ms is also common. With companies like Textron with huge parts, the time and bandwidth of moving those parts around makes virtualization a win.
FRED added: The majority of our active users are disparate from the VDI stack in the datacenter. Very few of our local users are using VDI at this point. Our designers and manufacturing engineers regularly use the technology at 80-150ms of latency across our Global MPLS network. We do have a group of engineers that use the capability at around 180ms latency across the public internet and through a Citrix NetScaler appliance ubiquitous SSL VPN connections. We have successfully tested the capability at one of our facilities in India and Japan some 400ms or more of latency away from the VDI stack. This is a near zero loss MPLS connection and is using tradition HDX 3DPro based ICA protocol. We have been evaluating Framehawk but have not made all the necessary updates on our production stack to accommodate the new technology. For our designers and manufacturing engineers in the US and EMEA we have had great success with this technology reducing the amount of data that we must more around in order to open models. VDI has allowed us to keep that data local to the VDI stack in the datacenter reducing the time to open and save models. This has very little impact on manipulation, but the perceived performance increase from open and save is a massive benefit to our remote teams.
JIMMY: Latency definitely plays a role in performance of your VDIs. As a general rule of thumb, we typically aim for a max of about 100ms of latency for optimal performance when using Revit and other modelling programs. With those guidelines, we are typically able to maintain anywhere from about 50-200ms latency on average for most of our offices. However, our Mumbai office is the furthest from our data center and they will hit over 300ms sometimes. While the performance is not ideal for them, it is definitely workable, and still allows us to collaborate in a means that would otherwise not be possible.