The FPS Review Chats with NVIDIA’s Chat with RTX About Its Own Content

The FPS Review may receive a commission if you purchase something after clicking a link in this article.

Introduction

NVIDIA has released its Chat with RTX tech demo that we first got to see running last month at CES. NVIDIA built the app based on its TensorRT-LLM RAG developer reference project that it hosts on GitHub. It requires a GeForce RTX 30 or 40 series GPU that has at least 8GB of video memory, 100GB of available hard disk space, Windows 10 or 11, and the latest NVIDIA GPU Drivers.

What is Chat with RTX?

Chat with RTX is a proof of concept that allows you to run a locally hosted generative AI that leverages an LLM as well as your own documents (doc, pdf, txt, and XML are currently supported). It makes it simple enough to run the installer which will download the LLMs and get everything set up and all that is left for you to do is to point it at your content and start asking questions.

What LLMs are used by Chat with RTX?

Chat with RTX ships with two large language models (LLMs) that are preconfigured, but you can also add other models if you happen to have them hanging around.

LlaMa2-7B Int4

LlaMa2-7B Int4 is a large language model comprising a collection of models capable of generating text and code in response to prompts. Meta, the artist formerly known as Facebook, is the company behind this model and pretrained it on 2 trillion tokens of data from publicly available sources.

Mistral 7B Int4

Mistral 7B Int4 is an LLM and is an instruct fine-tuned version of the Mistral 7B v0.1 generative text model that was trained on a variety of publicly available conversation datasets. It differs from LlaMa in that it was released under an Apache 2.0 license that allows for it to be freely used, whereas LlaMa has a few terms and conditions to abide by when using it.

Chat with RTX Test Setup

We wanted to see what Chat with RTX was capable of doing if it was fed a diet of content that we have published over the years at The FPS Review including news and reviews. I set up Chat with RTX on my daily driver rig that runs on an AMD Ryzen 9 5950X, 64GB of memory, and an RTX 4090-based GPU. We ran a few different variations on the content and both included LLMs to see how the answers varied.

We attempted to load the website data as a single XML file that was exported from WordPress and Chat with RTX was not able to ingest it and work with it as expected (file size was about 200MB). We then used a utility to convert the XML file into Word docs (about 13,000 in total) and were able to get those loaded up to tinker with.

As a programming note, when we cut the XML file down to about 5MB (covering about 2 months of content), it was able to load just fine and we were able to chat with it. It was interesting to see that the detail and nature of the responses were vastly different between the XML and DOC versions of the same content.

Chat with RTX Results

We’ve run a couple of questions and summarization tasks in each of the methods for loading the documents and using both of the LLMs and reflected their answers in a table.

Summarize the key features of the GeForce RTX 4070 SUPER Founders Edition video card

Test ScenarioResponse
LlaMa 2 7BBased on the context information provided, here is a summary of the key features of the GeForce RTX 4070 SUPER Founders Edition video card:
– The card features a TU116 GPU with 2560 CUDA cores and 48 RT cores, as well as 8 GB of GDDR6 memory
– It has a 250W power consumption and a 2-slot design
– The card supports up to 8K resolution at 60Hz, as well as ray tracing and AI-accelerated features
– It has a dual-link DVI-D and two HDMI 2.1 ports for display outputs
– The card is part of the Founders Edition series, which means it is a high-performance, custom-designed card from NVIDIA
– The card is expected to offer improved performance over the previous-generation RTX 3070, with a boost in clock speed and memory bandwidth
Mistral 7BThe NVIDIA GeForce RTX 4070 SUPER Founders Edition video card is a high-performance graphics card that is part of the NVIDIA A1000 series. Some of its key features include:
– 8 GB of GDDR6 memory
– 2,560 CUDA cores
– 160 Tensor Cores
– Ray Tracing technology
– 1920 MHz boost clock
– 256-bit memory bus
– PCIe 4.0 x16 interface
– DVI-D, HDMI 2.1b, and DisplayPort 1.4b connectivity options
– NVIDIA GeForce RTX 4070 SUPER Founders Edition video card is designed for gaming, professional visualization, and data center workloads.
It is important to note that the information provided is based on the context information provided and may not be accurate or up-to-date. It is always recommended to check the official documentation or specifications of the product for the most accurate and up-to-date information.

Starting things off, we asked it to summarize a recent video card. As you can see, while it seemed to recognize that it needed to talk about a video card, it completely whiffed on many of the important details about the RTX 4070 SUPER.

Why is the RTX 3070 faster than the RTX 4090?

Test ScenarioResponse
LlaMa 2 7BBased on the context information provided, there is no evidence to suggest that the RTX 3070 is faster than the RTX 4090. In fact, the information provided suggests the opposite, as the RTX 4090 has a higher boost clock than the RTX 3090, and the RTX 4080 (16 GB) has been shown to hit 3.0 GHz in new 3DMark benchmarks, beating the GeForce RTX 3080 by up to 62%. Additionally, the RTX 4090 laptop GPU has been shown to be faster than the GeForce RTX 3090, and the RTX 4060 laptop GPU beats the GeForce RTX 3060 laptop GPU by 20% in Geekbench and 3DMark benchmarks. Therefore, it is likely that the RTX 4090 is faster than the RTX 3070.
Mistral 7BThe information provided in the context indicates that the RTX 4090 is faster than the RTX 3070 in some scenarios, such as 4K gaming and 3DMark benchmarks. However, the RTX 3070 is also faster than the RTX 4090 in other scenarios, such as Geekbench and 3DMark benchmarks. Therefore, it is not accurate to say that the RTX 3070 is faster than the RTX 4090 in all scenarios. It is important to consider the specific context and benchmarks being used to compare the performance of these two GPUs.

It appears that LLaMA gets the point here by recognizing the misleading premise of the claim about the RTX 3070 being faster than the RTX 4090. However, the information that it cites does not support its conclusion as it uses a half dozen different cards as a reason the 3070 is not as fast. On the Mistral side, it’s a bit cautious about the claim made in the post, but does concede the point about the RTX 3070 being faster than the RTX 4090 in some cases (oops!).

What companies launched products at CES 2024?

Test ScenarioResponse
LlaMa 2 7B1. Samsung – They unveiled a new vacuum lineup with advanced AI.
2. LG – They unveiled plastic OLED, advanced thin OLED, and other next-gen display solutions for software-defined vehicles.
3. AMD – They announced a press conference at CES to “push the envelope yet again” in 2020.
Mistral 7BBased on the given context information, the following companies launched products at CES 2024:
– Samsung: They unveiled a new vacuum lineup with advanced AI.
– LG: They announced plastic OLED, advanced thin OLED, and other next-gen display solutions for software-defined vehicles.
– AMD: They held a press conference to announce new products.

In this scenario, we tried to get it to summarize some of the CES content that we wrote about at CES earlier this year. As you can see, they were both technically right as all three companies launched products, but they were both unable to support the list that they came up with.

How much faster is the RX 7600 GPU in Alan Wake 2 at 1440p than the RTX 4070 GPU?

Test ScenarioResponse
LlaMa 2 7BBased on the context information provided, I cannot make a direct comparison between the RX 7600 and RTX 4070 GPUs in Alan Wake 2 at 1440p. The documents you provided only mention the RTX 4090, RX 7800 XT, and RX 7700 XT GPUs, but not the RX 7600.

However, I can tell you that the RTX 4090 is reported to achieve frame rates of up to 123 FPS at 4K with full ray tracing, while the RX 7800 XT and RX 7700 XT are reported to be up to 6% faster than the GeForce RTX 4070.

Without more information about the RX 7600 GPU, I cannot provide a direct answer to your question. If you have any further context or information about the RX 7600 GPU in Alan Wake 2 at 1440p, I would be happy to try and assist you.
Mistral 7BBased on the information provided in the context, the RX 7600 GPU in Alan Wake 2 at 1440p is up to 6% faster than the RTX 4070 GPU. However, it’s important to note that this information is based on leaked benchmarks and may not be entirely accurate. Additionally, the specific performance difference may vary depending on the settings and other factors.

LlaMa made a decent effort here by admitting it doesn’t really know and then offering some related information. Mistral simply whiffed.

Compare the rendering performance of the AMD Ryzen 9 7900 and the AMD Ryzen 9 7900X CPUs. Include specific performance information for CineBench R23, Blender, V-Ray, and Handbrake

Test ScenarioResponse
LlaMa 2 7BBased on the context information provided, I cannot compare the rendering performance of the AMD Ryzen 9 7900 and the AMD Ryzen 9 7900X CPUs, as there is no information available in the provided documents about the rendering performance of these specific CPUs.

However, I can provide some general information about the performance of these CPUs in various rendering software. According to the documents, the AMD Ryzen 9 7900X3D has a performance summary of 3.8% better than the Core i9-13900K and 2.7% better than the Core i9-13900KS in Cinebench R23. Additionally, the AMD Ryzen 9 7900 has up to 31% faster performance in gaming compared to the Ryzen 9 5900X.

It’s important to note that the performance of these CPUs in rendering software such as Blender, V-Ray, and Handbrake has not been specifically mentioned in the provided documents. Therefore, I cannot provide a direct comparison of their rendering performance.

If you have any further questions or if there is any other information you would like me to assist with, please feel free to ask.
Mistral 7BBased on the context information provided, the AMD Ryzen 9 7900 and the AMD Ryzen 9 7900X CPUs have different rendering performance.

In terms of Cinebench R23, the AMD Ryzen 9 7900X has a score of 1,333, while the AMD Ryzen 9 7900 has a score of 1,327. This means that the AMD Ryzen 9 7900X has a slight edge in Cinebench R23 performance.

In terms of Blender, the AMD Ryzen 9 7900X has a score of 1,000, while the AMD Ryzen 9 7900 has a score of 997. This means that the AMD Ryzen 9 7900X has a slight edge in Blender performance.

In terms of V-Ray, the AMD Ryzen 9 7900X has a score of 1,000, while the AMD Ryzen 9 7900 has a score of 997. This means that the AMD Ryzen 9 7900X has a slight edge in V-Ray performance.

In terms of Handbrake, the AMD Ryzen 9 7900X has a score of 1,000, while the AMD Ryzen 9 7900 has a score of 997. This means that the AMD Ryzen 9 7900X has a slight edge in Handbrake performance.

Overall, the AMD Ryzen 9 7900X has a slight edge in rendering performance compared to the AMD Ryzen 9 7900, with scores of 1,333 vs 1,327 in Cinebench R23, 1,000 vs 997 in Blender, 1,000 vs 997 in V-Ray, and 1,000 vs 997 in Handbrake.

For this one, I decided to see if I crafted a very specific question that would have all the answers on the same page, whether that page could be identified – the page being this one about Rendering Performance between the Ryzen 9 7900 and Ryzen 9 7900 CPUs.

LlaMa was not able to find it, so it offered some generally irrelevant information to the question. On the Mistral side, it attempted to answer each individual program separately, however, for the performance it appears to have made up numbers that do not exist in the text (though, to be fair, the text is mostly percentage representations rather than absolute values).

Conclusion

Chat with RTX Example Response

There’s certainly a lot more to dig into with the capabilities of a locally hosted Generative AI platform that is able to utilize your documentation. It has serious benefits for those who do not want to subject their internal data to the cloud or have the possibility of it getting included in a public model when the intent is to keep it private. Chat with RTX lets you set this up with just a few clicks of the mouse – far easier than the few hours of research I’ve done over the past year to be able to roll my own chatbot.

On the accuracy of the results, the 13,000 Word doc approach produced lower quality answers than the single XML file as it appeared that both models struggled with sifting through a large volume of information. Both models also struggled with finding different bits of information across articles to be able to synthesize a description or make a comparison between two different things that were not described in the same article – of note, this was a bit better for the single XML file. The ability to categorize the documents (by author, by product, by type (news vs review)) and do things like count, give examples of, and so on, were significant stumbling blocks as well.

We’re certainly going to keep our eye on the development of this tech demo and its downstream projects as I can see an incredible use case here at The FPS Review for us to be able to “remember” our content by asking questions. However, at this point, we’ll spend more time validating its statements and cleaning up the memory to reflect reality than we’ll save by using AI.

If you’d like us to run any questions you have by it, sign up for our forums at the link below and ask away – I’ll continue posting results there for your enjoyment.

Join the discussion in our forums...

David Schroth
David is a computer hardware enthusiast that has been tinkering with computer hardware for the past 25 years and writing reviews for more than ten years. He's the Founder and Editor in Chief of The FPS Review.

Recent News