GPT-3.5-Turbo vs. GPT-4 with the SketchUp OpenAI Explorer extension

Now that version 2.1 of the SketchUp OpenAI Explorer extension is available, it is easier to test the AI’s capabilities when using the most recent GPT-4 model. In this post, I will be comparing side-by-side the gpt-3.5-turbo and the gpt-4 models with a few common examples. If you want to give this a try, too, make sure you have access to the gpt-4 model and then simply replace the model name in my SketchUp extension’s settings dialog.

The extension page on my other website features a longer list of successful prompts than what I am able to cover here. Feel free to give those a try, too.

Example1: Draw a Box

For this example, I asked the AI to draw a 2′ (i.e. “two foot”) box. I wanted to test not only whether the box would get drawn correctly, but also if the foot tick mark would be interpreted correctly as the foot unit. The images below show what I got with that request:

**Model:** gpt-3.5-turbo
**Result:** This generated a box that measured 2″ x 2″ x 2″ instead of the specified two feet. It also created the box downward from the ground

**Model:** gpt-4
**Result:** This correctly generated a 2′ x 2′ x 2′ box above the ground level

While both runs created boxes (cubes, actually), the gpt-4 model interpreted the request better than the gpt-3.5 model. As you can see, the gpt-4 model even reversed the created face so that the box would be extruded upwards. It did all of this, however, at a significantly increased time (16.96 seconds vs. 6.19 seconds).

Example 2: Select Something

In this example I asked to select all component instances in a model (and not select anything else: faces, lines, etc.). I intentionally did not ask for “instances” and wanted to see whether the AI can correctly interpret my request as “instances” and not “definitions” (which cannot be selected, of course). This process is usually more reliable if you ask to clear the selection first, which is why I included “select none” in my request.

**Model:** gpt-3.5-turbo
**Result:** This correctly selected all component instances

**Model:** gpt-4
**Result:** This correctly selected all component instances

While gpt-4 again took longer to execute, the generated code is also again a bit cleaner (because of its use of the .grep method). The result was the same for both runs, however: All of the component copies were selected.

Example 3: Colorize Objects

This request not only asked to create new SketchUp materials and assign them to entities in the model, it also required the AI to interpret a theme (i.e. “fall” in my case) and create a set of colors based on that.

**Model:** gpt-3.5-turbo
**Result:** This correctly colored all objects

**Model:** gpt-4
**Result:** This correctly colored all objects

Once again gpt-4 took much longer to execute but the code is a bit cleaner because it used the .sample method on the color array (instead of creating a random index). The result is similar for both, however.

Summary

While the gpt-4 model appears to generate slightly better SketchUp Ruby code, the gpt-3.5-turbo model generally creates good, usable code that is actually quite accurate in many cases. It is also much faster in executing the requests, and – quite importantly – it is much cheaper to use (see table with current pricing below).

gpt-3.5-turbo	gpt-4
*Input:* $0.0015 / 1K tokens *Output:* $0.002 / 1K tokens	*Input:* $0.03 / 1K tokens *Output:* $0.06 / 1K tokens

Source: OpenAI Pricing

I hope this comparison helps you when you work with the SketchUp OpenAI Explorer extension. Did you do any comparisons yourself? Let me know in the comments below.