I tried Stability AI’s new image-to-3D tool

I tried Stability AI’s new image-to-3D tool — and it creates digital models in seconds

StabilityAI, makers of the Stable Diffusion family of AI image models has unveiled a new image-to-3D tool called TripoSR that can quickly turn a picture into an object.

There are a growing number of generative 3D models but what makes TripoSR stand out is the speed it can create a new object, and that it can run on your laptop.

I was able to get the model running on my M2 MacBook Air in about 10 minutes using the Pinokio 1-click installer. It took about a minute to generate an object from a simple image.

Using a cloud version of the AI model other users have been able to have it working inside the Apple Vision Pro to generate a 3D object from a photo and load it as an interactive object without taking off the headset.

How does TripoSR work?

This workflow is really fun! 🤩 Create any 3D object you can imagine in Apple Vision Pro, FAST!Midjourney (or other image gen) -> TripoSR (modded) - Free USDZ ConverterMore info in the thread ⬇️🥽🤯 pic.twitter.com/UsvsFkk3bKMarch 6, 2024

TripoSR is the result of a partnership between StabilityAI and Tripo AI, an AI-powered 3D modelling startup from VAST AI Research.

The tool allows you to take any image, remove the background and convert it into a fully rendered 3D object that you can interact with.

The image serves as the basis for the 3D reconstruction. It runs through a pre-trained encoder to convert it into vectors with global and local features of the image.

They have the information required to then generate a 3D object. It doesn't need any additional input such as camera parameters or its position as TripoSR has been trained to "guess" this information during its training.

This is why it's so fast at generation, although it's also why the reverse of the generated model sometimes lacks detail.

How well does TripoSR work?

The models are fun and reasonably high resolution, although my tests struggled with the rear view of a model, often rendering it blank. However, the most impressive development is the speed of generation.

It generates an obj file on my Mac in anything from 30 seconds to a minute and apparently will create a file from an image in half a second on a machine running an NVIDIA H100 Tensor Core GPU.

The objects are interactive and if you select the right starting picture it does a better job of turning it into a 3D object than some other tools, including those that take a full 3D lidar scan using a phone.

What are the use cases?

This near real-time generation of a single object could lead to genuine virtual world creation on the fly, creating games that change as the user interacts.

If realized inside a virtual world environment like the Apple Vision Pro, users could generate new artwork or objects to populate their view, or even take a real world object and turn it into a virtual one you can interact with while in full VR.

For now its main use will be in creating virtual art that can be imported into Blender, Unity or Unreal Engine for use in game of virtual scene development.

More from Tom's Guide

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here