Generative AI and 3D

Endless possibilities for capturing and editing reality in 3D, the story of Augmenta, drone helicopters, quantum computing, jobs, events, and more

Apr 07, 2023

Welcome back to Startup Pirate, a newsletter about what matters in tech and startups with a Greek twist. Come aboard and join 4,530 others. Still not a subscriber? Here’s what you recently missed:

Find me on LinkedIn or Twitter,

-Alex

Generative AI and 3D

AI is revolutionizing how we capture and edit reality in 3D. Today, I’m excited to chat with Theo Panagiotopoulos, Founding Team and Senior Augmented Reality Engineer at Luma AI. In the same way, Midjourney and Stable Diffusion let us play with images using just a prompt and some imagination, Luma will do for entire scenes and objects — in other words, the rich 3D world we live in. Founded by engineers who left Apple’s AR and Computer Vision team, Luma AI is a small, remote-first team of 3D AI researchers and engineers that has raised over $28m.

Shrinking a Hollywood studio into an iPhone app won't just change media. It will disrupt key pillars of the economy from e-commerce and ads to gaming and social. We’re barely scratching the surface of the creative (and commercial) possibilities this unlocks. You can find some cool 3D scenes created with Luma AI here.

Let’s get to it.

Theo, it’s great to talk to you! I’d love to geek out on the cool technology you’re building at Luma and how you revolutionize 3D with AI. But before we do that, I’d love to learn more about your journey.

TP: Sure. My passion for 3D technologies started at the University of Patras while studying electrical engineering. A professor, Konstantinos Moustakas, introduced me to Augmented and Virtual Reality, and that’s how I fell in love with the industry. I also did a dissertation under his supervision on “Augmented Reality 3D Maps”. Then, I decided to continue my studies at Georgia Tech in Atlanta, USA, to explore Graphics, Computer Vision, and AI further. I was lucky to work with Thad Starner, a Professor and Technical Lead on Google's wearable computing group, and get exposed to the idea that technology can augment our physical and mental capabilities — what people call Human Augmentation. After university, I joined Apple’s AR team and stayed there for 3.5 years.

I mentioned some technical terms, so let me unpack them. Augmented Reality comprises of two main fields which work hand-in-hand. First, Computer Vision: how a camera understands space. A camera can only see pixels, so Computer Vision helps the computer give pixels meaning; it can recognize objects, surfaces, camera orientation, etc. Second, you can overlay a virtual world on top of the real world. And you can do that with Computer Graphics. These are the two core AR technologies. Augmented Reality is one way to experience 3D. Today, when discussing 3D, we often refer to the entertainment industry (gaming or movies) — a virtual world inside a screen. However, AR fuses the virtual and real worlds, helping you experience 3D in your everyday life.

During my time at Apple, I met Amit Jain, a brilliant engineer who left Apple’s Computer Vision department to build a startup, Luma, to revolutionize 3D capture. As soon as Luma started taking off, Amit contacted me, and I decided to join the team as a founding member.

You’ve been building Augmented Reality products for a while, so it would be great if you could give us the high-level, or sort of a 10,000ft view, of where AR is today. What are the most compelling use cases and technologies at the moment?

TP: AR is still in its infancy with limited real-world adoption, split between entertainment apps like Pokemon-Go, social media filters like the ones on TikTok or Snapchat, and utility apps like Apple’s Measure, which lets you gauge the size of real-world objects. The core limitation at the moment is that any AR experience is limited to mobile handheld devices. You need to have a mobile phone and point the camera at particular spots. Let’s not forget that the phone’s screen only covers a tiny portion of a human’s field of view. So, in this case, there’s limited screen real estate to display virtual content.

Apple’s Measure app: Measuring a person’s height in AR

One thing that can propel the usage of AR applications is the next generation of head-mounted displays or mixed-reality headsets. Some companies, like Meta (e.g. Meta Quest Pro) and HTC (Vive XR Elite), have already started releasing products in this space. With these headsets, AR can fill a human’s whole field of view with virtual content, and that’s orders of magnitude more than what can fit the real estate of a phone or tablet. For instance, with Quest Pro, you can overlay 3-4 virtual computer screens (screens that exist only in Augmented Reality) next to your laptop. Or you can read a webpage and get instant augmentation with immersive content, etc.

Meanwhile, progress in Computer Vision algorithms will further improve how a camera “understands” the real world. This can empower developers to create applications with better user experience, enabling endless use cases.

We mentioned Artificial Intelligence, Augmented Reality, and 3D. How are the different dots connected?

TP: Who can create 3D content nowadays? Big gaming and VFX studios that have an army of talented engineers and 3D artists. You usually need millions of dollars to produce content in 3D that feels real. The cost is high because a 3D artist needs to “sculpt” every surface, define the material of every object, how it reflects light, etc. Humans author all 3D content, and then conventional rendering algorithms are responsible for drawing the pixels on screen. It takes a LOT of work, making it inaccessible for 99% of creators. Artificial Intelligence changes that.

With AI, you can create a photorealistic 3D scene in minutes, simply by taking a quick video of the object or space you want to capture. Instead of painstakingly handcrafting the 3D scene bit by bit, you just open your camera, do a quick loop around the object you want to capture and let the AI do the heavy lifting. We call this 3D capture, and it's essentially the process of bringing something from the real world into the virtual world with a few taps.

Once the 3D scene is created, you can do all kinds of things with it, like share it on your website, or put virtual cameras in it and shoot impossible shots you would never be able to do in real life. You can reshoot as if you were a drone or a helicopter, pass through tight spots, use lenses that defy physical constraints, apply infinite stabilization, anything you can imagine. And the best thing is you can do all of that from the comfort of your room, so you don’t have to stress about getting the perfect shot when you are on location. Another thing you can do is view your capture in Augmented Reality, allowing you to immerse yourself in an environment as if you were there.

The breakthrough that made this all possible is a technology initially proposed in 2020 called Neural Radiance Fields or NeRFs. This opened the gates of innovation. NeRFs use inverse rendering, meaning you can input pixels (photos or videos) to the algorithm, and AI understands the 3D space. AI revolutionizes 3D, as it has done with so many other industries. Everyone can create 3D scenes just with their camera, which was previously unheard of.

The next big things start out looking like toys, and playing with Luma certainly feels like it. There are different ways to create 3D scenes, from video-to-3D to text-to-3D, and viewing these in AR.

TP: Exactly. Our goal at Luma is to democratize the creation of 3D. Currently, the primary use cases are e-commerce (displaying retail items in 3D) and visual effects in movies and gaming (the example I just mentioned with the re-shooting fits quite well here). We enable the creation of 3D scenes from videos, photos, and text. The latter is still in alpha version. You can add a text prompt like “I want to create a sculpture of Plato”, which generates a 3D model of Plato’s sculpture. Then you can port this to movies, games, you name it.

The whole Generative AI <> 3D industry and the technology we’re building are in their infancy, and it’s hard to predict fully what use cases will emerge. Thousands of early users experiment with our product, and indeed, you’re right that it does look like a toy in the sense that the exact use cases are carved out as we speak. We want to get feedback from our users, analyze the different use cases, and find out what makes sense for our business. As the technology improves, the amount of 3D content produced will skyrocket. I believe this is the only way to satisfy the demand as mixed-reality headsets enter the stage of mainstream adoption in the years to come.

What do you think is next for the Generative AI and 3D industry?

TP: As Jensen Huang (CEO of NVIDIA) said, “Every single pixel will be generated soon. Not rendered. Generated”. My belief is that Generative AI is on course to completely change 3D; it won’t happen overnight, but soon people will be able to “will” 3D worlds into existence. It will be interesting to see what kind of use cases emerge when access to 3D is democratized in this way. It’s still early days for the industry, and that’s why building in this sector is so fascinating!

Thank you so much for taking the time, Theo. It was great to talk to you!

TP: Thanks for having me, Alex.

Learn more about Luma AI on their website and look at their open roles.

Startup Jobs

Looking for your next career move? Check out job openings from Greek startups hiring in Greece, abroad, and remotely.

Here’s a list of jobs

News

Excited to work with the leading manufacturer of sub 55 lb, full-electric UAVs for the most demanding commercial & government applications. Velos Rotors raised $2m Seed from Marathon to accelerate adoption of its drone helicopters, serving customers across logistics, medical supplies, inspections and more.

SafeSize raised €14m Series A+ from 5G Ventures and other investors to further establish its 3D foot scanner solution. The system is already used by 2,500 stores in 50 countries.

Digital therapeutics startup Mindset Health raised $12m in Series A.

Smart basketball hoop startup, huupe raised $11m. Genesis Ventures is among the list of investors.

Home services platform, Douleutaras announced a €5m round from LATSCO Family Office, Apostolos Apostolakis, and other investors.

Proptech startup that helps you find a roommate, Myroomie, raised €120k.

Discussions for foreign direct investments in Greece from Amazon, Intel, and Erickson.

Demokritos National Center for Scientific Research to host deep tech startups from NATO countries at its accelerator within the next year.

New Products

Nudge, Logidot, CHARLIE, Todo App Codebase, AI-powered hiring tool by Workable

Interesting Reads & Podcasts

The story of Augmenta from the early days to the acquisition by CNH Industrial with the founders Dimitris Evangelopoulos and George Varvarelis. (link)

An introduction to quantum computing by Efthimios Kaxiras, Professor at Harvard University. (link)

Why MLOps is mostly data engineering by Kostas Pardalis. (link)

Lilian Balatsou on AI, language, ethics, and more. (link)

Thoughts on elevating product quality by Paul Stamatiou, Head of Design at Rewind AI. (link)

An introduction to Generative AI by Marily Nika, AI Product Lead at Meta. (link)

Building a drone startup with Andreas Raptopoulos, founder & CEO of Matternet. (link)

Where did the moon come from, by Lefteris Statharas. (link)

Events

“32nd Thessaloniki WordPress Meetup” by Thessaloniki WordPress Meetup on Apr 8

“Content Marketing & SEO for SaaS Companies” by Minuttia on Apr 21

“Greeking Out in Singapore” by Endeavor Greece on Apr 25

If you’re new to Startup Pirate, you can subscribe below.

Thanks for reading, and see you in two weeks!

P.S. If you’re enjoying this newsletter, share it with some friends or drop a like by clicking the buttons below ⤵️

Startup Pirate by Alex Alexakis

Discussion about this post