Unicorn status for Synthesia start up

FAU alumnus Prof. Dr. Matthias Nießner. His start-up is valued at one billion US dollars
Prof. Dr. Matthias Niessner. Picture: Astrid Eckert, Muenchen

Company of FAU alumnus Prof. Dr. Matthias Nießner valued at one billion US dollars

Face capture and facial reenactment are some of his specialist fields and he has been working on them for several years: Prof. Dr. Matthias Niessner, now Head of the Visual Computing Lab at TU München, is originally from FAU, where he not only completed his degree, but also gained his doctoral degree at the Chair of Computer Science 9 (Visual Computing) in 2013. The start-up, which he co-founded in 2017, has just achieved “unicorn” status. The AI video creation platform Synthesia is now valued at a billion US dollars after its lead investors, which include Accel, Nvidia and Kleiner Perkins, invested an additional 90 million dollars in the latest financing round. Synthesia enables users to produce professional videos in which AI avatars or “talking heads” explain and describe topics, depending on user requirements.

Prof. Niessner, congratulations on your success! Thanks to Synthesia, users can now create videos by themselves in which avatars that resemble humans present the required content. What came first: The idea for these talking heads or the desire to set up a company?

I had looked into deep fake technologies in great detail even before founding Synthesia in 2017. This is where the paper “Face2Face: Real-time Face Capture and Reenactment of RGB Videos” was written, which gained a lot of attention. After it was published, several colleagues asked me if I would like to found a start up. I had already studied business as a minor subject at FAU because I always wanted to take a project out of the research “bubble” and launch it as a product on the market. All this paved the way for Synthesia. The founding members who came together all complement each other perfectly. Lourdes Agapito and I focus on the technology and understand how it work. Steffen Tjerrild and Victor Riparbelli are the experts on the business side of the company.

What are the videos generated using Synthesia used for and how does the platform work exactly?

Essentially, our platform replaces traditional production methods for videos that require staff, equipment and editing software. A typical application example would be training videos used to provide initial training to employees for their tasks. Synthesia enables users to create videos using adaptable video templates and 140 different and diverse avatars that replace actors and narrators. They can also choose from 120 languages and dialects so that the content can be adapted to the relevant target audience.

Today, companies such as Amazon, accenture or Johnson & Johnson work with the Synthesia platform. On your homepage, it says that 5 years after setting up the company, you have 50,000 customers. How did it all start?

There wasn’t really a market launch as such. Instead we tested the software with our first customers on a “trial and error” basis. To find out the interests and needs of companies, we started small-scale test projects worth around 20,000 euros each. This enabled us to find out which improvements needed to be made. The feedback was so positive that we were able to employ three new members of staff after only 3 months.

And what about now? What is the AI software currently capable of?

Two years ago, we had our last big launch, which was our cloud platform. This enables users to access Synthesia from anywhere in the world. You can also easily input the text for the talking heads yourself and there are personalized voices. To do so, a person records their voice using a text that covers all the phonemes of a language and the software learns from this recording and transfers the sound to the final voiceover text.

Which changes or improvements are you planning with the new financing you have received?

That’s still a secret, but what I can tell you is that we want to give users even more control. In future, it will be possible to use several avatars in a video and they will be able to talk to each other. We want to improve how we integrate emotions and gestures and make creating avatars easier. Currently, we still have to film a real person and then use that video to simulate an avatar. In future, we will only need an image in order to generate a talking head.

Thank you for your time!

More information about face capture and reenactment

The paper mentioned above “Face2Face: Real-time Face Capture and Reenactment of RGB Videos”, available at https://doi.org/10.1145/3292039, was written by Prof. Matthias Niessner with FAU alumnus Justus Thies and others.
In an FAU interview in 2016, Thies talks about facial reenactment and the software that he developed especially for this purpose: Researchers present facial manipulation software

Where do FAU alumni end up?

Travel photographer, quantum computing expert, hydrogen expert, FAU innovator or theater director – the list of potential professions and jobs of FAU graduates is extensive. You can find out more about former FAU students in our series of fascinating interviews at www.fau.de/alumni.