The speed at which AI seems to be advancing is unbelievable. Before the end of last year, most of us thought that AI wouldn’t be individually useful for a few years yet. Then ChatGPT hit, backed by GPT-3 technology, followed quickly by GPT-4, and suddenly, we are surrounded by artificial intelligence that can improve our productivity.
One of the most interesting is the video synthesis AI model by DeepBrain AI, which can create a working digital twin of a human that can increasingly take on tasks that the human it mirrors would typically have to do.
Let’s chat about the emergence of the useful human digital twin this week. We’ll close with my Product of the Week, Lenovo’s Motorola ThinkPhone, a smartphone that addresses one of the computer industry’s biggest mistakes in its response to Apple’s iPhone.
Video Synthesis AI Models
Initially focused on news anchor talent, the DeepBrain AI is the first human digital twin I’ve looked at that can perform tasks indistinguishably from how the human would have done them.
The digital twin is created by allowing it to learn from a host of news videos to create a database of knowledge on the human presenter’s behavior, quirks, speech, and movements.
The resultant digital twin can be fed a script, and from that script, it will perform as the news presenter would have. The digital twin does not replace the human talent, and generally, the talent is compensated every time the news service uses this avatar, primarily for breaking news or short format teases for upcoming live programming.
Sometimes the news services will note that the audience is watching an AI-generated avatar. Where this technology is already in use, it has been accepted by the news audience and the talent.
The reason the news talent is okay with this is that they are compensated when the avatar is used, and using the avatar means the talent doesn’t have to drive to a studio to tape or record live short segments, updates, or announcements that usually require they return to a studio.
So, the avatar doesn’t hurt their income, and it supplements humans’ work by reducing their load and related aggravation, all of which should improve job satisfaction.
Avatar Use Cases
While most of the initial use cases for this technology are for video news programs where the avatar and newsperson are indistinguishable (example in Asia), other uses include:
- Virtual kiosks in banks where it looks like you are talking to a human but are instead chatting with an AI
- One-way training videos working from a written script
- A virtual concierge at a hotel that can assist with things like restaurant reservations or show tickets
- Interactive videos where, again, it feels like you are talking to a live person
Some of these avatars were never real people; they were computer-generated, highly realistic images.
Cost advantages are significant as it typically costs around $4,000 to create some of this short-form content with a live person, but it only costs around $100 to do the same thing with a computer-generated avatar.
The avatar doesn’t require rehearsal and will work from the written script without distractions like illness or any of the conflicts or behavioral problems typically associated with live talent.
Given that the avatar works from text, it can be controlled by other AI like ChatGPT or IBM’s Watson, which created a level of human-like interactive content that could fool many people into thinking they are talking to a live person.
Another use that has been explored is celebrity advocacy:
Celebrities don’t scale well because if you use them as advocates, the ability for individuals to chat with them is almost non-existent. However, an avatar of a celebrity could interact with fans at scale, in addition to the client’s current or future customers.
As noted, this technology can scan real people and artistically created characters, which could solve some of the problems that companies like McDonald’s (Ronald McDonald), Kentucky Fried Chicken, and Jack in the Box had with using live people, where all would regularly swap actors so that the actor wasn’t so tightly tied to the character that they couldn’t be replaced.
With a digitally created avatar, the firm owns the virtual actor, and the human-sourced problems evaporate once you remove the humans from the mix.
Human and Virtual Digital Twins – What Lies Ahead
The current focus of DeepBrain is on enhancing, not replacing, people for the most part. However, their use of fully synthesized avatars that have no connection to any human is more of a replacement than an enhancement model. While they are focused initially on short-form content, nothing prevents the technology from eventually moving to long-form productions like TV shows and movies.
The hyper-realistic nature of the avatars will improve with additional training and as the technology advances so that even in long form. The virtual actors will become indistinguishable from real people even though they are amalgams of those people, much like products such as Dalle-E build art from amalgams of images.
Given the massive cost advantages of using virtually created content over live action, the potential for technology like this to disrupt the media industry is significant. Looking ahead, it isn’t just the cost of the talent being avoided. The entire cost of the studio where the talent would otherwise perform could also be eliminated.
Since GPT-4 is already doing interesting work with scripts and stories, you can put this on the roadmap to having complete photorealistic movies and TV shows created entirely by AIs dynamically based on user preferences.
In the end, rather than watching the same TV show and movie as everyone else, this technology, combined with generative AI, could create customized videos at scale and potentially put you and your family in as the main actors (with your permission, of course).
Granted, you could then share those videos over social media with those interested in seeing what others create, potentially creating massive amounts of unique content that services would need to analyze and present to a world of potential customers.
Being able to create your twin to do some of your work, which is where DeepBrain AI is currently focused, is game-changing. But when we can take synthesized images and do the same thing, particularly for long-form content, it will massively disrupt all forms of entertainment. The pornography industry, for instance, is already all over this. Reddit content has been doing this for a while, and most users don’t seem to care.
Everything from in-game non-player characters (NPCs) that present like they are real people, to being able to build entire virtual sports teams with accurate representations of real, imagined, or even dead players are all potentially on the table, all of which suggests a level of disruption we are only beginning to see.
In short, it is already difficult to tell what is real and what isn’t, and that difficulty will only grow. When it comes to entertainment, this may turn out to be a good thing, but when it comes to our ability to see the truth, it may have a much more problematic impact. We aren’t anywhere near ready for that.
Lenovo ThinkPhone by Motorola
One of the biggest mistakes the computer industry made was pivoting fully to the iPhone. The irony was that the industry initially didn’t believe in a consumer-focused smartphone. Then, rather than fighting that trend on its merits, it attempted to pivot on the iPhone to turn Apple from a late follower into a market leader seemingly overnight.
This wasn’t the first time it had happened. A few decades earlier, IBM attempted to pivot to client/server computing and almost abandoned the mainframe, taking it from the clear market leader to almost out-of-business over a short number of years.
Today, there is an underserved market regarding business-focused smartphones. I once spoke to the then-CEO of Bank of America, who didn’t want his people using consumer phones. He wanted something secure and business-focused like the BlackBerry once was, but he was frustrated that no one had built such a device.
Well, Lenovo just fixed that with its iconic ThinkPhone.
The Lenovo ThinkPhone by Motorola (Image Credit: Lenovo)
Building off the heavily business-focused brand of the ThinkPad that originated with IBM, the ThinkPhone has similar dimensions to an iPhone but is wrapped with several unique business-focused features.
These features include:
- Instant Connect: Phone and PC seamlessly discover when nearby and connect over Wi-Fi.
- Unified Clipboard: Seamlessly transfer copied text or recent photos, scanned documents, and videos between devices by pasting them into any app on the destination device.
- Unified Notifications: Phone notifications instantly appear on the Windows Action Center. Clicking a notification auto-launches the corresponding phone app on the PC’s screen.
- File Drop: Easily drag and drop files between ThinkPhone and PC.
- App Streaming: Open any Android application directly on a PC.
- Advanced Webcam: Take advantage of the powerful ThinkPhone cameras and AI capabilities, seamlessly using it as your webcam for all your video calls. Why buy a separate webcam when you already have a better camera on your smartphone?
- Instant Hotspot: Connect to the internet through one click directly from the PC to leverage the ThinkPhone’s 5G connectivity. This is huge and potentially mitigates the need and cost for WAN capability in your PC.
Like the ThinkPad, the ThinkPhone is wrapped with security and tested to Military Standard 810H (MIL-STD-810H) due to its use of aramid fiber (used in bullet-proof vests), aircraft quality aluminum, and Victus — the most robust solution by Gorilla Glass.
The ThinkPhone is waterproof up to a depth of 1.5 meters for up to 30 minutes. It even has a red button to launch a critical application. I typically pick the camera as that is what I most often need to access quickly. Others might use it to reimplement push-to-talk for police, security, and other uses where instant communication is critical (this is supported in the Microsoft Teams Walkie Talkie app).
Designed to embrace remote management, ThinkPhone can be centrally configured and managed to ensure the device’s security and that it is not used inappropriately, a typical requirement for a business-oriented computing device. ThinkPhone has a unique processor called Moto Secure that isolates PINs, passwords, and cryptographic keys, keeping them in a tamper-resistant environment so bad actors can’t access them.
ThinkPhone comes with a unique and very small 68W universal charger that will charge the phone in minutes and is also strong enough to power most business-focused laptops or other USB-C devices — though not gaming machines or workstations.
Finally, the phone sports a high-quality 50 MP camera that should cover most photo needs, whether to capture a personal event or for an insurance investigator or someone else that needs to create a high-quality record.
The ThinkPhone fills the void in business phones that existed since BlackBerry and Palm exited the market, and it is my Product of the Week.