This Week in AI - Weekly Newsletter
Posts
OpenAI unveils GPT-4o (Her), Facebook's Camerabuds, Google Claps Back at OpenAI, and More!

OpenAI unveils GPT-4o (Her), Facebook's Camerabuds, Google Claps Back at OpenAI, and More!

What an exciting week this will be between announcements from OpenAI and Google I/O!! We'll cover it all!

Syed Abbas
May 14, 2024

As the tech giants unveil their product roadmaps, the race to dominate the AI landscape is more thrilling than ever 🚀. With the top spot still up for grabs, companies are pulling out all the stops to claim leadership in this dynamic space. Apple has yet to officially enter the race, adding an air of mystery to their AI ambitions 🤔, but we won't have to wait too long to learn what they are up to as WWDC is just a few weeks away.

Just ahead of Google I/O, OpenAI revealed its most advanced model yet, GPT-4o, but Google quickly responded with a demo of what's to come. The competition is fierce, and the innovations are game-changing 💥

Bonus points if you can spot the subtle jab OpenAI took at Google right at the top of the GPT-4o announcement page without scrolling any further! [link]

There is lot to cover so let’s dive right in!

ThisWeek in AI - OpenAI GPT-4o (HER)

GPT-4o: Let me start by saying, this is the most impressive model I’ve interacted with to date! It’s not just the responses that are remarkable, but the way it responds, offering the most human-like interaction you’ll experience with an LLM. Is it exactly like talking to a human? Of course not, but it’s closer than any other LLM out there.

Watch this incredibly impressive demo showcasing how GPT-4o effortlessly facilitates a fun and engaging game of rock-paper-scissors, complete with humor and ease.

Ok one more! The humanness of these interactions is seriously impressive

With GPT-4o, ‘o’ for ‘Omni’, as the model is capable of accepting text, audio and visual inputs and also producing the same in outputs unlike it predecessor which leveraged multiple models to handoff information to one another until the desired output was achieved. This update gives GPT a more human personality and reduces the response time for audio inputs to just 320 milliseconds (down from 5.4 seconds in GPT-4), similar to how humans interact with each other.

I watched almost all the demos released yesterday and here are the use cases this new update unlocks.

Travel and Accessibility guide: Whether it is recognizing landmarks, translating in real-time or reading signs and menus. GPT-4o can do it all!
All-knowing Tutor/Helper: Whether you are trying to learn math, cooking a new dish or putting IKEA furniture together. GPT-4o can help you with all of it and more!
AI working with AI: AI’s can now work together to make progress on a specific task. AIs negotiating with each other, sharing information, or as Bumble founder put it, your AI concierge and go on a date with another AI concierge to check for compatibility before you match.
True Work Partner: What if you could just share your screen with GPT-4o while working on a presentation, a document or solving a coding problem and let GPT-4o help you out in real-time giving you suggestions, unblocking you, etc.. or sending GPT-4o to attend a meeting on your behalf and report back with a meeting summary.
Personal Assistant: GPT-4o isn't your ultimate personal assistant yet, but it's on the path with its latest update. It can be your brainstorming partner, style guide, business advisor, and more. While it can't perform all tasks like a human assistant, such as finding the best restaurant and making reservations or building business plans from random ideas but we're getting closer to that reality. The future where GPT-4o handles these tasks is exciting, and it's only a matter of time before it reaches that level of functionality.

More highlights from the announcement:

Launch of ChatGPT Desktop App and Updated UI:

OpenAI introduced a new desktop version of ChatGPT, complete with a refreshed user interface that promises more intuitive and natural interactions.

Advanced Features and Greater Accessibility:

The new voice mode operates seamlessly within GPT-4o, minimizing latency and making conversations flow more naturally.
Enhanced vision capabilities allow users to upload and interact with images, screenshots, and documents containing text.
Memory features provide continuous context across sessions, making interactions more personalized and helpful.
Real-time browsing and advanced data analysis tools enable users to access current information and analyze complex data sets with ease.

Real-Time Conversational Speech and Emotion Detection:

Live demonstrations highlighted real-time speech capabilities, where users can interrupt and interact without any delay.
The model can detect and respond to emotions, enhancing the empathy and natural feel of conversations.

Expanded Language Support and API Availability:

GPT-4 now offers improved performance in 50 different languages, making it accessible to a broader global audience.
Developers can now access GPT-4o via API, allowing them to create cutting-edge AI applications with double the speed, half the cost, and five times the rate limits compared to GPT-4 Turbo.

Watch this mind-blowing demo of GPT-4o playing a tutor

Are you excited, yet?! Well, I am! Not only to get my hands on this latest update but also to see how developers innovate with this new model! It’s quite impressive! Watch the full announcement video below and more impressive demos here.

Oh, and before wrapping up this section, I have to share something amusing. Remember when Google released a video last year demonstrating its vision capabilities, only to later reveal that it was edited to cut out the latency? Well, OpenAI cheekily called them out on their GPT-4 homepage 😆

Google didn’t take these announcements lying down. They showcased their own technology in a demo, offering a glimpse of what’s to come at Google I/O. Check out the demo below.

One more day until #GoogleIO! We’re feeling 🤩. See you tomorrow for the latest news about AI, Search and more.
— Google (@Google)
4:22 PM • May 13, 2024

Excited about an AI startup or a product? Let us know at [email protected]

ThisWeek in AI - Art

Randy Travis team released a song that used AI-cloned voice debuted at #45 on Country Airplay chart. Listen to it on the Spotify link below!

Where That Came From

Randy Travis · Song · 2024

ThisWeek in AI - Interesting Reads

MIT gives AI the power to 'reason like humans' by creating hybrid architecture

MIT researchers have developed a hybrid AI architecture that enhances the reasoning abilities of AI systems by combining three specialized libraries: LILO (for coding), Ada (for strategic planning), and LGA (for robotics). These libraries use natural language abstractions to improve context understanding and decision-making. This approach aims to make AI systems more human-like in their reasoning and problem-solving capabilities, potentially transforming fields like software engineering, strategic planning, and robotics.

US, China to hold AI talks in Switzerland

U.S. and Chinese officials are set to meet in Switzerland to discuss AI security concerns. The talks, initiated after U.S. Secretary of State Antony Blinken’s visit to Beijing, aim to address the risks associated with AI technologies. Both nations are racing to dominate AI, with discussions focusing on setting global rules and ensuring AI does not make decisions about nuclear weapons. The U.S. delegation will include representatives from the White House, State, and Commerce Departments, while China's delegation will feature officials from its foreign ministry and state planner.

AI at Work Is Here. Now Comes the Hard Part

Microsoft's 2024 Work Trend Index report highlights the rapid integration of AI in the workplace, with 75% of knowledge workers using AI tools. While AI boosts productivity and creativity, leaders struggle with implementing it strategically due to concerns about immediate ROI and data privacy. The report identifies a talent shortage in AI skills, emphasizing the need for training and adaptation. Power users of AI show significantly enhanced productivity and job satisfaction. For organizations, adopting AI strategically and providing training can drive substantial business transformation.

The AI Randy Travis Song Has Officially Charted at Country Radio

Randy Travis has released his first new song in over a decade, "Where That Came From," using AI technology. The AI recreated Travis' voice by overlaying it onto a recording by James DuPre. This innovative approach was necessary due to Travis' limited speech after a 2013 stroke. The song, produced by Kyle Lehning, has received a mix of nostalgia and excitement from fans, highlighting AI's potential in preserving artists' legacies

Meta’s next hardware project might be AI-infused headphones with cameras

Meta is exploring designs for AI-enabled headphones, dubbed “Camerabuds,” which feature outward-facing cameras to detect surroundings and provide real-time AI features. This concept is similar to the multimodal AI in Meta’s smart glasses. Despite Mark Zuckerberg reviewing multiple designs, none have met his satisfaction, with concerns over battery life, heat, and privacy issues. Meta’s hardware track record is mixed; it discontinued its Portal smart speakers and a camera-enabled smartwatch but saw positive reception for its Ray-Ban smart glasses. Meta aims to integrate AI into familiar devices, competing with leading AI firms.

Rate this week's update:

Thanks for your feedback!

In a world where Artificial Intelligence moves at lightning speed, falling behind isn't an option. Let ThisWeek.AI be your guide. We bring the latest AI breakthroughs, startups, and products directly to your inbox, every week! Stay ahead of the curve, subscribe to ThisWeek.AI for free now!