Apple AI Study: ReALM is Smaller and Faster than GPT-4 for Contextual Data Analysis

Apple is working on bringing AI to Siri

2 Facebook x.com Reddit

Apple artificial intelligence research reveals model that would allow donating teams Siri is faster and more efficient by converting any given context into text that is easier to parse with a large language model.

Apple's AI research continues to be published as the company moves closer to the public launch of its AI initiatives in June during WWDC. To date, many studies have been published, including an image animation tool.

The latest article was first published by VentureBeat. The document details something called ReALM — Link resolution as language modeling.

When a computer program performs a task based on vague linguistic input, such as how a user might say “this” or “that,” it is called reference resolution. This is a complex problem to solve since computers can't interpret images the way humans can, but Apple may have found a simplified solution with LLM.

When communicating with smart assistants such as Siri, users can reference any amount of contextual information for interaction, such as background tasks, display data, and other non-speaking objects. Traditional analysis methods rely on incredibly large models and reference materials such as images, but Apple has simplified the approach by converting everything to text.

Apple has found that its smallest ReALM models perform similarly to GPT-4 with far fewer parameters and are therefore better suited for on-device use. Increasing the parameters used in ReALM allowed it to significantly outperform GPT-4.

One of the reasons for this performance improvement is the use of GPT-4 image analysis to understand the information on the screen. Most image training data is built on natural images rather than artificially coded web pages filled with text, so direct text recognition is less effective.

Represent screenshot data as text. Source: Apple Research

Converting an image to text allows ReALM to bypass these advanced image recognition options, making it smaller and more efficient. Apple also avoids problems with hallucinations, including the ability to limit decoding or use simple post-processing.

For example, if you're browsing a website and decide you want to call a business by simply saying “call business,” Siri needs to analyze what you mean based on context. He will be able to “see” that there is a phone number on the page marked as a work number and call it without further prompting from the user.

Apple is working to release a comprehensive AI strategy during WWDC 2024. Some rumors suggest the company will rely on smaller models on devices that preserve privacy and security, while licensing other companies' LLMs for more controversial offline applications. decisions. device processing is full of ethical conundrums.

Follow AppleInsider on Google News.

You may also like...

Popular Posts

Leave a Reply Cancel reply