Let’s catch up on all the latest AI developments.
OpenAI has announced that third-party developers can integrate ChatGPT into apps and services via an API and that doing so will be significantly cheaper than using its existing language models. The company is making Whisper, its AI-powered speech-to-text model, available through an API and making some changes to its developer terms of service.
OpenAI has also announced any data submitted through its API for “service improvements,” will not be included in AI model training unless a customer or organization opts in. In addition, the company is implementing a 30-day data retention policy for API users with options for stricter retention “depending on user needs” and simplifying its terms and data ownership to make it clear that users own the input and output of the models.
Microsoft released a new model, Kosmos-1, that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions. The researchers believe multimodal AI—which integrates different modes of input such as text, audio, images, and video—is a crucial step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human.
Brave search has launched a new Summarizer feature using AI.. but not ChatGPT.
In the “bad things can happen” department, researchers determined that AI chatbots are currently easily influenced by text prompts embedded in web pages. A hacker can thus plant a prompt on a web page in 0-point font, and when someone is asking the chatbot a question that causes it to ingest that page, it will unknowingly activate that prompt. The researchers call this attack “indirect prompt injection,” and give the example of compromising the Wikipedia page for Albert Einstein. When a user asks the chatbot about Albert Einstein, it could ingest that page and then fall prey to the hackers’ prompt, bending it to their whims—for example, to convince the user to hand over personal information. This is from Vice.
Another new scam – using voice impersonation generated by AI to scam family members by using the voices of other family members. The Washington Post covering this new ploy.
BleepingComputer reported that Bing Chat has a secret celebrity impersonation mode, too, so there’s another impersonation option.
Here’s another one – data-poisoning attacks. What are those, you ask? Attackers tamper with the training data used to create deep-learning models. This action means it’s possible to affect the decisions that the AI makes in a way that is hard to track. Suppose an attacker could gain control over a web resource indexed by a particular dataset. In that case, they could poison the collected data, making it inaccurate and potentially affecting the whole algorithm negatively. This one currently appears theoretical… although one way to do this is to buy expired domain names and put the new, poisoned data there.
Why do we care?
The more AI stories I track, the more value I see for providers to customers. It is the ability to understand the differences in the models and how they would cause different outcomes or be applied to other use cases. AI is a category, and within it is an ever-increasing quantity of implementations and models. Tracking that space and applying the suitable model to the right problem with the proper prompts will be the value.
Note that we’ve gone from nothing to significance in a very short period. Actual use cases have flourished in that time, far different from crypto and Web 3.0. It’s why I believe there is a “here” here – the actual practical implementations.
Another critical skill will be understanding (and ideally preventing) those bad things from happening. Those, too, are moving very quickly.
Be First to Comment