Blog

DAI#59 – APIs, dead bills, and NVIDIA opens up

Welcome to our weekly roundup of human-crafted AI news.

This week OpenAI handed out API goodies.

California’s AI safety bill got killed.

And NVIDIA surprised us with a powerful open model.

Let’s dig in.

Here come the agents

OpenAI didn’t announce any new models (or Sora) at its Dev Day event, but developers were excited over new API features. The Realtime API will be a game-changer for making smarter applications that speak with users and even act as agents.

The demo was really cool.

There have been rumors about OpenAI going the “for-profit” route and awarding Sam Altman billions of dollars in equity but Altman dismissed these. Even so, the company is on the drive for more investment and they’ll expect some return for their cash.

Apple has integrated OpenAI’s models into its devices but dropped out of the latest funding round for OpenAI which is expected to raise approximately $6.5 billion.

We’re not sure why Apple doesn’t want a piece of the OpenAI pie but it might have something to do with new developments with its Apple Intelligence. Or maybe it’s related to Sam Altman’s demand for exclusivity.

Tell me you have no moat without telling me you have no moat. pic.twitter.com/3I18MosvOg

— Pedro Domingos (@pmddomingos) October 3, 2024

Kill bill

Gavin Newsom had to decide between putting a safety rev limiter on AI developers or letting them go full steam ahead. In the end, he decided to veto California’s SB 1047 AI safety bill and offered some interesting reasons why.

Are we really at a point where we face genuine AI risks yet?

Well that escalated quickly https://t.co/xhZCITRJjE pic.twitter.com/aLZn4blS8G

— AI Notkilleveryoneism Memes (@AISafetyMemes) September 30, 2024

Newsom has signed a range of AI bills over the last month related to deepfakes, AI watermarking, child safety, performers’ AI rights, and election misinformation. Last week he signed AB 2013 which will really shake things up for LLM creators.

The bill says that on or before January 1, 2026, developers will need to provide a high-level summary of the training dataset of any models made from January 1, 2022 onward if the model is made available in California. Some of these requirements could air some dirty secrets.

More EU AI regs

The EU is clearly a lot more concerned about AI safety than the rest of the world. Either that or they just enjoy writing and passing legislation. This week they kickstarted a project to write an AI code of practice to attempt to balance innovation & safety.

When you see who heads up the safety technical group you’ll have a good idea of which way they’ll be leaning.

Liquid foundation models

​Transformer models are what gave us ChatGPT but there’s been a lot of debate recently about whether they will be up to delivering the next leap in AI. A company called Liquid AI is shaking things up with its Liquid Foundation Models (LFMs).

These aren’t your typical generative AIs. LFMs are specifically optimized to manage longer-context data, making them ideal for tasks have to handle sequential data like text, audio, or video.

The LFMs achieve impressive performance with a much smaller model, less memory, and less compute.

NVIDIA opens up

Nvidia just dropped a game-changer: an open-source AI model that goes head-to-head with big players like OpenAI and Google. Their new NVLM 1.0 lineup, led by the flagship 72B parameter NVLM-D-72B, shines in both vision and language tasks while also leveling up text-only capabilities.

With open weights and NVIDIA’s promise to release the code, it’s getting harder to justify paying for proprietary models for a lot of use cases.

NVLM-D benchmarks. Source: arXiv

Just say know

A new study found that the latest large language models (LLMs) are less likely to admit when they don’t know an answer to a user’s question. When users ask these models a question they’re more likely to make something up rather than admit they don’t know the answer.

The study highlights the need for a fundamental shift in the design and development of general-purpose AI, especially when it’s used in high-stakes areas. Researchers are still trying to understand why AI models are so keen to please us instead of saying: ‘Sorry, I don’t know the answer.’

AI inside

It seems like everyone is slapping an “AI” label on their products to pull in customers. Here are a few AI-powered tools that are actually worth checking out.

Bluedot: Record, transcribe, and summarize your meetings with AI-generated notes without a bot.

Guidde: Guidde magically turns your workflows into step-by-step video guides, complete with AI-generated voiceovers and pro-level visuals, all in just a few clicks.

In other news…

Here are some other clickworthy AI stories we enjoyed this week:

Meta’s AI-powered smart glasses raise concerns about privacy and user data.
AI tools are spilling the beans on awkward company secrets.
MIT makes a breakthrough in robotics to handle real-life chaos.
LinkedIn faces legal complications over its AI integration.
Anthropic hires OpenAI co-founder Durk Kingma.
Y Combinator is being criticized after it backed an AI startup that admits it basically cloned another AI startup.

And that’s a wrap.

If you live in California we’d love to know how you feel about SB 1047 getting vetoed. Is it a missed opportunity for AI safety or a positive step that will see us get AGI soon? With powerful open-source models like NVIDIA’s new bombshell, it’s going to be harder to regulate LLMs anyway.

OpenAI’s Realtime API was the highlight this week. Even if you’re not a developer, the prospect of interacting with smarter customer service bots that talk to you is pretty cool. Unless you work as a customer service agent and you’d like to keep your job that is.

Let us know what you think, follow us on X, and send us links to cool AI stuff we may have missed.

The post DAI#59 – APIs, dead bills, and NVIDIA opens up appeared first on DailyAI.

OpenAI unveils Realtime API and other features for developers

OpenAI didn’t release any new models at its Dev Day event but new API features will excite developers who want to use their models to build powerful apps.

OpenAI has had a tough few weeks with its CTO, Mira Murati, and other head researchers joining the ever-growing list of former employees. The company is under increasing pressure from other flagship models, including open-source models which offer developers cheaper and highly capable options.

The new features OpenAI unveiled were the Realtime API (in beta), vision fine-tuning, and efficiency-boosting tools like prompt caching and model distillation.

Realtime API

The Realtime API is the most exciting new feature, albeit in beta. It enables developers to build low-latency, speech-to-speech experiences in their apps without using separate models for speech recognition and text-to-speech conversion.

With this API, developers can now create apps that allow for real-time conversations with AI, such as voice assistants or language learning tools, all through a single API call. It’s not quite the seamless experience that GPT-4o’s Advanced Voice Mode offers, but it’s close.

It’s not cheap though, at approximately $0.06 per minute of audio input and $0.24 per minute of audio output.

The new Realtime API from OpenAI is incredible…

Watch it order 400 strawberries by actually CALLING the store with twillio. All with voice. pic.twitter.com/J2BBoL9yFv

— Ty (@FieroTy) October 1, 2024

Vision fine-tuning

Vision fine-tuning within the API allows developers to enhance their models’ ability to understand and interact with images. By fine-tuning GPT-4o using images, developers can create applications that excel in tasks like visual search or object detection.

This feature is already being leveraged by companies like Grab, which improved the accuracy of its mapping service by fine-tuning the model to recognize traffic signs from street-level images​.

OpenAI also gave an example of how GPT-4o could generate additional content for a website after being fine-tuned to stylistically match the site’s existing content.

Prompt caching

To improve cost efficiency, OpenAI introduced prompt caching, a tool that reduces the cost and latency of frequently used API calls. By reusing recently processed inputs, developers can cut costs by 50% and reduce response times. This feature is especially useful for applications requiring long conversations or repeated context, like chatbots and customer service tools.

Using cached inputs could save up to 50% on input token costs.

Price comparison of cached and uncached input tokens for OpenAI’s API. Source: OpenAI

Model distillation

Model distillation allows developers to fine-tune smaller, more cost-efficient models, using the outputs of larger, more capable models. This is a game-changer because, previously, distillation required multiple disconnected steps and tools, making it a time-consuming and error-prone process.

Before OpenAI’s integrated Model Distillation feature, developers had to manually orchestrate different parts of the process, like generating data from larger models, preparing fine-tuning datasets, and measuring performance with various tools.

Developers can now automatically store output pairs from larger models like GPT-4o and use those pairs to fine-tune smaller models like GPT-4o-mini. The whole process of dataset creation, fine-tuning, and evaluation can be done in a more structured, automated, and efficient way.

The streamlined developer process, lower latency, and reduced costs will make OpenAI’s GPT-4o model an attractive prospect for developers looking to deploy powerful apps quickly. It will be interesting to see which applications the multi-modal features make possible.

The post OpenAI unveils Realtime API and other features for developers appeared first on DailyAI.

EU kickstarts AI code of practice to balance innovation & safety

The European Commission has kicked off its project to develop the first-ever General-Purpose AI Code of Practice, and it’s tied closely to the recently passed EU AI Act.

The Code is aimed at setting some clear ground rules for AI models like ChatGPT and Google Gemini, especially when it comes to things like transparency, copyright, and managing the risks these powerful systems pose.

At a recent online plenary, nearly 1,000 experts from academia, industry, and civil society gathered to help shape what this Code will look like.

The process is being led by a group of 13 international experts, including Yoshua Bengio, one of the ‘godfathers’ of AI, who’s taking charge of the group focusing on technical risks. Bengio won the Turing Award, which is effectively the Nobel Prize for computing, so his opinions carry deserved weight.

Bengio’s pessimistic views on the catastrophic risk that powerful AI poses to humanity hint at the direction the team he heads will take.

These working groups will meet regularly to draft the Code with the final version expected by April 2025. Once finalized, the Code will have a big impact on any company looking to deploy its AI products in the EU.

The EU AI Act lays out a strict regulatory framework for AI providers, but the Code of Practice will be the practical guide companies will have to follow. The Code will deal with issues like making AI systems more transparent, ensuring they comply with copyright laws, and setting up measures to manage the risks associated with AI.

The teams drafting the Code will need to balance how AI is developed responsibly and safely, without stifling innovation, something the EU is already being criticized for. The latest AI models and features from Meta, Apple, and OpenAI are not being fully deployed in the EU due to already strict GDPR privacy laws.

The implications are huge. If done right, this Code could set global standards for AI safety and ethics, giving the EU a leadership role in how AI is regulated. But if the Code is too restrictive or unclear, it could slow down AI development in Europe, pushing innovators elsewhere.

While the EU would no doubt welcome global adoption of its Code, this is unlikely as China and the US appear to be more pro-development than risk-averse. The veto of California’s SB 1047 AI safety bill is a good example of the differing approaches to AI regulation.

AGI is unlikely to emerge from the EU tech industry, but the EU is also less likely to be ground zero for any potential AI-powered catastrophe.

The post EU kickstarts AI code of practice to balance innovation & safety appeared first on DailyAI.

California Governor Gavin Newsom vetoes SB 1047 AI safety bill

AI companies in California breathed a collective sigh of relief as Governor Gavin Newsom vetoed the SB 1047 AI safety bill that the State Senate passed earlier this month.

The controversial bill would mandate additional safety checks for AI models that cross a training compute or cost threshold if signed into law. These models would require a “kill switch” and incur heavy fines for makers of the models if they were used to cause “critical harm”.

In his letter to the California State Senate, Newsom explained the reasons for his decision to veto the bill.

He noted that one of the reasons that California is home to 32 of the world’s 50 leading AI companies is the state’s “free-spirited cultivation of intellectual freedom.” He didn’t mention the risk of some of these companies leaving California, but he hinted at the impact the bill would have on them.

Newsom said the main reason for vetoing the bill was that it was overly broad and the threshold for regulation didn’t address actual risks.

He said, “By focusing only on the most expensive and large-scale models, SB 1047 establishes a regulatory framework that could give the public a false sense of security about controlling this fast-moving technology. Smaller, specialized models may emerge as equally or even more dangerous than the models targeted by SB 1047 – at the potential expense of curtailing the very innovation that fuels advancement in favor of the public good.”

Newsom said that regulation of AI risks was necessary but that a focus on risky applications rather than the blanket approach of SB 1047 was a better option.

“SB 1047 does not take into account whether an Al system is deployed in high-risk environments, involves critical decision-making or the use of sensitive data. Instead, the bill applies stringent standards to even the most basic functions – so long as a large system deploys it. I do not believe this is the best approach to protecting the public from real threats posed by the technology,” Newsom explained.

While Newsom declined to sign SB 1047, he pointed to other AI regulations he signed this month as evidence that he’s taking the risks associated with AI seriously.

He summed up his commitment to safety and AI advancement by saying, “Given the stakes – protecting against actual threats without unnecessarily thwarting the promise of this technology to advance the public good – we must get this right.”

Senator Scott Weiner was understandably unhappy that Newsom declined to sign the bill he authored.

Weiner said, “This veto is a setback for everyone who believes in oversight of massive corporations that are making critical decisions that affect the safety and welfare of the public and the future of the planet…This veto leaves us with the troubling reality that companies aiming to create an extremely powerful technology face no binding restrictions from U.S. policymakers, particularly given Congress’s continuing paralysis around regulating the tech industry in any meaningful way.”

While Weiner lamented the failure of the bill, Meta’s Yann LeCun and venture capitalist Marc Andreesen publicly thanked Newsom for the veto.

We’ll have to wait to see if Newsom’s decision is an example of forward-thinking leadership or a cause for regret.

The post California Governor Gavin Newsom vetoes SB 1047 AI safety bill appeared first on DailyAI.

How is China doing in the AI race? Tech giants and startups are pushing boundaries

Alibaba Cloud recently released over 100 new open-source models in its Qwen 2.5 family. 

These models range in size from 0.5 to 72 billion parameters and handle tasks from coding to math in 29 different languages. 

The company’s Tongyi model, available through the Model Studio platform, has seen its user base jump from 90,000 to over 300,000 in just a few months.

Alibaba is also pushing the boundaries in multimodal AI. They’ve introduced a text-to-video model that can create various video styles from written descriptions, similar to OpenAI’s Sora, which has yet to be released.

The company’s Qwen 2-VL model can understand and answer questions about videos up to 20 minutes long – a massive accomplishment in processing and interpreting complex visual information. 

To support its frenetic AI R&D activity, Alibaba Cloud has launched a new, more efficient data center design called CUBE DC 5.0. 

It has also introduced Alibaba Cloud Open Lake to help companies manage the vast amounts of data required for AI systems.

While Alibaba is the latest Chinese AI company to make its mark on the international industry, it’s far from the only one.

In fact, China’s AI industry is flourishing, driven by top talent, technological innovation, and a determined strategy to keep pace with the US.

The flourishing Chinese AI ecosystem

China’s fast-developing AI industry is rich and diverse. It begs the question – have US efforts to restrict China’s AI R&D ultimately failed? 

Let’s examine some of its established and upcoming tech and AI businesses. 

Baidu

One of the country’s largest tech companies, Baidu’s Ernie 4.0, is claimed to rival GPT-4 in handling complex questions and logical reasoning. 

Baidu’s CEO, Robin Li, boldly states that Ernie 4.0 “is not inferior in any aspect to GPT-4.”

Baidu’s AI ambitions extend beyond software. The company is developing its own AI chip, the Kunlun 3, to be manufactured by TSMC. 

This showcases Baidu’s commitment to AI’s software and hardware aspects, potentially giving it an edge in the face of U.S. chip export restrictions.

ByteDance

The company behind TikTok has made significant inroads in the AI space with Doubao, an AI-powered chatbot that has gained substantial traction in China. 

Impressively, Doubao has surpassed Baidu’s Ernie Bot in downloads and active monthly users on iOS, indicating strong user preference.

ByteDance isn’t stopping at chatbots. The company has released a series of large language models under the umbrella “Doubao” for enterprises, offering a cost-effective alternative to competitors. 

In a move that echoes Baidu’s strategy, ByteDance is reportedly designing two chips with Taiwan Semiconductor Manufacturing Company, aiming for mass production by 2026.

SenseTime

As one of China’s leading AI companies, SenseTime was part of the original “AI dragons,” known for facial and image recognition technology. 

Since then, the company has expanded into a range of AI-driven applications, including autonomous driving, medical imaging, and smart city technology. 

SenseTime now holds some 16% of the Chinese large language model (LLM) market, making it a key player in both AI research and commercial applications. 

Despite facing U.S. export restrictions, SenseTime continues to thrive, pushing its AI capabilities beyond image recognition into areas like generative AI and large-scale language models.

Huawei

Huawei has released its Pangu large language model (LLM) and Ascend AI chips. Released in July 2023, the Pangu 3.0 model excels in Chinese language tasks. 

Additionally, Huawei’s Ascend 9XX series chips outpace Nvidia’s China-specific GPUs in some benchmarks, supporting AI development for a range of companies, including some on this list. 

Baichuan Intelligent

An upcoming key player in China’s AI ecosystem, Baichuan Intelligent, has gained attention for its advancements in large language models (LLMs). 

Founded by Wang Xiaochuan, Baichuan is focused on developing generative AI solutions that excel in Chinese language processing. 

After securing funding from major investors like Tencent and Alibaba, Baichuan is positioning itself for rapid growth. 

Tencent

The internet and gaming giant unveiled its Hunyuan AI model in September 2023. 

Hunyuan boasts strong Chinese language processing abilities, advanced logical reasoning, and reliable task execution capabilities. 

It’s available for enterprises to test and build applications, potentially opening up new avenues for AI integration across various industries.

Moonshot AI

This Beijing-based startup has developed Kimi, a popular chatbot powered by the company’s large language model. 

Moonshot AI has also dipped its toes into the U.S. market with products like Ohai, a role-play chat app, and Noisee, a music video generator. 

However, the company has stated it currently doesn’t plan to develop or release products outside of China, focusing instead on the domestic market.

MiniMax

Shanghai-based MiniMax entered the US market with Talkie, an AI character chatbot similar to Character.ai. 

Their success is notable – the Chinese version of their chatbot had almost 2.2 million total visits worldwide from March to May 2023, according to Similarweb data.

Zhipu AI

Founded in 2019, Zhipu AI offers a range of AI products, including a chatbot and a visual language foundation model. 

As one of the first Chinese AI companies to receive government approval for publicly releasing its models, Zhipu AI has attracted investment from major players like Alibaba, Tencent, and Saudi Arabia’s Prosperity7 Ventures.

Kuaishou

Kuaishou released Kling, the first free-to-the-public text-to-video model.

Kling can create high-quality videos of up to two minutes in length, offering a frame rate of 30 frames per second and a maximum resolution of 1080p. It also supports multiple aspect ratios, making it versatile for different video formats and platforms.

iFlytek

A partially state-owned company, iFlytek launched its Spark Big Model V4.0, claiming superior performance in several international benchmarks. 

Challenges and opportunities for Chinese AI

Despite these advancements, Chinese AI development continues to face challenges. 

China was among the first countries (if not the first) to impose strict AI regulations, with AI models requiring state permission before going public. 

US export controls on advanced chips have forced companies to seek alternative solutions. This created a blackmarket for high-end chips in the Chinese market and has seen Chinese companies seek chips through the Middle East.

Some firms, like ByteDance and Baidu, are designing their own chips to address this issue. 

This has sparked a wave of innovation and self-reliance in China’s tech sector, with the country aiming to become independent from foreign imports this decade.

China’s progress in AI is certainly attracting international attention. Benchmark tests have shown that some Chinese models are performing exceptionally well, with models like Alibaba’s Qwen impressing AI researchers globally. 

This challenges the notion that US chip restrictions would greatly hamper Chinese AI development.

While Western companies like OpenAI and Google remain at the cutting edge of AI, Chinese alternatives are making their mark on the global stage. 

Rather than the US keeping them ‘one generation behind,’ as has been the tactic for years, Chinese tech companies are going toe-to-toe with the biggest US corporations. 

The post How is China doing in the AI race? Tech giants and startups are pushing boundaries appeared first on DailyAI.

DAI#58 – AI voices, nuclear meltdowns, and Chinese top models

Welcome to our roundup of this week’s hottest AI news.

This week ChatGPT finally found its voice.

Microsoft looks to a nuclear meltdown site for power.

And Chinese AI seems unstoppable.

Let’s dig in.

It speaks!

OpenAI is finally rolling out ChatGPT’s advanced voice assistant. We’ve been waiting for months to get our hands on the feature that was demo’d back in May. Advanced Voice Mode (AVM) comes with some interesting customization options, but it’s still missing some of the things we saw in the original demo.

If you’re in the UK, EU, and a few other countries then you’ll have to wait until the legal issues are settled before you get to talk to ChatGPT.

Some of the things people are doing with AVM are really cool.

Watch ChatGPT travel the world and order food in foreign languages.

Advanced voice is amazing at accents! pic.twitter.com/CAEeEfcyuN

— Clintin Lyle Kruger (@Lyle_AI) September 25, 2024

How open is open?

Meta keeps insisting its models like Llama 3.1 are “open” but not everyone agrees. The Open Source Initiative has released its updated definition of open-source AI and it looks like Meta’s models don’t make the grade.

It will be tricky for Meta and other companies to comply with the OSI’s definition and it has legal ramifications too.

Tracking cities

Sprawling cities in the developing world often grow organically with little urban planning. This makes it increasingly tough for governments to be effective in delivering high-quality healthcare, urban development, environmental conservation, and resource management.

Google’s Open Buildings project now maps urban expansion across the Global South. AI is giving people in developing countries the same tools the Global North has to make better policy decisions.

AI-fueled research

The 2024 Nature Index reveals how AI is transforming every aspect of scientific research. Papers are being published faster than human peer reviewers can keep up.

The flurry of AI research has come with its own set of questions: Where are the big tech companies hiding their research and how did a nonsense paper with giant AI-generated rat testicles get published?

Nuke it

AI uses a lot of power and Microsoft is looking for new sources to plug into as it expands its data centers. The company has decided to resurrect the Three Mile Island nuclear power plant in a deal that will give it exclusive rights to the power.

With doomsayers increasingly highlighting AI safety concerns, is Microsoft tempting fate by restarting a power station on the island that saw the worst nuclear disaster in US history? What could possibly go wrong?

via GIPHY

East to West

Despite US export bans, China’s AI ambitions have surged ahead. There are some big hitters with top minds and advanced tech in the list of Chinese companies leading the AI pack. There’s no shortage of money and some of their AI solutions aren’t just keeping up with the US, they’re moving ahead.

Alibaba is one of the biggest players in Chinese AI. The company just released over 100 models, including Qwen 2.5 which is now the top open-source model in math and coding. The company’s new Qwen 2-VL vision model and text-to-video capabilities are impressive too.

Countries in the Middle East are looking left and right as they decide who to partner with.

The UAE’s presidential visit to the White House will have to navigate some tricky issues as the country looks to transform itself into an AI powerhouse. ‘We’ve got oil and money, you’ve got chips and AI,’ might be the short version of the talks.

The UAE’s AI ambitions face a crucial test in the White House talks with Chinese influence being the elephant in the room.

AI events

Here’s a list of some exciting AI events happening soon:

The MarTech Summit London 2024: AI marketing in focus
Thailand Cloud & Datacenter Convention 2024: Powering Thailand’s digital future
Data and the Future of Financial Services Summit 2024

In other news…

Here are some other clickworthy AI stories we enjoyed this week:

Chinese and Western AI scientists meet in Italy to talk AI safety.
Lionsgate signs a deal with AI company Runway and hopes AI can eliminate the need for storyboard artists and VFX crews.
AI vs. human engineers: Benchmarking coding skills head-to-head.
Sam Altman paints a picture of the Intelligence Age and how soon we’ll have AGI.
Now you can talk to and share photos with Meta AI.
The FTC announces a crackdown on deceptive AI claims and schemes.
The problem with AI slop flooding the internet is only going to get worse.
Avatar, Terminator and Titanic director James Cameron will join Stability AI’s board of directors.
OpenAI’s news X account got hacked by crypto scammers.
OpenAI exodus continues as CTO and lead researchers jump ship.

How it started vs. How it’s going

openAI pic.twitter.com/pYzz1S61KR

— nishan (@notnishan) September 25, 2024

And that’s a wrap.

Have you got access to ChatGPT’s advanced voice assistant yet? It’s really cool, but I miss Sky. Maybe now that Sam’s colleagues are leaving en masse he might give us Sora as a distraction.

What do you think of Microsoft’s idea to reboot an old nuclear power station? Powering potentially dangerous AI from a nuclear disaster site sounds like a cheesy SciFi script, not a business plan.

Let us know what you think, follow us on X, and send us links to cool AI stuff we may have missed.

The post DAI#58 – AI voices, nuclear meltdowns, and Chinese top models appeared first on DailyAI.