Uncategorized

Open Source Initiative disagrees with Meta on ‘open’ AI

The Open Source Initiative (OSI) has released an updated draft definition of what constitutes open-source AI and says Meta’s models don’t qualify despite the company’s claims.

Mark Zuckerberg has been vocal about Meta’s commitment to what he says is open-source AI. However, while models like Llama 3.1 are less opaque than the proprietary models from OpenAI or Google, discussions in the OSI community suggest Meta is using the term loosely.

At an online public town hall event on Friday, the OSI discussed the criteria it believes a truly open-source AI model should conform to. The OSI refers to these criteria as “4 Freedoms” and says an open-source AI “is an AI system made available under terms and in a way that grant the freedoms to:

Use the system for any purpose and without having to ask for permission.
Study how the system works and inspect its components.
Modify the system for any purpose, including to change its output.
Share the system for others to use with or without modifications, for any purpose.”

To be able to modify an AI model, the OSI’s open AI definition says the weights and source code should be open, and the training data set should be available.

Meta’s license imposes some restrictions on how its models can be used and it has declined to release the training data it used to train its models. If you accept that the OSI is the custodian of what “open-source” means, then the implication is that Meta distorts the truth when it calls its models “open”.

The OSI is a California public benefit corporation that relies on community input to develop open-source standards. Some in that community have accused Mark Zuckerberg of “open washing” Meta’s models and bullying the industry into accepting his version rather than the OSI’s definition.

Chairman of Open Source Group Japan, Shuji Sado said “It’s possible that Zuckerberg has a different definition of Open Source than we do,” and suggested that the unclear legal landscape around AI training data and copyright could be the reason for this.

Open Source AI Definition – Weekly update September 23 https://t.co/flbb3yGCmx

— Open Source Initiative @osi@opensource.org (@OpenSourceOrg) September 23, 2024

Words matter

This might all sound like an argument over semantics but, depending on the definition the AI industry adopts, there could be serious legal consequences.

Meta has had a tough time navigating EU GDPR laws over its insatiable hunger for users’ social media data. Some people claim that Meta’s loose definition of “open-source AI” is an attempt to skirt new laws like the EU AI Act.

The Act provides a limited exception for general-purpose AI models (GPAIMs) released under open-source licenses. These models are exempt from certain transparency obligations although they still have to provide a summary of the content used to train the model.

On the other hand, the proposed SB 1047 California AI safety bill disincentivizes companies like Meta from aligning their models with the OSI definition. The bill mandates complex safety protocols for “open” models and holds developers liable for harmful modifications and misuse by bad actors.

SB 1047 defines open-source AI tools as “artificial intelligence model[s] that [are] made freely available and that may be freely modified and redistributed.” Does that mean that an AI model that can be fine-tuned by a user is “open” or would the definition only apply if the model ticks all the OSI boxes?

For now, the vaguery allows Meta the marketing benefits and room to negotiate some legislation. At some point, the industry will need to commit to a definition. Will it be defined by a big tech company like Meta or by a community-driven organization like the OSI?

The post Open Source Initiative disagrees with Meta on ‘open’ AI appeared first on DailyAI.

Google’s Open Buildings project maps urban expansion across the Global South

Google Research has released the Open Buildings 2.5D Temporal Dataset, a project that tracks building changes across the Global South from 2016 to 2023.

This exceptionally detailed dataset offers a dynamic, year-by-year view of urbanization. It captures building construction, growth, and transformation across regions where such detailed data has long been scarce or nonexistent.

John Quinn, a software engineer at Google Research, explained why the project is important: “Not knowing where buildings are is a big problem for lots of practical reasons. If you’re creating services or vaccination campaigns or rescuing people after an emergency, this is an issue.”

The tool’s primary breakthrough is its ability to extract high-resolution insights from low-resolution Sentinel-2 satellite imagery. 

By analyzing up to 32 time-shifted images of the same location, the AI can detect structures far smaller than a single pixel, effectively overcoming the limitations of available imagery in many parts of the Global South.

Building density in 2023 based on Open Buildings 2.5D Temporal dataset data. Source: Google.

Beyond detection, the system estimates building heights with remarkable precision – within 1.5 meters on average.

This transforms flat satellite images into rich, multi-layered data, providing urban planners and humanitarian organizations with a powerful instrument for understanding population density and resource needs.

AI is transforming our approach to mapping land use globally.

A recent project by Aya Data in Ghana, for example. demonstrates the technology’s potential. They employed AI to analyze thousands of satellite images of South America, tracking changes in forests, urban areas, and agricultural land over several years.

The AI’s skill in detecting subtle changes, such as early signs of deforestation or new developments, informs real-world agriculture and conservation policies.

Beyond observing long-term shifts in the environment, AI enables researchers to detect and respond to immediate threats like illegal logging, wildlife poaching, and unauthorized changes in land use visible from satellite imagery.

Mapping the Global South

The term “Global South” typically refers to developing nations in Africa, Asia, Latin America, and the Caribbean. 

These regions, home to the majority of the world’s population, often face challenges in delivering high-quality healthcare, urban development, environmental conservation, and resource management.

Abdoulaye Diack, a program manager on the project, stated, “We want people in the global South making policy decisions to have the same tools available as the global North.” 

The dataset is already finding practical applications. In Uganda, the nonprofit Sunbird AI is leveraging this data for rural electrification projects.

WorldPop, based at the University of Southampton, is using it to refine global population estimates – critical information for planning everything from vaccination campaigns to disaster response.

WorldPop’s director, Professor Andrew Tatem, explained why this data is important: “Understanding where people live is vital for making sure that resources are distributed fairly and that no one is left behind in delivering services like healthcare.”

Google admits that the system has limitations. Persistent cloud cover can impede data collection in some of the world’s most humid regions, and very small structures may also elude detection. 

While not without limitations, the Open Buildings 2.5D Temporal Dataset showcases the powerful synergy between data democratization and AI.

It will yield valuable insights for urban planners, policymakers, and humanitarian organizations alike, helping them monitor and optimize the environment as land use continues to change at an unprecedented pace.

The post Google’s Open Buildings project maps urban expansion across the Global South appeared first on DailyAI.

Alibaba’s Qwen 2.5 is top open-source model in math and coding

Alibaba released more than 100 open-source AI models including Qwen 2.5 72B which beats other open-source models in math and coding benchmarks.

Much of the AI industry’s attention in open-source models has been on Meta’s efforts with Llama 3, but Alibaba’s Qwen 2.5 has closed the gap significantly. The freshly released Qwen 2.5 family of models range in size from 0.5 to 72 billion parameters with generalized base models as well as models focused on very specific tasks.

Alibaba says these models come with “enhanced knowledge and stronger capabilities in math and coding” with specialized models focused on coding, maths, and multiple modalities including language, audio, and vision.

Alibaba Cloud also announced an upgrade to its proprietary flagship model Qwen-Max, which it has not released as open-source. The Qwen 2.5 Max benchmarks look good, but it’s the Qwen 2.5 72B model that has been generating most of the excitement among open-source fans.

Qwen 2.5 72B instruct model math and coding benchmarks. Source: Alibaba Cloud

The benchmarks show Qwen 2.5 72B beating Meta’s much larger flagship Llama 3.1 405B model on several fronts, especially in math and coding. The gap between open-source models and proprietary ones like those from OpenAI and Google is also closing fast.

Early users of Qwen 2.5 72B show the model coming just short of Sonnet 3.5 and even beating OpenAI’s o1 models in coding.

Open source Qwen 2.5 beats o1 models on coding

Qwen 2.5 scores higher than the o1 models on coding on Livebench AI

Qwen is just below Sonnet 3.5, and for an open-source mode, that is awesome!!

o1 is good at some hard coding but terrible at code completion problems and… pic.twitter.com/iazam61eP9

— Bindu Reddy (@bindureddy) September 20, 2024

Alibaba says these new models were all trained on its large-scale dataset encompassing up to 18 trillion tokens. The Qwen 2.5 models come with a context window of up to 128k and can generate outputs of up to 8k tokens.

The move to smaller, more capable, and open-source free models will likely have a wider impact on a lot of users than more advanced models like o1. The edge and on-device capabilities of these models mean you can get a lot of mileage from a free model running on your laptop.

The smaller Qwen 2.5 model delivers GPT-4 level coding for a fraction of the cost, or even free if you’ve got a decent laptop to run it locally.

We have GPT-4 for coding at home! I looked up OpenAI?ref_src=twsrc%5Etfw”>@OpenAI GPT-4 0613 results for various benchmarks and compared them with @Alibaba_Qwen 2.5 7B coder.

> 15 months after the release of GPT-0613, we have an open LLM under Apache 2.0, which performs just as well.

> GPT-4 pricing… pic.twitter.com/2szw5kwTe5

— Philipp Schmid (@_philschmid) September 22, 2024

In addition to the LLMs, Alibaba released a significant update to its vision language model with the introduction of Qwen2-VL. Qwen2-VL can comprehend videos lasting over 20 minutes and supports video-based question-answering.

It’s designed for integration into mobile phones, automobiles, and robots to enable automation of operations that require visual understanding.

Alibaba also unveiled a new text-to-video model as part of its image generator, Tongyi Wanxiang large model family. Tongyi Wanxiang AI Video can produce cinematic quality video content and 3D animation with various artistic styles based on text prompts.

The demos look impressive and the tool is free to use, although you’ll need a Chinese mobile number to sign up for it here. Sora is going to have some serious competition when, or if, OpenAI eventually releases it.

The post Alibaba’s Qwen 2.5 is top open-source model in math and coding appeared first on DailyAI.

The MarTech Summit London 2024: AI marketing in focus

This conference, taking place in London, offers a unique opportunity to explore the latest trends and innovations in marketing technology across the B2B and B2C sectors.

The event, which will take place November 12-13, 2024, at Convene 155 Bishopsgate, London, will feature insightful case studies, panel discussions, fireside chats, keynotes, and Q&As focused on the latest MarTech trends. 

With 400+ attendees, over 85% in senior leadership roles, the summit provides unparalleled learning and networking opportunities with CMOs, Heads, Directors, and more.

Attendees will acquire cutting-edge knowledge on MarTech advancements, real-world implementation strategies, and how technology is transforming marketing approaches and workflows across both B2B and B2C sectors.

Why attend The MarTech Summit London 2024?

There are numerous reasons for attending the MarTech Summit London 2024:

Explore five key themes across three stages over two days
Gain insights from 70+ speakers and industry experts
Network with 400+ attendees, including 85% senior management
Participate in discussion roundtables to share and gain diverse perspectives
Evaluate new technologies for your MarTech stack from leading service providers

Who should attend?

The MarTech Summit London 2024 is designed for senior-level executives in functions such as:

Marketing and technology
Customer experience (CX) and engagement
Brand loyalty and retention
Data and consumer insights
E-commerce marketing
Digital strategy
Omni-channel
Innovation
Social media
Content strategy and storytelling
CRM
Product marketing
Automation
Digital transformation and growth

Key speakers at the event

Some of the key speakers in the lineup include, but aren’t limited to:

Sivan Einstein, Industry Head – Omnichannel Retail, Google
Neil Robins, Head of Digital & Media, Kenvue
Cat Daniel, Senior Director, Growth & Engagement, Monzo
Sam Lewis-Williams, Head of Marketing Automation, Financial Times
Namita Mediratta, CMI Director, Beauty, EMETU, Unilever
Richard Jones, Director of eCommerce, Carlsberg
Kumar Amrendra, Head of Digital Marketing, Planning & Data Science, Sky
James Brindley-Raynes, Head of Digital Customer Journey, Maersk
Clive Head, Head of CRM and Loyalty, Santander UK
Marie Tyler, Global Customer Engagement Leader, Honeywell

Featured themes and topics from The MarTech Summit London 2024

The event features five main themes across three stages:

Day 1 – Plenary Room:

MarTech Trends (including Marketing Automation, AI in Marketing, Digital Transformation)

Day 2 – B2C Marketing Stage:

Customer Experience (including Omnichannel Personalisation, E-Commerce, Digital Customer Experience)
Data, Analytics & Insights (including First-Party Data, Customer Data Platform, Data Privacy & Security)

Day 2 – B2B Marketing Stage:

Sales Enablement (including Revenue Enablement, Sales Content Optimisation, Account-Based Marketing)
Demand Generation (including B2B Marketing Metrics, Precision Demand Marketing, Content Marketing)

Key details about the event

Mark your calendars and prepare to be part of this fantastic event:

Date: November 12-13, 2024
Venue: Convene 155 Bishopsgate, Second Floor, London EC2M 3YD
Tickets: Available on the event website here

Visit the official MarTech Summit London 2024 website to secure your spot, view the full agenda, and explore sponsorship opportunities. 

The post The MarTech Summit London 2024: AI marketing in focus appeared first on DailyAI.

The 2024 Nature Index reveals how AI is transforming every aspect of scientific research

The 2024 Nature Index supplement on Artificial Intelligence, released this week, reveals a scientific world in the throes of an AI-driven paradigm shift. 

This annual report, published by the journal Nature, tracks high-quality science by measuring research outputs in 82 natural science journals, selected by an independent panel of researchers.

The latest edition illustrates how AI is not just changing what scientists study, but fundamentally altering how research is conducted, evaluated, and applied globally. 

One of the most striking trends revealed in the Index is the surge in corporate AI research. US companies have more than doubled their output in Nature Index journals since 2019, with their Share (a metric used by the Index to measure research output) increasing from 51.8 to 106.5. 

However, this boom in R&D activity comes with a caveat – it still only accounts for 3.8% of total US AI research output in these publications. In essence, despite a major uplift in corporate AI R&D, we’ve not seen those efforts reflected in public research output. 

This raises questions about where corporate AI research is located. Are companies publishing their most groundbreaking work in other venues, or keeping it under lock and key?

The answer is one of competing names and narratives. OpenAI, Microsoft, Google, Anthropic, and a handful of others are firmly entrenched in the closed-source model, but the open-source AI industry, led by Meta, Mistral, and others, is rapidly gaining ground.

Contributing to this, the funding disparity between private companies and public institutions in AI research is staggering. 

In 2021, according to Stanford University’s AI Index Report, private sector investment in AI worldwide reached approximately $93.5 billion. 

This includes spending by tech giants like Google, Microsoft, and Amazon, as well as AI-focused startups and other corporations across various industries.

In contrast, public funding for AI research is much lower. The US government’s non-defense AI R&D spending in 2021 was about $1.5 billion, while the European Commission allocated around €1 billion (approximately $1.1 billion) for AI research that year.

This gaping void in resource expenditure is giving private companies an advantage in AI development. They can afford more powerful computing resources and larger datasets and attract top talent with higher salaries.

“We’re increasingly looking at a situation where top-notch AI research is done primarily within the research labs of a rather small number of mostly US-based companies,” explained Holger Hoos, an AI researcher at RWTH Aachen University in Germany.

While the US maintains its lead in AI research, countries like China, the UK, and Germany are emerging as major hubs of innovation and collaboration.

However, this growth isn’t uniform across the globe. South Africa stands as the only African nation in the top 40 for AI output, showing how the digital divide is at risk of deepening in the AI era. 

AI in peer review: promise and peril

Peer review ensures academic and methodological rigor and transparency when papers are submitted to journals.

This year, a nonsense paper with giant AI-generated rat testicles was published in Frontiers, indicating how the peer review process is far from impenetrable.

Someone used DALL-E to create gobbledygook scientific figures and submitted them to Frontiers Journal. And guess what? The editor published it. LOLhttps://t.co/hjQkRQDkal https://t.co/aV1USo6Vt2 pic.twitter.com/VAkjJkY4dR

— Veera Rajagopal  (@doctorveera) February 15, 2024

Recent experiments have shown that AI can generate research assessment reports that are nearly indistinguishable from those written by human experts. 

Last year, an experiment testing ChatGPT’s peer reviews versus human reviewers on the same paper found that over 50% of the AI’s comments on the Nature papers and more than 77% on the ICLR papers aligned with the points raised by human reviewers.

Of course, ChatGPT is much quicker than human peer reviewers. “It’s getting harder and harder for researchers to get high-quality feedback from reviewers,” said James Zou from Stanford University, the leader researcher for that experiment.

AI’s relationship with research is raising fundamental questions about scientific evaluation and whether human judgment is intrinsic to the process.  The balance between AI efficiency and irreplaceable human insight is one of several key issues scientists from all backgrounds will need to grapple with in the years ahead.

AI might soon be capable of managing the entire research process from start to finish, potentially sidelining human researchers altogether.

For instance, Sakana‘s AI Scientist autonomously generates novel research ideas, designs and conducts experiments, and even writes and reviews scientific papers. This tempts a future where AI could drive scientific discovery with minimal human intervention.

On the methodology side, using machine learning (ML) to process and analyze data comes with risks. Princeton researchers argued that since many ML techniques can’t be easily replicated, this erodes the replicability of experiments – a key principle of high-quality science. 

Ultimately, AI’s rise to prominence in every aspect of research and science is gaining momentum, and the process likely irreversible. 

Last year, Nature surveyed 1,600 researchers and found that 66% believe that AI enables quicker data processing, 58% that it accelerates previously infeasible analysis, and 55% feel that it’s a cost and time-saving solution.

As Simon Baker, lead author of the supplement’s overview, concludes: “AI is changing the way researchers work forever, but human expertise must continue to hold sway.”

The question now is how the global scientific community will adapt to AI’s role in research, ensuring that the AI revolution in science benefits all of humanity, and without unforeseen risks wreaking havoc on science.  

As with so many aspects of the technology, mastering both benefits and risks is challenging but necessary to secure a safe path forward.

The post The 2024 Nature Index reveals how AI is transforming every aspect of scientific research appeared first on DailyAI.

Microsoft to ressurrect the Three Mile Island nuclear power plant in exclusive deal

Microsoft has announced an energy deal to reopen the Three Mile Island nuclear power plant on the Susquehanna River near Harrisburg, Pennsylvania.

Constellation Energy, the plant’s current owner, is now set to bring Unit 1 back online for Microsoft. That will involve investing $1.6 billion to restore the reactor by 2028.

While details remain unknown, Microsoft reportedly offered to buy the plant’s output for 20 consecutive years. 

Three Mile Island is best known as the site of the most serious nuclear accident in US history. In 1979, a partial meltdown occurred in one of its reactors, sparking public fear and distrust in nuclear power. 

The plant’s Unit 2 reactor, which melted down, was permanently closed, but Unit 1 continued operating until it was decommissioned in 2019 due to competition from cheaper natural gas. 

Three Mile Island, the site of the worst nuclear accident in U.S. history in 1979, saw a partial meltdown in one of its reactors. Decades later, the undamaged Unit 1 reactor, decommissioned in 2019, is set for revival in an exclusive deal with Microsoft to power AI data centers by 2028. Source: Wikimedia Commons.

Microsoft says the deal is also driven by its carbon-negative pledge by 2030. Nuclear energy is a zero-carbon power source, though there are ongoing controversies over radioactive waste management.

Constellation Energy’s CEO Joseph Dominguez was positive about the move, stating, “This plant never should have been allowed to shut down. It will produce as much clean energy as all of the renewables [wind and solar] built in Pennsylvania over the last 30 years.”

Constellation Energy stated that “Significant investments” need to be made in the plant, including upgrading and renovating the “turbine, generator, main power transformer, and cooling and control systems.”

The rising power demands of AI

Microsoft’s decision to tap into nuclear power shows once again the staggering energy requirements of AI and supporting data center technology.  

The company has been expanding its data centers worldwide, with many of these facilities dedicated to supporting AI workloads, including the training and deployment of models that require vast amounts of computational power.

Training large AI models can consume thousands of megawatt-hours (MWh) of electricity. 

According to some sources, OpenAI’s GPT-3, for instance, required over 1,200 MWh to train, which could power tens of thousands of homes for a day. 

Hundreds, if not thousands, of powerful AI models are actively being trained at any one time today. AI models require power not just during training but also for day-to-day operations.

This surge in energy demand from AI is part of a broader trend. The International Energy Agency (IEA) estimates that data centers currently account for 1.3% of global electricity consumption, and this is set to rise significantly, with AI infrastructure driving much of the increase. 

By 2030, data centers could consume up to 8% of the world’s electricity, further straining energy grids already stretched thin by increasing reliance on digital services and electric vehicles.

Coal and nuclear to take up the slack

While the focus on nuclear energy highlights the tech industry’s need for low-carbon alternatives, AI’s demand for power is remarkably breathing new life into coal. 

According to a Bloomberg report from earlier in the year, the rapid expansion of data centers is delaying the shutdown of coal plants across the US, defying the push for cleaner energy sources.

In areas like Kansas City, for example, the construction of data centers and electric vehicle battery factories has forced utility providers to halt plans to retire coal plants. 

Microsoft’s decision to power its AI operations with nuclear energy brings the broader conversation of AI sustainability into sharp focus. 

With the tech industry’s growth outpacing energy supplies, innovative solutions are needed to bridge the gap between demand and production. OpenAI, for example, has actively invested in Helion, a nuclear fusion project set to come online soon.

OpenAI CEO Sam Altman said on X, “If Helion works, it not only is a potential way out of the climate crisis but a path towards a much higher quality of life. Have loved being involved for the last 7 years and excited to be investing more.”

Despite its controversies, nuclear power offers a credible solution to AI’s energy demands, particularly in regions struggling to transition fully to renewable energy.

But the stakes are high. Building and maintaining nuclear plants still requires immense resources, and nuclear waste is challenging to dispose of. Many will see this as trivializing decarbonization and renewable energy strategies. 

We should say that this is still early days for the Microsoft-Constellation Eneergy deal.

Still, exclusive, private deals like this are exceptionally rare, showing how power in the AI industry hinges on power in the literal sense. 

The post Microsoft to ressurrect the Three Mile Island nuclear power plant in exclusive deal appeared first on DailyAI.