pwshub.com

Microsoft backs open source PostgreSQL analytics extension

A Microsoft-backed open source project aims to help address PostgreSQL's weaknesses as an analytics database.

The project, which is under the permissive MIT License, plans to bring popular in-process OLAP system DuckDB to PostgreSQL, which has been named the most popular database among developers since 2023.

While the popularity of PostgreSQL has grown steadily over the last 28 years, something credited to its robust and extensible design, developers have struggled to build analytical systems based on the database as effectively as they support transactions.

An open source PostgreSQL extension that embeds DuckDB's analytics engine intends to help address the problem.

A blog post from MotherDuck, which provides back-end services and extensions to DuckDB, said the IP is owned by the independent non-profit DuckDB Foundation, ensuring that it stays open source. It is hosted in the official DuckDB GitHub repository.

"PostgreSQL is great at a lot of things, but if you try to use it for analytics, you hit a wall pretty quickly," the post said. "That is, it is great at creating, finding and locating individual rows, but if you want to understand what is going on in a data set, it can be painfully slow.

"People have tried to add Band-Aids to improve PostgreSQL analytical performance but they haven't been particularly successful, because being good at analytics requires different techniques for running your queries, like being able to operate over batches of rows at once, and avoiding decompressing data until it is absolutely needed. And typically, that takes a purpose-built analytical engine, which takes a ton of effort."

DuckDB is embedded within a host process, so there is no DBMS server software to install, update, or maintain. For example, the DuckDB Python package can run queries directly on data in Python software library Pandas without importing or copying data. Written in C++, DuckDB is free and open source under the MIT License.

MotherDuck said DuckDB's vectorized query engine processes chunks of data at a time, making it valuable for answering analytical questions about what is going on in the data. "DuckDB's popularity has been soaring due to its speed, ease of use, and versatility," the blog claimed.

  • DuckDB promises greater stability with 1.0 release
  • DuckDB shuns VC breadcrumbs so support isn't all it's quacked up to be
  • VCs lay $52.5M golden egg for MotherDuck's serverless analytics platform
  • Ex-BigQuery exec and Motherduck CEO: For some users, the answer is to think small

The new DuckDB PostgreSQL extension, pg_duckdb, is supported by DuckDB Labs, the commercial support company founded by DuckDB's developers, Microsoft, which hosts PostgreSQL services and employs PostgreSQL committers, Hydra, the data and software company that kicked off pg_duckdb, and Neon, which offers a serverless PostgreSQL database platform. MotherDuck is also supporting the project.

Andy Pavlo, associate professor of databaseology at Carnegie Mellon University, said developers had been trying to make OLAP versions of PostgreSQL since the 2000s, "but these were all hard forks of the Postgres code" such as Greenplum, Vertica, Aster Data, and ParAccel.

"This approach was necessary at the time because the PostgreSQL extension or 'hooks' API started in 2006 and was originally meant to [provide] support for debugging functionality," he told us. "Then in the early 2010s, systems like Citus showed how to create a version of PostgreSQL that was based entirely on using extensions. The other big player which showed that you don't need to fork Postgres to support custom workloads is Timescale.

"DuckDB's optimizer is not as battle-hardened as PostgreSQL in terms of the number of years of service. But DuckDB's optimizer design is based on the TUM's HyPer optimizer, which is one of the best in the world. For example, it can unnest arbitrary subqueries that not even Microsoft SQL Server or Oracle can handle."

Pavlo added that the range of companies backing the project was impressive, especially MotherDuck, Neon and Microsoft. "If I was a developer that was already using Postgres + DuckDB in my stack and I wanted to be able to combine the two together, I would choose pg_duckdb over the other DuckDB Postgres extension alternatives," he said. ®

Source: theregister.com

Related stories
1 month ago - Plus: Three-year-old ProxyLogon flaw added to CISA's exploited bugs list Microsoft says it's investigating issues with a patch intended to plug a two-year-old flaw in the GRUB open source boot loader that is crashing some dual-boot...
3 days ago - AI-coded contributions? Most would rather skip the bot's work The majority of open source project maintainers are not being paid for their work, spend three times as much time on security than they did three years ago, and have become...
1 month ago - A glimpse into the rapidly advancing world of Chinese open source After a couple of years in development, Linux Deepin 23 arrives, with some new shiny that throws shade on the leading Western desktop distros.…
1 week ago - Plus: Trump family X accounts hijacked to promote crypto scam; Fog ransomware spreads; Hijacked PyPI packages; and more Infosec in brief After activating its chameleon field and going to ground following press attention earlier this year,...
1 month ago - HELLO, MICROSOFT? YOU THERE? — Microsoft said its update wouldn't install on Linux devices. It did...
Other stories
48 minutes ago - If you haven't decided to upgrade to the iPhone 16 just yet, better keep an eye on the old iPhone's battery.
48 minutes ago - This central Texas city is booming with internet providers offering speedy cable, satellite and fiber options. Explore CNET's top picks for the best broadband providers in Temple.
48 minutes ago - "The World's Borough" has many high-speed internet providers. Here are CNET's top picks for the best internet in Queens.
48 minutes ago - Spectrum is an obvious choice for home internet in Duluth, but it’s not the area's only option. Here are the best internet providers in Duluth.
48 minutes ago - Why You Can Trust CNET Our expert deal-hunting staff showcases the best price drops and discounts from reputable sellers daily. If you make a...