The State of the Open Source Database Industry in 2020: Part Three

State of the Open Source Database Industry

State of the Open Source Database IndustryA four-part blog series by Matt Yonkovit, Percona Chief Experience Officer. Read Matt’s first blog in this series on “Baseline Data and the Size of the Market” and his second blog in this series on “Migrating from Proprietary Software to Open Source“.

The Most Popular Open Source Databases 2020

Although there are some variations, the most popular open source databases are broadly similar across a number of different surveys and sources.

Percona Survey DB-Engines Stack-Overflow Survey
  1. MySQL
  2. PostgreSQL (45%)
  3. Redis (40%)
  4. MariaDB (39%)
  5. ElasticSearch (39%)
  6. MongoDB (38%)
  7. Kafka (15%)
  1. MySQL
  2. PostgreSQL
  3. MongoDB
  4. ElasticSearch
  5. Redis
  6. Cassandra
  7. MariaDB
  1. MySQL
  2. PostgreSQL
  3. MongoDB
  4. Redis
  5. MariaDB
  6. Elasticsearch
  7. Cassandra

It is interesting to note that the Percona survey shows very similar usage of PostgreSQL, Redis, MariaDB, Elastic, and MongoDB. They are all within 7% of each other.

Closing the Gap

Db-engines shows that the overall open source market closing the gap with commercial databases:

open source market

This matches the data from our open source database survey which shows increased usage of open source databases:

open source software adoption

Open Source Database Popularity

Below is the ranking according to overall popularity, according to data from DB-engines. As of March 2020, MySQL is still over 2x as popular as PostgreSQL and MongoDB.

open source database popularity

Stack Overflow runs a large annual Developer Survey (with 90,000 respondents in 2019). We took information from their survey results and used them to show the percentage difference in the usage of the open source databases we are focusing on, year-on-year, from 2018 to 2019.

OS database usage

Quick View of MySQL:

MySQL Database ranking

 

Growth and Adoption:

MySQL has seen its growth stabilize over the last few years. Although it isn’t seeing the growth and adoption it had in the past, we still see fairly consistent and steady adoption.

MySQL has by far the largest install base. DB-engines rank it over 2x more than the next open source database contender.

Cloud providers all say similar things in terms of their install and adoption numbers for MySQL: massive install base and steady growth.

While there is a massive install base, many companies run various flavors of MySQL.

Oracle themselves still have a healthy amount of MySQL Enterprise (which does have feature differences over Community and is pay-walled).

Percona has Percona Server for MySQL which includes enhancements and enterprise features in the 100% open space.

And, of course, many companies run their own versions.

Most Popular Usage:

3rd Party App Support:
MySQL’s popularity has enabled it to garner a pretty big list of third-party apps that use it for their backend.

Web Apps:
MySQL remains an incredibly popular choice for web applications everywhere.

E-commerce:
Combining third-party apps and web apps lead you to the e-commerce space. Many of our retail customer base is happily running and growing their e-commerce footprint with MySQL.

Greenfield Development:
New application development has always been a sweet spot for MySQL.

Opinions and Thoughts on the Future:

It seems that a big effort is being made in the MySQL community to make MySQL more attractive for developers.

This includes adding features that should help adoption and make things more developer-friendly.

More features have been added to increase the general appeal. These are more about incremental improvements than massive innovation leaps. You know what you are going to get, and that is not a bad thing.

Benefits of MySQL:

  • Massive install base
  • Long track record
  • Easy to get started
  • Great coverage by third-party service providers
  • Many alternative providers
  • Good community contributions

Community:

Oracle has provided a steady stream of features and enhancements since acquiring MySQL.

Companies like Percona have pushed the envelope on performance, scalability, and features bringing additional enhancements.

Facebook, Alibaba, Tencent, and others have contributed a lot of code and enhancements to the community.

New companies in the ecosystem pop up frequently to help push the envelope. For example, companies like PlanetScale’s work on operationalizing Vitess as a viable scale-out solution.

Considerations:

  • Not really a great migration target for large legacy applications
  • Confusion still persists between MySQL and MariaDB (they are different)
  • While experts are easier to find than other open source databases, it’s still not always easy to find experts
  • Innovation is steady and incremental, but no sweeping enhancements

Quick View of PostgreSQL:

PostgreSQL database ranking

 

Growth and Adoption:

PostgreSQL has been around for over 23 years, but, over the last few years, it has seen a massive resurgence in popularity and growth. DB-engines shows its popularity doubling over the last four years.

Cloud providers have told us that this is their fastest growing database platform. Analysts have told us that this is their #1 most enquired-about database.

Our own data and experience have verified this explosive growth.

Much of the growth in PostgreSQL is coming from migrations and the modernization of legacy business apps. Growth is also fueled by the openness of the project, enabling it to be embedded, modified, and enhanced in third party applications, hardware, and other systems.

Most Popular Usage:

Legacy Database Migration:
The PostgreSQL community has done a great job of incorporating enterprise features and a solid procedural language. This has made PostgreSQL a very popular migration target.

Business Applications:
Because of the feature set and robust nature of PostgreSQL, it is a great place for building new OLTP apps for your business.

GIS Applications:
Geospatial support in PostgreSQL is super strong.

The Edge and Embedded Systems:
The openness of the licensing makes this a great choice to embed in your own projects, offerings, and systems.

Opinions and Thoughts on the Future:

With so much emphasis being placed on PostgreSQL the future is bright and we anticipate continued growth.

That being said, the open licensing for PostgreSQL is going to lead to more “compatible” products, that may lock you in.

Be mindful that not every PostgreSQL offering is going to be a truly open product, some are simply “open source compatible”.

Benefits:

  • PostgreSQL is fully open, there is no single entity behind it, making it a truly open project.
  • Licensing makes this easy to put into place, embed, and modify as needed
  • It has a well fleshed-out and complete stored procedure language
  • There is an easier path than most for classic database developers and DBAs to learn and become productive
  • There are lots of enterprise extensions and options available
  • A large list of options for support, services, and DBaaS

Community:

The PostgreSQL community is strong and passionate.

All major cloud providers, several different third party providers (such as Percona, EnterpriseDB, CrunchyData, 2ndQuadrant, Pivotal), and large enterprises, all contribute back to PostgreSQL.

The number of companies offering support and services is high, with many offering enhancements on the standard PostgreSQL database.

Considerations:

  • Not all providers are as open as others, check out the details before committing to ensure you maintain portability
  • Sometimes getting third party extensions and add-ons to work properly requires a bit of work.
  • Support for add-ons tends to be hit or miss depending on the author
  • There are lots of features. PostgreSQL tends to have a reputation in the user space as being a bit more complex and harder to master than others. However, it is easier for those well versed in relational concepts.

Quick View of MongoDB:

MongoDB database ranking

 

Growth and Adoption:

MongoDB is the only vendor on our shortlist not to be publicly traded (Oracle is, but their focus is not only on OSS), so we can look at their revenue and customer growth numbers.

In the third quarter fiscal 2020 financial results MongoDB announced total revenue of $109.4 million, up 52% year-over-year. MongoDB Atlas revenue was 40% of their total Q3 revenue, up over 185% year-over-year. They also show strong user numbers, with over 15,900 customers as of October 31, 2019.

MongoDB saw a portion of revenue growth coming from expansions, price increases, and product changes. It’s hard to measure the overall adoption of MongoDB from revenue only.

DB-engines show growth, but not as fast as the revenue or customer numbers indicate. But, it is growing.

The StackOverflow survey shows flatter MongoDB adoption numbers.

However, Percona’s web traffic for MongoDB-related topics is increasing by 20%+ year-on-year.

Most Popular Usage:

Web – Especially Mobile
Get Big Fast was the mantra of Web 2.0. Only MongoDB delivered on both “big” and “fast”. NoSQL puts schema control in the hands of the agile web app or microservice developer, reducing their migration iteration time.

Gaming – Especially Mobile
Storing flexible inventories, keeping up with metadata changes, the ability to scale, and easier sharding, make this a popular target.

X-as-a-Service
Rapidly evolving businesses and rapidly changing data requirements are a good fit for MongoDB’s flexible document storage. This lends itself as a popular backend for many SaaS businesses.

While this is not the only target database for this large community, it is one.

Opinions and Thoughts on the Future:

MongoDB is unique in this list as development energy is veering away from the core database server.

MongoDB Inc was originally 10gen, which intended to be an entire platform for online services. It seems to be coming back to that original mission by adding text search, cloud automation, mobile integration, all within the MongoDB Atlas service. However, these are all closed source.

With major features and investments happening first and exclusively in Atlas, is the core MongoDB enterprise just the teaser to move you to their cloud platform?

Benefits:

  • Flexible schema
    NoSQL put schema control in the hands of the agile web app or microservice developer, significantly increasing app release iterations per year.
  • Objects in your language
    MongoDB drivers seamlessly convert DB documents to native objects of Javascript / Go / Python / Java / etc. You can work with the natural data type for your programming language, not SQL record structures.
  • Natural High Availability
    MongoDB server and client drivers were built around the replica set from the start, enabling easy, downtime-free maintenance, and automatic failover.
  • Horizontal Scaling built-in
    Without additional software, MongoDB can be run as a sharded cluster of up to several hundred shards. This makes it the best Big Data solution amongst general-purpose databases.

Community:

Pre-IPO: There were a lot more code hackers, contributors, and a bit more openness.

Post-IPO: Official MongoDB Certification, no hacking here, please.

The MongoDB community is a lot more centrally controlled than other databases we interact with.

Considerations:

  • SSPL is not a recognized open source license. Not everyone considers MongoDB to be truly open source.
  • There is uncertainty over what is allowed and what is not allowed with the SSPL. This has hampered community and third party adoption.
  • Growth and focus are on Atlas as a platform. This is great if you want to run in the cloud, but what if you don’t want to be locked-in to Atlas?
  • MongoDB is a tightly controlled ecosystem, by far the most restrictive of the listed open source projects we support. This means community contributions and third party services and support lags behind other databases.

Quick View of MariaDB:

MariaDB database ranking

Growth and Adoption:

DB-Engines shows that MariaDB has experienced significant growth over the past few years. However, it has less than 1/10th of the popularity of MySQL according to their rankings.

MariaDB has a massive community presence helped by its default status on many Linux Distributions. But, that growth has not yet led to significant commercial success.

We have seen a rise in companies with mixed MySQL and MariaDB environments, but this is generally not a strategic decision, but more of a happenstance.

The cloud providers we talked with are seeing slower adoption of MariaDB in the cloud, those with the choice tend to go with MySQL.

While growth for MariaDB is happening globally, we see that adoption happening much faster outside of North America (specifically in Europe and Asia.)

Most Popular Usage:

Rapid Development
Many developers like how easy it is to start with MariaDB. They also like the extra features MariaDB has been working on.

Web Apps:
Like its predecessor, MariaDB has a good reputation amongst those running web apps.

Oracle Alternative:
MariaDB has been trying to position itself as an alternative to Oracle and started including additional Oracle-like syntax and features over the last few years.

Many see moving from OracleDB to MySQL Enterprise as just moving to yet another Oracle product. MariaDB has gained some traction here, but still lags PostgreSQL in this space.

Opinions and Thoughts on the Future:

MariaDB is really gearing up to compete in the enterprise space, they seem to be particularly targeting industries such as the financial sector.

They have been trying to distance themselves from the reputation of being a derivative of MySQL. Unfortunately, this still leads to confusion for many.

MariaDB is working towards launching a GA of their own DBaaS, to compete with larger cloud vendors who they have recently soured-on, as they accuse them of strip-mining open source.

As MariaDB pushes into the enterprise space they are removing some of their more experimental features, to focus on providing a slimmed-down, more stable, enterprise build.

Note: it is often hard to separate the MariaDB foundation from MariaDB Corp.

Benefits:

  • MariaDB tends to push the envelope on a few things, offering lots of new interesting experimental features to try that can solve various use cases.
  • Because it has tried to maintain compatibility with MySQL it is easier to move to or start using, for those already in the MySQL space.
  • The MariaDB Foundation is very accommodating and focused on community engagement.
  • Having their column database included allows mixed-use cases. They are really pushing the idea of “Smart Transactions.” It will be interesting to see if this catches on.

Community:

MariaDB is split into the MariaDB foundation, which is responsible for the base code and is funded through sponsorships and other means, and the MariaDB Corporation. While there is a split, it’s often hard to separate the two entities, as they go hand-in-hand.

MariaDB has a large group of users who have deployed and are happily using MariaDB for a wide range of workloads. That said, the visibility and number of companies contributing is not on the same scale as MySQL or PostgreSQL. There are also sponsored features that are picked up and paid for by enterprises.

As overall adoption grows, the enterprise contributors will also grow. You can look at the foundation to see significant companies that have invested in MariaDB’s future such as Booking.com, Microsoft, Alibaba, Tencent, and IBM.

Considerations:

  • MariaDB is no longer really fully MySQL-compatible. For a long time, they pushed this, but it has now diverged so much that it is its own product. Once you migrate over, it is hard to move back.
  • The overlap of core RDBMS with MySQL, however, is still great. So, often it just boils down to personal preference between the two databases.
  • The rapid pace of feature adoption has led to workload anomalies and issues in the past. The decision to pare-back features for a more enterprise build seems to acknowledge this.

*Thanks to DB-Engines and Stack Overflow for their additional database information and graphs.

My final blog in this series on The State of the Open Source Database Industry in 2020 is available next week and will discuss the question “When is open source not really open?”


by Matt Yonkovit via Percona Database Performance Blog

Comments