Mongodb performance vs postgres

8/9/2023

Sharding (splitting a single big dataset across multiple servers) and replication (maintaining multiple copies of a dataset for HA) are completely separate things. The complexity of highly-redundant setups tends to cause more problems than it solves. But HA replication is not trivial and it's something that MongoDB definitely doesn't offer, despite its marketing (because for true HA, you need a guarantee that each node in the cluster will always produce non-stale data, which MongoDB doesn't provide).īasically, MongoDB still doesn't do anything special here, and if you just run a replica set, you still don't get the relational integrity that an RDBMS would provide.Įdit: Also, you usually want none of the above things. If you want just replication, then that's the standard mode of operation of most every RDBMS that supports a cluster of more than one instance. That seems to be more or less what MongoDB does, with its "each shard can be a replica set" approach. In a sharded model, you can have redundancy by simply having >1 nodes responsible for the same record, and modifying your algorithm to produce >1 results. i'm not use it, only install MongoDB in docker for dev at this time. So yeah, not quite the unique selling point that MongoDB are presenting it to said: If you still really want to do sharding for some reason, then there's an implementation of that for PostgreSQL and, from a quick search, it seems for MySQL as well (though I don't bother with MySQL personally, PostgreSQL is better and nicer to work with in almost every way). There's a reason why RDBMSes don't shard by default.

What their marketing copy doesn't mention, however, is that sharding comes with severe tradeoffs you can't have relational integrity, because a sharded system cannot assume that other servers are available to check the validity of certain references against, and there can be significant overhead associated with lookups when different servers have a different idea of which servers are currently online and "healthy", as well as a lot of opportunities for servers to serve up outdated versions of records.Īll the while this functionality is not necessary for the vast majority of projects (you can scale very far on a single server, enough for 99.9% of usecases), and so sharding is a really bad default as a replication strategy, because you will be trading in data integrity guarantees that you do need, for scalability features that you don't need. This is not a technique that's unique to MongoDB, and in fact there are quite a few databases that can be sharded (or are sharded by default). Okay, so what that is really referring to, is that MongoDB uses sharding (basically, distributing records across multiple servers and using a deterministic algorithm to determine what server to ask for what record), which makes it "easy" to scale up in the sense that it doesn't require you to architect your data storage around a particular distribution model across servers, it just throws all the records into a big content-addressable bucket. Their marketing seems to indicate that it's much better than MySQL. (This is actually a great indicator for whether a new-ish technology is just hype, or a serious improvement is the web full of "getting started" posts, or are there also in-depth articles about long-term use? If it's just the former and almost none of the latter, it's probably just said: but I've heard that they apparently scale better than said: I haven't tested it myself, but Mongo claims to be easier to configure HA on. Most anyone running a serious deployment has migrated away to a serious database by that point. There's a reason there's a billion "this is how easy it is to get started with MongoDB" tutorials around the web, and virtually none that tell you how to maintain a MongoDB cluster in the long run. but pretty much every single person I've spoken to who has actually maintained a serious production MongoDB cluster has called it a nightmare to operate and maintain, with constant inexplicable failures. a very bad deal.Īs for "easy replication setup" - it may be easy to get it running in something it claims is a replicated setup. That's great for their ability to market a subpar database product (and in fact, this seems to be quite literally their marketing strategy), but for the end user it means that the rare case is being optimized at the cost of the common case - ie. MongoDB makes it "easy to get started", in exchange for making everything after that significantly harder and less reliable, forever. Here's the problem, though: "getting started with" something is something you only do once, whereas "keeping it going" is something you will be doing effectively forever. Said: Mongo is easy to get started with (document db instead of relational) and has an easy replication setup

0 Comments

Mongodb performance vs postgres

Leave a Reply.

Author

Archives

Categories