Power Up Your PostgreSQL Analytics With Swarm64


0:22:41

So what's pretty good is that we came into it, having built already other database extensions, so we were really looking into Okay, what were the things the lessons we've learned, and we made the conscious choice to stay on an extension level with posterous In other words, we would not go in and build our own Postgres. And as many of your listeners probably know, a lot of the popular projects and products in the market are actually Postgres derivatives, you have the examples of Amazon redshift, you have the examples of IBM, and a teaser, or, for example, a pivotal greenplum that all swans are all once upon a time version of Postgres that would then kind of take private and foreign to the new project. And we decided to not do that. So we started with looking at, okay, where are extension hooks that we can use Where are certain API's that we can use, and we started kind of expanding from there. And posterous is very, very versatile in that space. I mean, it's probably among the most extensible databases there are including closed source databases, so both open and closed source databases, probably Postgres as among the one, the one that is most active sensible. And what you can do is you can define certain ways in how your data is accessed. Example custom scan provider, you can define ways and how your data is stored. We started with the foreign data tables, the foreign data storage engines, because there was no native storage engine yet at the point we started, then now is in version 12. We are very eager to see how this kind of table storage API will evolve over the future, we may actually go much more in that direction. But for now, it's really a combination of defining certain tables sources, in our case, that foreign table API, combined with certain access paths that we can define certain career planner hooks we can provide to Postgres certain cost functions. So it's really been designed very, very well in terms of extensibility. And you can just provide and kind of offer yourselves to all these different extension hooks and then your respective functions. will be called. And you have the ability to tell box standard post, Chris about all the great things you can do in addition, and this is how we worked. And we realized that a lot, it's not the easiest way of working. But it's in a way the most rewarding because on the one side, you're really benefiting from the Laura effort and overhead to move between Postgres versions. And secondly, it was actually very easy for us to support other solutions. For example, our product also works for enterprise dB, and enterprise, DBS, Postgres Advanced Server is actually not open source. And still, we were able to compile for Postgres Advanced Server by enterprise dB, and we're able to run on that. So now you can also use our product in solution like enterprise dB. And that would have not been the case if we hadn't gone for such a kind of modular pluggable architecture that Postgres was offering us. Now that is on how we kind of work into the system. Let me Just cover a few parts of what we're actually doing. So on the one side, if we kind of take the anatomy of a query, it goes into the system. And we are basically then offering posterous. In addition to all the different data handling mechanisms, the test itself, we're actually offering it additional ways to process the query. So for example, we offer it to move data around during the query, so called shuffling. And so the query can stay parallel for longer. That's one of the things we do we offer Postgres, our own joint implementation, specifically optimized for joining very large amounts of data. So if you want to join tables that have a few billion rows with tables that have a few million, or even a few billion rows themselves, that is something that can very quickly bring Postgres to its limits. And what we did is we have a special joint implementation for that. So that's something that is offered to Postgres, and it can pick it if it wants to, we offer certain crew writing patterns. So if we can basically know Notice that something is going to be executed very badly because, for example, it is not going to be maybe it's a very linear execution mechanism, as opposed to you could do it in parallel, then we will, we will offer that to Postgres and the Postgres query planner will then pick and choose, once the query is planned and gets executed, we have the matching execute or notes to all these things. And also we have this accelerated i O, I was mentioning before and when it comes to processing, we can actually offload sometimes the entire query to the hub accelerators. So there's optional hardware accelerators, you can use FPGAs you can use Samsung Smart SSDs and those FPGAs from Intel or sailings or smart cities from Samsung. They will then receive instructions and process data according to the query and only return the results and so all in all, there is a host of different functions we are offering To Postgres, the query planner will kind of choose, like from a buffet. And if you have the optional additional hardware acceleration, it will also offload and push down a lot of the query processing directly to additional hardware and making your system thereby even more efficient.