(These are some notes I took while studying Citus code, so it is probably more detail oriented than higher picture oriented).
Citus overrides the utility hook with multi_ProcessUtility
. This function calls ProcessCopyStmt()
for COPY statements, which calls CitusCopyFrom()
, which calls CopyToExistingShards()
.
CopyToExistingShards()
uses the postgres/src/include/commands/copy.h
API to read tuples:
BeginCopyFrom()
NextCopyFrom()
EndCopyFrom()
and it uses the CitusCopyDestReceiver
API to write tuples. CitusCopyDestReceiver
is a specialization of postgres’ DataReceiver, which contains the following methods:
rStartup
/rShutdown
: per-executor-run initialization and shutdown
rDestroy
: destroy the object itself.
receiveSlot
: called for each tuple to be output.
... More ...
If you set debug_print_parse
, debug_print_rewritten
, or debug_print_plan
to true, PostgreSQL will log some of the interesting internal data structures during the query execution. But these logs are usually too long and difficult to inspect.
I recently switched to using Frog to generate my blog. Last week I wrote a little Frog plugin to allow me embed these trees in a nice tree view in blog posts.
... More ...
You can use TRUNCATE in postgres to delete all of the rows in a table. The main advantage of it compared to using DELETE is performance. For example, using DELETE to delete all rows in a table with 1 million rows takes about 2.3 seconds, but truncating the same table would take about 10ms.
But how does postgres implement TRUNCATE that it is so fast?
... More ...
I recently came accross the "Files are hard" article, and it made me wonder how reliable is cstore_fdw’s design and implementation. cstore_fdw is a columnar store for PostgreSQL that I designed and developed in my previous job at Citus Data.
I am writing this post so my decisions for cstore_fdw’s design get reviewed by more people, and I get some feedback and improve the design.
... More ...
Recently I started learning Haskell by studying the Intro to FP Programming course on Edx. Since then, I try to model different problems using Haskell.
One of these problems is the Skyline problem, which goes like this:
You are given a set of rectangular buildings in a city, and you should return the skyline view of the city. Input is a sequence of tuples (x_{left}, height,
x_{right})
, each describing a building. The output is a sequence of pairs (x, height)
meaning that the height of skyline changed to height
at the given x coordinate.
... More ...