kdb+ is Memory Hungry, Right?

Jonny Press

We hear it a lot from our clients – kdb+ needs a huge amount of memory to run well, and memory is expensive. A lot of the default choices in kdb+ are geared towards performance, and performance means in-memory operations. In this blog we look at how kdb+ uses memory and what can be done to optimise or reduce the memory footprint.

First things first... where does the reputation come from?

There are several factors that lead to the memory hungry reputation. Firstly, kdb+ can be run as both an in-memory or on-disk database. Storing data in-memory has performance benefits (especially for inserts) but… well… uses memory, obviously. In a traditional kdb+tick style data capture system, the latest data (which is usually for the current day) resides in-memory. People throw a lot of data at kdb+, so the memory requirements can be large.

The second aspect is that it uses a version of buddy memory allocation and writes vectors of data to contiguous memory blocks in power of two sizes, and therefore memory is usually over-allocated. As an example, if you were to store a vector of 40,000 8-byte values, kdb+ would store that in a 512 KB (2^19 bytes) memory block as opposed to the 313 KB which is required. Storing vectors in contiguous blocks of memory is essential for performance and also for inserts in real-time capture use cases, allowing the vectors to grow and only re-allocating memory when the block size threshold is crossed. Kent wrote a nice blog on this previously.

The third is scratch space. We aren’t usually just storing data, we are analysing it. Analysis requires temporary data structures to be created, which also require memory. The default option in kdb+ is to recycle memory internally and not free it back to the operating system, meaning temporary structures can remain resident.

Does kdb+ do anything to try to conserve memory?

kdb+ is memory efficient in a number of areas. It uses copy-on-write semantics and reference counting, which means that if you copy an object in-memory it will only allocate a new memory location when the object is changed. In the context of a table, each column is treated as an independent object. As an example, assume we have a table mytab with columns a, b, c, d, e. If we create a new table, yourtab, as

yourtab:select A:a, C:c, D:d+10 from mytab

then the only new memory allocation will be the space required for the D column. As we modify either mytab or yourtab, the necessary copies will be made.

kdb+ structures its execution of on-disk queries to avoid reading data into memory until it absolutely has to. This is done by a combination of how the “where” clause part of the query is executed, map-reduce to minimise construction of large intermediate data structures, and the usage of memory-mapped files with lazy loading.

What can we do to reduce the memory footprint?

kdb+ is very flexible, and from our experience of looking at numerous different client implementations, usually there is lots which can be done to reduce memory. In a lot of cases there is little or no trade-off, it’s just a case of improving how things are done. Sometimes though there is a performance trade off – we reduce the memory usage, but we can only swap it for extra CPU cycles or, more usually, IO. A lot of techniques are centred around serving data from disk rather than memory, and storage devices can be fast enough that the performance degradation may be acceptable. Serving data from disk also tends to allow greater parallelism than the in-memory approach.

The first place to look is the real time capture components. The default solution is usually to have a “realtime” database component which stores all the data for the current period (usually the current date) in-memory – this gives excellent insert and query performance. It also makes it easy to persist data to disk at end-of-day, which was the default and only approach of kdb+ tick in standard form. If you know how the data is used then you can ask questions:

Do we need to store all the tables in memory or can some go directly to disk?
If we need all the tables, do we need all the columns – could some go directly to disk and not reside in-memory?
Do we need a full day of data in-memory or could we use a smaller window?

If we separate the tasks of persisting data to history from intraday analytics, then we can reduce the memory footprint using approach 1 or 2. It is fairly common that there are tables, or columns within tables, which are rarely used intraday and captured primarily for T+1 research purposes. Can these be moved? The “writing” database component in TorQ is an example of a process which performs this split. We could also look at chopping out specific rows from tables, but this would be the least usual approach. If we go this route, it’s better to do it upon receipt of data i.e. drop the data you don’t need before inserting it into the in-memory table rather than a periodic delete. Deletion from an in-memory table can be time consuming as it requires each column vector to be re-written.

Approach 3 is more involved, where we end up with an “intraday” database alongside the standard “historic” (on-disk and containing everything prior to today) and “realtime” databases (in-memory containing everything from today). The intraday database persists data to disk and serves queries from it. There is usually a performance loss compared to the in-memory queries, but this may be acceptable and compensated by greater parallelism. The Storage Manager component in KX Insights Microservices utilises this approach.

Memory Stats and Garbage Collection

We can examine the memory stats of the processes using .Q.w. This will show all of the current memory in-use, allocated, and peak since start. By default kdb+ will not return memory to the operating system meaning that the allocated memory may be significantly above the current in-use. To modify the behaviour, change the garbage collection flag. This will force kdb+ to return memory when it can, but will incur a performance cost.

Explicit garbage collect can also be invoked by .Q.gc which will try to return more memory than -g because it attempts to coalesce and return fragmented memory blocks. Therefore invoking .Q.gc[] whilst running under -g 1 may still free up memory, but will take time to run. kdb+ 4.1 introduced options around the aggressiveness of .Q.gc.

// start q with -g 1 flag
// create a large memory object
q)a:10000000?100
// delete it
q)delete a from `.
// gc doesn't return any more memory
// because the object has already been returned
// if kdb+ can free a full memory block of size > 32 Kb then it will
q).Q.gc[]
0

// compare to creating 10 items, of smaller size (one tenth the size)
q){@[`.;x;:;1000000?100]} each `$'10#.Q.a
// delete them
// (apologies for the functional form!!)
q){eval(!;enlist`.;();0b;enlist enlist x)} each `$'10#.Q.a
`.`.`.`.`.`.`.`.`.`.
// need explicit gc to coallesce the memory blocks and free them
q).Q.gc[]
67108864

Even when you know the process level stats, it might be difficult to work out which objects are taking up how much memory, and (due to reference counting) deleting an object might not reduce the memory footprint at all! The reference count of an object can be checked with -16! – it will only be possible to free memory when deleting objects with a ref count of 1. The serialized size of an object can be checked with -22! but this doesn’t tell us how much memory it takes up, it will (always?) be an underestimate. This is covered in Kent’s blog and there are some TorQ utilities to help.

The Devil is in the (Implementation) Detail

In a number of client systems we have seen significant data duplication – multiple processes loading the same tables into memory. These could be moved to a shared service, or served from disk instead of memory. Both have performance considerations.

How queries and data loaders are written can lead to large transient memory usage spikes. Queries should be designed sympathetically to how the data is structured. An example would be an asof join across multiple dates in a date partitioned database. One approach would be to select all the data for the period into memory and aj the tables together:

aj[`sym`time;
 select from trade where date in 2024.01.01 + til 5;
 select from quote where date in 2024.01.01 + til 5]

Another would be to iterate date-by-date:

raze {aj[`sym`time;
 select from trade where date=x;
 select from quote where date=x]} each 2024.01.01 + til 5

The latter approach will avoid the creation of large temporary structures and allow kdb+ to lean more heavily on memory mapping, significantly reducing the memory footprint and likely improving performance (though it should be noted there is a semantic difference in the two approaches around the date boundaries that needs to be accounted for – it is use case dependent and may or may not matter). A similar aj use case would be sub-selecting on instrument; contrary to the usual approach of reducing datasets as much as possible in the where clause, this:

aj[`sym`time;
 select from trade where date=2024.04.02,sym in `IBM`MSFT;
 update `p#sym from select from quote where date=2024.04.02,sym in `IBM`MSFT]

is less efficient in a lot of cases than this:

aj[`sym`time;
 select from trade where date=2024.04.02,sym in `IBM`MSFT;
 select from quote where date=2024.04.02]

For file loaders, a common pattern is to load all the data from the file into memory, apply some business logic to it, and then write it back out again. However it is also possible to do this in chunks, using .Q.fsn meaning large temporary structures are avoided. For more complex business or sorting logic, it may be required to further operate on the data when it has all been persisted to disk.

The same applies to data writers. If writing out a large amount of data to a CSV file for example, it can all be done in one shot (again creating a large temporary structure in memory) or it can also usually be changed to a chunk based approach.

Memory gotchas

Once we’ve done all that we should be in a good state! But there might be more to look at.

kdb+ stores nested data structures less compactly than non-nested types. Sometimes it’s possible to compact them into non-nested data types, with different types of string encoding being a relevant example. Avoiding nested structures where possible usually leads to memory efficiencies and performance improvements.

The grouped attribute leads to query performance improvements at the cost of a separate index being built and maintained.

Sym file bloat is worthy of a whole separate article. Sym files are loaded into memory by HDB processes, and HDB processes are usually replicated many times in enterprise scale systems leading to large memory overheads.

Conclusion

kdb+ systems grow over time and rather than throwing more memory at a system, how memory is currently being used should be considered. This type of investigation is a core component of our ARK service. If you want advice with your kdb+ system, come and ask! We are here to help.