Remote Execution of Functions with Complex Dependencies

Blog kdb+ 13 Jan 2023

Data Intellect

Remote Execution of Functions

Interprocess communication is a very important part of any kdb+ system. We regularly need to run queries on other processes and wait for the result. In a lot of cases, these invocations will execute functions which are already defined on the server process. However, sometimes it can be beneficial to define the function on the client and send the function definition along with the parameters to the server.  For instance on the client we could have

q)h:hopen `::1234
q)func1:{x*y}
q)h(func1;2;3)
6

Client sends function definition and inputs to server, server responds with the result.

In the set up above, the client sends function definition and inputs to server, and the server responds with the result.

It becomes more complicated when there are multiple functions defined on the client which have dependencies. In these cases, we may be able to push definitions to the server, but this is not good practice and also may not be possible if the server is restricted to read only access. How then do we get around these restrictions?

Namespaces 

Namespaces are an organisational tool within the kdb+ workspace. They allow us to put our variables into containers, and hence avoid naming clashes. The data object we use to hold namespaces is the dictionary, and they can be identified by a . at the start of the name. The keys of the dictionary are the symbolic names of the functions defined in the namespace, and the values are the current definitions of those functions. Importantly a namespace dictionary always a specific null first entry: a null symbol as the key and a generic null (::) as the value. For instance the namespace .ns: 

q).ns
  | ::
a | ``foo`func!(::;{x*y};{x+20})
b | ``func!(::;{x+y+z})  
c | {til x}

q).ns.a
     | ::  
foo  | {x*y}
func | {x+20}

Here we see that we can have two functions, .ns.a.func and .ns.b.func, with the same name (func). However, there is no confusion regarding their definitions, as they are in different namespaces. Note that as we see in the above example, namespaces can be nested any number of times. This is achieved using a nested dictionary structure.

Razing Namespaces 

We can now return to the issue mentioned previously involving dependencies in our functions being unrecognised on server processes. This becomes even more of an issue when dealing with functions in different namespaces. For example, if we define some functions as follows in a q script:     

.foo.bar.a:{x+y} 
.foo.bar.b:{20*x} 
  
.foo.c:{[x] 
  t:x*x; 
  .foo.bar.b[x]+.foo.bar.a[x;t] 
  }

And then attempt to load this file in and run a function on the server process, we see any function called inside .foo.c will be unknown to the server. 

q)h:hopen `::1234 
q)h(.foo.c;2) 
'.foo.bar.a 
  [0]  h(c;2) 
       ^ 

There are a number of ways you might try to solve this problem. For instance, if we have a main function which calls a number of subfunctions, why not simply define all the required subfunctions inside the main? This will fix the problem but depending on the exact situation, may involve a lot of repetition of code. Applying this to the above solution for instance: 

.foo.bar.a:{x+y} 
.foo.bar.b:{20*x} 
  

.foo.c:{ 
  .foo.bar.b:{20*x}; 
  .foo.bar.a:{x+y}; 
  t:x*x; 
  .foo.bar.b[x]+.foo.bar.a[x;t] 
  }

Now both the functions inside the .foo.bar namespace have been defined twice and it is easy to imagine that with larger functions the amount of repetition could become significant.  

The solution that we propose to this issue is to ‘flatten’ or ‘raze’ the default namespace dictionary. What this means is that rather than having nested dictionaries for nested namespaces we would have one large dictionary which contains all the functions in all namespaces, with the keys being symbols of each fully qualified function name, starting from the default namespace. Again looking at the previous example we would have a dictionary like

.foo.bar  | ``a`b!(::;{x+y};{20*x}) 
.foo.c    | { 
  t:x*x; 
  .foo.bar.b[x]+.foo.bar.a[x;t] 
  } 
.foo.bar.a| {x+y} 
.foo.bar.b| {20*x} 

Now we can send this full dictionary to the server along with the function we desire to run. This will ensure the server has all the definitions required to perform the call. It is important to note that a side effect of this method is that we will need to adapt how we write subfunction calls in our functions. This can be seen here 

.foo.c:{[x;funcs] 
  t:x*x; 
  // rather than using the standard: .foo.bar.b[x]+.foo.bar.a[x;t] 
  // we must take our functions from the main dictionary  
  funcs[`.foo.bar.b][x] + funcs[`.foo.bar.a][x;t] 
  } 


The Razenamespace Utilities
 

Let us have an in depth look at the body of code which we have developed to facilitate the approach outlined above. 

// Razenamespace Utils 

// These functions allow us to flatten out a namespace to include fully qualified names 
// which we can then send over IPC and reference more easily 

// Given the name and the value of a namespace, create fully qualified names 
// We want to drop the first (null) object from the namespace 
.razenamespace.flatten:{(` sv' x,/:1 _ key y)!1 _ value y}; 

q).razenamespace.flatten[`.foo;value `.foo]
.foo.bar | ``a`b!(::;{x+y};{20*x})
.foo.c | {
t:x*x;
.foo.bar.b[x]+.foo.bar.a[x;t]
}

The flatten function is used to add the name of a given namespace to the keys of that namespace dictionary, therefore making the names fully qualified. We see the result that this function has on our example of the .foo namespace.

// function to check if an object is a namespace dictionary 
.razenamespace.isnamespace:{$[99<>type x;0b;(`~first key x) and (::)~first value x]} 

As the name suggests this isnamespace function returns a boolean value which indicates if its input is a namespace dictionary or not. 

// if we have sub namespaces, flatten them 
.razenamespace.flattensubdicts:{ 
  $[count w:where .razenamespace.isnamespace each value x; 
   x,raze {.razenamespace.flatten[key[x]y;value[x]y]}[x] each w; 
   x] 
} 

flattensubdicts is where most of the work is done. It is passed a namespace dictionary and it identifies any nested dictionaries, extracts these, and appends them to the input dictionary. Each entry in the result also has a fully qualified name

// add the namespace prefix to the name of each of the functions in the dictionary 
// one arg - symbol of the namespace dictionary we want to raze entirely 
.razenamespace.allvars:{ 
  level1:.razenamespace.flatten[x;value x]; 
  // if we have sub namespaces, flatten them 
  // call this iteratively to get them all 
  .razenamespace.flattensubdicts/[level1] 
  } 

q).razenamespace.allvars[`.foo]
.foo.bar | ``a`b!(::;{x+y};{20*x})
.foo.c | {
t:x*x;
.foo.bar.b[x]+.foo.bar.a[x;t]
}
.foo.bar.a | {x+y}
.foo.bar.b | {20*x}

This is the unary function which combines all of the above steps to provide us with the full dictionary of all functions we have sought after. The flattensubdicts function is applied recursively in order to flatten any number of nested dictionaries. We see that it gives us the required dictionary when applied to the .foo namespace.

Conclusion 

Now having rewritten the .foo.c function in order to adjust how we make calls to other functions, as discussed above, finally we see how we can solve the initial problem by sending our function call to the server as follows:

q)h:hopen `::1234 
q)allfuncsdict:.razenamespace.allvars[`.foo]; 
q)h(.foo.c;2;allfuncsdict) 
46 

Hopefully this has demonstrated how this code can be useful, for a variety of kdb+ systems. Most importantly it will provide developers with a scalable, non-repetitive solution to the problem of sending function calls relying on different namespaces over IPC, even in situations where assumptions cannot be made about the functions defined on the remote process. 

 

Share this:

LET'S CHAT ABOUT YOUR PROJECT.

GET IN TOUCH