Storing results
Persistent storage
VIKTOR offers a Storage
which can be used to store and retrieve files within an app
workspace. The storage is persistent, meaning that the data will remain available with no time limit. This can be very
helpful in cases where (intermediate) results need to be shared between jobs. For example, a long-running task is
performed in job 'A', of which the results can be accessed in job 'B' without the need to rerun the task.
Supported operations
The storage can be accessed from an app with the Storage
methods as shown below:
import viktor as vkt
# Setting data on a key
vkt.Storage().set('data_key_1', data=vkt.File.from_data('abc'), scope='entity')
vkt.Storage().set('data_key_2', data=vkt.File.from_data('def'), scope='entity')
# Retrieve the data by key
vkt.Storage().get('data_key_1', scope='entity')
# List available data keys (by prefix)
vkt.Storage().list(scope='entity') # lists all files in current entity scope
vkt.Storage().list(prefix='data_key_', scope='entity') # lists 'data_key_1', 'data_key_2', ... etc.
# Delete data by key
vkt.Storage().delete('data_key_1', scope='entity')
Storage scopes
With the scope
argument the 'accessibility' of the stored data can be set. The following scopes are available:
- entity: when data needs to be accessed within a specific entity
- workspace: when data needs to be accessed workspace-wide
scope 'entity'
The entity scope means that the data is assigned to a specific entity. A common use-case for this scope is to store results of a long-running task (analysis, file parsing, etc.) and retrieve it without the need to rerun the task.
For example, data is stored in entity A1 on key entity_data
. VIKTOR stores the data in a zone in the storage
designated for entity A1:
vkt.Storage().set('entity_data', data=vkt.File.from_data('content set by A1'), scope='entity')
This data can then be retrieved in entity A1 using the get
operation:
vkt.Storage().get('entity_data', scope='entity') # File content: 'content set by A1'
When we would try to perform this get
operation in entity A2, a FileNotFound
error will be raised because the
file does not (yet) exist in the storage zone designated for entity A2. This also means that we can re-use the same
key to set data, without overwriting the data stored in entity A1:
vkt.Storage().set('entity_data', data=vkt.File.from_data('content set by A2'), scope='entity')
It is also possible to store/retrieve data from one entity in another (even with entities of different type).
This cross-entity referencing can be achieved by passing the relevant Entity
object as entity
argument:
entity = ... # retrieve entity A1
vkt.Storage().get('entity_data', scope='entity', entity=entity) # File content: 'content set by A1'
See this guide on how to navigate to the correct entity using the API.
scope 'workspace'
The workspace scope means that the data is accessible workspace-wide. All entities of all types will
point towards the same section in the storage with this scope. This scope can be seen as an extension of the
memoize
functionality, with the difference that the stored results are permanent.
In entity A1 the following data is stored on key workspace_data
using the workspace scope:
vkt.Storage().set('workspace_data', data=vkt.File.from_data('content set by A1'), scope='workspace')
Storing data in entity A2 on the same key using the workspace scope overwrites the previously stored data:
vkt.Storage().set('workspace_data', data=vkt.File.from_data('content set by A2'), scope='workspace')
When we retrieve this key in either entity A1, entity A2, or even an entity from TypeB, the returned file content will be the data which is stored last (in this case 'content set by A2').
Memoize
To store (long-running) results temporarily you can also make use of the memoize
function.
A practical use-case of memoize
is when a function call has input and output that is relatively small compared to the
time required for its evaluation. Cached results are kept for a maximum of 24 hours.
In below example, a DataView
performs a long-running calculation when calling func
. When the user changes input in
the editor and updates the view again, func
will only be evaluated again if either one of param_a
, param_b
, or
param_c
is changed in-between jobs:
import viktor as vkt
@vkt.memoize
def func(*, param_a, param_b, param_c):
# perform lengthy calculation
return result
class Controller(vkt.Controller):
@vkt.DataView("Results", duration_guess=30)
def get_data_view(self, params, **kwargs):
...
result = func(param_a=..., param_b=..., param_c=...)
...
return vkt.DataResult(...)
When using memoize
on your development environment the cache is stored locally. The local storage is limited to
50 function calls for practical reasons. If the limit is exceeded, cached results are cleared based on the
first-in-first-out principle. In production the storage is unlimited.