Generic Engine Notes (draft 2) ============================================================================= $Id: generic-engine-notes-draft2.txt,v 1.1 2001/08/21 14:09:46 bart Exp $ 1. Intro 1.1 About this document This document is in its infancy. If you want to see how the ideas in here evolved please see the 3rd draft of 'FreeS/WAN -- KLIPS HARDWARE ACCELERATION NOTES' which can be found here: http://www.jukie.net/~bart/linux-ipsec/ 1.2 Change since last draft In this document I have tried to capture comments I have received on the design and work them into the engine such that to make the use of the interface more efficient. In summary these are: * Each engine context implements a function max_output_length() which will compute the amount of space required in the job->data.out_ptr buffer based on the input length and the context structure. This allows for easier chaining; ie the application of multiple algorithms which may require a larger destination buffer than the source buffer. In such a situation the caller can precompute the maximum buffer needed for all calculations. * The user of the engine objects is discouraged to call function pointers directly. Instead the calls should be made via proxy inline functions like create_context, execute_job, release_job, and release_context. * Cleanup, rewording, and rewrite of some sections where appropriate. 1.3 Software You can obtain the software and further information about this project here: http://www.jukie.net/~bart/linux-ipsec/genericengine/ The project is protected by the GNU General Public License; it can be obtained here: http://www.fsf.org/copyleft/gpl.txt 1.4 The lingo First, the terms in this document need to be defined: Algorithm: a detailed sequence of actions to perform to accomplish some task. Engine: structure embodying an algorithm capabilities. In this system there is an engine registered for each implementation of a given algorithm. Context: structure holding context for an algorithm's execution. In this system an context is created by an engine and capable to function as that algorithm only. Job: structure which embodies processing of data using an context. The job can execute in an asynchronous context and hence a pointer to the date must be kept around; especially if the execution happens in an accelerator hardware component. 1.5 Example of use The Generic Engine fits into any scenario where the usage of the algorithm ca be easily identified to match the following criteria: a. Usage of the algorithm can be easily broken down into short lived contexts which preserve some information about the algorithm or allow continuous usage of the algorithm where the input is a stream. b. Usage of the algorithm context can be broken down into operations which are atomic with reference to the context. That is only one operation can be used to alter the context at one time. These operations are referred to as jobs. An engine encompasses the algorithm, but the user must create a context to utilize the algorithm. The context will preserve any state for the duration of the context's life. To process data through the context the user creates jobs and passes data through the jobs to the context along with a callback function. Let us consider an example where the algorithm is the LZW compression algorithms. It is not important to consider the specifics of the algorithm. It is, however, important to know that the LZW transform maintains a context for all compression operations. Furthermore, LZW is a stream algorithm; the whole compression session is be composed of many blocks of data being compressed one after another, serially. Here is a simple step by step example of how this algorithm would be used: 1. locate the engine in question: /* NOTE: engine must be returned with put_engine */ engine = get_engine_by_name("comp-lzw-sw"); 2. generate an context: struct comp_lzw_context_definition defn; // from struct generic_context *context; defn.operation = COMPRESS; create_context( engine, &defn, GFP_ATOMIC, &context ); 3. use context: struct generic_job *job; create_job( context, GFP_ATOMIC, &job ); job->data.in_ptr = (void*)...; // buffer with data job->data.in_len = (int)...; job->data.out_ptr = (void*)...; // buffer for result job->data.out_len = (int)...; job->callback = (int)(*)(void*)...; // async callback(job) job->opaque = (void*)...; // anything else execute_job( job ); // async processing 4. job finished, in callback(job)... if( job->result ) { // stores the result of the op /* error processing */ } else { /* success */ } release_job( &job ); 5. done with context... release_context( &context ); 6. done with engine... put_engine( engine ); Before any operations are executed a context must be generated and initialized. In step 1, the 'get_engine_by_name' query is made on the engine database; an engine is returned. An 'engine' is a structure that describes the algorithm. It is required for creating a context for processing. When calling create_context or job the GFP_* flag specifies the urgency level of the memory allocation to be performed (see include/linux/mm.h for details). A context is created and configured as shown in step 2, by calling create_context and passing a definition structure (defn). The specification for what data is held here is specific to the type of transform that is being applied. In our case we use the comp_lzw_context_definition structure which contains the operation to perform, COMPRESS, and possibly the compression level. In an alternate example, as in the case of an IPIP tunnel, the initialization data would contain the end points of the tunnel. Once the LZW engine has a context the user can start compressing the data stream a block at a time. In step 3, the input and output buffers are identified and so is the callback; finally the job is executed and the user awaits a callback. In step 4 the callback arrives and the job is released, however, for added performance the job could be kept for the next operation on that context. 2 Database Details 2.1 Finding an engine Before using any of the algorithms the database must be consulted for an engine object. This is done via the call to: struct generic_engine* get_engine_by_name(const char*); This function takes on a string identifying the algorithm and returns an engine which will be used to create contexts for mangling data. 2.2 Naming of algorithms Each algorithm is named according to its type (ex: comp for compress), specific algorithm name, and implementation method. An example of an algorithm engine name is "comp-lzw-sw" which describes an engine which implements a compression LZW algorithm in software. It is possible to request any implementation by not specifying the last portion of a name; ex: if only one implementation of LZW was present then the same engine would be returned from get_engine_by_name when either "comp-lzw-sw", "comp-lzw-*", or "comp-lzw" was passed. For each algorithm, such as 'comp-lzw', there will be header file that is included by those that use the engine which implements it. In the case of the 'comp-lzw' algorithm the header file is and will define the comp_lzw_context_definition structure as well as enum's, defines, and other structures that are used along with the engine. 2.3 Engine registration When an engine implementation is loaded into the kernel it will be registered by calling: int register_generic_engine(struct generic_engine*); When the functionality of this engine is no longer required the engine can be removed from the database by calling: int unregister_generic_engine(struct generic_engine*); This document does not go into the details of the implementation of the database as it is likely to change. Consider it a transparent database of objects (engines) keyed on a character string (the engine name). 3. Engine data-type The engine structure is defined as follows. struct generic_engine { struct list_head eng_elem; struct list_head con_list; char *name; struct generic_eng_ops ops; struct generic_con_ops con_ops; struct generic_job_ops job_ops; rwlock_t lock; atomic_t del_flag; struct module *owner; }; 3.1 Engine management fields struct list_head eng_elem; The first field, eng_elem, must be in the first position of the structure; it is used as the link element in the database of similar algorithms - ex: list of all compression algorithms. The exact granularity of 'similar algorithms' is not defined by this document. struct list_head con_list; The con_list is used to store the list of all contexts generated by this engine. A context is added to this list when the create_context() function call is made. It is removed by the use_count of the context is decremented to zero. char *name; The next field, name, describes the engine to the user; see section 2.1 for details. The name is provided as a const char* by the module that defines it and cannot exceed GENERIC_ENGINE_NAME_MAX_LEN characters. 3.2 Engine methods struct generic_eng_ops ops; struct generic_con_ops con_ops; struct generic_job_ops job_ops; The ops, con_ops, and job_ops specify operations that can be executed by all objects related to an algorithm (i.e. engine, context, job). The later two will be discussed in the next two sections dealing with context and job objects. The only function attached to the engine is: int (*context_size) ( struct generic_engine * ); This function returns the amount of memory that needs to be allocated in order to store a 'struct generic_context' (to be discussed in next section); this call should not be used outside of the 'create_context' function. Other functions that could later be added to the generic_eng_ops would return statistics or handle /proc file system requests, etc. 3.3 Engine synchronization rwlock_t lock; The struct generic_engine data is protected by a read-write lock. It is used in the obvious way: when a read is to be performed on the data in the data structure a read-lock is acquired; similarly a write-lock is acquired for altering the data in the structure. Note, that the items in the con_list list are not protected by this lock. atomic_t del_flag; The del_flag is initialized to 0. When it is set the engine is presumed to be ready for deletion. This means that no more contexts will be generated. The engine still exists, however, till the module usage drops to 0. struct module *owner; The module which contains the definition of this engine is pointed to by the owner variable. Functions like get_engine and put_engine use this variable to increment (__MOD_INC_USE_COUNT) and decrement (__MOD_INC_USE_COUNT) the module use count. The module which implements the engine cannot be removed (by rmmod) till it's not referenced by anything (i.e. its use count is ZERO). The user should not use the owner field directly. Instead get_engine and put_engine functions are available to increment and decrement the usage counter of the engine structure safely. These functions are defined as follows: struct generic_engine* get_engine_by_name(const char*); int get_engine(struct generic_engine *ge); void put_engine(struct generic_engine *ge); These function operate as expected. The get_engine returns 0 if the engine can get obtained - its use count can be incremented - otherwise 1 is returned. Failure occurs because the del_flag on the engine was set; recall, this means that the module is being prepared for removal. If the engine module is compiled into the kernel, which it cannot be at the present, this function should pass every time. The get_engine_by_name locates the appropriately named engine, increments its use count, and returns the structure pointer. 4. Context data-type The context structure is defined as follows: struct generic_context { struct list_head con_elem; struct list_head job_list; struct generic_engine *eng; struct generic_con_ops ops; void * defn; rwlock_t lock; atomic_t use_count; atomic_t del_flag; }; 4.1 Context management fields struct list_head con_elem; This field, con_elem, must be the first position of the structure, just as in the case of eng_elem of the engine structure. It takes on a job as a link in the engine->con_list, the list of all contexts generated by one engine. struct list_head job_list; The job_list is a list of all incomplete jobs on the context. A job is added to the list when the context->forward() or reverse() methods are called. It is removed when the job completes, fails or the context is deleted prematurely. struct generic_engine *eng; The eng pointer references the engine that 'owns' this context. The context needs to know what engine to increment the use count when it becomes activated as well as what list to place itself on. void * defn; The defn (definition) contains extra data which may configure how this context will function. In other words, it is any data that needs to be passed to a new context to activate it and make it function in the desired way. The type of structure that is passed here is defined by a separate file and provides all fields to configure a specific algorithm type. Thus for the example used above an LZW specific structure would be used from the file. All implementations of a specific algorithm use the same definition structure; that is comp-lzw-sw and comp-lzw-fastchip would use the same definition structure which configures the algorithm implementations. 4.2 Context methods struct generic_con_ops ops; This field is initialized when the context is created and has the same values as the engine->con_ops. The currently supported functions pointers follow. int (*init) ( struct generic_context*, int gfp_flags ); The actual function is never called by the user. Instead it is used internally by create_context to do algorithm specific initialization. Its job is to read through the definition structure (supplied to the create_context f'n) and configure any special features before processing can start. int (*cleanup) ( struct generic_context* ); The cleanup function is never called by the user. It is called by the last put_context call - that is the one that returns the last usage count and reduces the context->use_count to zero. int (*process) ( struct generic_job *job ); The process function performs operations on the data defined by the job within a given context pointed to by job->con. What the process function does may differ as per the configuration provided to create_context defn parameter. Note that a job must have been created, by a call to create_job, before calling one of these functions. The job structure is essential as the process function can operate in an async way. If the process function returns ENGINE_OK or ENGINE_DETACHED then the job->callback function will be called upon the completion of the operation. Any other return should be interpreted as an error; in this case the callback function will not be called. A success returned from the process function does not mean that the operation succeeded; the operation may still result in a failure -- the callback is responsible for checking the job->result field to verify success of the operation. int (*max_output_length) ( struct generic_context*, int input_length ); It is expected that the engine contexts will be used in chains; that is after one algorithm is applied to a piece of data, that output will be processed through another algorithm. This function returns the predicted size of the destination buffer required to process data of a source length given. With this knowledge buffers can be allocated ahead of time to be used for processing of the entire chain. int (*job_size) ( struct generic_context* ); The job_size function returns the size of the structure needed to be allocated in order to perform a job (i.e. an operation using the context). This should not be used outright but instead it is used by the create_job() function call defined in . 4.3 Context synchronization rwlock_t lock; This variable is used to lock the data in the context structure. As in the case of engine->lock the link list is not locked. atomic_t use_count; Use count keeps track of the number of users of the context. The structure has this value initialized to 1. It is incremented whenever a new job is executed on this context and decremented when a job is finished. The use count is also decremented, as mentioned above, by a call to context->release. Once the use count reaches zero the object is deleted and removed from the engine's list of contexts. atomic_t del_flag; Once the del flag is set there are no more jobs generated on this contexts - they fail. 5. Job data-type The job structure is defined as follows. struct generic_job { struct list_head job_elem; struct generic_context *con; struct generic_job_ops ops; struct { void *in_ptr, *out_ptr, *extra_ptr; int in_len, out_len, extra_len; } data; generic_job_cb_fn callback; void* opaque; int result; rwlock_t lock; atomic_t use_count; atomic_t del_flag; }; 5.1 Job management fields struct list_head job_elem; This field is used as a list element in the context->job_list; it must be the first item in the job structure. struct generic_context *con; Points to the context on which this job is created. 5.2 Job objectives struct { void *in_ptr, *out_ptr, *extra_ptr; int in_len, out_len, extra_len; } data; The data structure identifies the data to be processed and where the result should be stored. The extra_ptr/extra_len fields are optional and are used as needed by the individual algorithms. int result; The result is the result of the operation. It is set to ENGINE_DETACHED when the operation starts. It is set to ENGINE_OK if the operation was successful. The callback, defined below, needs to check this field to determine if the operation was successful or if it should be discarded. void* opaque; This can be set to anything the user wants. generic_job_cb_fn cb_fn; This is the function callback which is called when the job is completed, timed out or canceled prematurely. The callback should check the job->result for details. 5.3 Job methods struct generic_job_ops ops; This structure is filled at the time of create_job function call as a copy of engine->job_ops. Only one function call is currently possible. void (*cancel_async) ( struct generic_job *); This call allows the context, or the user, to cancel a job prematurely. 5.4 Job synchronization rwlock_t lock; atomic_t use_count; atomic_t del_flag; These fields are used in the same manner as the context lock. See above. 6. More? Regards, Bart Trojanowski ============================================================================= Ideas mentioned above are a copyright of Bart Trojanowski. If available, an updated version of this document can be found at: http://www.jukie.net/~bart/linux-ipsec/