Blame - jni/libopenssl/crypto/engine/README - jami-client-android

blob: 6b69b70f576a3bea61973f3f2d653be67e651ea4 [file] [log] [blame]

Alexandre Savard	1b09e31	2012-08-07 20:33:29 -0400	[diff] [blame]	1	Notes: 2001-09-24
				2	-----------------
				3
				4	This "description" (if one chooses to call it that) needed some major updating
				5	so here goes. This update addresses a change being made at the same time to
				6	OpenSSL, and it pretty much completely restructures the underlying mechanics of
				7	the "ENGINE" code. So it serves a double purpose of being a "ENGINE internals
				8	for masochists" document and a rather extensive commit log message. (I'd get
				9	lynched for sticking all this in CHANGES or the commit mails :-).
				10
				11	ENGINE_TABLE underlies this restructuring, as described in the internal header
				12	"eng_int.h", implemented in eng_table.c, and used in each of the "class" files;
				13	tb_rsa.c, tb_dsa.c, etc.
				14
				15	However, "EVP_CIPHER" underlies the motivation and design of ENGINE_TABLE so
				16	I'll mention a bit about that first. EVP_CIPHER (and most of this applies
				17	equally to EVP_MD for digests) is both a "method" and a algorithm/mode
				18	identifier that, in the current API, "lingers". These cipher description +
				19	implementation structures can be defined or obtained directly by applications,
				20	or can be loaded "en masse" into EVP storage so that they can be catalogued and
				21	searched in various ways, ie. two ways of encrypting with the "des_cbc"
				22	algorithm/mode pair are;
				23
				24	(i) directly;
				25	const EVP_CIPHER *cipher = EVP_des_cbc();
				26	EVP_EncryptInit(&ctx, cipher, key, iv);
				27	[ ... use EVP_EncryptUpdate() and EVP_EncryptFinal() ...]
				28
				29	(ii) indirectly;
				30	OpenSSL_add_all_ciphers();
				31	cipher = EVP_get_cipherbyname("des_cbc");
				32	EVP_EncryptInit(&ctx, cipher, key, iv);
				33	[ ... etc ... ]
				34
				35	The latter is more generally used because it also allows ciphers/digests to be
				36	looked up based on other identifiers which can be useful for automatic cipher
				37	selection, eg. in SSL/TLS, or by user-controllable configuration.
				38
				39	The important point about this is that EVP_CIPHER definitions and structures are
				40	passed around with impunity and there is no safe way, without requiring massive
				41	rewrites of many applications, to assume that EVP_CIPHERs can be reference
				42	counted. One an EVP_CIPHER is exposed to the caller, neither it nor anything it
				43	comes from can "safely" be destroyed. Unless of course the way of getting to
				44	such ciphers is via entirely distinct API calls that didn't exist before.
				45	However existing API usage cannot be made to understand when an EVP_CIPHER
				46	pointer, that has been passed to the caller, is no longer being used.
				47
				48	The other problem with the existing API w.r.t. to hooking EVP_CIPHER support
				49	into ENGINE is storage - the OBJ_NAME-based storage used by EVP to register
				50	ciphers simultaneously registers cipher types and cipher implementations -
				51	they are effectively the same thing, an "EVP_CIPHER" pointer. The problem with
				52	hooking in ENGINEs is that multiple ENGINEs may implement the same ciphers. The
				53	solution is necessarily that ENGINE-provided ciphers simply are not registered,
				54	stored, or exposed to the caller in the same manner as existing ciphers. This is
				55	especially necessary considering the fact ENGINE uses reference counts to allow
				56	for cleanup, modularity, and DSO support - yet EVP_CIPHERs, as exposed to
				57	callers in the current API, support no such controls.
				58
				59	Another sticking point for integrating cipher support into ENGINE is linkage.
				60	Already there is a problem with the way ENGINE supports RSA, DSA, etc whereby
				61	they are available because they're part of a giant ENGINE called "openssl".
				62	Ie. all implementations have to come from an ENGINE, but we get round that by
				63	having a giant ENGINE with all the software support encapsulated. This creates
				64	linker hassles if nothing else - linking a 1-line application that calls 2 basic
				65	RSA functions (eg. "RSA_free(RSA_new());") will result in large quantities of
				66	ENGINE code being linked in and because of that DSA, DH, and RAND also. If we
				67	continue with this approach for EVP_CIPHER support (even if it was possible)
				68	we would lose our ability to link selectively by selectively loading certain
				69	implementations of certain functionality. Touching any part of any kind of
				70	crypto would result in massive static linkage of everything else. So the
				71	solution is to change the way ENGINE feeds existing "classes", ie. how the
				72	hooking to ENGINE works from RSA, DSA, DH, RAND, as well as adding new hooking
				73	for EVP_CIPHER, and EVP_MD.
				74
				75	The way this is now being done is by mostly reverting back to how things used to
				76	work prior to ENGINE :-). Ie. RSA now has a "RSA_METHOD" pointer again - this
				77	was previously replaced by an "ENGINE" pointer and all RSA code that required
				78	the RSA_METHOD would call ENGINE_get_RSA() each time on its ENGINE handle to
				79	temporarily get and use the ENGINE's RSA implementation. Apart from being more
				80	efficient, switching back to each RSA having an RSA_METHOD pointer also allows
				81	us to conceivably operate with no ENGINE. As we'll see, this removes any need
				82	for a fallback ENGINE that encapsulates default implementations - we can simply
				83	have our RSA structure pointing its RSA_METHOD pointer to the software
				84	implementation and have its ENGINE pointer set to NULL.
				85
				86	A look at the EVP_CIPHER hooking is most explanatory, the RSA, DSA (etc) cases
				87	turn out to be degenerate forms of the same thing. The EVP storage of ciphers,
				88	and the existing EVP API functions that return "software" implementations and
				89	descriptions remain untouched. However, the storage takes more meaning in terms
				90	of "cipher description" and less meaning in terms of "implementation". When an
				91	EVP_CIPHER_CTX is actually initialised with an EVP_CIPHER method and is about to
				92	begin en/decryption, the hooking to ENGINE comes into play. What happens is that
				93	cipher-specific ENGINE code is asked for an ENGINE pointer (a functional
				94	reference) for any ENGINE that is registered to perform the algo/mode that the
				95	provided EVP_CIPHER structure represents. Under normal circumstances, that
				96	ENGINE code will return NULL because no ENGINEs will have had any cipher
				97	implementations registered. As such, a NULL ENGINE pointer is stored in the
				98	EVP_CIPHER_CTX context, and the EVP_CIPHER structure is left hooked into the
				99	context and so is used as the implementation. Pretty much how things work now
				100	except we'd have a redundant ENGINE pointer set to NULL and doing nothing.
				101
				102	Conversely, if an ENGINE has been registered to perform the algorithm/mode
				103	combination represented by the provided EVP_CIPHER, then a functional reference
				104	to that ENGINE will be returned to the EVP_CIPHER_CTX during initialisation.
				105	That functional reference will be stored in the context (and released on
				106	cleanup) - and having that reference provides a safe way to use an EVP_CIPHER
				107	definition that is private to the ENGINE. Ie. the EVP_CIPHER provided by the
				108	application will actually be replaced by an EVP_CIPHER from the registered
				109	ENGINE - it will support the same algorithm/mode as the original but will be a
				110	completely different implementation. Because this EVP_CIPHER isn't stored in the
				111	EVP storage, nor is it returned to applications from traditional API functions,
				112	there is no associated problem with it not having reference counts. And of
				113	course, when one of these "private" cipher implementations is hooked into
				114	EVP_CIPHER_CTX, it is done whilst the EVP_CIPHER_CTX holds a functional
				115	reference to the ENGINE that owns it, thus the use of the ENGINE's EVP_CIPHER is
				116	safe.
				117
				118	The "cipher-specific ENGINE code" I mentioned is implemented in tb_cipher.c but
				119	in essence it is simply an instantiation of "ENGINE_TABLE" code for use by
				120	EVP_CIPHER code. tb_digest.c is virtually identical but, of course, it is for
				121	use by EVP_MD code. Ditto for tb_rsa.c, tb_dsa.c, etc. These instantiations of
				122	ENGINE_TABLE essentially provide linker-separation of the classes so that even
				123	if ENGINEs implement all possible algorithms, an application using only
				124	EVP_CIPHER code will link at most code relating to EVP_CIPHER, tb_cipher.c, core
				125	ENGINE code that is independant of class, and of course the ENGINE
				126	implementation that the application loaded. It will not however link any
				127	class-specific ENGINE code for digests, RSA, etc nor will it bleed over into
				128	other APIs, such as the RSA/DSA/etc library code.
				129
				130	ENGINE_TABLE is a little more complicated than may seem necessary but this is
				131	mostly to avoid a lot of "init()"-thrashing on ENGINEs (that may have to load
				132	DSOs, and other expensive setup that shouldn't be thrashed unnecessarily) and
				133	to duplicate "default" behaviour. Basically an ENGINE_TABLE instantiation, for
				134	example tb_cipher.c, implements a hash-table keyed by integer "nid" values.
				135	These nids provide the uniquenness of an algorithm/mode - and each nid will hash
				136	to a potentially NULL "ENGINE_PILE". An ENGINE_PILE is essentially a list of
				137	pointers to ENGINEs that implement that particular 'nid'. Each "pile" uses some
				138	caching tricks such that requests on that 'nid' will be cached and all future
				139	requests will return immediately (well, at least with minimal operation) unless
				140	a change is made to the pile, eg. perhaps an ENGINE was unloaded. The reason is
				141	that an application could have support for 10 ENGINEs statically linked
				142	in, and the machine in question may not have any of the hardware those 10
				143	ENGINEs support. If each of those ENGINEs has a "des_cbc" implementation, we
				144	want to avoid every EVP_CIPHER_CTX setup from trying (and failing) to initialise
				145	each of those 10 ENGINEs. Instead, the first such request will try to do that
				146	and will either return (and cache) a NULL ENGINE pointer or will return a
				147	functional reference to the first that successfully initialised. In the latter
				148	case it will also cache an extra functional reference to the ENGINE as a
				149	"default" for that 'nid'. The caching is acknowledged by a 'uptodate' variable
				150	that is unset only if un/registration takes place on that pile. Ie. if
				151	implementations of "des_cbc" are added or removed. This behaviour can be
				152	tweaked; the ENGINE_TABLE_FLAG_NOINIT value can be passed to
				153	ENGINE_set_table_flags(), in which case the only ENGINEs that tb_cipher.c will
				154	try to initialise from the "pile" will be those that are already initialised
				155	(ie. it's simply an increment of the functional reference count, and no real
				156	"initialisation" will take place).
				157
				158	RSA, DSA, DH, and RAND all have their own ENGINE_TABLE code as well, and the
				159	difference is that they all use an implicit 'nid' of 1. Whereas EVP_CIPHERs are
				160	actually qualitatively different depending on 'nid' (the "des_cbc" EVP_CIPHER is
				161	not an interoperable implementation of "aes_256_cbc"), RSA_METHODs are
				162	necessarily interoperable and don't have different flavours, only different
				163	implementations. In other words, the ENGINE_TABLE for RSA will either be empty,
				164	or will have a single ENGING_PILE hashed to by the 'nid' 1 and that pile
				165	represents ENGINEs that implement the single "type" of RSA there is.
				166
				167	Cleanup - the registration and unregistration may pose questions about how
				168	cleanup works with the ENGINE_PILE doing all this caching nonsense (ie. when the
				169	application or EVP_CIPHER code releases its last reference to an ENGINE, the
				170	ENGINE_PILE code may still have references and thus those ENGINEs will stay
				171	hooked in forever). The way this is handled is via "unregistration". With these
				172	new ENGINE changes, an abstract ENGINE can be loaded and initialised, but that
				173	is an algorithm-agnostic process. Even if initialised, it will not have
				174	registered any of its implementations (to do so would link all class "table"
				175	code despite the fact the application may use only ciphers, for example). This
				176	is deliberately a distinct step. Moreover, registration and unregistration has
				177	nothing to do with whether an ENGINE is functional or not (ie. you can even
				178	register an ENGINE and its implementations without it being operational, you may
				179	not even have the drivers to make it operate). What actually happens with
				180	respect to cleanup is managed inside eng_lib.c with the "engine_cleanup_***"
				181	functions. These functions are internal-only and each part of ENGINE code that
				182	could require cleanup will, upon performing its first allocation, register a
				183	callback with the "engine_cleanup" code. The other part of this that makes it
				184	tick is that the ENGINE_TABLE instantiations (tb_***.c) use NULL as their
				185	initialised state. So if RSA code asks for an ENGINE and no ENGINE has
				186	registered an implementation, the code will simply return NULL and the tb_rsa.c
				187	state will be unchanged. Thus, no cleanup is required unless registration takes
				188	place. ENGINE_cleanup() will simply iterate across a list of registered cleanup
				189	callbacks calling each in turn, and will then internally delete its own storage
				190	(a STACK). When a cleanup callback is next registered (eg. if the cleanup() is
				191	part of a gracefull restart and the application wants to cleanup all state then
				192	start again), the internal STACK storage will be freshly allocated. This is much
				193	the same as the situation in the ENGINE_TABLE instantiations ... NULL is the
				194	initialised state, so only modification operations (not queries) will cause that
				195	code to have to register a cleanup.
				196
				197	What else? The bignum callbacks and associated ENGINE functions have been
				198	removed for two obvious reasons; (i) there was no way to generalise them to the
				199	mechanism now used by RSA/DSA/..., because there's no such thing as a BIGNUM
				200	method, and (ii) because of (i), there was no meaningful way for library or
				201	application code to automatically hook and use ENGINE supplied bignum functions
				202	anyway. Also, ENGINE_cpy() has been removed (although an internal-only version
				203	exists) - the idea of providing an ENGINE_cpy() function probably wasn't a good
				204	one and now certainly doesn't make sense in any generalised way. Some of the
				205	RSA, DSA, DH, and RAND functions that were fiddled during the original ENGINE
				206	changes have now, as a consequence, been reverted back. This is because the
				207	hooking of ENGINE is now automatic (and passive, it can interally use a NULL
				208	ENGINE pointer to simply ignore ENGINE from then on).
				209
				210	Hell, that should be enough for now ... comments welcome: geoff@openssl.org
				211