|
| 1 | +/* |
| 2 | + * Copyright 2015 Howard Chu, Symas Corp. |
| 3 | + * All rights reserved. |
| 4 | + * |
| 5 | + * Redistribution and use in source and binary forms, with or without |
| 6 | + * modification, are permitted only as authorized by the OpenLDAP |
| 7 | + * Public License. |
| 8 | + * |
| 9 | + * A copy of this license is available in the file LICENSE in the |
| 10 | + * top-level directory of the distribution or, alternatively, at |
| 11 | + * <http://www.OpenLDAP.org/license.html>. |
| 12 | + */ |
| 13 | +/** @page starting Getting Started |
| 14 | + |
| 15 | +LMDB is compact, fast, powerful, and robust and implements a simplified |
| 16 | +variant of the BerkeleyDB (BDB) API. (BDB is also very powerful, and verbosely |
| 17 | +documented in its own right.) After reading this page, the main |
| 18 | +\ref mdb documentation should make sense. Thanks to Bert Hubert |
| 19 | +for creating the |
| 20 | +<a href="https://github.com/ahupowerdns/ahutils/blob/master/lmdb-semantics.md"> |
| 21 | +initial version</a> of this writeup. |
| 22 | + |
| 23 | +Everything starts with an environment, created by #mdb_env_create(). |
| 24 | +Once created, this environment must also be opened with #mdb_env_open(). |
| 25 | + |
| 26 | +#mdb_env_open() gets passed a name which is interpreted as a directory |
| 27 | +path. Note that this directory must exist already, it is not created |
| 28 | +for you. Within that directory, a lock file and a storage file will be |
| 29 | +generated. If you don't want to use a directory, you can pass the |
| 30 | +#MDB_NOSUBDIR option, in which case the path you provided is used |
| 31 | +directly as the data file, and another file with a "-lock" suffix |
| 32 | +added will be used for the lock file. |
| 33 | + |
| 34 | +Once the environment is open, a transaction can be created within it |
| 35 | +using #mdb_txn_begin(). Transactions may be read-write or read-only, |
| 36 | +and read-write transactions may be nested. A transaction must only |
| 37 | +be used by one thread at a time. Transactions are always required, |
| 38 | +even for read-only access. The transaction provides a consistent |
| 39 | +view of the data. |
| 40 | + |
| 41 | +Once a transaction has been created, a database can be opened within it |
| 42 | +using #mdb_dbi_open(). If only one database will ever be used in the |
| 43 | +environment, a NULL can be passed as the database name. For named |
| 44 | +databases, the #MDB_CREATE flag must be used to create the database |
| 45 | +if it doesn't already exist. Also, #mdb_env_set_maxdbs() must be |
| 46 | +called after #mdb_env_create() and before #mdb_env_open() to set the |
| 47 | +maximum number of named databases you want to support. |
| 48 | + |
| 49 | +Note: a single transaction can open multiple databases. Generally |
| 50 | +databases should only be opened once, by the first transaction in |
| 51 | +the process. After the first transaction completes, the database |
| 52 | +handles can freely be used by all subsequent transactions. |
| 53 | + |
| 54 | +Within a transaction, #mdb_get() and #mdb_put() can store single |
| 55 | +key/value pairs if that is all you need to do (but see \ref Cursors |
| 56 | +below if you want to do more). |
| 57 | + |
| 58 | +A key/value pair is expressed as two #MDB_val structures. This struct |
| 59 | +has two fields, \c mv_size and \c mv_data. The data is a \c void pointer to |
| 60 | +an array of \c mv_size bytes. |
| 61 | + |
| 62 | +Because LMDB is very efficient (and usually zero-copy), the data returned |
| 63 | +in an #MDB_val structure may be memory-mapped straight from disk. In |
| 64 | +other words <b>look but do not touch</b> (or free() for that matter). |
| 65 | +Once a transaction is closed, the values can no longer be used, so |
| 66 | +make a copy if you need to keep them after that. |
| 67 | + |
| 68 | +@section Cursors Cursors |
| 69 | + |
| 70 | +To do more powerful things, we must use a cursor. |
| 71 | + |
| 72 | +Within the transaction, a cursor can be created with #mdb_cursor_open(). |
| 73 | +With this cursor we can store/retrieve/delete (multiple) values using |
| 74 | +#mdb_cursor_get(), #mdb_cursor_put(), and #mdb_cursor_del(). |
| 75 | + |
| 76 | +#mdb_cursor_get() positions itself depending on the cursor operation |
| 77 | +requested, and for some operations, on the supplied key. For example, |
| 78 | +to list all key/value pairs in a database, use operation #MDB_FIRST for |
| 79 | +the first call to #mdb_cursor_get(), and #MDB_NEXT on subsequent calls, |
| 80 | +until the end is hit. |
| 81 | + |
| 82 | +To retrieve all keys starting from a specified key value, use #MDB_SET. |
| 83 | +For more cursor operations, see the \ref mdb docs. |
| 84 | + |
| 85 | +When using #mdb_cursor_put(), either the function will position the |
| 86 | +cursor for you based on the \b key, or you can use operation |
| 87 | +#MDB_CURRENT to use the current position of the cursor. Note that |
| 88 | +\b key must then match the current position's key. |
| 89 | + |
| 90 | +@subsection summary Summarizing the Opening |
| 91 | + |
| 92 | +So we have a cursor in a transaction which opened a database in an |
| 93 | +environment which is opened from a filesystem after it was |
| 94 | +separately created. |
| 95 | + |
| 96 | +Or, we create an environment, open it from a filesystem, create a |
| 97 | +transaction within it, open a database within that transaction, |
| 98 | +and create a cursor within all of the above. |
| 99 | + |
| 100 | +Got it? |
| 101 | + |
| 102 | +@section thrproc Threads and Processes |
| 103 | + |
| 104 | +LMDB uses POSIX locks on files, and these locks have issues if one |
| 105 | +process opens a file multiple times. Because of this, do not |
| 106 | +#mdb_env_open() a file multiple times from a single process. Instead, |
| 107 | +share the LMDB environment that has opened the file across all threads. |
| 108 | +Otherwise, if a single process opens the same environment multiple times, |
| 109 | +closing it once will remove all the locks held on it, and the other |
| 110 | +instances will be vulnerable to corruption from other processes. |
| 111 | + |
| 112 | +Also note that a transaction is tied to one thread by default using |
| 113 | +Thread Local Storage. If you want to pass read-only transactions across |
| 114 | +threads, you can use the #MDB_NOTLS option on the environment. |
| 115 | + |
| 116 | +@section txns Transactions, Rollbacks, etc. |
| 117 | + |
| 118 | +To actually get anything done, a transaction must be committed using |
| 119 | +#mdb_txn_commit(). Alternatively, all of a transaction's operations |
| 120 | +can be discarded using #mdb_txn_abort(). In a read-only transaction, |
| 121 | +any cursors will \b not automatically be freed. In a read-write |
| 122 | +transaction, all cursors will be freed and must not be used again. |
| 123 | + |
| 124 | +For read-only transactions, obviously there is nothing to commit to |
| 125 | +storage. The transaction still must eventually be aborted to close |
| 126 | +any database handle(s) opened in it, or committed to keep the |
| 127 | +database handles around for reuse in new transactions. |
| 128 | + |
| 129 | +In addition, as long as a transaction is open, a consistent view of |
| 130 | +the database is kept alive, which requires storage. A read-only |
| 131 | +transaction that no longer requires this consistent view should |
| 132 | +be terminated (committed or aborted) when the view is no longer |
| 133 | +needed (but see below for an optimization). |
| 134 | + |
| 135 | +There can be multiple simultaneously active read-only transactions |
| 136 | +but only one that can write. Once a single read-write transaction |
| 137 | +is opened, all further attempts to begin one will block until the |
| 138 | +first one is committed or aborted. This has no effect on read-only |
| 139 | +transactions, however, and they may continue to be opened at any time. |
| 140 | + |
| 141 | +@section dupkeys Duplicate Keys |
| 142 | + |
| 143 | +#mdb_get() and #mdb_put() respectively have no and only some support |
| 144 | +for multiple key/value pairs with identical keys. If there are multiple |
| 145 | +values for a key, #mdb_get() will only return the first value. |
| 146 | + |
| 147 | +When multiple values for one key are required, pass the #MDB_DUPSORT |
| 148 | +flag to #mdb_dbi_open(). In an #MDB_DUPSORT database, by default |
| 149 | +#mdb_put() will not replace the value for a key if the key existed |
| 150 | +already. Instead it will add the new value to the key. In addition, |
| 151 | +#mdb_del() will pay attention to the value field too, allowing for |
| 152 | +specific values of a key to be deleted. |
| 153 | + |
| 154 | +Finally, additional cursor operations become available for |
| 155 | +traversing through and retrieving duplicate values. |
| 156 | + |
| 157 | +@section optim Some Optimization |
| 158 | + |
| 159 | +If you frequently begin and abort read-only transactions, as an |
| 160 | +optimization, it is possible to only reset and renew a transaction. |
| 161 | + |
| 162 | +#mdb_txn_reset() releases any old copies of data kept around for |
| 163 | +a read-only transaction. To reuse this reset transaction, call |
| 164 | +#mdb_txn_renew() on it. Any cursors in this transaction must also |
| 165 | +be renewed using #mdb_cursor_renew(). |
| 166 | + |
| 167 | +Note that #mdb_txn_reset() is similar to #mdb_txn_abort() and will |
| 168 | +close any databases you opened within the transaction. |
| 169 | + |
| 170 | +To permanently free a transaction, reset or not, use #mdb_txn_abort(). |
| 171 | + |
| 172 | +@section cleanup Cleaning Up |
| 173 | + |
| 174 | +For read-only transactions, any cursors created within it must |
| 175 | +be closed using #mdb_cursor_close(). |
| 176 | + |
| 177 | +It is very rarely necessary to close a database handle, and in |
| 178 | +general they should just be left open. |
| 179 | + |
| 180 | +@section onward The Full API |
| 181 | + |
| 182 | +The full \ref mdb documentation lists further details, like how to: |
| 183 | + |
| 184 | + \li size a database (the default limits are intentionally small) |
| 185 | + \li drop and clean a database |
| 186 | + \li detect and report errors |
| 187 | + \li optimize (bulk) loading speed |
| 188 | + \li (temporarily) reduce robustness to gain even more speed |
| 189 | + \li gather statistics about the database |
| 190 | + \li define custom sort orders |
| 191 | + |
| 192 | +*/ |
0 commit comments