worker-promise (via ESM) tests

SQLite Source Repository

Untar the source archive. CD into the "unix/" subfolder of the source tree. -

Run: `mkdir $HOME/local` -

Run: `./configure --prefix=$HOME/local` -

Run: `make install` +

Run: `mkdir $HOME/local` +

Run: `./configure --prefix=$HOME/local`
+ SQLite deliverable builds add: `--static CFLAGS=-Os` +

Run: `make install`

As of 2024-10-25, TCL is not longer required for many @@ -36,12 +37,35 @@ guidance on building for Windows. 4. Download the SQLite source tree and unpack it. CD into the toplevel directory of the source tree. - 5. Run: `./configure --enable-all --with-tclsh=$HOME/local/bin/tclsh9.0` + 5. *(Optional):* Download Antirez's "linenoise" command-line editing + library to provide command-line editing in the CLI. You can find + a suitable copy of the linenoise sources at + or at various other locations + on the internet. If you put the linenoise source tree in + a directory named $HOME/linenoise or a directory "linenoise" which + is a sibling of the SQLite source tree, then the SQLite ./configure + script will automatically find and use that source code to provide + command-line editing in the CLI. If you would rather use the readline + or editline libraries or a precompiled linenoise library, there are + ./configure options to accommodate that choice. The SQLite developers + typically use $HOME/linenoise since linenoise is small, has no + external dependencies, "just works", and the ./configure script + will pick it up and use it automatically. But you do what works + best for you. +

+ You are not required to have any command-line editing support in + order to use SQLite. But command-line editing does make the + interactive experience more enjoyable. + + 6. Run: `./configure --enable-all --with-tclsh=$HOME/local/bin/tclsh9.0` You do not need to use --with-tclsh if the tclsh you want to use is the first one on your PATH or if you are building without TCL. - 6. Run the "`Makefile`" makefile with an appropriate target. + Lots of other options to ./configure are available. + Run `./configure --help` for further guidance. + + 7. Run the "`Makefile`" makefile with an appropriate target. Examples:

`make sqlite3.c` @@ -66,5 +90,7 @@ guidance on building for Windows. of SQLite. - 7. For a debugging build of the CLI, where the ".treetrace" and ".wheretrace" - commands work, add the the --with-debug argument to configure. + 8. For a debugging build of the CLI, use `./configure --dev`. A debugging + build contains lots of extra debugging code, so it is slow and a lot + bigger. You probably do not want to deploy a debugging build. But if + you are working on the code, a debugging build works much better. diff --git a/doc/compile-for-windows.md b/doc/compile-for-windows.md index 0e59c83fed..30536d5fd8 100644 --- a/doc/compile-for-windows.md +++ b/doc/compile-for-windows.md @@ -180,3 +180,39 @@ statically linked so that it does not depend on separate DLL: 6. After your executable is built, you can verify that it does not depend on the TCL DLL by running:
```
dumpbin /dependents sqlite3_analyzer.exe
```
+ +## Linking Against ZLIB + +Some feature (such as zip-file support in the CLI) require the ZLIB +compression library. That library is more or less universally available +on unix platforms, but is seldom provided on Windows. You will probably +need to provide it yourself. Here the the steps needed: + + 1. Download the zlib-1.3.1.tar.gz tarball (or a similar version). + Unpack the tarball sources. You can put them wherever you like. + For the purposes of this document, let's assume you put the source + tree in c:\\zlib-64. Note: If you are building for both x64 and + x86, you will need separate builds of ZLIB for each, thus separate + build directories. + + 2. Before building SQLite (as described above) first make these + environment changes. The lead-programmer for SQLite (who writes + these words) has BAT files named "env-x64.bat" and "env-x32.bat" + and "env-arm64.bat" that make these changes, and he runs those + BAT file whenever he starts a new shell. These are the settings + needed: +
+ set USE_ZLIB=1
+ set BUILD_ZLIB=0
+ set ZLIBDIR=c:\\zlib-64 +
+ + 3. Because the settings in step 2 specify "BUILD_ZLIB=0", you will need + to build the library at least once. I recommand: +
+ make clean sqlite3.exe BUILD_ZLIB=1 +
+ + 4. After making the environment changes specified in steps 1 through 3 + above, you then build and test SQLite as you normally would. The + environment variable changes will cause ZLIB to be linked automatically. diff --git a/doc/lemon.html b/doc/lemon.html index 965f305c04..a994b396b7 100644 --- a/doc/lemon.html +++ b/doc/lemon.html @@ -696,6 +696,7 @@
4.4 Special Directives
%right
%realloc
%stack_overflow +
%stack_size_limit
%stack_size
%start_symbol
%syntax_error @@ -1203,20 +1204,33 @@
4.4.25 The %wildcard directive
The wildcard token is only matched if there are no alternatives.
-
4.4.26 The %realloc and %free directives
+
4.4.26 The %realloc, %free, and +%stack_size_limit directives

The %realloc and %free directives defines function -that allocate and free heap memory. The signatures of these functions -should be the same as the realloc() and free() functions from the standard -C library. - -
If both of these functions are defined -then these functions are used to allocate and free -memory for supplemental parser stack space, if the initial -parse stack space is exceeded. The initial parser stack size +that allocate and free heap memory. The signatures and semantics of +these functions are similar to the realloc() and free() functions from +the standard C library, except that these functions take an extra +parameter at the end that is determined by %extra_context. If +%extra_context is not defined, then the extra argument is 0. The +extra parameter provides the capability to do better error reporting +in the event of a memory allocation error, and/or to use an alternative +private application heap. + +
If both of these functions are defined then they are used to +allocate and free memory for supplemental parser stack space, if +the initial parse stack space is exceeded. The initial parser stack size is specified by either %stack_size or the -DYYSTACKDEPTH compile-time flag. +
The %stack_size_limit directive defines a function that returns +the maximum allowed parser stack size. If this diretive does not exist, +no size limit is enforced. This function takes a single argument which +is the %extra_context value or "0" if %extra_context is not defined. +The function should return an integer that is the maximum +number of parser stack entries. If more stack space +than this is needed, the %stack_overflow code is invoked. +
5.0 Error Processing
diff --git a/doc/testrunner.md b/doc/testrunner.md index d1696e9d1d..90ef4b71f2 100644 --- a/doc/testrunner.md +++ b/doc/testrunner.md @@ -4,6 +4,12 @@
- 1. Overview +
  - 1.1. Running testrunner.tcl +
  - 1.2. Run using "make" +
  - 1.3. Outputs from testrunner.tcl +
  - 1.4. Built-in help +
- 2. Binary Tests
  - 2.1. Organization of Tcl Tests @@ -26,17 +32,40 @@ The testrunner.tcl program is a Tcl script used to run multiple SQLite tests in parallel, thus reducing testing time on multi-core machines. -It supports the following types of tests: +The testrunner.tcl supports running tests that based on `testfixture`, +`sqlite3`, and `fuzzcheck`. + + +## 1.1 Running testrunner.tcl + +The testrunner.tcl script is located in the "test" subdirectory of the +SQLite source tree. So if your shell is current positioned at the top +of the source tree, you would normally run the script using the command: +"test/testrunner.tcl". On Windows, you have to specify the +`tclsh` interpreter command first, like this: +"tclsh test/testrunner.tcl". + +In this document, we will assume that you are on a unix-like OS +(not on Windows) and that your current directory is the root +of the SQLite source tree, and so all invocations of the testrunner.tcl +script will be of the form "test/testrunner.tcl". If you +are in a different directory, then make appropriate adjustments to +the path. On Windows, add the "tclsh" interpreter command +up front. - * Tcl test scripts. + +## 1.2 Run using make - * Fuzzcheck tests, including using an external fuzzcheck database. +The standard Makefiles for SQLite include targets that invoke +testrunner.tcl. So the following commands also run testrunner.tcl: - * Tests run with `make` commands. Examples: - - `make devtest` - - `make releasetest` - - `make sdevtest` - - `make testrunner` + * `make devtest` + * `make releasetest` + * `make sdevtest` + * `make testrunner` + + +## 1.3 Outputs from testrunner.tcl The testrunner.tcl program stores output of all tests and builds run in log file **testrunner.log**, created in the current working directory. @@ -54,17 +83,19 @@ A useful query might be: ``` You can get a summary of errors in a prior run by invoking commands like -these: +those shown below. Note that the testrunner.tcl script can be run directly +on unix systems (including Macs) but you will need to add tclsh +to the front on Windows. ``` - tclsh $(TESTDIR)/testrunner.tcl errors - tclsh $(TESTDIR)/testrunner.tcl errors -v + test/testrunner.tcl errors + test/testrunner.tcl errors -v ``` Running the command: ``` - tclsh $(TESTDIR)/testrunner.tcl status + test/testrunner.tcl status ``` in the directory containing the testrunner.db database runs various queries @@ -73,32 +104,43 @@ A good way to keep and eye on test progress is to run either of the two following commands: ``` - watch tclsh $(TESTDIR)/testrunner.tcl status - tclsh $(TESTDIR)/testrunner.tcl status -d 2 + watch test/testrunner.tcl status + test/testrunner.tcl status -d 2 ``` Both of the commands above accomplish about the same thing, but the second one has the advantage of not requiring "watch" to be installed on your system. -Sometimes testrunner.tcl uses the `testfixture` binary that it is run with -to run tests (see "Binary Tests" below). Sometimes it builds testfixture and +Sometimes testrunner.tcl uses the `testfixture` and `sqlite3` binaries that +are in the directory from which testrunner.tcl is run. +(see "Binary Tests" below). Sometimes it builds testfixture and other binaries in specific configurations to test (see "Source Tests"). + +## 1.4 Built-in help + +Run this command: + +``` + test/testrunner.tcl help +``` + +To get a summary of all of the various command-line options available +with testrunner.tcl + + # 2. Binary Tests The commands described in this section all run various combinations of the Tcl -test scripts using the `testfixture` binary used to run the testrunner.tcl -script (i.e. they do not invoke the compiler to build new binaries, or the -`make` command to run tests that are not Tcl scripts). The procedure to run -these tests is therefore: - - 1. Build the "testfixture" (or "testfixture.exe" for windows) binary using - whatever method seems convenient. - - 2. Test the binary built in step 1 by running testrunner.tcl with it, - perhaps with various options. +test scripts using whatever `testfixture` binary (and maybe also the `sqlite3` +binary, depending on the test) that is found in the directory from which +testrunner.tcl is launched. So typically, one must first run something +like "`make testfixture sqlite3`" before launching binary tests. In other +words, testrunner.tcl does not automatically build the binaries under test +for binary tests. The testrunner.tcl expects the binaries to be available +already. The following sub-sections describe the various options that can be passed to testrunner.tcl to test binary testfixture builds. @@ -140,22 +182,22 @@ are defined in file *testrunner_data.tcl*. To run the "veryquick" test set, use either of the following: ``` - ./testfixture $TESTDIR/testrunner.tcl - ./testfixture $TESTDIR/testrunner.tcl veryquick + test/testrunner.tcl + test/testrunner.tcl veryquick ``` To run the "full" test suite: ``` - ./testfixture $TESTDIR/testrunner.tcl full + test/testrunner.tcl full ``` To run the subset of the "full" test suite for which the test file name matches a specified pattern (e.g. all tests that start with "fts5"), either of: ``` - ./testfixture $TESTDIR/testrunner.tcl fts5% - ./testfixture $TESTDIR/testrunner.tcl 'fts5*' + test/testrunner.tcl fts5% + test/testrunner.tcl 'fts5*' ``` Strictly speaking, for a test to be run the pattern must match the script @@ -167,7 +209,7 @@ characters specified as part of the pattern are transformed to "\*". To run "all" tests (full + permutations): ``` - ./testfixture $TESTDIR/testrunner.tcl all + test/testrunner.tcl all ``` @@ -186,10 +228,15 @@ If there is no permutation, the individual test script may be run with: Or, if the failure occured as part of a permutation: ``` - ./testfixture $TESTDIR/testrunner.tcl $PERMUTATION $PATH_TO_SCRIPT + test/testrunner.tcl $PERMUTATION $PATH_TO_SCRIPT ``` -TODO: An example instead of "$PERMUTATION" and $PATH\_TO\_SCRIPT? +One can also rerun all tests that failed or did not complete +in the previous invocation by typing: + +``` + test/testrunner.tcl retest +``` # 3. Source Code Tests @@ -204,11 +251,9 @@ other tests. The advantages of this are that: * it ensures that tests are always run using binaries created with the same set of compiler options. -The testrunner.tcl commands described in this section may be run using -either a *testfixture* (or testfixture.exe) build, or with any other Tcl -shell that supports SQLite 3.31.1 or newer via "package require sqlite3". - -TODO: ./configure + Makefile.msc build systems. +The testrunner.tcl commands described in this section do not require that +the testfixture and/or sqlite3 binaries be built ahead of time. Those +binaries will be constructed automatically. ## 3.1. Commands to Run SQLite Tests @@ -218,7 +263,7 @@ the `make fuzztest` target once for each of two --enable-all builds - one with debugging enabled and one without: ``` - tclsh $TESTDIR/testrunner.tcl mdevtest + test/testrunner.tcl mdevtest ``` In other words, it is equivalent to running: @@ -227,13 +272,13 @@ In other words, it is equivalent to running: $TOP/configure --enable-all --enable-debug make fuzztest make testfixture - ./testfixture $TOP/test/testrunner.tcl veryquick + $TOP/test/testrunner.tcl veryquick # Then, after removing files created by the tests above: $TOP/configure --enable-all OPTS="-O0" make fuzztest make testfixture - ./testfixture $TOP/test/testrunner.tcl veryquick + $TOP/test/testrunner.tcl veryquick ``` The **sdevtest** command is identical to the mdevtest command, except that the @@ -241,7 +286,7 @@ second of the two builds is a sanitizer build. Specifically, this means that OPTS="-fsanitize=address,undefined" is specified instead of OPTS="-O0": ``` - tclsh $TESTDIR/testrunner.tcl sdevtest + test/testrunner.tcl sdevtest ``` The **release** command runs lots of tests under lots of builds. It runs @@ -250,7 +295,7 @@ on Linux, Windows or OSX. Refer to *testrunner\_data.tcl* for the details of the specific tests run. ``` - tclsh $TESTDIR/testrunner.tcl release + test/testrunner.tcl release ``` As with source code tests, one or more patterns @@ -258,7 +303,7 @@ may be appended to any of the above commands (mdevtest, sdevtest or release). Pattern matching is used for both Tcl tests and fuzz tests. ``` - tclsh $TESTDIR/testrunner.tcl release rtree% + test/testrunner.tcl release rtree% ``` @@ -268,14 +313,14 @@ testrunner.tcl can build a zipvfs-enabled testfixture and use it to run tests from the Zipvfs project with the following command: ``` - tclsh $TESTDIR/testrunner.tcl --zipvfs $PATH_TO_ZIPVFS + test/testrunner.tcl --zipvfs $PATH_TO_ZIPVFS ``` This can be combined with any of "mdevtest", "sdevtest" or "release" to test both SQLite and Zipvfs with a single command: ``` - tclsh $TESTDIR/testrunner.tcl --zipvfs $PATH_TO_ZIPVFS mdevtest + test/testrunner.tcl --zipvfs $PATH_TO_ZIPVFS mdevtest ``` @@ -295,10 +340,10 @@ a dos \*.bat file on windows. For example: ``` # Create a script that recreates build configuration "Device-One" on # Linux or OSX: - tclsh $TESTDIR/testrunner.tcl script Device-One > make.sh + test/testrunner.tcl script Device-One > make.sh # Create a script that recreates build configuration "Have-Not" on Windows: - tclsh $TESTDIR/testrunner.tcl script Have-Not > make.bat + test/testrunner.tcl script Have-Not > make.bat ``` The generated bash or \*.bat file script accepts a single argument - a makefile @@ -321,7 +366,7 @@ Thus, for example, to run a full releasetest including an external dbsqlfuzz database, run a command like one of these: ``` - tclsh test/testrunner.tcl releasetest --fuzzdb ../fuzz/20250415.db + test/testrunner.tcl releasetest --fuzzdb ../fuzz/20250415.db FUZZDB=../fuzz/20250415.db make releasetest nmake /f Makefile.msc FUZZDB=../fuzz/20250415.db releasetest ``` @@ -331,7 +376,7 @@ databases. So if you want to run *only* tests involving the external database, you can use a command something like this: ``` - tclsh test/testrunner.tcl releasetest 20250415 --fuzzdb ../fuzz/20250415.db + test/testrunner.tcl releasetest 20250415 --fuzzdb ../fuzz/20250415.db ``` @@ -345,7 +390,7 @@ required by a test, not to run any actual tests. For example: ``` # Build binaries required by release test. - tclsh $TESTDIR/testrunner.tcl --buildonly release" + test/testrunner.tcl --buildonly release" ``` The **--dryrun** option prevents testrunner.tcl from building any binaries @@ -354,7 +399,7 @@ would normally execute into the testrunner.log file. Example: ``` # Log the shell commmands that make up the mdevtest test. - tclsh $TESTDIR/testrunner.tcl --dryrun mdevtest" + test/testrunner.tcl --dryrun mdevtest" ``` The **--explain** option is similar to --dryrun in that it prevents @@ -364,7 +409,7 @@ summary of all the builds and tests that would have been run. ``` # Show what builds and tests would have been run - tclsh $TESTDIR/testrunner.tcl --explain mdevtest + test/testrunner.tcl --explain mdevtest ``` The **--status** option uses VT100 escape sequences to display the test @@ -380,7 +425,7 @@ When running either binary or source code tests, testrunner.tcl reports the number of jobs it intends to use to stdout. e.g. ``` - $ ./testfixture $TESTDIR/testrunner.tcl + $ test/testrunner.tcl splitting work across 16 jobs ... more output ... ``` @@ -390,7 +435,7 @@ of real cores on the machine. This can be overridden using the "--jobs" (or -j) switch: ``` - $ ./testfixture $TESTDIR/testrunner.tcl --jobs 8 + $ test/testrunner.tcl --jobs 8 splitting work across 8 jobs ... more output ... ``` @@ -400,5 +445,5 @@ running by exucuting the following command from the directory containing the testrunner.log and testrunner.db files: ``` - $ ./testfixture $TESTDIR/testrunner.tcl njob $NEW_NUMBER_OF_JOBS + $ test/testrunner.tcl njob $NEW_NUMBER_OF_JOBS ``` diff --git a/ext/expert/expert1.test b/ext/expert/expert1.test index 0c3b512af0..aaea03711d 100644 --- a/ext/expert/expert1.test +++ b/ext/expert/expert1.test @@ -90,7 +90,7 @@ foreach {tn setup} { proc do_rec_test {tn sql res} { set res [squish [string trim $res]] set tst [subst -nocommands { - squish [string trim [exec $::CLI test.db ".expert" {$sql;}]] + squish [string trim [exec $::CLI -noinit test.db ".expert" {$sql;}]] }] uplevel [list do_test $tn $tst $res] } diff --git a/ext/fts3/fts3.c b/ext/fts3/fts3.c index f178abafed..368e9b189a 100644 --- a/ext/fts3/fts3.c +++ b/ext/fts3/fts3.c @@ -1816,9 +1816,7 @@ static int fts3CursorSeekStmt(Fts3Cursor *pCsr){ zSql = sqlite3_mprintf("SELECT %s WHERE rowid = ?", p->zReadExprlist); if( !zSql ) return SQLITE_NOMEM; p->bLock++; - rc = sqlite3_prepare_v3( - p->db, zSql,-1,SQLITE_PREPARE_PERSISTENT,&pCsr->pStmt,0 - ); + rc = sqlite3Fts3PrepareStmt(p, zSql, 1, 1, &pCsr->pStmt); p->bLock--; sqlite3_free(zSql); } @@ -3393,9 +3391,7 @@ static int fts3FilterMethod( } if( zSql ){ p->bLock++; - rc = sqlite3_prepare_v3( - p->db,zSql,-1,SQLITE_PREPARE_PERSISTENT,&pCsr->pStmt,0 - ); + rc = sqlite3Fts3PrepareStmt(p, zSql, 1, 1, &pCsr->pStmt); p->bLock--; sqlite3_free(zSql); }else{ @@ -4018,6 +4014,7 @@ static int fts3IntegrityMethod( UNUSED_PARAMETER(isQuick); rc = sqlite3Fts3IntegrityCheck(p, &bOk); + assert( pVtab->zErrMsg==0 || rc!=SQLITE_OK ); assert( rc!=SQLITE_CORRUPT_VTAB ); if( rc==SQLITE_ERROR || (rc&0xFF)==SQLITE_CORRUPT ){ *pzErr = sqlite3_mprintf("unable to validate the inverted index for" diff --git a/ext/fts3/fts3Int.h b/ext/fts3/fts3Int.h index e98b90a753..fea31aae84 100644 --- a/ext/fts3/fts3Int.h +++ b/ext/fts3/fts3Int.h @@ -203,7 +203,16 @@ typedef sqlite3_int64 i64; /* 8-byte signed integer */ #define LARGEST_INT64 (0xffffffff|(((i64)0x7fffffff)<<32)) #define SMALLEST_INT64 (((i64)-1) - LARGEST_INT64) -#define deliberate_fall_through +#if !defined(deliberate_fall_through) +# if defined(__has_attribute) +# if __has_attribute(fallthrough) +# define deliberate_fall_through __attribute__((fallthrough)); +# endif +# endif +#endif +#if !defined(deliberate_fall_through) +# define deliberate_fall_through +#endif /* ** Macros needed to provide flexible arrays in a portable way @@ -601,6 +610,15 @@ int sqlite3Fts3Incrmerge(Fts3Table*,int,int); (*(u8*)(p)&0x80) ? sqlite3Fts3GetVarint32(p, piVal) : (*piVal=*(u8*)(p), 1) \ ) +int sqlite3Fts3PrepareStmt( + Fts3Table *p, /* Prepare for this connection */ + const char *zSql, /* SQL to prepare */ + int bPersist, /* True to set SQLITE_PREPARE_PERSISTENT */ + int bAllowVtab, /* True to omit SQLITE_PREPARE_NO_VTAB */ + sqlite3_stmt **pp /* OUT: Prepared statement */ +); + + /* fts3.c */ void sqlite3Fts3ErrMsg(char**,const char*,...); int sqlite3Fts3PutVarint(char *, sqlite3_int64); diff --git a/ext/fts3/fts3_aux.c b/ext/fts3/fts3_aux.c index 439d579366..042fe53946 100644 --- a/ext/fts3/fts3_aux.c +++ b/ext/fts3/fts3_aux.c @@ -325,7 +325,7 @@ static int fts3auxNextMethod(sqlite3_vtab_cursor *pCursor){ pCsr->aStat[1].nDoc++; } eState = 2; - /* fall through */ + /* no break */ deliberate_fall_through case 2: if( v==0 ){ /* 0x00. Next integer will be a docid. */ diff --git a/ext/fts3/fts3_write.c b/ext/fts3/fts3_write.c index 19dff31f00..1b8bca70f2 100644 --- a/ext/fts3/fts3_write.c +++ b/ext/fts3/fts3_write.c @@ -98,9 +98,9 @@ typedef struct SegmentWriter SegmentWriter; ** incrementally. See function fts3PendingListAppend() for details. */ struct PendingList { - int nData; + sqlite3_int64 nData; char *aData; - int nSpace; + sqlite3_int64 nSpace; sqlite3_int64 iLastDocid; sqlite3_int64 iLastCol; sqlite3_int64 iLastPos; @@ -273,6 +273,24 @@ struct SegmentNode { #define SQL_UPDATE_LEVEL_IDX 38 #define SQL_UPDATE_LEVEL 39 +/* +** Wrapper around sqlite3_prepare_v3() to ensure that SQLITE_PREPARE_FROM_DDL +** is always set. +*/ +int sqlite3Fts3PrepareStmt( + Fts3Table *p, /* Prepare for this connection */ + const char *zSql, /* SQL to prepare */ + int bPersist, /* True to set SQLITE_PREPARE_PERSISTENT */ + int bAllowVtab, /* True to omit SQLITE_PREPARE_NO_VTAB */ + sqlite3_stmt **pp /* OUT: Prepared statement */ +){ + int f = SQLITE_PREPARE_FROM_DDL + |((bAllowVtab==0) ? SQLITE_PREPARE_NO_VTAB : 0) + |(bPersist ? SQLITE_PREPARE_PERSISTENT : 0); + + return sqlite3_prepare_v3(p->db, zSql, -1, f, pp, NULL); +} + /* ** This function is used to obtain an SQLite prepared statement handle ** for the statement identified by the second argument. If successful, @@ -398,12 +416,12 @@ static int fts3SqlStmt( pStmt = p->aStmt[eStmt]; if( !pStmt ){ - int f = SQLITE_PREPARE_PERSISTENT|SQLITE_PREPARE_NO_VTAB; + int bAllowVtab = 0; char *zSql; if( eStmt==SQL_CONTENT_INSERT ){ zSql = sqlite3_mprintf(azSql[eStmt], p->zDb, p->zName, p->zWriteExprlist); }else if( eStmt==SQL_SELECT_CONTENT_BY_ROWID ){ - f &= ~SQLITE_PREPARE_NO_VTAB; + bAllowVtab = 1; zSql = sqlite3_mprintf(azSql[eStmt], p->zReadExprlist); }else{ zSql = sqlite3_mprintf(azSql[eStmt], p->zDb, p->zName); @@ -411,7 +429,7 @@ static int fts3SqlStmt( if( !zSql ){ rc = SQLITE_NOMEM; }else{ - rc = sqlite3_prepare_v3(p->db, zSql, -1, f, &pStmt, NULL); + rc = sqlite3Fts3PrepareStmt(p, zSql, 1, bAllowVtab, &pStmt); sqlite3_free(zSql); assert( rc==SQLITE_OK || pStmt==0 ); p->aStmt[eStmt] = pStmt; @@ -760,7 +778,9 @@ static int fts3PendingTermsAddOne( pList = (PendingList *)fts3HashFind(pHash, zToken, nToken); if( pList ){ - p->nPendingData -= (pList->nData + nToken + sizeof(Fts3HashElem)); + assert( (i64)pList->nData+(i64)nToken+(i64)sizeof(Fts3HashElem) + <= (i64)p->nPendingData ); + p->nPendingData -= (int)(pList->nData + nToken + sizeof(Fts3HashElem)); } if( fts3PendingListAppend(&pList, p->iPrevDocid, iCol, iPos, &rc) ){ if( pList==fts3HashInsert(pHash, zToken, nToken, pList) ){ @@ -773,7 +793,9 @@ static int fts3PendingTermsAddOne( } } if( rc==SQLITE_OK ){ - p->nPendingData += (pList->nData + nToken + sizeof(Fts3HashElem)); + assert( (i64)p->nPendingData + pList->nData + nToken + + sizeof(Fts3HashElem) <= 0x3fffffff ); + p->nPendingData += (int)(pList->nData + nToken + sizeof(Fts3HashElem)); } return rc; } @@ -3574,7 +3596,7 @@ static int fts3DoRebuild(Fts3Table *p){ if( !zSql ){ rc = SQLITE_NOMEM; }else{ - rc = sqlite3_prepare_v2(p->db, zSql, -1, &pStmt, 0); + rc = sqlite3Fts3PrepareStmt(p, zSql, 0, 1, &pStmt); sqlite3_free(zSql); } @@ -5327,7 +5349,7 @@ int sqlite3Fts3IntegrityCheck(Fts3Table *p, int *pbOk){ if( !zSql ){ rc = SQLITE_NOMEM; }else{ - rc = sqlite3_prepare_v2(p->db, zSql, -1, &pStmt, 0); + rc = sqlite3Fts3PrepareStmt(p, zSql, 0, 1, &pStmt); sqlite3_free(zSql); } @@ -5457,7 +5479,7 @@ static int fts3SpecialInsert(Fts3Table *p, sqlite3_value *pVal){ v = atoi(&zVal[9]); if( v>=24 && v<=p->nPgsz-35 ) p->nNodeSize = v; rc = SQLITE_OK; - }else if( nVal>11 && 0==sqlite3_strnicmp(zVal, "maxpending=", 9) ){ + }else if( nVal>11 && 0==sqlite3_strnicmp(zVal, "maxpending=", 11) ){ v = atoi(&zVal[11]); if( v>=64 && v<=FTS3_MAX_PENDING_DATA ) p->nMaxPendingData = v; rc = SQLITE_OK; diff --git a/ext/fts5/fts5Int.h b/ext/fts5/fts5Int.h index a13a65d3c2..d5404535cc 100644 --- a/ext/fts5/fts5Int.h +++ b/ext/fts5/fts5Int.h @@ -81,7 +81,13 @@ typedef sqlite3_uint64 u64; # define FLEXARRAY 1 #endif -#endif +#endif /* SQLITE_AMALGAMATION */ + +/* +** Constants for the largest and smallest possible 32-bit signed integers. +*/ +# define LARGEST_INT32 ((int)(0x7fffffff)) +# define SMALLEST_INT32 ((int)((-1) - LARGEST_INT32)) /* Truncate very long tokens to this many bytes. Hard limit is ** (65536-1-1-4-9)==65521 bytes. The limiting factor is the 16-bit offset diff --git a/ext/fts5/fts5_aux.c b/ext/fts5/fts5_aux.c index 95b33ea318..ee43ca6cca 100644 --- a/ext/fts5/fts5_aux.c +++ b/ext/fts5/fts5_aux.c @@ -455,7 +455,7 @@ static void fts5SnippetFunction( iBestCol = (iCol>=0 ? iCol : 0); nPhrase = pApi->xPhraseCount(pFts); - aSeen = sqlite3_malloc(nPhrase); + aSeen = sqlite3_malloc64(nPhrase); if( aSeen==0 ){ rc = SQLITE_NOMEM; } diff --git a/ext/fts5/fts5_buffer.c b/ext/fts5/fts5_buffer.c index afcd83b6ba..d799e34cb4 100644 --- a/ext/fts5/fts5_buffer.c +++ b/ext/fts5/fts5_buffer.c @@ -288,7 +288,7 @@ char *sqlite3Fts5Strndup(int *pRc, const char *pIn, int nIn){ if( nIn<0 ){ nIn = (int)strlen(pIn); } - zRet = (char*)sqlite3_malloc(nIn+1); + zRet = (char*)sqlite3_malloc64((i64)nIn+1); if( zRet ){ memcpy(zRet, pIn, nIn); zRet[nIn] = '\0'; diff --git a/ext/fts5/fts5_config.c b/ext/fts5/fts5_config.c index eea82b046d..cea14b500b 100644 --- a/ext/fts5/fts5_config.c +++ b/ext/fts5/fts5_config.c @@ -576,7 +576,7 @@ int sqlite3Fts5ConfigParse( sqlite3_int64 nByte; int bUnindexed = 0; /* True if there are one or more UNINDEXED */ - *ppOut = pRet = (Fts5Config*)sqlite3_malloc(sizeof(Fts5Config)); + *ppOut = pRet = (Fts5Config*)sqlite3_malloc64(sizeof(Fts5Config)); if( pRet==0 ) return SQLITE_NOMEM; memset(pRet, 0, sizeof(Fts5Config)); pRet->pGlobal = pGlobal; @@ -1123,5 +1123,3 @@ void sqlite3Fts5ConfigErrmsg(Fts5Config *pConfig, const char *zFmt, ...){ va_end(ap); } - - diff --git a/ext/fts5/fts5_expr.c b/ext/fts5/fts5_expr.c index 352df81f4f..8ecaca34fe 100644 --- a/ext/fts5/fts5_expr.c +++ b/ext/fts5/fts5_expr.c @@ -314,7 +314,7 @@ int sqlite3Fts5ExprNew( assert( sParse.rc!=SQLITE_OK || sParse.zErr==0 ); if( sParse.rc==SQLITE_OK ){ - *ppNew = pNew = sqlite3_malloc(sizeof(Fts5Expr)); + *ppNew = pNew = sqlite3_malloc64(sizeof(Fts5Expr)); if( pNew==0 ){ sParse.rc = SQLITE_NOMEM; sqlite3Fts5ParseNodeFree(sParse.pExpr); @@ -466,7 +466,7 @@ int sqlite3Fts5ExprAnd(Fts5Expr **pp1, Fts5Expr *p2){ p2->pRoot = 0; if( sParse.rc==SQLITE_OK ){ - Fts5ExprPhrase **ap = (Fts5ExprPhrase**)sqlite3_realloc( + Fts5ExprPhrase **ap = (Fts5ExprPhrase**)sqlite3_realloc64( p1->apExprPhrase, nPhrase * sizeof(Fts5ExprPhrase*) ); if( ap==0 ){ diff --git a/ext/fts5/fts5_hash.c b/ext/fts5/fts5_hash.c index a33dec9a92..ba4a030b7d 100644 --- a/ext/fts5/fts5_hash.c +++ b/ext/fts5/fts5_hash.c @@ -91,7 +91,7 @@ int sqlite3Fts5HashNew(Fts5Config *pConfig, Fts5Hash **ppNew, int *pnByte){ int rc = SQLITE_OK; Fts5Hash *pNew; - *ppNew = pNew = (Fts5Hash*)sqlite3_malloc(sizeof(Fts5Hash)); + *ppNew = pNew = (Fts5Hash*)sqlite3_malloc64(sizeof(Fts5Hash)); if( pNew==0 ){ rc = SQLITE_NOMEM; }else{ diff --git a/ext/fts5/fts5_index.c b/ext/fts5/fts5_index.c index 7e25731ed5..164d613881 100644 --- a/ext/fts5/fts5_index.c +++ b/ext/fts5/fts5_index.c @@ -2093,7 +2093,7 @@ static void fts5SegIterReverseInitPage(Fts5Index *p, Fts5SegIter *pIter){ /* If necessary, grow the pIter->aRowidOffset[] array. */ if( iRowidOffset>=pIter->nRowidOffset ){ - int nNew = pIter->nRowidOffset + 8; + i64 nNew = pIter->nRowidOffset + 8; int *aNew = (int*)sqlite3_realloc64(pIter->aRowidOffset,nNew*sizeof(int)); if( aNew==0 ){ p->rc = SQLITE_NOMEM; @@ -5240,7 +5240,7 @@ static void fts5DoSecureDelete( int iSegid = pSeg->pSeg->iSegid; u8 *aPg = pSeg->pLeaf->p; int nPg = pSeg->pLeaf->nn; - int iPgIdx = pSeg->pLeaf->szLeaf; + int iPgIdx = pSeg->pLeaf->szLeaf; /* Offset of page footer */ u64 iDelta = 0; int iNextOff = 0; @@ -5319,7 +5319,7 @@ static void fts5DoSecureDelete( iSOP += fts5GetVarint32(&aPg[iSOP], nPos); } assert_nc( iSOP==pSeg->iLeafOffset ); - iNextOff = pSeg->iLeafOffset + pSeg->nPos; + iNextOff = iSOP + pSeg->nPos; } } @@ -5399,31 +5399,31 @@ static void fts5DoSecureDelete( ** is another term following it on this page. So the subsequent term ** needs to be moved to replace the term associated with the entry ** being removed. */ - int nPrefix = 0; - int nSuffix = 0; - int nPrefix2 = 0; - int nSuffix2 = 0; + u64 nPrefix = 0; + u64 nSuffix = 0; + u64 nPrefix2 = 0; + u64 nSuffix2 = 0; iDelKeyOff = iNextOff; - iNextOff += fts5GetVarint32(&aPg[iNextOff], nPrefix2); - iNextOff += fts5GetVarint32(&aPg[iNextOff], nSuffix2); + iNextOff += fts5GetVarint(&aPg[iNextOff], &nPrefix2); + iNextOff += fts5GetVarint(&aPg[iNextOff], &nSuffix2); if( iKey!=1 ){ - iKeyOff += fts5GetVarint32(&aPg[iKeyOff], nPrefix); + iKeyOff += fts5GetVarint(&aPg[iKeyOff], &nPrefix); } - iKeyOff += fts5GetVarint32(&aPg[iKeyOff], nSuffix); + iKeyOff += fts5GetVarint(&aPg[iKeyOff], &nSuffix); nPrefix = MIN(nPrefix, nPrefix2); nSuffix = (nPrefix2 + nSuffix2) - nPrefix; - if( (iKeyOff+nSuffix)>iPgIdx || (iNextOff+nSuffix2)>iPgIdx ){ + if( (iKeyOff+nSuffix)>(u64)iPgIdx || (iNextOff+nSuffix2)>(u64)iPgIdx ){ FTS5_CORRUPT_IDX(p); }else{ if( iKey!=1 ){ iOff += sqlite3Fts5PutVarint(&aPg[iOff], nPrefix); } iOff += sqlite3Fts5PutVarint(&aPg[iOff], nSuffix); - if( nPrefix2>pSeg->term.n ){ + if( nPrefix2>(u64)pSeg->term.n ){ FTS5_CORRUPT_IDX(p); }else if( nPrefix2>nPrefix ){ memcpy(&aPg[iOff], &pSeg->term.p[nPrefix], nPrefix2-nPrefix); @@ -5454,7 +5454,7 @@ static void fts5DoSecureDelete( u8 *aTermIdx = &pTerm->p[pTerm->szLeaf]; int nTermIdx = pTerm->nn - pTerm->szLeaf; int iTermIdx = 0; - int iTermOff = 0; + i64 iTermOff = 0; while( 1 ){ u32 iVal = 0; @@ -5465,12 +5465,15 @@ static void fts5DoSecureDelete( } nTermIdx = iTermIdx; - memmove(&pTerm->p[iTermOff], &pTerm->p[pTerm->szLeaf], nTermIdx); - fts5PutU16(&pTerm->p[2], iTermOff); - - fts5DataWrite(p, iId, pTerm->p, iTermOff+nTermIdx); - if( nTermIdx==0 ){ - fts5SecureDeleteIdxEntry(p, iSegid, pSeg->iTermLeafPgno); + if( iTermOff>pTerm->szLeaf ){ + FTS5_CORRUPT_IDX(p); + }else{ + memmove(&pTerm->p[iTermOff], &pTerm->p[pTerm->szLeaf], nTermIdx); + fts5PutU16(&pTerm->p[2], iTermOff); + fts5DataWrite(p, iId, pTerm->p, iTermOff+nTermIdx); + if( nTermIdx==0 ){ + fts5SecureDeleteIdxEntry(p, iSegid, pSeg->iTermLeafPgno); + } } } fts5DataRelease(pTerm); @@ -5493,7 +5496,9 @@ static void fts5DoSecureDelete( int iPrevKeyOut = 0; int iKeyIn = 0; - memmove(&aPg[iOff], &aPg[iNextOff], nMove); + if( nMove>0 ){ + memmove(&aPg[iOff], &aPg[iNextOff], nMove); + } iPgIdx -= nShift; nPg = iPgIdx; fts5PutU16(&aPg[2], iPgIdx); @@ -5931,7 +5936,7 @@ int sqlite3Fts5IndexMerge(Fts5Index *p, int nMerge){ fts5StructureRelease(pStruct); pStruct = pNew; nMin = 1; - nMerge = nMerge*-1; + nMerge = (nMerge==SMALLEST_INT32 ? LARGEST_INT32 : (nMerge*-1)); } if( pStruct && pStruct->nLevel ){ if( fts5IndexMerge(p, &pStruct, nMerge, nMin) ){ @@ -6413,16 +6418,16 @@ struct Fts5TokenDataMap { ** aMap[] variables. */ struct Fts5TokenDataIter { - int nMapAlloc; /* Allocated size of aMap[] in entries */ - int nMap; /* Number of valid entries in aMap[] */ + i64 nMapAlloc; /* Allocated size of aMap[] in entries */ + i64 nMap; /* Number of valid entries in aMap[] */ Fts5TokenDataMap *aMap; /* Array of (rowid+pos -> token) mappings */ /* The following are used for prefix-queries only. */ Fts5Buffer terms; /* The following are used for other full-token tokendata queries only. */ - int nIter; - int nIterAlloc; + i64 nIter; + i64 nIterAlloc; Fts5PoslistReader *aPoslistReader; int *aPoslistToIter; Fts5Iter *apIter[FLEXARRAY]; @@ -6478,11 +6483,11 @@ static void fts5TokendataIterAppendMap( ){ if( p->rc==SQLITE_OK ){ if( pT->nMap==pT->nMapAlloc ){ - int nNew = pT->nMapAlloc ? pT->nMapAlloc*2 : 64; - int nAlloc = nNew * sizeof(Fts5TokenDataMap); + i64 nNew = pT->nMapAlloc ? pT->nMapAlloc*2 : 64; + i64 nAlloc = nNew * sizeof(Fts5TokenDataMap); Fts5TokenDataMap *aNew; - aNew = (Fts5TokenDataMap*)sqlite3_realloc(pT->aMap, nAlloc); + aNew = (Fts5TokenDataMap*)sqlite3_realloc64(pT->aMap, nAlloc); if( aNew==0 ){ p->rc = SQLITE_NOMEM; return; @@ -6508,7 +6513,7 @@ static void fts5TokendataIterAppendMap( */ static void fts5TokendataIterSortMap(Fts5Index *p, Fts5TokenDataIter *pT){ Fts5TokenDataMap *aTmp = 0; - int nByte = pT->nMap * sizeof(Fts5TokenDataMap); + i64 nByte = pT->nMap * sizeof(Fts5TokenDataMap); aTmp = (Fts5TokenDataMap*)sqlite3Fts5MallocZero(&p->rc, nByte); if( aTmp ){ @@ -7042,9 +7047,10 @@ static Fts5TokenDataIter *fts5AppendTokendataIter( if( p->rc==SQLITE_OK ){ if( pIn==0 || pIn->nIter==pIn->nIterAlloc ){ - int nAlloc = pIn ? pIn->nIterAlloc*2 : 16; - int nByte = SZ_FTS5TOKENDATAITER(nAlloc+1); - Fts5TokenDataIter *pNew = (Fts5TokenDataIter*)sqlite3_realloc(pIn, nByte); + i64 nAlloc = pIn ? pIn->nIterAlloc*2 : 16; + i64 nByte = SZ_FTS5TOKENDATAITER(nAlloc+1); + Fts5TokenDataIter *pNew; + pNew = (Fts5TokenDataIter*)sqlite3_realloc64(pIn, nByte); if( pNew==0 ){ p->rc = SQLITE_NOMEM; @@ -7141,8 +7147,8 @@ static void fts5IterSetOutputsTokendata(Fts5Iter *pIter){ /* Ensure the token-mapping is large enough */ if( eDetail==FTS5_DETAIL_FULL && pT->nMapAlloc<(pT->nMap + nByte) ){ - int nNew = (pT->nMapAlloc + nByte) * 2; - Fts5TokenDataMap *aNew = (Fts5TokenDataMap*)sqlite3_realloc( + i64 nNew = (pT->nMapAlloc + nByte) * 2; + Fts5TokenDataMap *aNew = (Fts5TokenDataMap*)sqlite3_realloc64( pT->aMap, nNew*sizeof(Fts5TokenDataMap) ); if( aNew==0 ){ diff --git a/ext/fts5/fts5_main.c b/ext/fts5/fts5_main.c index f45b9ef906..2e3b5b3af5 100644 --- a/ext/fts5/fts5_main.c +++ b/ext/fts5/fts5_main.c @@ -517,7 +517,7 @@ static void fts5SetEstimatedRows(sqlite3_index_info *pIdxInfo, i64 nRow){ if( sqlite3_libversion_number()>=3008002 ) #endif { - pIdxInfo->estimatedRows = nRow; + pIdxInfo->estimatedRows = MAX(1, nRow); } #endif } @@ -586,19 +586,30 @@ static int fts5UsePatternMatch( ** a) If a MATCH operator is present, the cost depends on the other ** constraints also present. As follows: ** -** * No other constraints: cost=1000.0 -** * One rowid range constraint: cost=750.0 -** * Both rowid range constraints: cost=500.0 -** * An == rowid constraint: cost=100.0 +** * No other constraints: cost=50000.0 +** * One rowid range constraint: cost=37500.0 +** * Both rowid range constraints: cost=30000.0 +** * An == rowid constraint: cost=25000.0 ** ** b) Otherwise, if there is no MATCH: ** -** * No other constraints: cost=1000000.0 -** * One rowid range constraint: cost=750000.0 -** * Both rowid range constraints: cost=250000.0 -** * An == rowid constraint: cost=10.0 +** * No other constraints: cost=3000000.0 +** * One rowid range constraints: cost=2250000.0 +** * Both rowid range constraint: cost=750000.0 +** * An == rowid constraint: cost=25.0 ** ** Costs are not modified by the ORDER BY clause. +** +** The ratios used in case (a) are based on informal results obtained from +** the tool/fts5cost.tcl script. The "MATCH and ==" combination has the +** cost set quite high because the query may be a prefix query. Unless +** there is a prefix index, prefix queries with rowid constraints are much +** more expensive than non-prefix queries with rowid constraints. +** +** The estimated rows returned is set to the cost/40. For simple queries, +** experimental results show that cost/4 might be about right. But for +** more complex queries that use multiple terms the number of rows might +** be far fewer than this. So we compromise and use cost/40. */ static int fts5BestIndexMethod(sqlite3_vtab *pVTab, sqlite3_index_info *pInfo){ Fts5Table *pTab = (Fts5Table*)pVTab; @@ -631,7 +642,7 @@ static int fts5BestIndexMethod(sqlite3_vtab *pVTab, sqlite3_index_info *pInfo){ return SQLITE_ERROR; } - idxStr = (char*)sqlite3_malloc(pInfo->nConstraint * 8 + 1); + idxStr = (char*)sqlite3_malloc64((i64)pInfo->nConstraint * 8 + 1); if( idxStr==0 ) return SQLITE_NOMEM; pInfo->idxStr = idxStr; pInfo->needToFreeIdxStr = 1; @@ -724,21 +735,35 @@ static int fts5BestIndexMethod(sqlite3_vtab *pVTab, sqlite3_index_info *pInfo){ /* Calculate the estimated cost based on the flags set in idxFlags. */ if( bSeenEq ){ - pInfo->estimatedCost = nSeenMatch ? 1000.0 : 25.0; - fts5SetUniqueFlag(pInfo); + pInfo->estimatedCost = nSeenMatch ? 25000.0 : 25.0; fts5SetEstimatedRows(pInfo, 1); + fts5SetUniqueFlag(pInfo); }else{ - if( bSeenLt && bSeenGt ){ - pInfo->estimatedCost = nSeenMatch ? 5000.0 : 750000.0; - }else if( bSeenLt || bSeenGt ){ - pInfo->estimatedCost = nSeenMatch ? 7500.0 : 2250000.0; + i64 nEstRows; + if( nSeenMatch ){ + if( bSeenLt && bSeenGt ){ + pInfo->estimatedCost = 50000.0; + }else if( bSeenLt || bSeenGt ){ + pInfo->estimatedCost = 37500.0; + }else{ + pInfo->estimatedCost = 50000.0; + } + nEstRows = (i64)(pInfo->estimatedCost / 40.0); + for(i=1; i
    estimatedCost *= 2.5; + nEstRows = nEstRows / 2; + } }else{ - pInfo->estimatedCost = nSeenMatch ? 10000.0 : 3000000.0; - } - for(i=1; i
    estimatedCost *= 0.4; + if( bSeenLt && bSeenGt ){ + pInfo->estimatedCost = 750000.0; + }else if( bSeenLt || bSeenGt ){ + pInfo->estimatedCost = 2250000.0; + }else{ + pInfo->estimatedCost = 3000000.0; + } + nEstRows = (i64)(pInfo->estimatedCost / 4.0); } - fts5SetEstimatedRows(pInfo, (i64)(pInfo->estimatedCost / 4.0)); + fts5SetEstimatedRows(pInfo, nEstRows); } pInfo->idxNum = idxFlags; @@ -2081,6 +2106,7 @@ static int fts5UpdateMethod( } update_out: + sqlite3Fts5IndexCloseReader(pTab->p.pIndex); pTab->p.pConfig->pzErrmsg = 0; return rc; } @@ -3762,7 +3788,7 @@ static int fts5Init(sqlite3 *db){ int rc; Fts5Global *pGlobal = 0; - pGlobal = (Fts5Global*)sqlite3_malloc(sizeof(Fts5Global)); + pGlobal = (Fts5Global*)sqlite3_malloc64(sizeof(Fts5Global)); if( pGlobal==0 ){ rc = SQLITE_NOMEM; }else{ diff --git a/ext/fts5/fts5_tcl.c b/ext/fts5/fts5_tcl.c index 25cd5c0633..f5d8705ffe 100644 --- a/ext/fts5/fts5_tcl.c +++ b/ext/fts5/fts5_tcl.c @@ -391,7 +391,7 @@ static int SQLITE_TCLAPI xF5tApi( break; } CASE(12, "xSetAuxdata") { - F5tAuxData *pData = (F5tAuxData*)sqlite3_malloc(sizeof(F5tAuxData)); + F5tAuxData *pData = (F5tAuxData*)sqlite3_malloc64(sizeof(F5tAuxData)); if( pData==0 ){ Tcl_AppendResult(interp, "out of memory", (char*)0); return TCL_ERROR; @@ -780,7 +780,7 @@ static int SQLITE_TCLAPI f5tTokenize( } if( nText>0 ){ - pCopy = sqlite3_malloc(nText); + pCopy = sqlite3_malloc64(nText); if( pCopy==0 ){ tokenizer.xDelete(pTok); Tcl_AppendResult(interp, "error in sqlite3_malloc()", (char*)0); @@ -1420,7 +1420,7 @@ static int f5tOrigintextCreate( void *pTokCtx = 0; int rc = SQLITE_OK; - pTok = (OriginTextTokenizer*)sqlite3_malloc(sizeof(OriginTextTokenizer)); + pTok = (OriginTextTokenizer*)sqlite3_malloc64(sizeof(OriginTextTokenizer)); if( pTok==0 ){ rc = SQLITE_NOMEM; }else if( nArg<1 ){ @@ -1480,7 +1480,7 @@ static int xOriginToken( int nReq = nToken + 1 + (iEnd-iStart); if( nReq>p->nBuf ){ sqlite3_free(p->aBuf); - p->aBuf = sqlite3_malloc(nReq*2); + p->aBuf = sqlite3_malloc64(nReq*2); if( p->aBuf==0 ) return SQLITE_NOMEM; p->nBuf = nReq*2; } diff --git a/ext/fts5/fts5_test_tok.c b/ext/fts5/fts5_test_tok.c index 994d304dc6..c77c49de74 100644 --- a/ext/fts5/fts5_test_tok.c +++ b/ext/fts5/fts5_test_tok.c @@ -194,7 +194,7 @@ static int fts5tokConnectMethod( } if( rc==SQLITE_OK ){ - pTab = (Fts5tokTable*)sqlite3_malloc(sizeof(Fts5tokTable)); + pTab = (Fts5tokTable*)sqlite3_malloc64(sizeof(Fts5tokTable)); if( pTab==0 ){ rc = SQLITE_NOMEM; }else{ @@ -275,7 +275,7 @@ static int fts5tokBestIndexMethod( static int fts5tokOpenMethod(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCsr){ Fts5tokCursor *pCsr; - pCsr = (Fts5tokCursor *)sqlite3_malloc(sizeof(Fts5tokCursor)); + pCsr = (Fts5tokCursor *)sqlite3_malloc64(sizeof(Fts5tokCursor)); if( pCsr==0 ){ return SQLITE_NOMEM; } @@ -347,7 +347,7 @@ static int fts5tokCb( if( pCsr->nRow ){ pRow->iPos = pRow[-1].iPos + ((tflags & FTS5_TOKEN_COLOCATED) ? 0 : 1); } - pRow->zToken = sqlite3_malloc(nToken+1); + pRow->zToken = sqlite3_malloc64((sqlite3_int64)nToken+1); if( pRow->zToken==0 ) return SQLITE_NOMEM; memcpy(pRow->zToken, pToken, nToken); pRow->zToken[nToken] = 0; @@ -373,8 +373,8 @@ static int fts5tokFilterMethod( fts5tokResetCursor(pCsr); if( idxNum==1 ){ const char *zByte = (const char *)sqlite3_value_text(apVal[0]); - int nByte = sqlite3_value_bytes(apVal[0]); - pCsr->zInput = sqlite3_malloc(nByte+1); + sqlite3_int64 nByte = sqlite3_value_bytes(apVal[0]); + pCsr->zInput = sqlite3_malloc64(nByte+1); if( pCsr->zInput==0 ){ rc = SQLITE_NOMEM; }else{ diff --git a/ext/fts5/fts5_tokenize.c b/ext/fts5/fts5_tokenize.c index b8a1136465..9908102392 100644 --- a/ext/fts5/fts5_tokenize.c +++ b/ext/fts5/fts5_tokenize.c @@ -72,7 +72,7 @@ static int fts5AsciiCreate( if( nArg%2 ){ rc = SQLITE_ERROR; }else{ - p = sqlite3_malloc(sizeof(AsciiTokenizer)); + p = sqlite3_malloc64(sizeof(AsciiTokenizer)); if( p==0 ){ rc = SQLITE_NOMEM; }else{ @@ -367,7 +367,7 @@ static int fts5UnicodeCreate( if( nArg%2 ){ rc = SQLITE_ERROR; }else{ - p = (Unicode61Tokenizer*)sqlite3_malloc(sizeof(Unicode61Tokenizer)); + p = (Unicode61Tokenizer*)sqlite3_malloc64(sizeof(Unicode61Tokenizer)); if( p ){ const char *zCat = "L* N* Co"; int i; @@ -590,7 +590,7 @@ static int fts5PorterCreate( zBase = azArg[0]; } - pRet = (PorterTokenizer*)sqlite3_malloc(sizeof(PorterTokenizer)); + pRet = (PorterTokenizer*)sqlite3_malloc64(sizeof(PorterTokenizer)); if( pRet ){ memset(pRet, 0, sizeof(PorterTokenizer)); rc = pApi->xFindTokenizer_v2(pApi, zBase, &pUserdata, &pV2); @@ -1297,7 +1297,7 @@ static int fts5TriCreate( rc = SQLITE_ERROR; }else{ int i; - pNew = (TrigramTokenizer*)sqlite3_malloc(sizeof(*pNew)); + pNew = (TrigramTokenizer*)sqlite3_malloc64(sizeof(*pNew)); if( pNew==0 ){ rc = SQLITE_NOMEM; }else{ diff --git a/ext/fts5/fts5_vocab.c b/ext/fts5/fts5_vocab.c index 3a6a968f7c..295ace6ba9 100644 --- a/ext/fts5/fts5_vocab.c +++ b/ext/fts5/fts5_vocab.c @@ -666,7 +666,7 @@ static int fts5VocabFilterMethod( const char *zCopy = (const char *)sqlite3_value_text(pLe); if( zCopy==0 ) zCopy = ""; pCsr->nLeTerm = sqlite3_value_bytes(pLe); - pCsr->zLeTerm = sqlite3_malloc(pCsr->nLeTerm+1); + pCsr->zLeTerm = sqlite3_malloc64((i64)pCsr->nLeTerm+1); if( pCsr->zLeTerm==0 ){ rc = SQLITE_NOMEM; }else{ diff --git a/ext/fts5/test/fts5corrupt9.test b/ext/fts5/test/fts5corrupt9.test new file mode 100644 index 0000000000..6cf06f8360 --- /dev/null +++ b/ext/fts5/test/fts5corrupt9.test @@ -0,0 +1,129 @@ +# 2026 Jan 15 +# +# The author disclaims copyright to this source code. In place of +# a legal notice, here is a blessing: +# +# May you do good and not evil. +# May you find forgiveness for yourself and forgive others. +# May you share freely, never taking more than you give. +# +#*********************************************************************** +# + +source [file join [file dirname [info script]] fts5_common.tcl] +set testprefix fts5corrupt9 + +# If SQLITE_ENABLE_FTS5 is not defined, omit this file. +ifcapable !fts5 { + finish_test + return +} + +sqlite3_fts5_may_be_corrupt 1 + +sqlite3 db test.db + +set nrows 50 +set repeat 500 +set text [string trim [string repeat "aaa " $repeat]] + +do_execsql_test 1.0 { + CREATE VIRTUAL TABLE t USING fts5(content); + INSERT INTO t(t, rank) VALUES('secure-delete', 1); +} +do_test 1.1 { + for {set i 0} {$i < $nrows} {incr i} { + db eval "INSERT INTO t(content) VALUES('$text')" + } + db eval "INSERT INTO t(t) VALUES('optimize')" +} {} + +do_test 1.2 { + db eval { SELECT segid, pgno FROM t_idx } {} + set rowid [expr {($segid << 37) + ($pgno >> 1)}] + db eval { + UPDATE t_data + SET block = X'00000009043061616104ffffffff07' + WHERE rowid=$rowid + } +} {} + +# At one point this would segfault due to OOB write. +# +do_catchsql_test 1.3 { + DELETE FROM t WHERE rowid=3 +} {0 {}} + +#------------------------------------------------------------------------- +reset_db + +set nRow 8000 +set zText aaa + +do_execsql_test 2.0 { + CREATE VIRTUAL TABLE t USING fts5(content, detail=none); + INSERT INTO t(t, rank) VALUES('secure-delete', 1); + BEGIN; +} +do_test 2.1 { + for {set ii 0} {$ii<$nRow} {incr ii} { + execsql { INSERT INTO t(content) VALUES($zText); } + } + execsql { + COMMIT; + INSERT INTO t(t) VALUES('optimize'); + } +} {} + +set hex "00040f7d9f49[string repeat {01} 3958]80" + +do_execsql_test 2.1 " + UPDATE t_data SET block = X'$hex' WHERE rowid=137438953474; +" + +do_execsql_test 2.3 { + DELETE FROM t WHERE rowid=7999 +} + +#------------------------------------------------------------------------- +reset_db + +do_execsql_test 3.0 { + CREATE VIRTUAL TABLE t USING fts5(content, detail=none); + INSERT INTO t(t, rank) VALUES('secure-delete', 1); + INSERT INTO t(content) VALUES('aaa'); + INSERT INTO t(content) VALUES('bbb'); + INSERT INTO t(t) VALUES('optimize'); +} + +do_execsql_test 3.1 { + UPDATE t_data SET block = X'000000100430616161010187ffffff7f0406' WHERE rowid=412316860417; +} + +do_catchsql_test 3.2 { + DELETE FROM t WHERE rowid=1; +} {1 {fts5: corruption in table "t"}} + +#------------------------------------------------------------------------- +reset_db + +do_execsql_test 4.0 { + CREATE VIRTUAL TABLE t USING fts5(content); + INSERT INTO t(t, rank) VALUES('secure-delete', 1); + INSERT INTO t(content) VALUES('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); + INSERT INTO t(t) VALUES('optimize'); +} + +do_execsql_test 4.1 { + UPDATE t_data SET block = X'00000fce9f4830616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161616161610487ffffff7f' WHERE rowid=137438953473; + UPDATE t_data SET block = X'0004000801040203' WHERE rowid=137438953474; +} + +do_catchsql_test 4.2 { + DELETE FROM t WHERE rowid = 1; +} {1 {fts5: corruption in table "t"}} + +sqlite3_fts5_may_be_corrupt 0 + +finish_test + diff --git a/ext/fts5/test/fts5integrity.test b/ext/fts5/test/fts5integrity.test index 4bf120c446..9b2720faf0 100644 --- a/ext/fts5/test/fts5integrity.test +++ b/ext/fts5/test/fts5integrity.test @@ -379,9 +379,6 @@ do_execsql_test 12.2 { db close sqlite3 db test.db -readonly 1 -explain_i { - PRAGMA integrity_check - } do_execsql_test 12.3 { PRAGMA integrity_check } {ok} diff --git a/ext/fts5/test/fts5interrupt.test b/ext/fts5/test/fts5interrupt.test index 67ef5f7e97..87b232b05a 100644 --- a/ext/fts5/test/fts5interrupt.test +++ b/ext/fts5/test/fts5interrupt.test @@ -64,4 +64,21 @@ foreach {tn sql} { } } +#------------------------------------------------------------------------- +# Verify that https://sqlite.org/forum/forumpost/95413eb410 has been +# fixed. +# +reset_db +do_execsql_test 2.0 { + CREATE VIRTUAL TABLE f1 USING fts5(x); + BEGIN TRANSACTION; + INSERT INTO f1(x) VALUES('abc def ghi'); +} +do_test 2.1 { + sqlite3_interrupt db +} {} +do_execsql_test 2.2 { + ROLLBACK +} + finish_test diff --git a/ext/fts5/test/fts5join.test b/ext/fts5/test/fts5join.test index e4d3b69b79..2b9945a6f1 100644 --- a/ext/fts5/test/fts5join.test +++ b/ext/fts5/test/fts5join.test @@ -65,5 +65,14 @@ do_eqp_test 1.4 { `--SCAN vt VIRTUAL TABLE INDEX 0:= } +do_eqp_test 1.5 { + SELECT * FROM vt, t1 + WHERE vt.rowid = t1.rowid AND vt MATCH ? AND b = ? +} { + QUERY PLAN + |--SCAN vt VIRTUAL TABLE INDEX 0:M1 + `--SEARCH t1 USING INTEGER PRIMARY KEY (rowid=?) +} + finish_test diff --git a/ext/fts5/test/fts5merge.test b/ext/fts5/test/fts5merge.test index c57c21ded3..09c18245f3 100644 --- a/ext/fts5/test/fts5merge.test +++ b/ext/fts5/test/fts5merge.test @@ -238,6 +238,22 @@ do_execsql_test 6.3 { INSERT INTO g1(g1) VALUES('integrity-check'); } +#-------------------------------------------------------------------------- +# Check that passing -2147483648 as the parameter to a merge command +# does not cause a signed integer overflow error. +# +reset_db +do_execsql_test 7.0 { + CREATE VIRTUAL TABLE f1 USING fts5(a); +} +do_execsql_test 7.1 { + INSERT INTO f1 VALUES('one two three'); + INSERT INTO f1 VALUES('four five six'); + INSERT INTO f1 VALUES('seven eight nine'); +} +do_execsql_test 7.2 { + INSERT INTO f1(f1, rank) VALUES('merge', -2147483648); +} finish_test diff --git a/ext/fts5/tool/fts5cost.tcl b/ext/fts5/tool/fts5cost.tcl new file mode 100644 index 0000000000..4f53d29eb6 --- /dev/null +++ b/ext/fts5/tool/fts5cost.tcl @@ -0,0 +1,153 @@ +# +# 2026 March 20 +# +# The author disclaims copyright to this source code. In place of +# a legal notice, here is a blessing: +# +# May you do good and not evil. +# May you find forgiveness for yourself and forgive others. +# May you share freely, never taking more than you give. +# +#-------------------------------------------------------------------------- +# +# This script extracts the documentation for the API used by fts5 auxiliary +# functions from header file fts5.h. It outputs html text on stdout that +# is included in the documentation on the web. +# + + +sqlite3 db fts5cost.db + +# Create an IPK table with 1,000,000 entries. Short records. +# +set res [list [catch { db eval {SELECT count(*) FROM t1} } msg] $msg] +if {$res!="0 1000000"} { + db eval { + PRAGMA mmap_size = 1000000000; -- 1GB + DROP TABLE IF EXISTS t1; + CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT); + WITH s(i) AS ( + SELECT 1 UNION ALL SELECT i+1 FROM s WHERE i<1_000_000 + ) + INSERT INTO t1 SELECT i, hex(randomblob(8)) FROM s; + } +} + +# Create an FTS5 table with 1,000,000 entries. Each row contains a single +# column containing a document of 100 terms chosen pseudo-randomly from +# a vocabularly of 2000. +set res [list [catch { db eval {SELECT count(*) FROM f1} } msg] $msg] +if {$res!="0 1000000"} { + set nVocab 2000 + set nTerm 100 + db eval { + BEGIN; + DROP TABLE IF EXISTS vocab1; + CREATE TABLE vocab1(w); + } + for {set ii 0} {$ii<$nVocab} {incr ii} { + set word [format %06x [expr {int(abs(rand()) * 0xFFFFFF)}]] + db eval { INSERT INTO vocab1 VALUES($word) } + lappend lVocab $word + } + db func doc doc + proc doc {} { + for {set ii 0} {$ii<$::nTerm} {incr ii} { + lappend ret [lindex $::lVocab [expr int(abs(rand())*$::nVocab)]] + } + set ret + } + db eval { + DROP TABLE IF EXISTS f1; + CREATE VIRTUAL TABLE f1 USING fts5(x); + WITH s(i) AS ( + SELECT 1 UNION ALL SELECT i+1 FROM s WHERE i<1_000_000 + ) + INSERT INTO f1(rowid, x) SELECT i, doc() FROM s; + COMMIT; + } +} else { + set lVocab [db eval { SELECT * FROM vocab1 }] + set nVocab [llength $lVocab] +} + +proc rowid_query {n} { + set rowid 654 + for {set ii 0} {$ii<$n} {incr ii} { + db eval { SELECT b FROM t1 WHERE a = $rowid } + set rowid [expr {($rowid + 7717) % 1000000}] + } +} + +proc rowid_query_fts {n} { + set rowid 654 + for {set ii 0} {$ii<$n} {incr ii} { + db eval { SELECT * FROM f1 WHERE rowid = $rowid } + set rowid [expr {($rowid + 7717) % 1000000}] + } +} + +proc match_query_fts {n} { + set idx 654 + for {set ii 0} {$ii<$n} {incr ii} { + set match [lrange $::lVocab $idx $idx+1] + db eval { SELECT * FROM f1($match) } + set idx [expr {($idx + 7717) % $::nVocab}] + } +} + +proc prefix_query_fts {n} { + set idx 654 + for {set ii 0} {$ii<$n} {incr ii} { + set match "[lindex $::lVocab $idx]*" + db eval { SELECT * FROM f1($match) } + set idx [expr {($idx + 7717) % $::nVocab}] + } +} + +proc match_rowid_query_fts {n} { + set idx 654 + for {set ii 0} {$ii<$n} {incr ii} { + set match "[lindex $::lVocab $idx]" + db eval { SELECT * FROM f1($match) WHERE rowid=500000 } + set idx [expr {($idx + 7717) % $::nVocab}] + } +} + +proc prefix_rowid_query_fts {n} { + set idx 654 + for {set ii 0} {$ii<$n} {incr ii} { + set match "[lindex $::lVocab $idx]*" + db eval { SELECT * FROM f1($match) WHERE rowid=500000 } + set idx [expr {($idx + 7717) % $::nVocab}] + } +} + + +proc mytime {cmd div} { + set tm [time $cmd] + expr {[lindex $tm 0] / $div} +} + +#set us [mytime { match_rowid_query_fts 1000 } 1000] +#puts "1000 match/rowid queries on fts5 table: ${us} per query" + +set us [mytime { prefix_rowid_query_fts 1000 } 1000] +puts "1000 prefix/rowid queries on fts5 table: ${us} per query" + +set us [mytime { match_query_fts 10 } 10] +puts "10 match queries on fts5 table: ${us} per query" + +set us [mytime { prefix_query_fts 10 } 10] +puts "10 prefix queries on fts5 table: ${us} per query" + +set us [mytime { prefix_rowid_query_fts 1000 } 1000] +puts "1000 prefix/rowid queries on fts5 table: ${us} per query" + +set us [mytime { rowid_query 10000 } 10000] +puts "10000 by-rowid queries on normal table: ${us} per query" + +set us [mytime { rowid_query_fts 10000 } 10000] +puts "10000 by-rowid queries on fts5 table: ${us} per query" + + diff --git a/ext/intck/sqlite3intck.c b/ext/intck/sqlite3intck.c index 5f645fae6e..e3fef77637 100644 --- a/ext/intck/sqlite3intck.c +++ b/ext/intck/sqlite3intck.c @@ -319,7 +319,7 @@ static int intckGetToken(const char *z){ char c = z[0]; int iRet = 1; if( c=='\'' || c=='"' || c=='`' ){ - while( 1 ){ + while( z[iRet] ){ if( z[iRet]==c ){ iRet++; if( z[iRet]!=c ) break; diff --git a/ext/jni/README.md b/ext/jni/README.md index 5ad79fce9e..0bdbde91eb 100644 --- a/ext/jni/README.md +++ b/ext/jni/README.md @@ -13,11 +13,14 @@ Technical support is available in the forum: -> **FOREWARNING:** this subproject is very much in development and - subject to any number of changes. Please do not rely on any - information about its API until this disclaimer is removed. The JNI - bindings released with version 3.43 are a "tech preview." Once - finalized, strong backward compatibility guarantees will apply. +> **FOREWARNING:** the JNI subproject is experimental and subject to + any number of changes. This API is "feature-complete", with only a + few difficult-to-reach corners of the C API not represented here, + but it is not a supported deliverable of the project so does not + have same backward compatibility guarantees which the C APIs + do. That said: the [C-style API](#1to1ish) is especially resistent + to compatibility breakage because it's designed to be as close to + the C API as feasible. Project goals/requirements: @@ -162,11 +165,13 @@ or propagate exceptions and must return error information (if any) via result codes or `null`. The only cases where the C-style APIs may throw is through client-side misuse, e.g. passing in a null where it may cause a `NullPointerException`. The APIs clearly mark function -parameters which should not be null, but does not generally actively -defend itself against such misuse. Some C-style APIs explicitly accept -`null` as a no-op for usability's sake, and some of the JNI APIs -deliberately return an error code, instead of segfaulting, when passed -a `null`. +parameters which should not be null, and it internally uses the +`SQLITE_API_ARMOR` mechanism to help product against such misuse. Some +C-style APIs explicitly accept `null` as a no-op for usability's sake, +and some of the JNI APIs deliberately return an error code, instead of +segfaulting, when passed a `null`. There are no known cases where it +will misuse memory if passed a `null` or out-of-range value from +client code. Client-defined callbacks _must never throw exceptions_ unless _very explicitly documented_ as being throw-safe. Exceptions are generally @@ -194,7 +199,8 @@ Some constructs, when modelled 1-to-1 from C to Java, are unduly clumsy to work with in Java because they try to shoehorn C's way of doing certain things into Java's wildly different ways. The following subsections cover those, starting with a verbose explanation and -demonstration of where such changes are "really necessary"... +demonstration of where such changes are "really necessary" for +usability's sake... ### Custom Collations @@ -286,12 +292,9 @@ binding. The Java API has only one core function-registration function: ```java int sqlite3_create_function(sqlite3 db, String funcName, int nArgs, - int encoding, SQLFunction func); + int flags, SQLFunction func); ``` -> Design question: does the encoding argument serve any purpose in - Java? That's as-yet undetermined. If not, it will be removed. - `SQLFunction` is not used directly, but is instead instantiated via one of its three subclasses: @@ -313,4 +316,4 @@ in-flux nature of this API. Various APIs which accept callbacks, e.g. `sqlite3_trace_v2()` and `sqlite3_update_hook()`, use interfaces similar to those shown above. Despite the changes in signature, the JNI layer makes every effort to -provide the same semantics as the C API documentation suggests. +provide the same semantics as the C API documentation describes. diff --git a/ext/jni/src/c/sqlite3-jni.c b/ext/jni/src/c/sqlite3-jni.c index f130eff042..8cdba9bcfd 100644 --- a/ext/jni/src/c/sqlite3-jni.c +++ b/ext/jni/src/c/sqlite3-jni.c @@ -133,11 +133,9 @@ ** Which sqlite3.c we're using needs to be configurable to enable ** building against a custom copy, e.g. the SEE variant. We have to ** include sqlite3.c, as opposed to sqlite3.h, in order to get access -** to some internal details like SQLITE_MAX_... and friends. This -** increases the rebuild time considerably but we need this in order -** to access some internal functionality and keep the to-Java-exported -** values of SQLITE_MAX_... and SQLITE_LIMIT_... in sync with the C -** build. +** to some internal details like SQLITE_MAX_... and friends, and keep +** those consistent with this build. This increases the rebuild time +** considerably, however. */ #ifndef SQLITE_C # define SQLITE_C sqlite3.c @@ -335,7 +333,7 @@ struct S3JniNphOp { const char * const zMember /* Name of member property */; const char * const zTypeSig /* JNI type signature of zMember */; /* - ** klazz is a global ref to the class represented by pRef. + ** klazz is a global ref to the class represented by zName. ** ** According to: ** @@ -999,19 +997,20 @@ static S3JniEnv * S3JniEnv__get(JNIEnv * const env){ ** JNI bindings such as sqlite3_prepare_v2/v3(), and definitely not ** from client code. ** -** Returns err_code. +** Returns err_code _unless_ err_code is 0 and sqlite3_set_errmsg() +** fails with OOM, in which case it may return SQLITE_OOM or fail +** fatally. +** +** This function predates sqlite3_set_errmsg(), which is why it has a +** slightly different interface. Before that function was introduced, +** this code used the SQLite-internal APIs to do this. */ -static int s3jni_db_error(sqlite3* const db, int err_code, - const char * const zMsg){ +static int s3jni_db_error(JNIEnv * env, sqlite3* const db, + int err_code, const char * const zMsg){ if( db!=0 ){ - if( 0==zMsg ){ - sqlite3Error(db, err_code); - }else{ - const int nMsg = sqlite3Strlen30(zMsg); - sqlite3_mutex_enter(sqlite3_db_mutex(db)); - sqlite3ErrorWithMsg(db, err_code, "%.*s", nMsg, zMsg); - sqlite3_mutex_leave(sqlite3_db_mutex(db)); - } + int const rc = sqlite3_set_errmsg(db, err_code, zMsg); + s3jni_oom_fatal(0==rc); + if( rc && !err_code ) err_code=rc; } return err_code; } @@ -1235,11 +1234,11 @@ static int s3jni__db_exception(JNIEnv * const env, sqlite3 * const pDb, char * zMsg; S3JniExceptionClear; zMsg = s3jni_exception_error_msg(env, ex); - s3jni_db_error(pDb, errCode, zMsg ? zMsg : zDfltMsg); + s3jni_db_error(env, pDb, errCode, zMsg ? zMsg : zDfltMsg); sqlite3_free(zMsg); S3JniUnrefLocal(ex); }else if( zDfltMsg ){ - s3jni_db_error(pDb, errCode, zDfltMsg); + s3jni_db_error(env, pDb, errCode, zDfltMsg); } return errCode; } @@ -1956,15 +1955,6 @@ static void S3JniUdf_finalizer(void * s){ S3JniUdf_free(s3jni_env(), (S3JniUdf*)s, 1); } -/* -** Helper for processing args to UDF handlers with signature -** (sqlite3_context*,int,sqlite3_value**). -*/ -typedef struct { - jobject jcx /* sqlite3_context */; - jobjectArray jargv /* sqlite3_value[] */; -} udf_jargs; - /* ** Converts the given (cx, argc, argv) into arguments for the given ** UDF, writing the result (Java wrappers for cx and argv) in the @@ -2006,7 +1996,7 @@ static int udf_args(JNIEnv *env, /* ** Requires that jCx and jArgv are sqlite3_context ** resp. array-of-sqlite3_value values initialized by udf_args(). The -** latter will be 0-and-NULL for UDF types with no arguments. This +** (argc,argv) are (0,NULL) for UDF types with no arguments. This ** function zeroes out the nativePointer member of jCx and each entry ** in jArgv. This is a safety-net precaution to avoid undefined ** behavior if a Java-side UDF holds a reference to its context or one @@ -2099,19 +2089,19 @@ static int udf_xFSI(sqlite3_context* const pCx, int argc, sqlite3_value** const argv, S3JniUdf * const s, jmethodID xMethodID, const char * const zFuncType){ S3JniDeclLocal_env; - udf_jargs args = {0,0}; - int rc = udf_args(env, pCx, argc, argv, &args.jcx, &args.jargv); - + jobject jcx = 0 /* sqlite3_context */; + jobjectArray jargv = 0 /* sqlite3_value[] */; + int rc = udf_args(env, pCx, argc, argv, &jcx, &jargv); if( 0 == rc ){ - (*env)->CallVoidMethod(env, s->jObj, xMethodID, args.jcx, args.jargv); + (*env)->CallVoidMethod(env, s->jObj, xMethodID, jcx, jargv); S3JniIfThrew{ rc = udf_report_exception(env, 'F'==zFuncType[1]/*xFunc*/, pCx, s->zFuncName, zFuncType); } - udf_unargs(env, args.jcx, argc, args.jargv); + udf_unargs(env, jcx, argc, jargv); } - S3JniUnrefLocal(args.jcx); - S3JniUnrefLocal(args.jargv); + S3JniUnrefLocal(jcx); + S3JniUnrefLocal(jargv); return rc; } @@ -3300,7 +3290,7 @@ static jobject s3jni_commit_rollback_hook(int isCommit, JNIEnv * const env, S3JniDb_mutex_enter; ps = S3JniDb_from_jlong(jpDb); if( !ps ){ - s3jni_db_error(ps->pDb, SQLITE_MISUSE, 0); + s3jni_db_error(env, ps->pDb, SQLITE_MISUSE, 0); S3JniDb_mutex_leave; return 0; } @@ -3320,13 +3310,14 @@ static jobject s3jni_commit_rollback_hook(int isCommit, JNIEnv * const env, else sqlite3_rollback_hook(ps->pDb, 0, 0); }else{ jclass const klazz = (*env)->GetObjectClass(env, jHook); - jmethodID const xCallback = (*env)->GetMethodID(env, klazz, "call", - isCommit ? "()I" : "()V"); + jmethodID const xCallback = + (*env)->GetMethodID(env, klazz, "call", + isCommit ? "()I" : "()V"); S3JniUnrefLocal(klazz); S3JniIfThrew { S3JniExceptionReport; S3JniExceptionClear; - s3jni_db_error(ps->pDb, SQLITE_ERROR, + s3jni_db_error(env, ps->pDb, SQLITE_ERROR, "Cannot not find matching call() method in" "hook object."); }else{ @@ -3606,7 +3597,7 @@ S3JniApi(sqlite3_create_collation() sqlite3_create_collation_v2(), (*env)->GetMethodID(env, klazz, "call", "([B[B)I"); S3JniUnrefLocal(klazz); S3JniIfThrew{ - rc = s3jni_db_error(ps->pDb, SQLITE_ERROR, + rc = s3jni_db_error(env, ps->pDb, SQLITE_ERROR, "Could not get call() method from " "CollationCallback object."); }else{ @@ -3645,15 +3636,15 @@ S3JniApi(sqlite3_create_function() sqlite3_create_function_v2() if( !pDb || !jFuncName ){ return SQLITE_MISUSE; - }else if( !encodingTypeIsValid(eTextRep) ){ - return s3jni_db_error(pDb, SQLITE_FORMAT, + }else if( !encodingTypeIsValid(eTextRep & 0x0f) ){ + return s3jni_db_error(env, pDb, SQLITE_FORMAT, "Invalid function encoding option."); } s = S3JniUdf_alloc(env, jFunctor); if( !s ) return SQLITE_NOMEM; if( UDF_UNKNOWN_TYPE==s->type ){ - rc = s3jni_db_error(pDb, SQLITE_MISUSE, + rc = s3jni_db_error(env, pDb, SQLITE_MISUSE, "Cannot unambiguously determine function type."); S3JniUdf_free(env, s, 1); goto error_cleanup; @@ -4012,7 +4003,7 @@ S3JniApi(sqlite3_jni_db_error(), jint, 1jni_1db_1error)( zStr = jStr ? s3jni_jstring_to_utf8( jStr, 0) : NULL; - rc = s3jni_db_error( ps->pDb, (int)jRc, zStr ); + rc = s3jni_db_error(env, ps->pDb, (int)jRc, zStr ); sqlite3_free(zStr); } return rc; @@ -4321,7 +4312,7 @@ static void s3jni_updatepre_hook_impl(void * pState, sqlite3 *pDb, int opId, jTable = jDbName ? s3jni_utf8_to_jstring( zTable, -1) : 0; S3JniIfThrew { S3JniExceptionClear; - s3jni_db_error(ps->pDb, SQLITE_NOMEM, 0); + s3jni_db_error(env, ps->pDb, SQLITE_NOMEM, 0); }else{ assert( hook.jObj ); assert( hook.midCallback ); @@ -4423,7 +4414,7 @@ static jobject s3jni_updatepre_hook(JNIEnv * env, int isPre, jlong jpDb, jobject S3JniUnrefLocal(klazz); S3JniIfThrew { S3JniExceptionClear; - s3jni_db_error(ps->pDb, SQLITE_ERROR, + s3jni_db_error(env, ps->pDb, SQLITE_ERROR, "Cannot not find matching callback on " "(pre)update hook object."); }else{ @@ -4532,7 +4523,7 @@ S3JniApi(sqlite3_progress_handler(),void,1progress_1handler)( S3JniUnrefLocal(klazz); S3JniIfThrew { S3JniExceptionClear; - s3jni_db_error(ps->pDb, SQLITE_ERROR, + s3jni_db_error(env, ps->pDb, SQLITE_ERROR, "Cannot not find matching xCallback() on " "ProgressHandler object."); }else{ @@ -4906,8 +4897,9 @@ S3JniApi(sqlite3_set_authorizer(),jint,1set_1authorizer)( ")I"); S3JniUnrefLocal(klazz); S3JniIfThrew { - rc = s3jni_db_error(ps->pDb, SQLITE_ERROR, - "Error setting up Java parts of authorizer hook."); + rc = s3jni_db_error(env, ps->pDb, SQLITE_ERROR, + "Error setting up Java parts of " + "authorizer hook."); }else{ rc = sqlite3_set_authorizer(ps->pDb, s3jni_xAuth, ps); } @@ -5191,7 +5183,7 @@ S3JniApi(sqlite3_trace_v2(),jint,1trace_1v2)( S3JniUnrefLocal(klazz); S3JniIfThrew { S3JniExceptionClear; - rc = s3jni_db_error(ps->pDb, SQLITE_ERROR, + rc = s3jni_db_error(env, ps->pDb, SQLITE_ERROR, "Cannot not find matching call() on " "TracerCallback object."); }else{ diff --git a/ext/jni/src/c/sqlite3-jni.h b/ext/jni/src/c/sqlite3-jni.h index 81af5cbde1..c326fa8eaf 100644 --- a/ext/jni/src/c/sqlite3-jni.h +++ b/ext/jni/src/c/sqlite3-jni.h @@ -245,8 +245,10 @@ extern "C" { #define org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_CACHE_USED_SHARED 11L #undef org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_CACHE_SPILL #define org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_CACHE_SPILL 12L +#undef org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_TEMPBUF_SPILL +#define org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_TEMPBUF_SPILL 13L #undef org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_MAX -#define org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_MAX 12L +#define org_sqlite_jni_capi_CApi_SQLITE_DBSTATUS_MAX 13L #undef org_sqlite_jni_capi_CApi_SQLITE_UTF8 #define org_sqlite_jni_capi_CApi_SQLITE_UTF8 1L #undef org_sqlite_jni_capi_CApi_SQLITE_UTF16LE diff --git a/ext/jni/src/org/sqlite/jni/capi/CApi.java b/ext/jni/src/org/sqlite/jni/capi/CApi.java index 0b840c3623..1bdc5300d2 100644 --- a/ext/jni/src/org/sqlite/jni/capi/CApi.java +++ b/ext/jni/src/org/sqlite/jni/capi/CApi.java @@ -115,8 +115,9 @@ private static byte[] nulTerminateUtf8(String s){ JNIEnv is not cached, else returns true, but this information is primarily for testing of the JNI bindings and is not information which client-level code can use to make any informed - decisions. Its return type and semantics are not considered - stable and may change at any time. + decisions. The semantics of its return type and value are not + considered stable and may change at any time. i.e. act as if it + returns null. */ public static native boolean sqlite3_java_uncache_thread(); @@ -2585,7 +2586,8 @@ public static int sqlite3_value_type(@NotNull sqlite3_value v){ public static final int SQLITE_DBSTATUS_DEFERRED_FKS = 10; public static final int SQLITE_DBSTATUS_CACHE_USED_SHARED = 11; public static final int SQLITE_DBSTATUS_CACHE_SPILL = 12; - public static final int SQLITE_DBSTATUS_MAX = 12; + public static final int SQLITE_DBSTATUS_TEMPBUF_SPILL = 13; + public static final int SQLITE_DBSTATUS_MAX = 13; // encodings public static final int SQLITE_UTF8 = 1; diff --git a/ext/jni/src/org/sqlite/jni/capi/Tester1.java b/ext/jni/src/org/sqlite/jni/capi/Tester1.java index 9d14c954b8..891bdea541 100644 --- a/ext/jni/src/org/sqlite/jni/capi/Tester1.java +++ b/ext/jni/src/org/sqlite/jni/capi/Tester1.java @@ -815,7 +815,9 @@ public void xDestroy(){ }; // Register and use the function... - int rc = sqlite3_create_function(db, "myfunc", -1, SQLITE_UTF8, func); + int rc = sqlite3_create_function(db, "myfunc", -1, + SQLITE_UTF8 | SQLITE_INNOCUOUS, + func); affirm(0 == rc); affirm(0 == xFuncAccum.value); final sqlite3_stmt stmt = prepare(db, "SELECT myfunc(1,2,3)"); diff --git a/ext/jni/src/org/sqlite/jni/wrapper1/Sqlite.java b/ext/jni/src/org/sqlite/jni/wrapper1/Sqlite.java index d259e0ce62..ba2ffd119d 100644 --- a/ext/jni/src/org/sqlite/jni/wrapper1/Sqlite.java +++ b/ext/jni/src/org/sqlite/jni/wrapper1/Sqlite.java @@ -171,6 +171,7 @@ public final class Sqlite implements AutoCloseable { public static final int DBSTATUS_DEFERRED_FKS = CApi.SQLITE_DBSTATUS_DEFERRED_FKS; public static final int DBSTATUS_CACHE_USED_SHARED = CApi.SQLITE_DBSTATUS_CACHE_USED_SHARED; public static final int DBSTATUS_CACHE_SPILL = CApi.SQLITE_DBSTATUS_CACHE_SPILL; + public static final int DBSTATUS_TEMPBUF_SPILL = CApi.SQLITE_DBSTATUS_TEMPBUF_SPILL; // Limits public static final int LIMIT_LENGTH = CApi.SQLITE_LIMIT_LENGTH; diff --git a/ext/misc/amatch.c b/ext/misc/amatch.c index 587c610b95..21504777f6 100644 --- a/ext/misc/amatch.c +++ b/ext/misc/amatch.c @@ -847,7 +847,7 @@ static int amatchConnect( (void)pAux; *ppVtab = 0; - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; rc = SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -928,7 +928,7 @@ static int amatchConnect( static int amatchOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){ amatch_vtab *p = (amatch_vtab*)pVTab; amatch_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->pVtab = p; diff --git a/ext/misc/base64.c b/ext/misc/base64.c index 2da767bb0d..3334222f71 100644 --- a/ext/misc/base64.c +++ b/ext/misc/base64.c @@ -232,7 +232,7 @@ static void base64(sqlite3_context *context, int na, sqlite3_value *av[]){ sqlite3_result_text(context,"",-1,SQLITE_STATIC); break; } - cBuf = sqlite3_malloc(nc); + cBuf = sqlite3_malloc64(nc); if( !cBuf ) goto memFail; nc = (int)(toBase64(bBuf, nb, cBuf) - cBuf); sqlite3_result_text(context, cBuf, nc, sqlite3_free); @@ -254,7 +254,7 @@ static void base64(sqlite3_context *context, int na, sqlite3_value *av[]){ sqlite3_result_zeroblob(context, 0); break; } - bBuf = sqlite3_malloc(nb); + bBuf = sqlite3_malloc64(nb); if( !bBuf ) goto memFail; nb = (int)(fromBase64(cBuf, nc, bBuf) - bBuf); sqlite3_result_blob(context, bBuf, nb, sqlite3_free); @@ -275,7 +275,7 @@ static void base64(sqlite3_context *context, int na, sqlite3_value *av[]){ #ifdef _WIN32 __declspec(dllexport) #endif -int sqlite3_base_init +int sqlite3_base64_init #else static int sqlite3_base64_init #endif diff --git a/ext/misc/base85.c b/ext/misc/base85.c index 63245e2e4a..a2e6c3ab40 100644 --- a/ext/misc/base85.c +++ b/ext/misc/base85.c @@ -262,7 +262,7 @@ static int allBase85( char *p, int len ){ #ifndef BASE85_STANDALONE -# ifndef OMIT_BASE85_CHECKER +#ifndef OMIT_BASE85_CHECKER /* This function does the work for the SQLite is_base85(t) UDF. */ static void is_base85(sqlite3_context *context, int na, sqlite3_value *av[]){ assert(na==1); @@ -282,7 +282,7 @@ static void is_base85(sqlite3_context *context, int na, sqlite3_value *av[]){ return; } } -# endif +#endif /* This function does the work for the SQLite base85(x) UDF. */ static void base85(sqlite3_context *context, int na, sqlite3_value *av[]){ @@ -309,7 +309,7 @@ static void base85(sqlite3_context *context, int na, sqlite3_value *av[]){ sqlite3_result_text(context,"",-1,SQLITE_STATIC); break; } - cBuf = sqlite3_malloc(nc); + cBuf = sqlite3_malloc64(nc); if( !cBuf ) goto memFail; nc = (int)(toBase85(bBuf, nb, cBuf, "\n") - cBuf); sqlite3_result_text(context, cBuf, nc, sqlite3_free); @@ -331,7 +331,7 @@ static void base85(sqlite3_context *context, int na, sqlite3_value *av[]){ sqlite3_result_zeroblob(context, 0); break; } - bBuf = sqlite3_malloc(nb); + bBuf = sqlite3_malloc64(nb); if( !bBuf ) goto memFail; nb = (int)(fromBase85(cBuf, nc, bBuf) - bBuf); sqlite3_result_blob(context, bBuf, nb, sqlite3_free); @@ -352,14 +352,14 @@ static void base85(sqlite3_context *context, int na, sqlite3_value *av[]){ #ifdef _WIN32 __declspec(dllexport) #endif -int sqlite3_base_init +int sqlite3_base85_init #else static int sqlite3_base85_init #endif (sqlite3 *db, char **pzErr, const sqlite3_api_routines *pApi){ SQLITE_EXTENSION_INIT2(pApi); (void)pzErr; -# ifndef OMIT_BASE85_CHECKER +#ifndef OMIT_BASE85_CHECKER { int rc = sqlite3_create_function (db, "is_base85", 1, @@ -367,7 +367,7 @@ static int sqlite3_base85_init 0, is_base85, 0, 0); if( rc!=SQLITE_OK ) return rc; } -# endif +#endif return sqlite3_create_function (db, "base85", 1, SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS|SQLITE_DIRECTONLY|SQLITE_UTF8, @@ -432,9 +432,9 @@ int main(int na, char *av[]){ int nc = strlen(cBuf); size_t nbo = fromBase85( cBuf, nc, bBuf ) - bBuf; if( 1 != fwrite(bBuf, nbo, 1, fb) ) rc = 1; -# ifndef OMIT_BASE85_CHECKER +#ifndef OMIT_BASE85_CHECKER b85Clean &= allBase85( cBuf, nc ); -# endif +#endif } break; default: diff --git a/ext/misc/btreeinfo.c b/ext/misc/btreeinfo.c index 9c726f5f17..24645f2268 100644 --- a/ext/misc/btreeinfo.c +++ b/ext/misc/btreeinfo.c @@ -306,6 +306,10 @@ static int binfoCompute(sqlite3 *db, int pgno, BinfoCursor *pCsr){ nEntry *= (nCell+1); if( aData[0]==10 || aData[0]==13 ) break; nPage *= (nCell+1); + if( 14+2*(nCell/2)>=pgsz ){ + rc = SQLITE_CORRUPT; + break; + } if( nCell<=1 ){ pgno = get_uint32(aData+8); }else{ @@ -339,7 +343,7 @@ static int binfoColumn( sqlite3 *db = sqlite3_context_db_handle(ctx); int rc = binfoCompute(db, pgno, pCsr); if( rc ){ - pCursor->pVtab->zErrMsg = sqlite3_mprintf("%s", sqlite3_errmsg(db)); + pCursor->pVtab->zErrMsg = sqlite3_mprintf("%s", sqlite3_errstr(rc)); return SQLITE_ERROR; } } diff --git a/ext/misc/closure.c b/ext/misc/closure.c index 14caf271f9..22bfd888f5 100644 --- a/ext/misc/closure.c +++ b/ext/misc/closure.c @@ -516,7 +516,7 @@ static int closureConnect( (void)pAux; *ppVtab = 0; - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; rc = SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -579,7 +579,7 @@ static int closureConnect( static int closureOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){ closure_vtab *p = (closure_vtab*)pVTab; closure_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->pVtab = p; @@ -638,7 +638,7 @@ static int closureInsertNode( sqlite3_int64 id, /* The node ID */ int iGeneration /* The generation number for this node */ ){ - closure_avl *pNew = sqlite3_malloc( sizeof(*pNew) ); + closure_avl *pNew = sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); pNew->id = id; diff --git a/ext/misc/completion.c b/ext/misc/completion.c index 67b40d84d1..37237d9c9f 100644 --- a/ext/misc/completion.c +++ b/ext/misc/completion.c @@ -132,7 +132,7 @@ static int completionConnect( " phase INT HIDDEN" /* Used for debugging only */ ")"); if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -154,7 +154,7 @@ static int completionDisconnect(sqlite3_vtab *pVtab){ */ static int completionOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ completion_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->db = ((completion_vtab*)p)->db; @@ -199,6 +199,7 @@ static int completionNext(sqlite3_vtab_cursor *cur){ completion_cursor *pCur = (completion_cursor*)cur; int eNextPhase = 0; /* Next phase to try if current phase reaches end */ int iCol = -1; /* If >=0, step pCur->pStmt and use the i-th column */ + int rc; pCur->iRowid++; while( pCur->ePhase!=COMPLETION_EOF ){ switch( pCur->ePhase ){ @@ -224,22 +225,27 @@ static int completionNext(sqlite3_vtab_cursor *cur){ case COMPLETION_TABLES: { if( pCur->pStmt==0 ){ sqlite3_stmt *pS2; + sqlite3_str* pStr = sqlite3_str_new(pCur->db); char *zSql = 0; const char *zSep = ""; sqlite3_prepare_v2(pCur->db, "PRAGMA database_list", -1, &pS2, 0); while( sqlite3_step(pS2)==SQLITE_ROW ){ const char *zDb = (const char*)sqlite3_column_text(pS2, 1); - zSql = sqlite3_mprintf( - "%z%s" + sqlite3_str_appendf(pStr, + "%s" "SELECT name FROM \"%w\".sqlite_schema", - zSql, zSep, zDb + zSep, zDb ); - if( zSql==0 ) return SQLITE_NOMEM; zSep = " UNION "; } - sqlite3_finalize(pS2); - sqlite3_prepare_v2(pCur->db, zSql, -1, &pCur->pStmt, 0); + rc = sqlite3_finalize(pS2); + zSql = sqlite3_str_finish(pStr); + if( zSql==0 ) return SQLITE_NOMEM; + if( rc==SQLITE_OK ){ + sqlite3_prepare_v2(pCur->db, zSql, -1, &pCur->pStmt, 0); + } sqlite3_free(zSql); + if( rc ) return rc; } iCol = 0; eNextPhase = COMPLETION_COLUMNS; @@ -248,24 +254,29 @@ static int completionNext(sqlite3_vtab_cursor *cur){ case COMPLETION_COLUMNS: { if( pCur->pStmt==0 ){ sqlite3_stmt *pS2; + sqlite3_str *pStr = sqlite3_str_new(pCur->db); char *zSql = 0; const char *zSep = ""; sqlite3_prepare_v2(pCur->db, "PRAGMA database_list", -1, &pS2, 0); while( sqlite3_step(pS2)==SQLITE_ROW ){ const char *zDb = (const char*)sqlite3_column_text(pS2, 1); - zSql = sqlite3_mprintf( - "%z%s" + sqlite3_str_appendf(pStr, + "%s" "SELECT pti.name FROM \"%w\".sqlite_schema AS sm" " JOIN pragma_table_xinfo(sm.name,%Q) AS pti" " WHERE sm.type='table'", - zSql, zSep, zDb, zDb + zSep, zDb, zDb ); - if( zSql==0 ) return SQLITE_NOMEM; zSep = " UNION "; } - sqlite3_finalize(pS2); - sqlite3_prepare_v2(pCur->db, zSql, -1, &pCur->pStmt, 0); + rc = sqlite3_finalize(pS2); + zSql = sqlite3_str_finish(pStr); + if( zSql==0 ) return SQLITE_NOMEM; + if( rc==SQLITE_OK ){ + sqlite3_prepare_v2(pCur->db, zSql, -1, &pCur->pStmt, 0); + } sqlite3_free(zSql); + if( rc ) return rc; } iCol = 0; eNextPhase = COMPLETION_EOF; @@ -282,9 +293,10 @@ static int completionNext(sqlite3_vtab_cursor *cur){ pCur->szRow = sqlite3_column_bytes(pCur->pStmt, iCol); }else{ /* When all rows are finished, advance to the next phase */ - sqlite3_finalize(pCur->pStmt); + rc = sqlite3_finalize(pCur->pStmt); pCur->pStmt = 0; pCur->ePhase = eNextPhase; + if( rc ) return rc; continue; } } diff --git a/ext/misc/compress.c b/ext/misc/compress.c index 6b034eb45f..48ea5182d7 100644 --- a/ext/misc/compress.c +++ b/ext/misc/compress.c @@ -59,7 +59,7 @@ static void compressFunc( pIn = sqlite3_value_blob(argv[0]); nIn = sqlite3_value_bytes(argv[0]); nOut = 13 + nIn + (nIn+999)/1000; - pOut = sqlite3_malloc( nOut+5 ); + pOut = sqlite3_malloc64( nOut+5 ); for(i=4; i>=0; i--){ x[i] = (nIn >> (7*(4-i)))&0x7f; } @@ -98,7 +98,7 @@ static void uncompressFunc( nOut = (nOut<<7) | (pIn[i]&0x7f); if( (pIn[i]&0x80)!=0 ){ i++; break; } } - pOut = sqlite3_malloc( nOut+1 ); + pOut = sqlite3_malloc64( nOut+1 ); rc = uncompress(pOut, &nOut, &pIn[i], nIn-i); if( rc==Z_OK ){ sqlite3_result_blob(context, pOut, nOut, sqlite3_free); diff --git a/ext/misc/csv.c b/ext/misc/csv.c index 8331265aa0..eaf9cbba78 100644 --- a/ext/misc/csv.c +++ b/ext/misc/csv.c @@ -24,8 +24,8 @@ ** schema= parameter, like this: ** ** CREATE VIRTUAL TABLE temp.csv2 USING csv( -** filename = "../http.log", -** schema = "CREATE TABLE x(date,ipaddr,url,referrer,userAgent)" +** filename = '../http.log', +** schema = 'CREATE TABLE x(date,ipaddr,url,referrer,userAgent)' ** ); ** ** Instead of specifying a file, the text of the CSV can be loaded using @@ -62,6 +62,10 @@ SQLITE_EXTENSION_INIT1 # define CSV_NOINLINE #endif +#ifndef SQLITEINT_H +typedef sqlite3_int64 i64; +typedef sqlite3_uint64 u64; +#endif /* Max size of the error message in a CsvReader */ #define CSV_MXERR 200 @@ -74,9 +78,9 @@ typedef struct CsvReader CsvReader; struct CsvReader { FILE *in; /* Read the CSV text from this input stream */ char *z; /* Accumulated text for a field */ - int n; /* Number of bytes in z */ - int nAlloc; /* Space allocated for z[] */ - int nLine; /* Current line number */ + i64 n; /* Number of bytes in z */ + i64 nAlloc; /* Space allocated for z[] */ + i64 nLine; /* Current line number */ int bNotFirst; /* True if prior text has been seen */ int cTerm; /* Character that terminated the most recent field */ size_t iIn; /* Next unread character in the input buffer */ @@ -125,7 +129,7 @@ static int csv_reader_open( const char *zData /* ... or use this data */ ){ if( zFilename ){ - p->zIn = sqlite3_malloc( CSV_INBUFSZ ); + p->zIn = sqlite3_malloc64( CSV_INBUFSZ ); if( p->zIn==0 ){ csv_errmsg(p, "out of memory"); return 1; @@ -174,7 +178,7 @@ static int csv_getc(CsvReader *p){ ** Return 0 on success and non-zero if there is an OOM error */ static CSV_NOINLINE int csv_resize_and_append(CsvReader *p, char c){ char *zNew; - int nNew = p->nAlloc*2 + 100; + i64 nNew = p->nAlloc*2 + 100; zNew = sqlite3_realloc64(p->z, nNew); if( zNew ){ p->z = zNew; @@ -218,7 +222,7 @@ static char *csv_read_one_field(CsvReader *p){ } if( c=='"' ){ int pc, ppc; - int startLine = p->nLine; + i64 startLine = p->nLine; pc = ppc = 0; while( 1 ){ c = csv_getc(p); @@ -322,7 +326,7 @@ typedef struct CsvCursor { sqlite3_vtab_cursor base; /* Base class. Must be first */ CsvReader rdr; /* The CsvReader object */ char **azVal; /* Value of the current row */ - int *aLen; /* Length of each entry */ + i64 *aLen; /* Length of each entry */ sqlite3_int64 iRowid; /* The current rowid. Negative for EOF */ } CsvCursor; @@ -494,7 +498,7 @@ static int csvtabConnect( CsvTable *pNew = 0; /* The CsvTable object to construct */ int bHeader = -1; /* header= flags. -1 means not seen yet */ int rc = SQLITE_OK; /* Result code from this routine */ - int i, j; /* Loop counters */ + u64 i, j; /* Loop counters */ #ifdef SQLITE_TEST int tstFlags = 0; /* Value for testflags=N parameter */ #endif @@ -510,11 +514,10 @@ static int csvtabConnect( # define CSV_DATA (azPValue[1]) # define CSV_SCHEMA (azPValue[2]) - assert( sizeof(azPValue)==sizeof(azParam) ); memset(&sRdr, 0, sizeof(sRdr)); memset(azPValue, 0, sizeof(azPValue)); - for(i=3; inCol; + nByte = sizeof(*pCur) + (sizeof(char*)+sizeof(i64))*pTab->nCol; pCur = sqlite3_malloc64( nByte ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, nByte); pCur->azVal = (char**)&pCur[1]; - pCur->aLen = (int*)&pCur->azVal[pTab->nCol]; + pCur->aLen = (i64*)&pCur->azVal[pTab->nCol]; *ppCursor = &pCur->base; if( csv_reader_open(&pCur->rdr, pTab->zFilename, pTab->zData) ){ csv_xfer_error(pTab, &pCur->rdr); diff --git a/ext/misc/decimal.c b/ext/misc/decimal.c index f87699f96b..66d4e3042f 100644 --- a/ext/misc/decimal.c +++ b/ext/misc/decimal.c @@ -31,6 +31,10 @@ SQLITE_EXTENSION_INIT1 #define IsSpace(X) isspace((unsigned char)X) #endif +#ifndef SQLITE_DECIMAL_MAX_DIGIT +# define SQLITE_DECIMAL_MAX_DIGIT 10000000 +#endif + /* A decimal object */ typedef struct Decimal Decimal; struct Decimal { @@ -69,7 +73,8 @@ static Decimal *decimalNewFromText(const char *zIn, int n){ int i; int iExp = 0; - p = sqlite3_malloc( sizeof(*p) ); + if( zIn==0 ) goto new_from_text_failed; + p = sqlite3_malloc64( sizeof(*p) ); if( p==0 ) goto new_from_text_failed; p->sign = 0; p->oom = 0; @@ -128,9 +133,10 @@ static Decimal *decimalNewFromText(const char *zIn, int n){ } } if( iExp>0 ){ - p->a = sqlite3_realloc64(p->a, (sqlite3_int64)p->nDigit + signed char *a = sqlite3_realloc64(p->a, (sqlite3_int64)p->nDigit + (sqlite3_int64)iExp + 1 ); - if( p->a==0 ) goto new_from_text_failed; + if( a==0 ) goto new_from_text_failed; + p->a = a; memset(p->a+p->nDigit, 0, iExp); p->nDigit += iExp; } @@ -148,9 +154,10 @@ static Decimal *decimalNewFromText(const char *zIn, int n){ } } if( iExp>0 ){ - p->a = sqlite3_realloc64(p->a, (sqlite3_int64)p->nDigit + signed char *a = sqlite3_realloc64(p->a, (sqlite3_int64)p->nDigit + (sqlite3_int64)iExp + 1 ); - if( p->a==0 ) goto new_from_text_failed; + if( a==0 ) goto new_from_text_failed; + p->a = a; memmove(p->a+iExp, p->a, p->nDigit); memset(p->a, 0, iExp); p->nDigit += iExp; @@ -161,6 +168,7 @@ static Decimal *decimalNewFromText(const char *zIn, int n){ for(i=0; inDigit && p->a[i]==0; i++){} if( i>=p->nDigit ) p->sign = 0; } + if( p->nDigit>SQLITE_DECIMAL_MAX_DIGIT ) goto new_from_text_failed; return p; new_from_text_failed: @@ -291,12 +299,38 @@ static void decimal_result(sqlite3_context *pCtx, Decimal *p){ sqlite3_result_text(pCtx, z, i, sqlite3_free); } +/* +** Round a decimal value to N significant digits. N must be positive. +*/ +static void decimal_round(Decimal *p, int N){ + int i; + int nZero; + if( N<1 ) return; + if( p==0 ) return; + if( p->nDigit<=N ) return; + for(nZero=0; nZeronDigit && p->a[nZero]==0; nZero++){} + N += nZero; + if( p->nDigit<=N ) return; + if( p->a[N]>4 ){ + p->a[N-1]++; + for(i=N-1; i>0 && p->a[i]>9; i--){ + p->a[i] = 0; + p->a[i-1]++; + } + if( p->a[0]>9 ){ + p->a[0] = 1; + p->nFrac--; + } + } + memset(&p->a[N], 0, p->nDigit - N); +} + /* ** Make the given Decimal the result in an format similar to '%+#e'. ** In other words, show exponential notation with leading and trailing ** zeros omitted. */ -static void decimal_result_sci(sqlite3_context *pCtx, Decimal *p){ +static void decimal_result_sci(sqlite3_context *pCtx, Decimal *p, int N){ char *z; /* The output buffer */ int i; /* Loop counter */ int nZero; /* Number of leading zeros */ @@ -314,7 +348,8 @@ static void decimal_result_sci(sqlite3_context *pCtx, Decimal *p){ sqlite3_result_null(pCtx); return; } - for(nDigit=p->nDigit; nDigit>0 && p->a[nDigit-1]==0; nDigit--){} + if( N<1 ) N = 0; + for(nDigit=p->nDigit; nDigit>N && p->a[nDigit-1]==0; nDigit--){} for(nZero=0; nZeroa[nZero]==0; nZero++){} nFrac = p->nFrac + (nDigit - p->nDigit); nDigit -= nZero; @@ -430,15 +465,18 @@ static void decimalCmpFunc( static void decimal_expand(Decimal *p, int nDigit, int nFrac){ int nAddSig; int nAddFrac; + signed char *a; if( p==0 ) return; nAddFrac = nFrac - p->nFrac; nAddSig = (nDigit - p->nDigit) - nAddFrac; if( nAddFrac==0 && nAddSig==0 ) return; - p->a = sqlite3_realloc64(p->a, nDigit+1); - if( p->a==0 ){ + if( nDigit+1>SQLITE_DECIMAL_MAX_DIGIT ){ p->oom = 1; return; } + a = sqlite3_realloc64(p->a, nDigit+1); + if( a==0 ){ p->oom = 1; return; } + p->a = a; if( nAddSig ){ memmove(p->a+nAddSig, p->a, p->nDigit); memset(p->a, 0, nAddSig); @@ -533,14 +571,18 @@ static void decimalMul(Decimal *pA, Decimal *pB){ signed char *acc = 0; int i, j, k; int minFrac; + sqlite3_int64 sumDigit; if( pA==0 || pA->oom || pA->isNull || pB==0 || pB->oom || pB->isNull ){ goto mul_end; } - acc = sqlite3_malloc64( (sqlite3_int64)pA->nDigit + - (sqlite3_int64)pB->nDigit + 2 ); + sumDigit = pA->nDigit; + sumDigit += pB->nDigit; + sumDigit += 2; + if( sumDigit>SQLITE_DECIMAL_MAX_DIGIT ){ pA->oom = 1; return; } + acc = sqlite3_malloc64( sumDigit ); if( acc==0 ){ pA->oom = 1; goto mul_end; @@ -677,10 +719,16 @@ static void decimalFunc( sqlite3_value **argv ){ Decimal *p = decimal_new(context, argv[0], 0); - UNUSED_PARAMETER(argc); + int N; + if( argc==2 ){ + N = sqlite3_value_int(argv[1]); + if( N>0 ) decimal_round(p, N); + }else{ + N = 0; + } if( p ){ if( sqlite3_user_data(context)!=0 ){ - decimal_result_sci(context, p); + decimal_result_sci(context, p, N); }else{ decimal_result(context, p); } @@ -766,7 +814,7 @@ static void decimalSumStep( if( p==0 ) return; if( !p->isInit ){ p->isInit = 1; - p->a = sqlite3_malloc(2); + p->a = sqlite3_malloc64(2); if( p->a==0 ){ p->oom = 1; }else{ @@ -850,7 +898,7 @@ static void decimalPow2Func( UNUSED_PARAMETER(argc); if( sqlite3_value_type(argv[0])==SQLITE_INTEGER ){ Decimal *pA = decimalPow2(sqlite3_value_int(argv[0])); - decimal_result_sci(context, pA); + decimal_result_sci(context, pA, 0); decimal_free(pA); } } @@ -871,7 +919,9 @@ int sqlite3_decimal_init( void (*xFunc)(sqlite3_context*,int,sqlite3_value**); } aFunc[] = { { "decimal", 1, 0, decimalFunc }, + { "decimal", 2, 0, decimalFunc }, { "decimal_exp", 1, 1, decimalFunc }, + { "decimal_exp", 2, 1, decimalFunc }, { "decimal_cmp", 2, 0, decimalCmpFunc }, { "decimal_add", 2, 0, decimalAddFunc }, { "decimal_sub", 2, 0, decimalSubFunc }, diff --git a/ext/misc/explain.c b/ext/misc/explain.c index 726af76b96..132041882c 100644 --- a/ext/misc/explain.c +++ b/ext/misc/explain.c @@ -92,7 +92,7 @@ static int explainConnect( rc = sqlite3_declare_vtab(db, "CREATE TABLE x(addr,opcode,p1,p2,p3,p4,p5,comment,sql HIDDEN)"); if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -114,7 +114,7 @@ static int explainDisconnect(sqlite3_vtab *pVtab){ */ static int explainOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ explain_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->db = ((explain_vtab*)p)->db; diff --git a/ext/misc/fileio.c b/ext/misc/fileio.c index d78b148779..91da383e75 100644 --- a/ext/misc/fileio.c +++ b/ext/misc/fileio.c @@ -73,11 +73,6 @@ ** $path is a relative path, then $path is interpreted relative to $dir. ** And the paths returned in the "name" column of the table are also ** relative to directory $dir. -** -** Notes on building this extension for Windows: -** Unless linked statically with the SQLite library, a preprocessor -** symbol, FILEIO_WIN32_DLL, must be #define'd to create a stand-alone -** DLL form of this extension for WIN32. See its use below for details. */ #include "sqlite3ext.h" SQLITE_EXTENSION_INIT1 @@ -94,12 +89,16 @@ SQLITE_EXTENSION_INIT1 # include # include # define STRUCT_STAT struct stat +# include +# include #else # include "windirent.h" # include # define STRUCT_STAT struct _stat # define chmod(path,mode) fileio_chmod(path,mode) # define mkdir(path,mode) fileio_mkdir(path) + extern LPWSTR sqlite3_win32_utf8_to_unicode(const char*); + extern char *sqlite3_win32_unicode_to_utf8(LPCWSTR); #endif #include #include @@ -131,12 +130,9 @@ SQLITE_EXTENSION_INIT1 */ #if defined(_WIN32) || defined(WIN32) static int fileio_chmod(const char *zPath, int pmode){ - sqlite3_int64 sz = strlen(zPath); - wchar_t *b1 = sqlite3_malloc64( (sz+1)*sizeof(b1[0]) ); int rc; + wchar_t *b1 = sqlite3_win32_utf8_to_unicode(zPath); if( b1==0 ) return -1; - sz = MultiByteToWideChar(CP_UTF8, 0, zPath, sz, b1, sz); - b1[sz] = 0; rc = _wchmod(b1, pmode); sqlite3_free(b1); return rc; @@ -148,12 +144,9 @@ static int fileio_chmod(const char *zPath, int pmode){ */ #if defined(_WIN32) || defined(WIN32) static int fileio_mkdir(const char *zPath){ - sqlite3_int64 sz = strlen(zPath); - wchar_t *b1 = sqlite3_malloc64( (sz+1)*sizeof(b1[0]) ); int rc; + wchar_t *b1 = sqlite3_win32_utf8_to_unicode(zPath); if( b1==0 ) return -1; - sz = MultiByteToWideChar(CP_UTF8, 0, zPath, sz, b1, sz); - b1[sz] = 0; rc = _wmkdir(b1); sqlite3_free(b1); return rc; @@ -266,50 +259,7 @@ static sqlite3_uint64 fileTimeToUnixTime( return (fileIntervals.QuadPart - epochIntervals.QuadPart) / 10000000; } - - -#if defined(FILEIO_WIN32_DLL) && (defined(_WIN32) || defined(WIN32)) -# /* To allow a standalone DLL, use this next replacement function: */ -# undef sqlite3_win32_utf8_to_unicode -# define sqlite3_win32_utf8_to_unicode utf8_to_utf16 -# -LPWSTR utf8_to_utf16(const char *z){ - int nAllot = MultiByteToWideChar(CP_UTF8, 0, z, -1, NULL, 0); - LPWSTR rv = sqlite3_malloc(nAllot * sizeof(WCHAR)); - if( rv!=0 && 0 < MultiByteToWideChar(CP_UTF8, 0, z, -1, rv, nAllot) ) - return rv; - sqlite3_free(rv); - return 0; -} -#endif - -/* -** This function attempts to normalize the time values found in the stat() -** buffer to UTC. This is necessary on Win32, where the runtime library -** appears to return these values as local times. -*/ -static void statTimesToUtc( - const char *zPath, - STRUCT_STAT *pStatBuf -){ - HANDLE hFindFile; - WIN32_FIND_DATAW fd; - LPWSTR zUnicodeName; - extern LPWSTR sqlite3_win32_utf8_to_unicode(const char*); - zUnicodeName = sqlite3_win32_utf8_to_unicode(zPath); - if( zUnicodeName ){ - memset(&fd, 0, sizeof(WIN32_FIND_DATAW)); - hFindFile = FindFirstFileW(zUnicodeName, &fd); - if( hFindFile!=NULL ){ - pStatBuf->st_ctime = (time_t)fileTimeToUnixTime(&fd.ftCreationTime); - pStatBuf->st_atime = (time_t)fileTimeToUnixTime(&fd.ftLastAccessTime); - pStatBuf->st_mtime = (time_t)fileTimeToUnixTime(&fd.ftLastWriteTime); - FindClose(hFindFile); - } - sqlite3_free(zUnicodeName); - } -} -#endif +#endif /* _WIN32 */ /* ** This function is used in place of stat(). On Windows, special handling @@ -321,14 +271,23 @@ static int fileStat( STRUCT_STAT *pStatBuf ){ #if defined(_WIN32) - sqlite3_int64 sz = strlen(zPath); - wchar_t *b1 = sqlite3_malloc64( (sz+1)*sizeof(b1[0]) ); int rc; + wchar_t *b1 = sqlite3_win32_utf8_to_unicode(zPath); if( b1==0 ) return 1; - sz = MultiByteToWideChar(CP_UTF8, 0, zPath, sz, b1, sz); - b1[sz] = 0; rc = _wstat(b1, pStatBuf); - if( rc==0 ) statTimesToUtc(zPath, pStatBuf); + if( rc==0 ){ + HANDLE hFindFile; + WIN32_FIND_DATAW fd; + memset(&fd, 0, sizeof(WIN32_FIND_DATAW)); + hFindFile = FindFirstFileW(b1, &fd); + if( hFindFile!=NULL ){ + pStatBuf->st_ctime = (time_t)fileTimeToUnixTime(&fd.ftCreationTime); + pStatBuf->st_atime = (time_t)fileTimeToUnixTime(&fd.ftLastAccessTime); + pStatBuf->st_mtime = (time_t)fileTimeToUnixTime(&fd.ftLastWriteTime); + FindClose(hFindFile); + } + } + sqlite3_free(b1); return rc; #else return stat(zPath, pStatBuf); @@ -459,7 +418,6 @@ static int writeFile( if( mtime>=0 ){ #if defined(_WIN32) -#if !SQLITE_OS_WINRT /* Windows */ FILETIME lastAccess; FILETIME lastWrite; @@ -490,7 +448,6 @@ static int writeFile( }else{ return 1; } -#endif #elif defined(AT_FDCWD) && 0 /* utimensat() is not universally available */ /* Recent unix */ struct timespec times[2]; @@ -656,7 +613,7 @@ static int fsdirConnect( (void)pzErr; rc = sqlite3_declare_vtab(db, "CREATE TABLE x" FSDIR_SCHEMA); if( rc==SQLITE_OK ){ - pNew = (fsdir_tab*)sqlite3_malloc( sizeof(*pNew) ); + pNew = (fsdir_tab*)sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); sqlite3_vtab_config(db, SQLITE_VTAB_DIRECTONLY); @@ -679,7 +636,7 @@ static int fsdirDisconnect(sqlite3_vtab *pVtab){ static int fsdirOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ fsdir_cursor *pCur; (void)p; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->iLvl = -1; @@ -1094,6 +1051,154 @@ static int fsdirRegister(sqlite3 *db){ # define fsdirRegister(x) SQLITE_OK #endif +/* +** This version of realpath() works on any system. The string +** returned is held in memory allocated using sqlite3_malloc64(). +** The caller is responsible for calling sqlite3_free(). +*/ +static char *portable_realpath(const char *zPath){ +#if !defined(_WIN32) /* BEGIN unix */ + + char *zOut = 0; /* Result */ + char *z; /* Temporary buffer */ +#if defined(PATH_MAX) + char zBuf[PATH_MAX+1]; /* Space for the temporary buffer */ +#endif + + if( zPath==0 ) return 0; +#if defined(PATH_MAX) + z = realpath(zPath, zBuf); + if( z ){ + zOut = sqlite3_mprintf("%s", zBuf); + } +#endif /* defined(PATH_MAX) */ + if( zOut==0 ){ + /* Try POSIX.1-2008 malloc behavior */ + z = realpath(zPath, NULL); + if( z ){ + zOut = sqlite3_mprintf("%s", z); + free(z); + } + } + return zOut; + +#else /* End UNIX, Begin WINDOWS */ + + wchar_t *zPath16; /* UTF16 translation of zPath */ + char *zOut = 0; /* Result */ + wchar_t *z = 0; /* Temporary buffer */ + + if( zPath==0 ) return 0; + + zPath16 = sqlite3_win32_utf8_to_unicode(zPath); + if( zPath16==0 ) return 0; + z = _wfullpath(NULL, zPath16, 0); + sqlite3_free(zPath16); + if( z ){ + zOut = sqlite3_win32_unicode_to_utf8(z); + free(z); + } + return zOut; + +#endif /* End WINDOWS, Begin common code */ +} + +/* +** SQL function: realpath(X) +** +** Try to convert file or pathname X into its real, absolute pathname. +** Return NULL if unable. +** +** The file or directory X is not required to exist. The answer is formed +** by calling system realpath() on the prefix of X that does exist and +** appending the tail of X that does not (yet) exist. +*/ +static void realpathFunc( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + const char *zPath; /* Original input path */ + char *zCopy; /* An editable copy of zPath */ + char *zOut; /* The result */ + char cSep = 0; /* Separator turned into \000 */ + size_t len; /* Prefix length before cSep */ +#ifdef _WIN32 + const int isWin = 1; +#else + const int isWin = 0; +#endif + + (void)argc; + zPath = (const char*)sqlite3_value_text(argv[0]); + if( zPath==0 ) return; + if( zPath[0]==0 ) zPath = "."; + zCopy = sqlite3_mprintf("%s",zPath); + len = strlen(zCopy); + while( len>1 && (zCopy[len-1]=='/' || (isWin && zCopy[len-1]=='\\')) ){ + len--; + } + zCopy[len] = 0; + while( 1 /*exit-by-break*/ ){ + zOut = portable_realpath(zCopy); + zCopy[len] = cSep; + if( zOut ){ + if( cSep ){ + zOut = sqlite3_mprintf("%z%s",zOut,&zCopy[len]); + } + break; + }else{ + size_t i = len-1; + while( i>0 ){ + if( zCopy[i]=='/' || (isWin && zCopy[i]=='\\') ) break; + i--; + } + if( i<=0 ){ + if( zCopy[0]=='/' ){ + zOut = zCopy; + zCopy = 0; + }else if( (zOut = portable_realpath("."))!=0 ){ + zOut = sqlite3_mprintf("%z/%s", zOut, zCopy); + } + break; + } + cSep = zCopy[i]; + zCopy[i] = 0; + len = i; + } + } + sqlite3_free(zCopy); + if( zOut ){ + /* Simplify any "/./" or "/../" that might have snuck into the + ** pathname due to appending of zCopy. We only have to consider + ** unix "/" separators, because the _wfilepath() system call on + ** Windows will have already done this simplification for us. */ + size_t i, j, n; + n = strlen(zOut); + for(i=j=0; i0 && zOut[j-1]!='/' ){ j--; } + if( j>0 ){ j--; } + i += 2; + continue; + } + } + zOut[j++] = zOut[i]; + } + zOut[j] = 0; + + /* Return the result */ + sqlite3_result_text(context, zOut, -1, sqlite3_free); + } +} + + #ifdef _WIN32 __declspec(dllexport) #endif @@ -1120,13 +1225,10 @@ int sqlite3_fileio_init( if( rc==SQLITE_OK ){ rc = fsdirRegister(db); } + if( rc==SQLITE_OK ){ + rc = sqlite3_create_function(db, "realpath", 1, + SQLITE_UTF8, 0, + realpathFunc, 0, 0); + } return rc; } - -#if defined(FILEIO_WIN32_DLL) && (defined(_WIN32) || defined(WIN32)) -/* To allow a standalone DLL, make test_windirent.c use the same - * redefined SQLite API calls as the above extension code does. - * Just pull in this .c to accomplish this. As a beneficial side - * effect, this extension becomes a single translation unit. */ -# include "test_windirent.c" -#endif diff --git a/ext/misc/fossildelta.c b/ext/misc/fossildelta.c index d24a87700e..e2de0ec40f 100644 --- a/ext/misc/fossildelta.c +++ b/ext/misc/fossildelta.c @@ -38,9 +38,11 @@ SQLITE_EXTENSION_INIT1 #ifndef SQLITE_AMALGAMATION /* -** The "u32" type must be an unsigned 32-bit integer. Adjust this +** The "u32" type must be an unsigned 32-bit integer. "u64" is +** an unsigned 64-bit integer. */ typedef unsigned int u32; +typedef sqlite3_uint64 u64; /* ** Must be a 16-bit value @@ -541,8 +543,8 @@ static int delta_apply( int lenDelta, /* Length of the delta */ char *zOut /* Write the output into this preallocated buffer */ ){ - unsigned int limit; - unsigned int total = 0; + sqlite3_uint64 limit; + sqlite3_uint64 total = 0; #ifdef FOSSIL_ENABLE_DELTA_CKSUM_TEST char *zOrigOut = zOut; #endif @@ -570,7 +572,7 @@ static int delta_apply( /* ERROR: copy exceeds output file size */ return -1; } - if( ofst+cnt > lenSrc ){ + if( (u64)ofst+(u64)cnt > (u64)lenSrc ){ /* ERROR: copy extends past end of input */ return -1; } @@ -841,7 +843,7 @@ static int deltaparsevtabDisconnect(sqlite3_vtab *pVtab){ */ static int deltaparsevtabOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ deltaparsevtab_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/fuzzer.c b/ext/misc/fuzzer.c index e16d005d9c..12785e3a40 100644 --- a/ext/misc/fuzzer.c +++ b/ext/misc/fuzzer.c @@ -556,7 +556,7 @@ static int fuzzerConnect( static int fuzzerOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){ fuzzer_vtab *p = (fuzzer_vtab*)pVTab; fuzzer_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->pVtab = p; @@ -617,12 +617,12 @@ static int fuzzerRender( int *pnBuf /* Size of the buffer */ ){ const fuzzer_rule *pRule = pStem->pRule; - int n; /* Size of output term without nul-term */ + sqlite3_int64 n; /* Size of output term without nul-term */ char *z; /* Buffer to assemble output term in */ n = pStem->nBasis + pRule->nTo - pRule->nFrom; if( (*pnBuf)-1000 && e<1000 ){ sqlite3_result_double(context, 0.0); return; @@ -259,6 +259,38 @@ static void ieee754func_to_blob( } } +/* +** Functions to convert between 64-bit integers and floats. +** +** The bit patterns are copied. The numeric values are different. +*/ +static void ieee754func_from_int( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + UNUSED_PARAMETER(argc); + if( sqlite3_value_type(argv[0])==SQLITE_INTEGER ){ + double r; + sqlite3_int64 v = sqlite3_value_int64(argv[0]); + memcpy(&r, &v, sizeof(r)); + sqlite3_result_double(context, r); + } +} +static void ieee754func_to_int( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + UNUSED_PARAMETER(argc); + if( sqlite3_value_type(argv[0])==SQLITE_FLOAT ){ + double r = sqlite3_value_double(argv[0]); + sqlite3_uint64 v; + memcpy(&v, &r, sizeof(v)); + sqlite3_result_int64(context, v); + } +} + /* ** SQL Function: ieee754_inc(r,N) ** @@ -311,6 +343,8 @@ int sqlite3_ieee_init( { "ieee754_exponent", 1, 2, ieee754func }, { "ieee754_to_blob", 1, 0, ieee754func_to_blob }, { "ieee754_from_blob", 1, 0, ieee754func_from_blob }, + { "ieee754_to_int", 1, 0, ieee754func_to_int }, + { "ieee754_from_int", 1, 0, ieee754func_from_int }, { "ieee754_inc", 2, 0, ieee754inc }, }; unsigned int i; diff --git a/ext/misc/memstat.c b/ext/misc/memstat.c index 8e69b46955..5002a1359c 100644 --- a/ext/misc/memstat.c +++ b/ext/misc/memstat.c @@ -85,7 +85,7 @@ static int memstatConnect( rc = sqlite3_declare_vtab(db,"CREATE TABLE x(name,schema,value,hiwtr)"); if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -107,7 +107,7 @@ static int memstatDisconnect(sqlite3_vtab *pVtab){ */ static int memstatOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ memstat_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->db = ((memstat_vtab*)p)->db; diff --git a/ext/misc/prefixes.c b/ext/misc/prefixes.c index e6517e7195..3c47933c06 100644 --- a/ext/misc/prefixes.c +++ b/ext/misc/prefixes.c @@ -75,7 +75,7 @@ static int prefixesConnect( "CREATE TABLE prefixes(prefix TEXT, original_string TEXT HIDDEN)" ); if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -98,7 +98,7 @@ static int prefixesDisconnect(sqlite3_vtab *pVtab){ */ static int prefixesOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ prefixes_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/qpvtab.c b/ext/misc/qpvtab.c index b7c2a05126..15071883c8 100644 --- a/ext/misc/qpvtab.c +++ b/ext/misc/qpvtab.c @@ -152,7 +152,7 @@ static int qpvtabConnect( #define QPVTAB_FLAGS 11 #define QPVTAB_NONE 12 if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -174,7 +174,7 @@ static int qpvtabDisconnect(sqlite3_vtab *pVtab){ */ static int qpvtabOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ qpvtab_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/regexp.c b/ext/misc/regexp.c index f1babf4ab7..4f40e3f1c7 100644 --- a/ext/misc/regexp.c +++ b/ext/misc/regexp.c @@ -32,7 +32,7 @@ ** ^X X occurring at the beginning of the string ** X$ X occurring at the end of the string ** . Match any single character -** \c Character c where c is one of \{}()[]|*+?. +** \c Character c where c is one of \{}()[]|*+?-. ** \c C-language escapes for c in afnrtv. ex: \t or \n ** \uXXXX Where XXXX is exactly 4 hex digits, unicode value XXXX ** \xXX Where XX is exactly 2 hex digits, unicode value XX @@ -417,7 +417,7 @@ static int re_hex(int c, int *pV){ ** return its interpretation. */ static unsigned re_esc_char(ReCompiled *p){ - static const char zEsc[] = "afnrtv\\()*.+?[$^{|}]"; + static const char zEsc[] = "afnrtv\\()*.+?[$^{|}]-"; static const char zTrans[] = "\a\f\n\r\t\v"; int i, v = 0; char c; @@ -676,7 +676,7 @@ static const char *re_compile( int i, j; *ppRe = 0; - pRe = sqlite3_malloc( sizeof(*pRe) ); + pRe = sqlite3_malloc64( sizeof(*pRe) ); if( pRe==0 ){ return "out of memory"; } @@ -740,11 +740,18 @@ static const char *re_compile( } /* -** Compute a reasonable limit on the length of the REGEXP NFA. +** The value of LIMIT_MAX_PATTERN_LENGTH. */ static int re_maxlen(sqlite3_context *context){ sqlite3 *db = sqlite3_context_db_handle(context); - return 75 + sqlite3_limit(db, SQLITE_LIMIT_LIKE_PATTERN_LENGTH,-1)/2; + return sqlite3_limit(db, SQLITE_LIMIT_LIKE_PATTERN_LENGTH,-1); +} + +/* +** Maximum NFA size given a maximum pattern length. +*/ +static int re_maxnfa(int mxlen){ + return 75+mxlen/2; } /* @@ -770,10 +777,17 @@ static void re_sql_func( (void)argc; /* Unused */ pRe = sqlite3_get_auxdata(context, 0); if( pRe==0 ){ + int mxLen = re_maxlen(context); + int nPattern; zPattern = (const char*)sqlite3_value_text(argv[0]); if( zPattern==0 ) return; - zErr = re_compile(&pRe, zPattern, re_maxlen(context), - sqlite3_user_data(context)!=0); + nPattern = sqlite3_value_bytes(argv[0]); + if( nPattern>mxLen ){ + zErr = "REGEXP pattern too big"; + }else{ + zErr = re_compile(&pRe, zPattern, re_maxnfa(mxLen), + sqlite3_user_data(context)!=0); + } if( zErr ){ re_free(pRe); sqlite3_result_error(context, zErr, -1); @@ -814,7 +828,6 @@ static void re_bytecode_func( int i; int n; char *z; - (void)argc; static const char *ReOpName[] = { "EOF", "MATCH", @@ -837,9 +850,10 @@ static void re_bytecode_func( "ATSTART", }; + (void)argc; zPattern = (const char*)sqlite3_value_text(argv[0]); if( zPattern==0 ) return; - zErr = re_compile(&pRe, zPattern, re_maxlen(context), + zErr = re_compile(&pRe, zPattern, re_maxnfa(re_maxlen(context)), sqlite3_user_data(context)!=0); if( zErr ){ re_free(pRe); diff --git a/ext/misc/scrub.c b/ext/misc/scrub.c index 9fbf2aed4a..2406d39f25 100644 --- a/ext/misc/scrub.c +++ b/ext/misc/scrub.c @@ -92,7 +92,7 @@ static void scrubBackupErr(ScrubState *p, const char *zFormat, ...){ static u8 *scrubBackupAllocPage(ScrubState *p){ u8 *pPage; if( p->rcErr ) return 0; - pPage = sqlite3_malloc( p->szPage ); + pPage = sqlite3_malloc64( p->szPage ); if( pPage==0 ) p->rcErr = SQLITE_NOMEM; return pPage; } diff --git a/ext/misc/series.c b/ext/misc/series.c index ffdb23c1a0..ac8f4597f0 100644 --- a/ext/misc/series.c +++ b/ext/misc/series.c @@ -239,7 +239,7 @@ static int seriesConnect( rc = sqlite3_declare_vtab(db, "CREATE TABLE x(value,start hidden,stop hidden,step hidden)"); if( rc==SQLITE_OK ){ - pNew = *ppVtab = sqlite3_malloc( sizeof(*pNew) ); + pNew = *ppVtab = sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); sqlite3_vtab_config(db, SQLITE_VTAB_INNOCUOUS); @@ -261,7 +261,7 @@ static int seriesDisconnect(sqlite3_vtab *pVtab){ static int seriesOpen(sqlite3_vtab *pUnused, sqlite3_vtab_cursor **ppCursor){ series_cursor *pCur; (void)pUnused; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/sha1.c b/ext/misc/sha1.c index 07d7970609..fb8f625f51 100644 --- a/ext/misc/sha1.c +++ b/ext/misc/sha1.c @@ -230,13 +230,16 @@ static void hash_finish( *****************************************************************************/ /* -** Implementation of the sha1(X) function. +** Two SQL functions: sha1(X) and sha1b(X). ** -** Return a lower-case hexadecimal rendering of the SHA1 hash of the -** argument X. If X is a BLOB, it is hashed as is. For all other +** sha1(X) returns a lower-case hexadecimal rendering of the SHA1 hash +** of the argument X. If X is a BLOB, it is hashed as is. For all other ** types of input, X is converted into a UTF-8 string and the string -** is hash without the trailing 0x00 terminator. The hash of a NULL +** is hashed without the trailing 0x00 terminator. The hash of a NULL ** value is NULL. +** +** sha1b(X) is the same except that it returns a 20-byte BLOB containing +** the binary hash instead of a hexadecimal string. */ static void sha1Func( sqlite3_context *context, @@ -246,22 +249,27 @@ static void sha1Func( SHA1Context cx; int eType = sqlite3_value_type(argv[0]); int nByte = sqlite3_value_bytes(argv[0]); + const unsigned char *pData; char zOut[44]; assert( argc==1 ); if( eType==SQLITE_NULL ) return; hash_init(&cx); if( eType==SQLITE_BLOB ){ - hash_step(&cx, sqlite3_value_blob(argv[0]), nByte); + pData = (const unsigned char*)sqlite3_value_blob(argv[0]); }else{ - hash_step(&cx, sqlite3_value_text(argv[0]), nByte); + pData = (const unsigned char*)sqlite3_value_text(argv[0]); } + if( pData==0 ) return; + hash_step(&cx, pData, nByte); if( sqlite3_user_data(context)!=0 ){ + /* sha1b() - binary result */ hash_finish(&cx, zOut, 1); sqlite3_result_blob(context, zOut, 20, SQLITE_TRANSIENT); }else{ + /* sha1() - hexadecimal text result */ hash_finish(&cx, zOut, 0); - sqlite3_result_blob(context, zOut, 40, SQLITE_TRANSIENT); + sqlite3_result_text(context, zOut, 40, SQLITE_TRANSIENT); } } @@ -315,6 +323,7 @@ static void sha1QueryFunc( } nCol = sqlite3_column_count(pStmt); z = sqlite3_sql(pStmt); + if( z==0 ) z = ""; n = (int)strlen(z); hash_step_vformat(&cx,"S%d:",n); hash_step(&cx,(unsigned char*)z,n); diff --git a/ext/misc/sqlar.c b/ext/misc/sqlar.c index 9f726f0b89..30ccc4f550 100644 --- a/ext/misc/sqlar.c +++ b/ext/misc/sqlar.c @@ -46,7 +46,7 @@ static void sqlarCompressFunc( uLongf nOut = compressBound(nData); Bytef *pOut; - pOut = (Bytef*)sqlite3_malloc(nOut); + pOut = (Bytef*)sqlite3_malloc64(nOut); if( pOut==0 ){ sqlite3_result_error_nomem(context); return; @@ -84,14 +84,14 @@ static void sqlarUncompressFunc( sqlite3_int64 sz; assert( argc==2 ); - sz = sqlite3_value_int(argv[1]); + sz = sqlite3_value_int64(argv[1]); if( sz<=0 || sz==(nData = sqlite3_value_bytes(argv[0])) ){ sqlite3_result_value(context, argv[0]); }else{ uLongf szf = sz; const Bytef *pData= sqlite3_value_blob(argv[0]); - Bytef *pOut = sqlite3_malloc(sz); + Bytef *pOut = sqlite3_malloc64(sz); if( pOut==0 ){ sqlite3_result_error_nomem(context); }else if( Z_OK!=uncompress(pOut, &szf, pData, nData) ){ diff --git a/ext/misc/sqlite3_stdio.c b/ext/misc/sqlite3_stdio.c index c9bceb1942..049dd51740 100644 --- a/ext/misc/sqlite3_stdio.c +++ b/ext/misc/sqlite3_stdio.c @@ -101,8 +101,8 @@ FILE *sqlite3_fopen(const char *zFilename, const char *zMode){ sz1 = (int)strlen(zFilename); sz2 = (int)strlen(zMode); - b1 = sqlite3_malloc( (sz1+1)*sizeof(b1[0]) ); - b2 = sqlite3_malloc( (sz2+1)*sizeof(b1[0]) ); + b1 = sqlite3_malloc64( (sz1+1)*sizeof(b1[0]) ); + b2 = sqlite3_malloc64( (sz2+1)*sizeof(b1[0]) ); if( b1 && b2 ){ sz1 = MultiByteToWideChar(CP_UTF8, 0, zFilename, sz1, b1, sz1); b1[sz1] = 0; @@ -127,8 +127,8 @@ FILE *sqlite3_popen(const char *zCommand, const char *zMode){ sz1 = (int)strlen(zCommand); sz2 = (int)strlen(zMode); - b1 = sqlite3_malloc( (sz1+1)*sizeof(b1[0]) ); - b2 = sqlite3_malloc( (sz2+1)*sizeof(b1[0]) ); + b1 = sqlite3_malloc64( (sz1+1)*sizeof(b1[0]) ); + b2 = sqlite3_malloc64( (sz2+1)*sizeof(b1[0]) ); if( b1 && b2 ){ sz1 = MultiByteToWideChar(CP_UTF8, 0, zCommand, sz1, b1, sz1); b1[sz1] = 0; @@ -151,7 +151,7 @@ char *sqlite3_fgets(char *buf, int sz, FILE *in){ ** that into UTF-8. Otherwise, non-ASCII characters all get translated ** into '?'. */ - wchar_t *b1 = sqlite3_malloc( sz*sizeof(wchar_t) ); + wchar_t *b1 = sqlite3_malloc64( sz*sizeof(wchar_t) ); if( b1==0 ) return 0; #ifdef SQLITE_USE_W32_FOR_CONSOLE_IO DWORD nRead = 0; @@ -226,7 +226,7 @@ int sqlite3_fputs(const char *z, FILE *out){ ** to the console on Windows. */ int sz = (int)strlen(z); - wchar_t *b1 = sqlite3_malloc( (sz+1)*sizeof(wchar_t) ); + wchar_t *b1 = sqlite3_malloc64( (sz+1)*sizeof(wchar_t) ); if( b1==0 ) return 0; sz = MultiByteToWideChar(CP_UTF8, 0, z, sz, b1, sz); b1[sz] = 0; @@ -258,7 +258,7 @@ int sqlite3_fputs(const char *z, FILE *out){ /* -** Work-alike for fprintf() from the standard C library. +** Work-alikes for fprintf() and vfprintf() from the standard C library. */ int sqlite3_fprintf(FILE *out, const char *zFormat, ...){ int rc; @@ -285,6 +285,24 @@ int sqlite3_fprintf(FILE *out, const char *zFormat, ...){ } return rc; } +int sqlite3_vfprintf(FILE *out, const char *zFormat, va_list ap){ + int rc; + if( UseWtextForOutput(out) ){ + /* When writing to the command-prompt in Windows, it is necessary + ** to use _O_WTEXT input mode and write UTF-16 characters. + */ + char *z; + z = sqlite3_vmprintf(zFormat, ap); + sqlite3_fputs(z, out); + rc = (int)strlen(z); + sqlite3_free(z); + }else{ + /* Writing to a file or other destination, just write bytes without + ** any translation. */ + rc = vfprintf(out, zFormat, ap); + } + return rc; +} /* ** Set the mode for an output stream. mode argument is typically _O_BINARY or diff --git a/ext/misc/sqlite3_stdio.h b/ext/misc/sqlite3_stdio.h index dd0eefad04..75368df9f8 100644 --- a/ext/misc/sqlite3_stdio.h +++ b/ext/misc/sqlite3_stdio.h @@ -31,6 +31,7 @@ #ifdef _WIN32 /**** Definitions For Windows ****/ #include +#include #include FILE *sqlite3_fopen(const char *zFilename, const char *zMode); @@ -38,6 +39,7 @@ FILE *sqlite3_popen(const char *zCommand, const char *type); char *sqlite3_fgets(char *s, int size, FILE *stream); int sqlite3_fputs(const char *s, FILE *stream); int sqlite3_fprintf(FILE *stream, const char *format, ...); +int sqlite3_vfprintf(FILE *stream, const char *format, va_list); void sqlite3_fsetmode(FILE *stream, int mode); @@ -49,6 +51,7 @@ void sqlite3_fsetmode(FILE *stream, int mode); #define sqlite3_fgets fgets #define sqlite3_fputs fputs #define sqlite3_fprintf fprintf +#define sqlite3_vfprintf vfprintf #define sqlite3_fsetmode(F,X) /*no-op*/ #endif diff --git a/ext/misc/stmtrand.c b/ext/misc/stmtrand.c index b5e3b89a32..7e52ef25b2 100644 --- a/ext/misc/stmtrand.c +++ b/ext/misc/stmtrand.c @@ -52,7 +52,7 @@ static void stmtrandFunc( p = (Stmtrand*)sqlite3_get_auxdata(context, STMTRAND_KEY); if( p==0 ){ unsigned int seed; - p = sqlite3_malloc( sizeof(*p) ); + p = sqlite3_malloc64( sizeof(*p) ); if( p==0 ){ sqlite3_result_error_nomem(context); return; diff --git a/ext/misc/templatevtab.c b/ext/misc/templatevtab.c index 5865f5214b..ba79373437 100644 --- a/ext/misc/templatevtab.c +++ b/ext/misc/templatevtab.c @@ -101,7 +101,7 @@ static int templatevtabConnect( #define TEMPLATEVTAB_A 0 #define TEMPLATEVTAB_B 1 if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -123,7 +123,7 @@ static int templatevtabDisconnect(sqlite3_vtab *pVtab){ */ static int templatevtabOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ templatevtab_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/tmstmpvfs.c b/ext/misc/tmstmpvfs.c new file mode 100644 index 0000000000..6f1af36f74 --- /dev/null +++ b/ext/misc/tmstmpvfs.c @@ -0,0 +1,1042 @@ +/* +** 2026-01-05 +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** May you do good and not evil. +** May you find forgiveness for yourself and forgive others. +** May you share freely, never taking more than you give. +** +****************************************************************************** +** +** This file implements a VFS shim that writes a timestamp and other tracing +** information into 16 byts of reserved space at the end of each page of the +** database file. +** +** The VFS also tries to generate log-files with names of the form: +** +** $(DATABASE)-tmstmp/$(TIME)-$(PID)-$(ID) +** +** Log files are only generated if directory $(DATABASE)-tmstmp exists. +** The name of each log file is the current ISO8601 time in milliseconds, +** the process ID, and a random 32-bit value (to disambiguate multiple +** connections from the same process) separated by dashes. The log file +** contains 16-bytes records for various events, such as opening or close +** of the database or WAL file, writes to the WAL file, checkpoints, and +** similar. The logfile is only generated if the connection attempts to +** modify the database. There is a separate log file for each open database +** connection. +** +** COMPILING +** +** To build this extension as a separately loaded shared library or +** DLL, use compiler command-lines similar to the following: +** +** (linux) gcc -fPIC -shared tmstmpvfs.c -o tmstmpvfs.so +** (mac) clang -fPIC -dynamiclib tmstmpvfs.c -o tmstmpvfs.dylib +** (windows) cl tmstmpvfs.c -link -dll -out:tmstmpvfs.dll +** +** You may want to add additional compiler options, of course, +** according to the needs of your project. +** +** Another option is to statically link both SQLite and this extension +** into your application. If both this file and "sqlite3.c" are statically +** linked, and if "sqlite3.c" is compiled with an option like: +** +** -DSQLITE_EXTRA_INIT=sqlite3_register_tmstmpvfs +** +** Then SQLite will use the tmstmp VFS by default throughout your +** application. +** +** LOADING +** +** To load this extension as a shared library, you first have to +** bring up a dummy SQLite database connection to use as the argument +** to the sqlite3_load_extension() API call. Then you invoke the +** sqlite3_load_extension() API and shutdown the dummy database +** connection. All subsequent database connections that are opened +** will include this extension. For example: +** +** sqlite3 *db; +** sqlite3_open(":memory:", &db); +** sqlite3_load_extension(db, "./tmstmpvfs"); +** sqlite3_close(db); +** +** Tmstmpvfs is a VFS Shim. When loaded, "tmstmpvfs" becomes the new +** default VFS and it uses the prior default VFS as the next VFS +** down in the stack. This is normally what you want. However, in +** complex situations where multiple VFS shims are being loaded, +** it might be important to ensure that tmstmpvfs is loaded in the +** correct order so that it sequences itself into the default VFS +** Shim stack in the right order. +** +** When running the CLI, you can load this extension at invocation by +** adding a command-line option like this: "--vfs ./tmstmpvfs.so". +** The --vfs option usually specifies the symbolic name of a built-in VFS. +** But if the argument to --vfs is not a built-in VFS but is instead the +** name of a file, the CLI tries to load that file as an extension. Note +** that the full name of the extension file must be provided, including +** the ".so" or ".dylib" or ".dll" suffix. +** +** An application can see if the tmstmpvfs is being used by examining +** the results from SQLITE_FCNTL_VFSNAME (or the .vfsname command in +** the CLI). If the answer include "tmstmp", then this VFS is being +** used. +** +** USING +** +** Open database connections using the sqlite3_open() or +** sqlite3_open_v2() interfaces, as normal. Ordinary database files +** (without a timestamp) will operate normally. +** +** Timestamping only works on databases that have a reserve-bytes +** value of exactly 16. The default value for reserve-bytes is 0. +** Hence, newly created database files will omit the timestamp by +** default. To create a database that includes a timestamp, change +** the reserve-bytes value to 16 by running: +** +** int n = 16; +** sqlite3_file_control(db, 0, SQLITE_FCNTL_RESERVE_BYTES, &n); +** +** If you do this immediately after creating a new database file, +** before anything else has been written into the file, then that +** might be all that you need to do. Otherwise, the API call +** above should be followed by: +** +** sqlite3_exec(db, "VACUUM", 0, 0, 0); +** +** It never hurts to run the VACUUM, even if you don't need it. +** +** From the CLI, use the ".filectrl reserve_bytes 16" command, +** followed by "VACUUM;". +** +** SQLite allows the number of reserve-bytes to be increased, but +** not decreased. If you want to restore the reserve-bytes to 0 +** (to disable tmstmpvfs), the easiest approach is to use VACUUM INTO +** with a URI filename as the argument and include "reserve=0" query +** parameter on the URI. Example: +** +** VACUUM INTO 'file:notimestamps.db?reserve=0'; +** +** Then switch over to using the new database file. The reserve=0 query +** parameter only works on SQLite 3.52.0 and later. +** +** IMPLEMENTATION NOTES +** +** The timestamp information is stored in the last 16 bytes of each page. +** This module only operates if the "bytes of reserved space on each page" +** value at offset 20 the SQLite database header is exactly 16. If +** the reserved-space value is not 16, no timestamp information is added +** to database pages. Some, but not all, logfile entries will be made +** still, but the size of the logs will be greatly reduced. +** +** The timestamp layout is as follows: +** +** bytes 0,1 Zero. Reserved for future expansion +** bytes 2-7 Milliseconds since the Unix Epoch +** bytes 8-11 WAL frame number +** bytes 12 0: WAL write 2: rollback write +** bytes 13-15 Lower 24 bits of Salt-1 +** +** For transactions that occur in rollback mode, only the timestamp +** in bytes 2-7 and byte 12 are non-zero. Byte 12 is set to 2 for +** rollback writes. +** +** The 16-byte tag is added to each database page when the content +** is written into the database file itself. This shim does not make +** any changes to the page as it is written to the WAL file, since +** that would mess up the WAL checksum. +** +** LOGGING +** +** An open database connection that attempts to write to the database +** will create a log file if a directory name $(DATABASE)-tmstmp exists. +** The name of the log file is: +** +** $(TIME)-$(PID)-$(RANDOM) +** +** Where TIME is an ISO 8601 date in milliseconds with no punctuation, +** PID is the process ID, and RANDOM is a 32-bit random number expressed +** as hexadecimal. +** +** The log consists of 16-byte records. Each record consists of five +** unsigned integers: +** +** 1 1 6 4 4 <--- bytes +** op a1 ts a2 a3 +** +** The meanings of the a1-a3 values depend on op. ts is the timestamp +** in milliseconds since the unix epoch (1970-01-01 00:00:00). +** Opcodes are defined by the ELOG_* #defines below. +** +** ELOG_OPEN_DB "Open a connection to the database file" +** op = 0x01 +** a2 = process-ID +** +** ELOG_OPEN_WAL "Open a connection to the -wal file" +** op = 0x02 +** a2 = process-ID +** +** ELOG_WAL_PAGE "New page added to the WAL file" +** op = 0x03 +** a1 = 1 if last page of a txn. 0 otherwise. +** a2 = page number in the DB file +** a3 = frame number in the WAL file +** +** ELOG_DB_PAGE "Database page updated using rollback mode" +** op = 0x04 +** a2 = page number in the DB file +** +** ELOG_CKPT_START "Start of a checkpoint operation" +** op = 0x05 +** +** ELOG_CKPT_PAGE "Page xfer from WAL to database" +** op = 0x06 +** a2 = database page number +** a3 = frame number in the WAL file +** +** ELOG_CKPT_END "Start of a checkpoint operation" +** op = 0x07 +** +** ELOG_WAL_RESET "WAL file header overwritten" +** op = 0x08 +** a3 = Salt1 value +** +** ELOG_CLOSE_WAL "Close the WAL file connection" +** op = 0x0e +** +** ELOG_CLOSE_DB "Close the DB connection" +** op = 0x0f +** +** VIEWING TIMESTAMPS AND LOGS +** +** The command-line utility at tool/showtmlog.c will read and display +** the content of one or more tmstmpvfs.c log files. If all of the +** log files are stored in directory $(DATABASE)-tmstmp, then you can +** view them all using a command like shown below (with an extra "?" +** inserted on the wildcard to avoid closing the C-language comment +** that contains this text): +** +** showtmlog $(DATABASE)-tmstmp/?* +** +** The command-line utility at tools/showdb.c can be used to show the +** timestamps on pages of a database file, using a command like this: +** +** showdb --tmstmp $(DATABASE) pgidx +* +** The command above shows the timestamp and the intended use of every +** pages in the database, in human-readable form. If you also add +** the --csv option to the command above, then the command generates +** a Comma-Separated-Value (CSV) file as output, which contains a +** decoding of the complete timestamp tag on each page of the database. +** This CVS file can be easily imported into another SQLite database +** using a CLI command like the following: +** +** .import --csv '|showdb --tmstmp -csv orig.db pgidx' ts_table +** +** In the command above, the database containing the timestamps is +** "orig.db" and the content is imported into a new table named "ts_table". +** The "ts_table" is created automatically, using the column names found +** in the first line of the CSV file. All columns of the automatically +** created ts_table are of type TEXT. It might make more sense to +** create the table yourself, using more sensible datatypes, like this: +** +** CREATE TABLE ts_table ( +** pgno INT, -- page number +** tm REAL, -- seconds since 1970-01-01 +** frame INT, -- WAL frame number +** flg INT, -- flag (tag byte 12) +** salt INT, -- WAL salt (tag bytes 13-15) +** parent INT, -- Parent page number +** child INT, -- Index of this page in its parent +** ovfl INT, -- Index of this page on the overflow chain +** txt TEXT -- Description of this page +** ); +** +** Then import using: +** +** .import --csv --skip 1 '|showdb --tmstmp --csv orig.db pgidx' ts_table +** +** Note the addition of the "--skip 1" option on ".import" to bypass the +** first line of the CSV file that contains the column names. +** +** Both programs "showdb" and "showtmlog" can be built by running +** "make showtmlog showdb" from the top-level of a recent SQLite +** source tree. +*/ +#if defined(SQLITE_AMALGAMATION) && !defined(SQLITE_TMSTMPVFS_STATIC) +# define SQLITE_TMSTMPVFS_STATIC +#endif +#ifdef SQLITE_TMSTMPVFS_STATIC +# include "sqlite3.h" +#else +# include "sqlite3ext.h" + SQLITE_EXTENSION_INIT1 +#endif +#include +#include +#include + +/* +** Forward declaration of objects used by this utility +*/ +typedef struct sqlite3_vfs TmstmpVfs; +typedef struct TmstmpFile TmstmpFile; +typedef struct TmstmpLog TmstmpLog; + +/* +** Bytes of reserved space used by this extension +*/ +#define TMSTMP_RESERVE 16 + +/* +** The magic number used to identify TmstmpFile objects +*/ +#define TMSTMP_MAGIC 0x2a87b72d + +/* +** Useful datatype abbreviations +*/ +#if !defined(SQLITE_AMALGAMATION) + typedef unsigned char u8; + typedef unsigned int u32; +#endif + +/* +** Current process id +*/ +#if defined(_WIN32) +# include +# define GETPID (u32)GetCurrentProcessId() +#else +# include +# define GETPID (u32)getpid() +#endif + +/* Access to a lower-level VFS that (might) implement dynamic loading, +** access to randomness, etc. +*/ +#define ORIGVFS(p) ((sqlite3_vfs*)((p)->pAppData)) +#define ORIGFILE(p) ((sqlite3_file*)(((TmstmpFile*)(p))+1)) + +/* Information for the tmstmp log file. */ +struct TmstmpLog { + char *zLogname; /* Log filename */ + FILE *log; /* Open log file */ + int n; /* Bytes of a[] used */ + unsigned char a[16*6]; /* Buffered header for the log */ +}; + +/* An open WAL or DB file */ +struct TmstmpFile { + sqlite3_file base; /* IO methods */ + u32 uMagic; /* Magic number for sanity checking */ + u32 salt1; /* Last WAL salt-1 value */ + u32 iFrame; /* Last WAL frame number */ + u32 pgno; /* Current page number */ + u32 pgsz; /* Size of each page, in bytes */ + u8 isWal; /* True if this is a WAL file */ + u8 isDb; /* True if this is a DB file */ + u8 isCommit; /* Last WAL frame header was a transaction commit */ + u8 hasCorrectReserve; /* File has the correct reserve size */ + u8 inCkpt; /* True if in a checkpoint */ + TmstmpLog *pLog; /* Log file */ + TmstmpFile *pPartner; /* DB->WAL or WAL->DB mapping */ + sqlite3_int64 iOfst; /* Offset of last WAL frame header */ + sqlite3_vfs *pSubVfs; /* Underlying VFS */ +}; + +/* +** Event log opcodes +*/ +#define ELOG_OPEN_DB 0x01 +#define ELOG_OPEN_WAL 0x02 +#define ELOG_WAL_PAGE 0x03 +#define ELOG_DB_PAGE 0x04 +#define ELOG_CKPT_START 0x05 +#define ELOG_CKPT_PAGE 0x06 +#define ELOG_CKPT_DONE 0x07 +#define ELOG_WAL_RESET 0x08 +#define ELOG_CLOSE_WAL 0x0e +#define ELOG_CLOSE_DB 0x0f + +/* +** Methods for TmstmpFile +*/ +static int tmstmpClose(sqlite3_file*); +static int tmstmpRead(sqlite3_file*, void*, int iAmt, sqlite3_int64 iOfst); +static int tmstmpWrite(sqlite3_file*,const void*,int iAmt, sqlite3_int64 iOfst); +static int tmstmpTruncate(sqlite3_file*, sqlite3_int64 size); +static int tmstmpSync(sqlite3_file*, int flags); +static int tmstmpFileSize(sqlite3_file*, sqlite3_int64 *pSize); +static int tmstmpLock(sqlite3_file*, int); +static int tmstmpUnlock(sqlite3_file*, int); +static int tmstmpCheckReservedLock(sqlite3_file*, int *pResOut); +static int tmstmpFileControl(sqlite3_file*, int op, void *pArg); +static int tmstmpSectorSize(sqlite3_file*); +static int tmstmpDeviceCharacteristics(sqlite3_file*); +static int tmstmpShmMap(sqlite3_file*, int iPg, int pgsz, int, void volatile**); +static int tmstmpShmLock(sqlite3_file*, int offset, int n, int flags); +static void tmstmpShmBarrier(sqlite3_file*); +static int tmstmpShmUnmap(sqlite3_file*, int deleteFlag); +static int tmstmpFetch(sqlite3_file*, sqlite3_int64 iOfst, int iAmt, void **pp); +static int tmstmpUnfetch(sqlite3_file*, sqlite3_int64 iOfst, void *p); + +/* +** Methods for TmstmpVfs +*/ +static int tmstmpOpen(sqlite3_vfs*, const char *, sqlite3_file*, int , int *); +static int tmstmpDelete(sqlite3_vfs*, const char *zName, int syncDir); +static int tmstmpAccess(sqlite3_vfs*, const char *zName, int flags, int *); +static int tmstmpFullPathname(sqlite3_vfs*, const char *zName, int, char *zOut); +static void *tmstmpDlOpen(sqlite3_vfs*, const char *zFilename); +static void tmstmpDlError(sqlite3_vfs*, int nByte, char *zErrMsg); +static void (*tmstmpDlSym(sqlite3_vfs *pVfs, void *p, const char*zSym))(void); +static void tmstmpDlClose(sqlite3_vfs*, void*); +static int tmstmpRandomness(sqlite3_vfs*, int nByte, char *zOut); +static int tmstmpSleep(sqlite3_vfs*, int microseconds); +static int tmstmpCurrentTime(sqlite3_vfs*, double*); +static int tmstmpGetLastError(sqlite3_vfs*, int, char *); +static int tmstmpCurrentTimeInt64(sqlite3_vfs*, sqlite3_int64*); +static int tmstmpSetSystemCall(sqlite3_vfs*, const char*,sqlite3_syscall_ptr); +static sqlite3_syscall_ptr tmstmpGetSystemCall(sqlite3_vfs*, const char *z); +static const char *tmstmpNextSystemCall(sqlite3_vfs*, const char *zName); + +static sqlite3_vfs tmstmp_vfs = { + 3, /* iVersion (set when registered) */ + 0, /* szOsFile (set when registered) */ + 1024, /* mxPathname */ + 0, /* pNext */ + "tmstmpvfs", /* zName */ + 0, /* pAppData (set when registered) */ + tmstmpOpen, /* xOpen */ + tmstmpDelete, /* xDelete */ + tmstmpAccess, /* xAccess */ + tmstmpFullPathname, /* xFullPathname */ + tmstmpDlOpen, /* xDlOpen */ + tmstmpDlError, /* xDlError */ + tmstmpDlSym, /* xDlSym */ + tmstmpDlClose, /* xDlClose */ + tmstmpRandomness, /* xRandomness */ + tmstmpSleep, /* xSleep */ + tmstmpCurrentTime, /* xCurrentTime */ + tmstmpGetLastError, /* xGetLastError */ + tmstmpCurrentTimeInt64, /* xCurrentTimeInt64 */ + tmstmpSetSystemCall, /* xSetSystemCall */ + tmstmpGetSystemCall, /* xGetSystemCall */ + tmstmpNextSystemCall /* xNextSystemCall */ +}; + +static const sqlite3_io_methods tmstmp_io_methods = { + 3, /* iVersion */ + tmstmpClose, /* xClose */ + tmstmpRead, /* xRead */ + tmstmpWrite, /* xWrite */ + tmstmpTruncate, /* xTruncate */ + tmstmpSync, /* xSync */ + tmstmpFileSize, /* xFileSize */ + tmstmpLock, /* xLock */ + tmstmpUnlock, /* xUnlock */ + tmstmpCheckReservedLock, /* xCheckReservedLock */ + tmstmpFileControl, /* xFileControl */ + tmstmpSectorSize, /* xSectorSize */ + tmstmpDeviceCharacteristics, /* xDeviceCharacteristics */ + tmstmpShmMap, /* xShmMap */ + tmstmpShmLock, /* xShmLock */ + tmstmpShmBarrier, /* xShmBarrier */ + tmstmpShmUnmap, /* xShmUnmap */ + tmstmpFetch, /* xFetch */ + tmstmpUnfetch /* xUnfetch */ +}; + +/* +** Write a 6-byte millisecond timestamp into aOut[] +*/ +static void tmstmpPutTS(TmstmpFile *p, unsigned char *aOut){ + sqlite3_uint64 tm = 0; + p->pSubVfs->xCurrentTimeInt64(p->pSubVfs, (sqlite3_int64*)&tm); + tm -= 210866760000000LL; + aOut[0] = (tm>>40)&0xff; + aOut[1] = (tm>>32)&0xff; + aOut[2] = (tm>>24)&0xff; + aOut[3] = (tm>>16)&0xff; + aOut[4] = (tm>>8)&0xff; + aOut[5] = tm&0xff; +} + +/* +** Read a 32-bit big-endian unsigned integer and return it. +*/ +static u32 tmstmpGetU32(const unsigned char *a){ + return (a[0]<<24) + (a[1]<<16) + (a[2]<<8) + a[3]; +} + +/* Write a 32-bit integer as big-ending into a[] +*/ +static void tmstmpPutU32(u32 v, unsigned char *a){ + a[0] = (v>>24) & 0xff; + a[1] = (v>>16) & 0xff; + a[2] = (v>>8) & 0xff; + a[3] = v & 0xff; +} + +/* Free a TmstmpLog object */ +static void tmstmpLogFree(TmstmpLog *pLog){ + if( pLog==0 ) return; + if( pLog->log ) fclose(pLog->log); + sqlite3_free(pLog->zLogname); + sqlite3_free(pLog); +} + +/* Flush log content. Open the file if necessary. Return the +** number of errors. */ +static int tmstmpLogFlush(TmstmpFile *p){ + TmstmpLog *pLog = p->pLog; + assert( pLog!=0 ); + if( pLog->log==0 ){ + pLog->log = fopen(pLog->zLogname, "wb"); + if( pLog->log==0 ){ + tmstmpLogFree(pLog); + p->pLog = 0; + return 1; + } + } + (void)fwrite(pLog->a, pLog->n, 1, pLog->log); + fflush(pLog->log); + pLog->n = 0; + return 0; +} + +/* +** Write a record onto the event log +*/ +static void tmstmpEvent( + TmstmpFile *p, + u8 op, + u8 a1, + u32 a2, + u32 a3, + u8 *pTS +){ + unsigned char *a; + TmstmpLog *pLog; + if( p->isWal ){ + p = p->pPartner; + assert( p!=0 ); + assert( p->isDb ); + } + pLog = p->pLog; + if( pLog==0 ) return; + if( pLog->n >= (int)sizeof(pLog->a) ){ + if( tmstmpLogFlush(p) ) return; + } + a = pLog->a + pLog->n; + a[0] = op; + a[1] = a1; + if( pTS ){ + memcpy(a+2, pTS, 6); + }else{ + tmstmpPutTS(p, a+2); + } + tmstmpPutU32(a2, a+8); + tmstmpPutU32(a3, a+12); + pLog->n += 16; + if( pLog->log || (op>=ELOG_WAL_PAGE && op<=ELOG_WAL_RESET) ){ + (void)tmstmpLogFlush(p); + } +} + +/* +** Close a connection +*/ +static int tmstmpClose(sqlite3_file *pFile){ + TmstmpFile *p = (TmstmpFile *)pFile; + if( p->hasCorrectReserve ){ + tmstmpEvent(p, p->isDb ? ELOG_CLOSE_DB : ELOG_CLOSE_WAL, 0, 0, 0, 0); + } + tmstmpLogFree(p->pLog); + if( p->pPartner ){ + assert( p->pPartner->pPartner==p ); + p->pPartner->pPartner = 0; + p->pPartner = 0; + } + pFile = ORIGFILE(pFile); + return pFile->pMethods->xClose(pFile); +} + +/* +** Read bytes from a file +*/ +static int tmstmpRead( + sqlite3_file *pFile, + void *zBuf, + int iAmt, + sqlite_int64 iOfst +){ + int rc; + TmstmpFile *p = (TmstmpFile*)pFile; + pFile = ORIGFILE(pFile); + rc = pFile->pMethods->xRead(pFile, zBuf, iAmt, iOfst); + if( rc!=SQLITE_OK ) return rc; + if( p->isDb + && iOfst==0 + && iAmt>=100 + ){ + const unsigned char *a = (unsigned char*)zBuf; + p->hasCorrectReserve = (a[20]==TMSTMP_RESERVE); + p->pgsz = (a[16]<<8) + a[17]; + if( p->pgsz==1 ) p->pgsz = 65536; + if( p->pPartner ){ + p->pPartner->hasCorrectReserve = p->hasCorrectReserve; + p->pPartner->pgsz = p->pgsz; + } + } + if( p->isWal + && p->inCkpt + && iAmt>=512 && iAmt<=65535 && (iAmt&(iAmt-1))==0 + ){ + p->pPartner->iFrame = (iOfst-56)/(p->pgsz+24) + 1; + } + return rc; +} + +/* +** Write data to a tmstmp-file. +*/ +static int tmstmpWrite( + sqlite3_file *pFile, + const void *zBuf, + int iAmt, + sqlite_int64 iOfst +){ + TmstmpFile *p = (TmstmpFile*)pFile; + sqlite3_file *pSub = ORIGFILE(pFile); + if( !p->hasCorrectReserve ){ + /* The database does not have the correct reserve size. No-op */ + }else if( p->isWal ){ + /* Writing into a WAL file */ + if( iAmt==24 ){ + /* A frame header */ + u32 x = 0; + p->iFrame = (iOfst - 32)/(p->pgsz+24)+1; + p->pgno = tmstmpGetU32((const u8*)zBuf); + p->salt1 = tmstmpGetU32(((const u8*)zBuf)+8); + memcpy(&x, ((const u8*)zBuf)+4, 4); + p->isCommit = (x!=0); + p->iOfst = iOfst; + }else if( iAmt>=512 && iOfst==p->iOfst+24 ){ + unsigned char s[TMSTMP_RESERVE]; + memset(s, 0, TMSTMP_RESERVE); + tmstmpPutTS(p, s+2); + tmstmpEvent(p, ELOG_WAL_PAGE, p->isCommit, p->pgno, p->iFrame, s+2); + }else if( iAmt==32 && iOfst==0 ){ + p->salt1 = tmstmpGetU32(((const u8*)zBuf)+16); + tmstmpEvent(p, ELOG_WAL_RESET, 0, 0, p->salt1, 0); + } + }else if( p->inCkpt ){ + unsigned char *s = (unsigned char*)zBuf+iAmt-TMSTMP_RESERVE; + memset(s, 0, TMSTMP_RESERVE); + tmstmpPutTS(p, s+2); + tmstmpPutU32(p->iFrame, s+8); + tmstmpPutU32(p->pPartner->salt1 & 0xffffff, s+12); + assert( p->pgsz>0 ); + tmstmpEvent(p, ELOG_CKPT_PAGE, 0, (iOfst/p->pgsz)+1, p->iFrame, 0); + }else if( p->pPartner==0 ){ + /* Writing into a database in rollback mode */ + unsigned char *s = (unsigned char*)zBuf+iAmt-TMSTMP_RESERVE; + memset(s, 0, TMSTMP_RESERVE); + tmstmpPutTS(p, s+2); + s[12] = 2; + assert( p->pgsz>0 ); + tmstmpEvent(p, ELOG_DB_PAGE, 0, (u32)(iOfst/p->pgsz)+1, 0, s+2); + } + return pSub->pMethods->xWrite(pSub,zBuf,iAmt,iOfst); +} + +/* +** Truncate a tmstmp-file. +*/ +static int tmstmpTruncate(sqlite3_file *pFile, sqlite_int64 size){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xTruncate(pFile, size); +} + +/* +** Sync a tmstmp-file. +*/ +static int tmstmpSync(sqlite3_file *pFile, int flags){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xSync(pFile, flags); +} + +/* +** Return the current file-size of a tmstmp-file. +*/ +static int tmstmpFileSize(sqlite3_file *pFile, sqlite_int64 *pSize){ + TmstmpFile *p = (TmstmpFile *)pFile; + pFile = ORIGFILE(p); + return pFile->pMethods->xFileSize(pFile, pSize); +} + +/* +** Lock a tmstmp-file. +*/ +static int tmstmpLock(sqlite3_file *pFile, int eLock){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xLock(pFile, eLock); +} + +/* +** Unlock a tmstmp-file. +*/ +static int tmstmpUnlock(sqlite3_file *pFile, int eLock){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xUnlock(pFile, eLock); +} + +/* +** Check if another file-handle holds a RESERVED lock on a tmstmp-file. +*/ +static int tmstmpCheckReservedLock(sqlite3_file *pFile, int *pResOut){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xCheckReservedLock(pFile, pResOut); +} + +/* +** File control method. For custom operations on a tmstmp-file. +*/ +static int tmstmpFileControl(sqlite3_file *pFile, int op, void *pArg){ + int rc; + TmstmpFile *p = (TmstmpFile*)pFile; + pFile = ORIGFILE(pFile); + rc = pFile->pMethods->xFileControl(pFile, op, pArg); + switch( op ){ + case SQLITE_FCNTL_VFSNAME: { + if( p->hasCorrectReserve && rc==SQLITE_OK ){ + *(char**)pArg = sqlite3_mprintf("tmstmp/%z", *(char**)pArg); + } + break; + } + case SQLITE_FCNTL_CKPT_START: { + p->inCkpt = 1; + assert( p->isDb ); + assert( p->pPartner!=0 ); + p->pPartner->inCkpt = 1; + if( p->hasCorrectReserve ){ + tmstmpEvent(p, ELOG_CKPT_START, 0, 0, 0, 0); + } + rc = SQLITE_OK; + break; + } + case SQLITE_FCNTL_CKPT_DONE: { + p->inCkpt = 0; + assert( p->isDb ); + assert( p->pPartner!=0 ); + p->pPartner->inCkpt = 0; + if( p->hasCorrectReserve ){ + tmstmpEvent(p, ELOG_CKPT_DONE, 0, 0, 0, 0); + } + rc = SQLITE_OK; + break; + } + } + return rc; +} + +/* +** Return the sector-size in bytes for a tmstmp-file. +*/ +static int tmstmpSectorSize(sqlite3_file *pFile){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xSectorSize(pFile); +} + +/* +** Return the device characteristic flags supported by a tmstmp-file. +*/ +static int tmstmpDeviceCharacteristics(sqlite3_file *pFile){ + int devchar = 0; + pFile = ORIGFILE(pFile); + devchar = pFile->pMethods->xDeviceCharacteristics(pFile); + return (devchar & ~SQLITE_IOCAP_SUBPAGE_READ); +} + +/* Create a shared memory file mapping */ +static int tmstmpShmMap( + sqlite3_file *pFile, + int iPg, + int pgsz, + int bExtend, + void volatile **pp +){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xShmMap(pFile,iPg,pgsz,bExtend,pp); +} + +/* Perform locking on a shared-memory segment */ +static int tmstmpShmLock(sqlite3_file *pFile, int offset, int n, int flags){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xShmLock(pFile,offset,n,flags); +} + +/* Memory barrier operation on shared memory */ +static void tmstmpShmBarrier(sqlite3_file *pFile){ + pFile = ORIGFILE(pFile); + pFile->pMethods->xShmBarrier(pFile); +} + +/* Unmap a shared memory segment */ +static int tmstmpShmUnmap(sqlite3_file *pFile, int deleteFlag){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xShmUnmap(pFile,deleteFlag); +} + +/* Fetch a page of a memory-mapped file */ +static int tmstmpFetch( + sqlite3_file *pFile, + sqlite3_int64 iOfst, + int iAmt, + void **pp +){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xFetch(pFile, iOfst, iAmt, pp); +} + +/* Release a memory-mapped page */ +static int tmstmpUnfetch(sqlite3_file *pFile, sqlite3_int64 iOfst, void *pPage){ + pFile = ORIGFILE(pFile); + return pFile->pMethods->xUnfetch(pFile, iOfst, pPage); +} + + +/* +** Open a tmstmp file handle. +*/ +static int tmstmpOpen( + sqlite3_vfs *pVfs, + const char *zName, + sqlite3_file *pFile, + int flags, + int *pOutFlags +){ + TmstmpFile *p, *pDb; + sqlite3_file *pSubFile; + sqlite3_vfs *pSubVfs; + int rc; + + pSubVfs = ORIGVFS(pVfs); + if( (flags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_WAL))==0 ){ + /* If the file is not a persistent database or a WAL file, then + ** bypass the timestamp logic all together */ + return pSubVfs->xOpen(pSubVfs, zName, pFile, flags, pOutFlags); + } + if( (flags & SQLITE_OPEN_WAL)!=0 ){ + pDb = (TmstmpFile*)sqlite3_database_file_object(zName); + if( pDb==0 + || pDb->uMagic!=TMSTMP_MAGIC + || !pDb->isDb + || pDb->pPartner!=0 + ){ + return pSubVfs->xOpen(pSubVfs, zName, pFile, flags, pOutFlags); + } + }else{ + pDb = 0; + } + p = (TmstmpFile*)pFile; + memset(p, 0, sizeof(*p)); + pSubFile = ORIGFILE(pFile); + pFile->pMethods = &tmstmp_io_methods; + p->pSubVfs = pSubVfs; + p->uMagic = TMSTMP_MAGIC; + rc = pSubVfs->xOpen(pSubVfs, zName, pSubFile, flags, pOutFlags); + if( rc ) goto tmstmp_open_done; + if( pDb!=0 ){ + p->isWal = 1; + p->pPartner = pDb; + pDb->pPartner = p; + }else{ + u32 r2; + u32 pid; + TmstmpLog *pLog; + sqlite3_uint64 r1; /* Milliseconds since 1970-01-01 */ + sqlite3_uint64 days; /* Days since 1970-01-01 */ + sqlite3_uint64 sod; /* Start of date specified by r1 */ + sqlite3_uint64 z; /* Days since 0000-03-01 */ + sqlite3_uint64 era; /* 400-year era */ + int h; /* hour */ + int m; /* minute */ + int s; /* second */ + int f; /* millisecond */ + int Y; /* year */ + int M; /* month */ + int D; /* day */ + int y; /* year assuming March is first month */ + unsigned int doe; /* day of 400-year era */ + unsigned int yoe; /* year of 400-year era */ + unsigned int doy; /* day of year */ + unsigned int mp; /* month with March==0 */ + + p->isDb = 1; + r1 = 0; + pLog = sqlite3_malloc64( sizeof(TmstmpLog) ); + if( pLog==0 ){ + pSubFile->pMethods->xClose(pSubFile); + rc = SQLITE_NOMEM; + goto tmstmp_open_done; + } + memset(pLog, 0, sizeof(pLog[0])); + p->pLog = pLog; + p->pSubVfs->xCurrentTimeInt64(p->pSubVfs, (sqlite3_int64*)&r1); + r1 -= 210866760000000LL; + days = r1/86400000; + sod = (r1%86400000)/1000; + f = (int)(r1%1000); + + h = sod/3600; + m = (sod%3600)/60; + s = sod%60; + z = days + 719468; + era = z/146097; + doe = (unsigned)(z - era*146097); + yoe = (doe - doe/1460 + doe/36524 - doe/146096)/365; + y = (int)yoe + era*400; + doy = doe - (365*yoe + yoe/4 - yoe/100); + mp = (5*doy + 2)/153; + D = doy - (153*mp + 2)/5 + 1; + M = mp + (mp<10 ? 3 : -9); + Y = y + (M <=2); + sqlite3_randomness(sizeof(r2), &r2); + pid = GETPID; + pLog->zLogname = sqlite3_mprintf( + "%s-tmstmp/%04d%02d%02dT%02d%02d%02d%03d-%08d-%08x", + zName, Y, M, D, h, m, s, f, pid, r2); + } + tmstmpEvent(p, p->isWal ? ELOG_OPEN_WAL : ELOG_OPEN_DB, 0, GETPID, 0, 0); + +tmstmp_open_done: + if( rc ) pFile->pMethods = 0; + return rc; +} + +/* +** All VFS interfaces other than xOpen are passed down into the Sub-VFS. +*/ +static int tmstmpDelete(sqlite3_vfs *p, const char *zName, int syncDir){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xDelete(pSub,zName,syncDir); +} +static int tmstmpAccess(sqlite3_vfs *p, const char *zName, int flags, int *pR){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xAccess(pSub,zName,flags,pR); +} +static int tmstmpFullPathname(sqlite3_vfs*p,const char *zName,int n,char *zOut){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xFullPathname(pSub,zName,n,zOut); +} +static void *tmstmpDlOpen(sqlite3_vfs *p, const char *zFilename){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xDlOpen(pSub,zFilename); +} +static void tmstmpDlError(sqlite3_vfs *p, int nByte, char *zErrMsg){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xDlError(pSub,nByte,zErrMsg); +} +static void(*tmstmpDlSym(sqlite3_vfs *p, void *pDl, const char *zSym))(void){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xDlSym(pSub,pDl,zSym); +} +static void tmstmpDlClose(sqlite3_vfs *p, void *pDl){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xDlClose(pSub,pDl); +} +static int tmstmpRandomness(sqlite3_vfs *p, int nByte, char *zOut){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xRandomness(pSub,nByte,zOut); +} +static int tmstmpSleep(sqlite3_vfs *p, int microseconds){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xSleep(pSub,microseconds); +} +static int tmstmpCurrentTime(sqlite3_vfs *p, double *prNow){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xCurrentTime(pSub,prNow); +} +static int tmstmpGetLastError(sqlite3_vfs *p, int a, char *b){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xGetLastError(pSub,a,b); +} +static int tmstmpCurrentTimeInt64(sqlite3_vfs *p, sqlite3_int64 *piNow){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xCurrentTimeInt64(pSub,piNow); +} +static int tmstmpSetSystemCall(sqlite3_vfs *p, const char *zName, + sqlite3_syscall_ptr x){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xSetSystemCall(pSub,zName,x); +} +static sqlite3_syscall_ptr tmstmpGetSystemCall(sqlite3_vfs *p, const char *z){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xGetSystemCall(pSub,z); +} +static const char *tmstmpNextSystemCall(sqlite3_vfs *p, const char *zName){ + sqlite3_vfs *pSub = ORIGVFS(p); + return pSub->xNextSystemCall(pSub,zName); +} + +/* +** Register the tmstmp VFS as the default VFS for the system. +*/ +static int tmstmpRegisterVfs(void){ + int rc = SQLITE_OK; + sqlite3_vfs *pOrig = sqlite3_vfs_find(0); + if( pOrig==0 ) return SQLITE_ERROR; + if( pOrig==&tmstmp_vfs ) return SQLITE_OK; + tmstmp_vfs.iVersion = pOrig->iVersion; + tmstmp_vfs.pAppData = pOrig; + tmstmp_vfs.szOsFile = pOrig->szOsFile + sizeof(TmstmpFile); + rc = sqlite3_vfs_register(&tmstmp_vfs, 1); + return rc; +} + +#if defined(SQLITE_TMSTMPVFS_STATIC) +/* This variant of the initializer runs when the extension is +** statically linked. +*/ +int sqlite3_register_tmstmpvfs(const char *NotUsed){ + (void)NotUsed; + return tmstmpRegisterVfs(); +} +int sqlite3_unregister_tmstmpvfs(void){ + if( sqlite3_vfs_find("tmstmpvfs") ){ + sqlite3_vfs_unregister(&tmstmp_vfs); + } + return SQLITE_OK; +} +#endif /* defined(SQLITE_TMSTMPVFS_STATIC */ + +#if !defined(SQLITE_TMSTMPVFS_STATIC) +/* This variant of the initializer function is used when the +** extension is shared library to be loaded at run-time. +*/ +#ifdef _WIN32 +__declspec(dllexport) +#endif +/* +** This routine is called by sqlite3_load_extension() when the +** extension is first loaded. +***/ +int sqlite3_tmstmpvfs_init( + sqlite3 *db, + char **pzErrMsg, + const sqlite3_api_routines *pApi +){ + int rc; + SQLITE_EXTENSION_INIT2(pApi); + (void)pzErrMsg; /* not used */ + (void)db; /* not used */ + rc = tmstmpRegisterVfs(); + if( rc==SQLITE_OK ) rc = SQLITE_OK_LOAD_PERMANENTLY; + return rc; +} +#endif /* !defined(SQLITE_TMSTMPVFS_STATIC) */ diff --git a/ext/misc/vfsstat.c b/ext/misc/vfsstat.c index 504c0b31d5..a7a17fffd2 100644 --- a/ext/misc/vfsstat.c +++ b/ext/misc/vfsstat.c @@ -613,7 +613,7 @@ static int vstattabConnect( rc = sqlite3_declare_vtab(db,"CREATE TABLE x(file,stat,count)"); if( rc==SQLITE_OK ){ - pNew = *ppVtab = sqlite3_malloc( sizeof(*pNew) ); + pNew = *ppVtab = sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); } @@ -633,7 +633,7 @@ static int vstattabDisconnect(sqlite3_vtab *pVtab){ */ static int vstattabOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ VfsStatCursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/vfstrace.c b/ext/misc/vfstrace.c index 9d75a8b647..24e75e1171 100644 --- a/ext/misc/vfstrace.c +++ b/ext/misc/vfstrace.c @@ -892,7 +892,7 @@ static int vfstraceOpen( vfstrace_printf(pInfo, "%s.xOpen(%s,flags=0x%x)", pInfo->zVfsName, p->zFName, flags); if( p->pReal->pMethods ){ - sqlite3_io_methods *pNew = sqlite3_malloc( sizeof(*pNew) ); + sqlite3_io_methods *pNew = sqlite3_malloc64( sizeof(*pNew) ); const sqlite3_io_methods *pSub = p->pReal->pMethods; memset(pNew, 0, sizeof(*pNew)); pNew->iVersion = pSub->iVersion; diff --git a/ext/misc/vtablog.c b/ext/misc/vtablog.c index 2b3e303559..a48f3a632f 100644 --- a/ext/misc/vtablog.c +++ b/ext/misc/vtablog.c @@ -14,6 +14,8 @@ ** on stdout when its key interfaces are called. This is intended for ** interactive analysis and debugging of virtual table interfaces. ** +** HOW TO COMPILE: +** ** To build this extension as a separately loaded shared library or ** DLL, use compiler command-lines similar to the following: ** @@ -21,7 +23,7 @@ ** (mac) clang -fPIC -dynamiclib vtablog.c -o vtablog.dylib ** (windows) cl vtablog.c -link -dll -out:vtablog.dll ** -** Usage example: +** USAGE EXAMPLE: ** ** .load ./vtablog ** CREATE VIRTUAL TABLE temp.log USING vtablog( @@ -29,6 +31,23 @@ ** rows=25 ** ); ** SELECT * FROM log; +** +** ARGUMENTS TO CREATE VIRTUAL TABLE: +** +** In "CREATE VIRTUAL TABLE temp.log AS vtablog(ARGS....)" statement, the +** ARGS argument is a list of key-value pairs that can be any of the +** following. +** +** schema=TEXT Text is a CREATE TABLE statement that defines +** the schema of the new virtual table. +** +** rows=N The table as N rows. +** +** consume_order_by=N If the left-most ORDER BY terms is ASC and +** against column N (where the leftmost column +** is #1) then set the orderByConsumed=1 flag in +** xBestIndex. Or if the left-most ORDER BY is +** DESC and against column -N, do likewise. */ #include "sqlite3ext.h" SQLITE_EXTENSION_INIT1 @@ -49,6 +68,8 @@ struct vtablog_vtab { char *zName; /* Table name. argv[2] of xConnect/xCreate */ int nRow; /* Number of rows in the table */ int nCursor; /* Number of cursors created */ + int iConsumeOB; /* Consume the ORDER BY clause if on column N-th + ** and consumeOB=N or consumeOB=(-N) and DESC */ }; /* vtablog_cursor is a subclass of sqlite3_vtab_cursor which will @@ -180,6 +201,7 @@ static int vtablogConnectCreate( int rc; char *zSchema = 0; char *zNRow = 0; + char *zConsumeOB = 0; printf("%s.%s.%s():\n", argv[1], argv[2], isCreate ? "xCreate" : "xConnect"); @@ -203,6 +225,10 @@ static int vtablogConnectCreate( rc = SQLITE_ERROR; goto vtablog_end_connect; } + if( vtablog_string_parameter(pzErr, "consume_order_by", z, &zConsumeOB) ){ + rc = SQLITE_ERROR; + goto vtablog_end_connect; + } } if( zSchema==0 ){ zSchema = sqlite3_mprintf("%s","CREATE TABLE x(a,b);"); @@ -214,13 +240,17 @@ static int vtablogConnectCreate( printf(" schema = '%s'\n", zSchema); rc = sqlite3_declare_vtab(db, zSchema); if( rc==SQLITE_OK ){ - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); pNew->nRow = 10; if( zNRow ) pNew->nRow = atoi(zNRow); printf(" nrow = %d\n", pNew->nRow); + if( zConsumeOB ) pNew->iConsumeOB = atoi(zConsumeOB); + if( pNew->iConsumeOB ){ + printf(" consume_order_by = %d\n", pNew->iConsumeOB); + } pNew->zDb = sqlite3_mprintf("%s", argv[1]); pNew->zName = sqlite3_mprintf("%s", argv[2]); } @@ -228,6 +258,7 @@ static int vtablogConnectCreate( vtablog_end_connect: sqlite3_free(zSchema); sqlite3_free(zNRow); + sqlite3_free(zConsumeOB); return rc; } static int vtablogCreate( @@ -282,7 +313,7 @@ static int vtablogOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ vtablog_cursor *pCur; printf("%s.%s.xOpen(cursor=%d)\n", pTab->zDb, pTab->zName, ++pTab->nCursor); - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); pCur->iCursor = pTab->nCursor; @@ -514,16 +545,27 @@ static int vtablogBestIndex( } } printf(" nOrderBy: %d\n", p->nOrderBy); - for(i=0; inOrderBy; i++){ - printf(" orderby[%d]: col=%d desc=%d\n", - i, - p->aOrderBy[i].iColumn, - p->aOrderBy[i].desc); + if( p->nOrderBy ){ + for(i=0; inOrderBy; i++){ + printf(" orderby[%d]: col=%d desc=%d\n", + i, + p->aOrderBy[i].iColumn, + p->aOrderBy[i].desc); + } + if( pTab->iConsumeOB ){ + int N = p->aOrderBy[0].iColumn+1; + if( (p->aOrderBy[0].desc && N==-pTab->iConsumeOB) + || (!p->aOrderBy[0].desc && N==pTab->iConsumeOB) + ){ + p->orderByConsumed = 1; + } + } } p->estimatedCost = (double)500; p->estimatedRows = 500; printf(" idxNum=%d\n", p->idxNum); printf(" idxStr=NULL\n"); + printf(" sqlite3_vtab_distinct()=%d\n", sqlite3_vtab_distinct(p)); printf(" orderByConsumed=%d\n", p->orderByConsumed); printf(" estimatedCost=%g\n", p->estimatedCost); printf(" estimatedRows=%lld\n", p->estimatedRows); diff --git a/ext/misc/vtshim.c b/ext/misc/vtshim.c index 3f7945724c..ed6c568f63 100644 --- a/ext/misc/vtshim.c +++ b/ext/misc/vtshim.c @@ -86,7 +86,7 @@ static int vtshimCreate( } return SQLITE_ERROR; } - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -125,7 +125,7 @@ static int vtshimConnect( } return SQLITE_ERROR; } - pNew = sqlite3_malloc( sizeof(*pNew) ); + pNew = sqlite3_malloc64( sizeof(*pNew) ); *ppVtab = (sqlite3_vtab*)pNew; if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, sizeof(*pNew)); @@ -192,7 +192,7 @@ static int vtshimOpen(sqlite3_vtab *pBase, sqlite3_vtab_cursor **ppCursor){ int rc; *ppCursor = 0; if( pAux->bDisposed ) return SQLITE_ERROR; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); rc = pAux->pMod->xOpen(pVtab->pChild, &pCur->pChild); @@ -444,7 +444,7 @@ static int vtshimCopyModule( ){ sqlite3_module *p; if( !pMod || !ppMod ) return SQLITE_ERROR; - p = sqlite3_malloc( sizeof(*p) ); + p = sqlite3_malloc64( sizeof(*p) ); if( p==0 ) return SQLITE_NOMEM; memcpy(p, pMod, sizeof(*p)); *ppMod = p; @@ -464,7 +464,7 @@ void *sqlite3_create_disposable_module( vtshim_aux *pAux; sqlite3_module *pMod; int rc; - pAux = sqlite3_malloc( sizeof(*pAux) ); + pAux = sqlite3_malloc64( sizeof(*pAux) ); if( pAux==0 ){ if( xDestroy ) xDestroy(pClientData); return 0; diff --git a/ext/misc/wholenumber.c b/ext/misc/wholenumber.c index 4c955925da..fe5fc83ab0 100644 --- a/ext/misc/wholenumber.c +++ b/ext/misc/wholenumber.c @@ -47,7 +47,7 @@ static int wholenumberConnect( char **pzErr ){ sqlite3_vtab *pNew; - pNew = *ppVtab = sqlite3_malloc( sizeof(*pNew) ); + pNew = *ppVtab = sqlite3_malloc64( sizeof(*pNew) ); if( pNew==0 ) return SQLITE_NOMEM; sqlite3_declare_vtab(db, "CREATE TABLE x(value)"); sqlite3_vtab_config(db, SQLITE_VTAB_INNOCUOUS); @@ -69,7 +69,7 @@ static int wholenumberDisconnect(sqlite3_vtab *pVtab){ */ static int wholenumberOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ wholenumber_cursor *pCur; - pCur = sqlite3_malloc( sizeof(*pCur) ); + pCur = sqlite3_malloc64( sizeof(*pCur) ); if( pCur==0 ) return SQLITE_NOMEM; memset(pCur, 0, sizeof(*pCur)); *ppCursor = &pCur->base; diff --git a/ext/misc/zipfile.c b/ext/misc/zipfile.c index 58cfba658a..9b127cc5a6 100644 --- a/ext/misc/zipfile.c +++ b/ext/misc/zipfile.c @@ -393,7 +393,7 @@ static int zipfileConnect( rc = sqlite3_declare_vtab(db, ZIPFILE_SCHEMA); if( rc==SQLITE_OK ){ - pNew = (ZipfileTab*)sqlite3_malloc64((sqlite3_int64)nByte+nFile); + pNew = (ZipfileTab*)sqlite3_malloc64((i64)nByte+nFile); if( pNew==0 ) return SQLITE_NOMEM; memset(pNew, 0, nByte+nFile); pNew->db = db; @@ -456,7 +456,7 @@ static int zipfileDisconnect(sqlite3_vtab *pVtab){ static int zipfileOpen(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCsr){ ZipfileTab *pTab = (ZipfileTab*)p; ZipfileCsr *pCsr; - pCsr = sqlite3_malloc(sizeof(*pCsr)); + pCsr = sqlite3_malloc64(sizeof(*pCsr)); *ppCsr = (sqlite3_vtab_cursor*)pCsr; if( pCsr==0 ){ return SQLITE_NOMEM; @@ -539,14 +539,15 @@ static void zipfileCursorErr(ZipfileCsr *pCsr, const char *zFmt, ...){ static int zipfileReadData( FILE *pFile, /* Read from this file */ u8 *aRead, /* Read into this buffer */ - int nRead, /* Number of bytes to read */ + i64 nRead, /* Number of bytes to read */ i64 iOff, /* Offset to read from */ char **pzErrmsg /* OUT: Error message (from sqlite3_malloc) */ ){ size_t n; fseek(pFile, (long)iOff, SEEK_SET); - n = fread(aRead, 1, nRead, pFile); - if( (int)n!=nRead ){ + n = fread(aRead, 1, (long)nRead, pFile); + if( n!=(size_t)nRead ){ + sqlite3_free(*pzErrmsg); *pzErrmsg = sqlite3_mprintf("error in fread()"); return SQLITE_ERROR; } @@ -563,7 +564,7 @@ static int zipfileAppendData( fseek(pTab->pWriteFd, (long)pTab->szCurrent, SEEK_SET); n = fwrite(aWrite, 1, nWrite, pTab->pWriteFd); if( (int)n!=nWrite ){ - pTab->base.zErrMsg = sqlite3_mprintf("error in fwrite()"); + zipfileTableErr(pTab,"error in fwrite()"); return SQLITE_ERROR; } pTab->szCurrent += nWrite; @@ -704,7 +705,12 @@ static int zipfileScanExtra(u8 *aExtra, int nExtra, u32 *pmTime){ u8 *p = aExtra; u8 *pEnd = &aExtra[nExtra]; - while( pcds); if( rc!=SQLITE_OK ){ - *pzErr = sqlite3_mprintf("failed to read CDS at offset %lld", iOff); + zipfileTableErr(pTab, "failed to read CDS at offset %lld", iOff); }else if( aBlob==0 ){ rc = zipfileReadData( pFile, aRead, nExtra+nFile, iOff+ZIPFILE_CDS_FIXED_SZ, pzErr ); }else{ aRead = (u8*)&aBlob[iOff + ZIPFILE_CDS_FIXED_SZ]; - if( (iOff + ZIPFILE_LFH_FIXED_SZ + nFile + nExtra)>nBlob ){ + if( (iOff + ZIPFILE_CDS_FIXED_SZ + nFile + nExtra)>nBlob ){ rc = zipfileCorrupt(pzErr); } } @@ -900,14 +907,15 @@ static int zipfileGetEntry( rc = zipfileReadData(pFile, aRead, szFix, pNew->cds.iOffset, pzErr); }else{ aRead = (u8*)&aBlob[pNew->cds.iOffset]; - if( (pNew->cds.iOffset + ZIPFILE_LFH_FIXED_SZ)>nBlob ){ + if( ((i64)pNew->cds.iOffset + ZIPFILE_LFH_FIXED_SZ)>nBlob ){ rc = zipfileCorrupt(pzErr); } } + memset(&lfh, 0, sizeof(lfh)); if( rc==SQLITE_OK ) rc = zipfileReadLFH(aRead, &lfh); if( rc==SQLITE_OK ){ - pNew->iDataOff = pNew->cds.iOffset + ZIPFILE_LFH_FIXED_SZ; + pNew->iDataOff = (i64)pNew->cds.iOffset + ZIPFILE_LFH_FIXED_SZ; pNew->iDataOff += lfh.nFile + lfh.nExtra; if( aBlob && pNew->cds.szCompressed ){ if( pNew->iDataOff + pNew->cds.szCompressed > nBlob ){ @@ -918,7 +926,7 @@ static int zipfileGetEntry( } } }else{ - *pzErr = sqlite3_mprintf("failed to read LFH at offset %d", + zipfileTableErr(pTab, "failed to read LFH at offset %d", (int)pNew->cds.iOffset ); } @@ -942,7 +950,7 @@ static int zipfileNext(sqlite3_vtab_cursor *cur){ int rc = SQLITE_OK; if( pCsr->pFile ){ - i64 iEof = pCsr->eocd.iOffset + pCsr->eocd.nSize; + i64 iEof = (i64)pCsr->eocd.iOffset + (i64)pCsr->eocd.nSize; zipfileEntryFree(pCsr->pCurrent); pCsr->pCurrent = 0; if( pCsr->iNextOff>=iEof ){ @@ -987,7 +995,7 @@ static void zipfileInflate( int nIn, /* Size of buffer aIn[] in bytes */ int nOut /* Expected output size */ ){ - u8 *aRes = sqlite3_malloc(nOut); + u8 *aRes = sqlite3_malloc64(nOut); if( aRes==0 ){ sqlite3_result_error_nomem(pCtx); }else{ @@ -1008,7 +1016,7 @@ static void zipfileInflate( if( err!=Z_STREAM_END ){ zipfileCtxErrorMsg(pCtx, "inflate() failed (%d)", err); }else{ - sqlite3_result_blob(pCtx, aRes, nOut, zipfileFree); + sqlite3_result_blob(pCtx, aRes, (int)str.total_out, zipfileFree); aRes = 0; } } @@ -1180,12 +1188,12 @@ static int zipfileEof(sqlite3_vtab_cursor *cur){ static int zipfileReadEOCD( ZipfileTab *pTab, /* Return errors here */ const u8 *aBlob, /* Pointer to in-memory file image */ - int nBlob, /* Size of aBlob[] in bytes */ + i64 nBlob, /* Size of aBlob[] in bytes */ FILE *pFile, /* Read from this file if aBlob==0 */ ZipfileEOCD *pEOCD /* Object to populate */ ){ u8 *aRead = pTab->aBuffer; /* Temporary buffer */ - int nRead; /* Bytes to read from file */ + i64 nRead; /* Bytes to read from file */ int rc = SQLITE_OK; memset(pEOCD, 0, sizeof(ZipfileEOCD)); @@ -1206,7 +1214,7 @@ static int zipfileReadEOCD( } if( rc==SQLITE_OK ){ - int i; + i64 i; /* Scan backwards looking for the signature bytes */ for(i=nRead-20; i>=0; i--){ @@ -1217,9 +1225,7 @@ static int zipfileReadEOCD( } } if( i<0 ){ - pTab->base.zErrMsg = sqlite3_mprintf( - "cannot find end of central directory record" - ); + zipfileTableErr(pTab, "cannot find end of central directory record"); return SQLITE_ERROR; } @@ -1264,7 +1270,7 @@ static void zipfileAddEntry( } } -static int zipfileLoadDirectory(ZipfileTab *pTab, const u8 *aBlob, int nBlob){ +static int zipfileLoadDirectory(ZipfileTab *pTab, const u8 *aBlob, i64 nBlob){ ZipfileEOCD eocd; int rc; int i; @@ -1312,7 +1318,7 @@ static int zipfileFilter( }else if( sqlite3_value_type(argv[0])==SQLITE_BLOB ){ static const u8 aEmptyBlob = 0; const u8 *aBlob = (const u8*)sqlite3_value_blob(argv[0]); - int nBlob = sqlite3_value_bytes(argv[0]); + i64 nBlob = sqlite3_value_bytes(argv[0]); assert( pTab->pFirstEntry==0 ); if( aBlob==0 ){ aBlob = &aEmptyBlob; @@ -1386,7 +1392,7 @@ static int zipfileBestIndex( static ZipfileEntry *zipfileNewEntry(const char *zPath){ ZipfileEntry *pNew; - pNew = sqlite3_malloc(sizeof(ZipfileEntry)); + pNew = sqlite3_malloc64(sizeof(ZipfileEntry)); if( pNew ){ memset(pNew, 0, sizeof(ZipfileEntry)); pNew->cds.zFile = sqlite3_mprintf("%s", zPath); @@ -1510,7 +1516,7 @@ static int zipfileBegin(sqlite3_vtab *pVtab){ assert( pTab->pWriteFd==0 ); if( pTab->zFile==0 || pTab->zFile[0]==0 ){ - pTab->base.zErrMsg = sqlite3_mprintf("zipfile: missing filename"); + zipfileTableErr(pTab, "zipfile: missing filename"); return SQLITE_ERROR; } @@ -1520,9 +1526,9 @@ static int zipfileBegin(sqlite3_vtab *pVtab){ ** in main-memory until the transaction is committed. */ pTab->pWriteFd = sqlite3_fopen(pTab->zFile, "ab+"); if( pTab->pWriteFd==0 ){ - pTab->base.zErrMsg = sqlite3_mprintf( - "zipfile: failed to open file %s for writing", pTab->zFile - ); + zipfileTableErr(pTab, + "zipfile: failed to open file %s for writing", pTab->zFile + ); rc = SQLITE_ERROR; }else{ fseek(pTab->pWriteFd, 0, SEEK_END); @@ -1987,7 +1993,7 @@ struct ZipfileCtx { ZipfileBuffer cds; }; -static int zipfileBufferGrow(ZipfileBuffer *pBuf, int nByte){ +static int zipfileBufferGrow(ZipfileBuffer *pBuf, i64 nByte){ if( pBuf->n+nByte>pBuf->nAlloc ){ u8 *aNew; sqlite3_int64 nNew = pBuf->n ? pBuf->n*2 : 512; @@ -2036,7 +2042,7 @@ static void zipfileStep(sqlite3_context *pCtx, int nVal, sqlite3_value **apVal){ char *zName = 0; /* Path (name) of new entry */ int nName = 0; /* Size of zName in bytes */ char *zFree = 0; /* Free this before returning */ - int nByte; + i64 nByte; memset(&e, 0, sizeof(e)); p = (ZipfileCtx*)sqlite3_aggregate_context(pCtx, sizeof(ZipfileCtx)); diff --git a/ext/qrf/README.md b/ext/qrf/README.md new file mode 100644 index 0000000000..8555cb0780 --- /dev/null +++ b/ext/qrf/README.md @@ -0,0 +1,775 @@ +# SQLite Query Result Formatting Subsystem + +The "Query Result Formatter" or "QRF" subsystem is a C-language +subroutine that formats the output from an SQLite query for display using +a fix-width font, for example on a terminal window over an SSH connection. +The output format is configurable. The application can request various +table formats, with flexible column widths and alignments, row-oriented +formats, such as CSV and similar, as well as various special purpose formats +like JSON. + +For the first 25 years of SQLite's existance, the +[command-line interface](https://sqlite.org/cli.html) (CLI) +formatted query results using a hodge-podge of routines +that had grown slowly by accretion. The QRF was created +in fall of 2025 to refactor and reorganize this code into +a more usable form. The idea behind QRF is to implement all the +query result formatting capabilities of the CLI in a subroutine +that can be incorporated and reused by other applications. + +## 1.0 Overview Of Operation + +Suppose variable `sqlite3_stmt *pStmt` is a pointer to an SQLite +prepared statement that has been reset and bound and is ready to run. +Then to format the output from this prepared statement, use code +similar to the following: + +> ~~~ +sqlite3_qrf_spec spec; /* Format specification */ +char *zErrMsg; /* Text error message (optional) */ +char *zResult = 0; /* Formatted output written here */ +int rc; /* Result code */ + +memset(&spec, 0, sizeof(spec)); /* Initialize the spec */ +spec.iVersion = 1; /* Version number must be 1 */ +spec.pzOutput = &zResult; /* Write results in variable zResult */ +/* Optionally fill in other settings in spec here, as needed */ +zErrMsg = 0; /* Not required; just being pedantic */ +rc = sqlite3_format_query_result(pStmt, &spec, &zErrMsg); /* Format results */ +if( rc ){ + printf("Error (%d): %s\n", rc, zErrMsg); /* Report an error */ + sqlite3_free(zErrMsg); /* Free the error message text */ +}else{ + printf("%s", zResult); /* Report the results */ +} +sqlite3_free(zResult); /* Free memory used to hold results */ +~~~ + +The `sqlite3_qrf_spec` object describes the desired output format +and where to send the generated output. Most of the work in using +the QRF involves filling out the sqlite3_qrf_spec. + +### 1.1 Using QRF with SQL text + +If you start with SQL text instead of an sqlite3_stmt pointer, and +especially if the SQL text might comprise two or more statements, then +the SQL text needs to be converted into sqlite3_stmt objects separately. +If the original SQL text is in a variable `const char *zSql` and the +database connection is in variable `sqlite3 *db`, then code +similar to the following should work: + +> ~~~ +sqlite3_qrf_spec spec; /* Format specification */ +char *zErrMsg; /* Text error message (optional) */ +char *zResult = 0; /* Formatted output written here */ +sqlite3_stmt *pStmt; /* Next prepared statement */ +int rc; /* Result code */ + +memset(&spec, 0, sizeof(spec)); /* Initialize the spec */ +spec.iVersion = 1; /* Version number must be 1 */ +spec.pzOutput = &zResult; /* Write results in variable zResult */ +/* Optionally fill in other settings in spec here, as needed */ +zErrMsg = 0; /* Not required; just being pedantic */ +while( zSql && zSql[0] ){ + pStmt = 0; /* Not required; just being pedantic */ + rc = sqlite3_prepare_v2(db, zSql, -1, &pStmt, &zSql); + if( rc!=SQLITE_OK ){ + printf("Error: %s\n", sqlite3_errmsg(db)); + }else{ + rc = sqlite3_format_query_result(pStmt, &spec, &zErrMsg); /* Get results */ + if( rc ){ + printf("Error (%d): %s\n", rc, zErrMsg); /* Report an error */ + sqlite3_free(zErrMsg); /* Free the error message text */ + }else{ + printf("%s", zResult); /* Report the results */ + sqlite3_free(zResult); /* Free memory used to hold results */ + zResult = 0; + } + } + sqlite3_finalize(pStmt); +} +~~~ + + +## 2.0 The `sqlite3_qrf_spec` object + +The `sqlite3_qrf_spec` looks like this: + +> ~~~ +typedef struct sqlite3_qrf_spec sqlite3_qrf_spec; +struct sqlite3_qrf_spec { + unsigned char iVersion; /* Version number of this structure */ + unsigned char eStyle; /* Formatting style. "box", "csv", etc... */ + unsigned char eEsc; /* How to escape control characters in text */ + unsigned char eText; /* Quoting style for text */ + unsigned char eTitle; /* Quating style for the text of column names */ + unsigned char eBlob; /* Quoting style for BLOBs */ + unsigned char bTitles; /* True to show column names */ + unsigned char bWordWrap; /* Try to wrap on word boundaries */ + unsigned char bTextJsonb; /* Render JSONB blobs as JSON text */ + unsigned char eDfltAlign; /* Default alignment, no covered by aAlignment */ + unsigned char eTitleAlign; /* Alignment for column headers */ + unsigned char bSplitColumn; /* Wrap single-column output into many columns */ + unsigned char bBorder; /* Show outer border in Box and Table styles */ + short int nWrap; /* Wrap columns wider than this */ + short int nScreenWidth; /* Maximum overall table width */ + short int nLineLimit; /* Maximum number of lines for any row */ + short int nTitleLimit; /* Maximum number of characters in a title */ + unsigned int nMultiInsert; /* Add rows to one INSERT until size exceeds */ + int nCharLimit; /* Maximum number of characters in a cell */ + int nWidth; /* Number of entries in aWidth[] */ + int nAlign; /* Number of entries in aAlignment[] */ + short int *aWidth; /* Column widths */ + unsigned char *aAlign; /* Column alignments */ + char *zColumnSep; /* Alternative column separator */ + char *zRowSep; /* Alternative row separator */ + char *zTableName; /* Output table name */ + char *zNull; /* Rendering of NULL */ + char *(*xRender)(void*,sqlite3_value*); /* Render a value */ + int (*xWrite)(void*,const char*,sqlite3_int64); /* Write output */ + void *pRenderArg; /* First argument to the xRender callback */ + void *pWriteArg; /* First argument to the xWrite callback */ + char **pzOutput; /* Storage location for output string */ + /* Additional fields may be added in the future */ +}; +~~~ + +Do not be alarmed by the complexity of this structure. Everything can +be zeroed except for: + + * `.iVersion` + * One of `.pzOutput` or `.xWrite`. + +You do not need to understand and configure every field of this object +in order to use QRF effectively. Start by zeroing out the whole structure, +then initializing iVersion and one of pzOutput or xWrite. Then maybe +tweak one or two other settings to get the output you want. + +Further detail on the meanings of each of the fields in the +`sqlite3_qrf_spec` object is in the subsequent sections. + +### 2.1 Structure Version Number + +The sqlite3_qrf_spec.iVersion field must be 1. Future enhancements to +the QRF might add new fields to the bottom of the sqlite3_qrf_spec +object. Those new fields will only be accessible if the iVersion is greater +than 1. Thus the iVersion field is used to support upgradability. + +### 2.2 Output Deposition (xWrite and pzOutput) + +The formatted output can either be sent to a callback function +or accumulated into an output buffer in memory obtained +from sqlite3_malloc(). If the sqlite3_qrf_spec.xWrite column is not NULL, +then that function is invoked (using sqlite3_qrf_spec.xWriteArg as its +first argument) to transmit the formatted output. Or, if +sqlite3_qrf_spec.pzOutput points to a pointer to a character, then that +pointer is made to point to memory obtained from sqlite3_malloc() that +contains the complete text of the formatted output. If spec.pzOutput\[0\] +is initially non-NULL, then it is assumed to already point to memory obtained +from sqlite3_malloc(). In that case, the buffer is resized using +sqlite3_realloc() and the new text is appended. + +One of either sqlite3_qrf_spec.xWrite and sqlite3_qrf_spec.pzOutput must be +non-NULL and the other must be NULL. + +The return value from xWrite is an SQLITE result code. The usual return +should be SQLITE_OK. But if for some reason the write fails, a different +value might be returned. + +### 2.3 Output Format + +The sqlite3_qrf_spec.eStyle field is an integer code that defines the +specific output format that will be generated. See [section 4.0](#style) +below for details on the meaning of the various style options. + +Other fields in sqlite3_qrf_spec might be used or might be +ignored, depending on the value of eStyle. + +### 2.4 Show Column Names (bTitles) + +The sqlite3_qrf_spec.bTitles field can be either QRF_SW_Auto, +QRF_SW_On, or QRF_SW_Off. Those three constants also have shorter +alternative spellings: QRF_Auto, QRF_No, and +QRF_Yes. + +> ~~~ +#define QRF_SW_Auto 0 /* Let QRF choose the best value */ +#define QRF_SW_Off 1 /* This setting is forced off */ +#define QRF_SW_On 2 /* This setting is forced on */ +#define QRF_Auto 0 /* Alternate spelling for QRF_SW_Auto and others */ +#define QRF_No 1 /* Alternate spelling for QRF_SW_Off */ +#define QRF_Yes 2 /* Alternate spelling for QRF_SW_On */ +~~~ + +If the value is QRF_Yes, then column names appear in the output. +If the value is QRF_No, column names are omitted. If the +value is QRF_Auto, then an appropriate default is chosen. + +### 2.5 Control Character Escapes (eEsc) + +The sqlite3_qrf_spec.eEsc determines how ASCII control characters are +formatted when displaying TEXT values in the result. These are the allowed +values: + +> ~~~ +#define QRF_ESC_Auto 0 /* Choose the ctrl-char escape automatically */ +#define QRF_ESC_Off 1 /* Do not escape control characters */ +#define QRF_ESC_Ascii 2 /* Unix-style escapes. Ex: U+0007 shows ^G */ +#define QRF_ESC_Symbol 3 /* Unicode escapes. Ex: U+0007 shows U+2407 */ +~~~ + +If the value of eEsc is QRF_ESC_Ascii, then the control character +with value X is displayed as ^Y where Y is X+0x40. Hence, a +backspace character (U+0008) is shown as "^H". + +If eEsc is QRF_ESC_Symbol, then control characters in the range of U+0001 +through U+001f are mapped into U+2401 through U+241f, respectively. + +If the value of eEsc is QRF_ESC_Off, then no translation occurs +and control characters that appear in TEXT strings are transmitted +to the formatted output as-is. This can be dangerous in applications, +since an adversary who can control TEXT values might be able to +inject ANSI cursor movement sequences to hide nefarious values. + +The QRF_ESC_Auto value for eEsc means that the query result formatter +gets to pick whichever control-character encoding it thinks is best for +the situation. This will usually be QRF_ESC_Ascii. + +The TAB (U+0009), LF (U+000a) and CR-LF (U+000d,U+000a) character +sequence are always output literally and are not mapped to alternative +display values, regardless of this setting. + +### 2.6 Display of TEXT values (eText, eTitle) + +The sqlite3_qrf_spec.eText controls how text values are rendered in the +display. sqlite3_qrf_spec.eTitle controls how column names are rendered. +Both fields can have one of the following values: + +> ~~~ +#define QRF_TEXT_Auto 0 /* Choose text encoding automatically */ +#define QRF_TEXT_Plain 1 /* Literal text */ +#define QRF_TEXT_Sql 2 /* Quote as an SQL literal */ +#define QRF_TEXT_Csv 3 /* CSV-style quoting */ +#define QRF_TEXT_Html 4 /* HTML-style quoting */ +#define QRF_TEXT_Tcl 5 /* C/Tcl quoting */ +#define QRF_TEXT_Json 6 /* JSON quoting */ +#define QRF_TEXT_Relaxed 7 /* Relaxed SQL quoting */ +~~~ + +A value of QRF_TEXT_Auto means that the query result formatter will choose +what it thinks will be the best text encoding. + +A value of QRF_TEXT_Plain means that text values appear in the output exactly +as they are found in the database file, with no translation. + +A value of QRF_TEXT_Sql means that text values are escaped so that they +look like SQL literals. That means the value will be surrounded by +single-quotes (U+0027) and any single-quotes contained within the text +will be doubled. + +QRF_TEXT_Relaxed is similar to QRF_TEXT_Sql, except that it automatically +reverts to QRF_TEXT_Plain if the value to be displayed does not contain +special characters and is not easily confused with a NULL or a numeric +value. QRF_TEXT_Relaxed strives to minimize the amount of quoting syntax +while keeping the result unambiguous and easy for humans to read. The +precise rules for when quoting is omitted in QRF_TEXT_Relaxed, and when +it is applied, might be adjusted in future releases. + +A value of QRF_TEXT_Csv means that text values are escaped in accordance +with RFC 4180, which defines Comma-Separated-Value or CSV files. +Text strings that contain no special values appears as-is. Text strings +that contain special values are contained in double-quotes (U+0022) and +any double-quotes within the value are doubled. + +A value of QRF_TEXT_Html means that text values are escaped for use in +HTML. Special characters "<", "&", ">", """, and "'" +are displayed as "<", "&", ">", """, +and "'", respectively. + +A value of QRF_TEXT_Tcl means that text values are displayed inside of +double-quotes and special characters within the string are escaped using +backslash escape, as in ANSI-C or TCL or Perl or other popular programming +languages. + +A value of QRF_TEXT_Json gives similar results as QRF_TEXT_Tcl except that the +rules are adjusted so that the displayed string is strictly conforming +the JSON specification. + +### 2.7 How to display BLOB values (eBlob and bTextJsonb) + +If the sqlite3_qrf_spec.bTextJsonb flag is QRF_SW_On and if the value to be +displayed is JSONB, then the JSONB is translated into text JSON and the +text is shown according to the sqlite3_qrf_spec.eText setting as +described in the previous section. + +If the bTextJsonb flag is QRF_SW_Off (the usual case) or if the BLOB value to +be displayed is not JSONB, then the sqlite3_qrf_spec.eBlob field determines +how the BLOB value is formatted. The following options are available; + +> ~~~ +#define QRF_BLOB_Auto 0 /* Determine BLOB quoting using eText */ +#define QRF_BLOB_Text 1 /* Display content exactly as it is */ +#define QRF_BLOB_Sql 2 /* Quote as an SQL literal */ +#define QRF_BLOB_Hex 3 /* Hexadecimal representation */ +#define QRF_BLOB_Tcl 4 /* "\000" notation */ +#define QRF_BLOB_Json 5 /* A JSON string */ +#define QRF_BLOB_Size 6 /* Display the blob size only */ +~~~ + +A value of QRF_BLOB_Auto means that display format is selected automatically +by sqlite3_format_query_result() based on eStyle and eText. + +A value of QRF_BLOB_Text means that BLOB values are interpreted as UTF8 +text and are displayed using formatting results set by eEsc and +eText. + +A value of QRF_BLOB_Sql means that BLOB values are shown as SQL BLOB +literals: a prefix "`x'`" following by hexadecimal and ending with a +final "`'`". + +A value of QRF_BLOB_Hex means that BLOB values are shown as +hexadecimal text with no delimiters. + +A value of QRF_BLOB_Tcl means that BLOB values are shown as a +C/Tcl/Perl string literal where every byte is an octal backslash +escape. So a BLOB of `x'052881f3'` would be displayed as +`"\005\050\201\363"`. + +A value of QRF_BLOB_Json is similar to QRF_BLOB_Tcl except that is +uses unicode backslash escapes, since JSON does not understand +the C/Tcl/Perl octal backslash escapes. So the string from the +previous paragraph would be shown as +`"\u0005\u0028\u0081\u00f3"`. + +A value of QRF_BLOB_Size does not show any BLOB content at all. +Instead, it substitutes a text string that says how many bytes +the BLOB contains. + +### 2.8 Maximum size of displayed content (nLineLimit, nCharLimit, nTitleLimit) + +If the sqlite3_qrf_spec.nCharLimit setting is non-zero, then the formatter +will display only the first nCharLimit characters of each value. +Only characters that take up space are counted when enforcing this +limit. Zero-width characters and VT100 escape sequences do not count +toward this limit. The count is in characters, not bytes. When +imposing this limit, the formatter adds the three characters "..." +to the end of the value. Those added characters are not counted +as part of the limit. Very small limits still result in truncation, +but might render a few more characters than the limit. + +If the sqlite3_qrf_spec.nLineLimit setting is non-zero, then the +formatter will only display the first nLineLimit lines of each value. +It does not matter if the value is split because it contains a newline +character, or if it split by wrapping. This setting merely limits +the number of displayed lines. The nLineLimit setting currently only +works for **Box**, **Column**, **Line**, **Markdown**, and **Table** +styles, though that limitation might change in future releases. + +The idea behind both of these settings is to prevent large renderings +when doing a query that (unexpectedly) contains very large text or +blob values: perhaps megabyes of text. + +If the sqlite3_qrf_spec.nTitleLimit is non-zero, then the formatter +attempts to limits the size of column titles to at most nTitleLimit +display characters in width and a single line of text. The nTitleLimit +is useful for queries that have result columns that are scalar +subqueries or complex expressions. If those columns lack an AS +clause, then the name of the column will be a copy of the expression +that defines the column, which in some queries can be hundreds of +characters and multiple lines in length, which can reduce the readability +of tabular displays. An nTitleLimit somewhere in the range of 10 to 20. +can improve readability. The nTitleLimit setting currently only +works for **Box**, **Column**, **Line**, **Markdown**, and **Table** +styles, though that limitation might change in future releases. + +### 2.9 Multiple Tuples Per INSERT In QRF_STYLE_Insert (nMultiInsert) + +If the sqlite3_qrf_spec.nMultiInsert value is positive, then the +QRF_STYLE_Insert output mode will generate multiple tuples in +each INSERT statement until the total number of bytes in the +statement exceeds nMultiInsert. A value of a few thousand is +recommended here, in order to generate SQL output that is parsed +and inserted at maximum speed by SQLite. + +### 2.10 Word Wrapping In Columnar Styles (nWrap, bWordWrap) + +When using columnar formatting modes (QRF_STYLE_Box, QRF_STYLE_Column, +QRF_STYLE_Markdown, or QRF_STYLE_Table), the formatter attempts to limit +the width of any individual column to sqlite3_qrf_spec.nWrap characters +if nWrap is non-zero. A zero value for nWrap means "unlimited". +The nWrap limit might be exceeded if the limit is very small. + +In order to keep individual columns within requested width limits, +it is sometimes necessary to wrap the content for a single row of +a single column across multiple lines. When this +becomes necessary and if the bWordWrap setting is QRF_Yes, then the +formatter attempts to split the content on whitespace or at a word boundary. +If bWordWrap is QRF_No, then the formatter is free to split content +anywhere, including in the middle of a word. + +For narrow columns and wide words, it might sometimes be necessary to split +a column in the middle of a word, even when bWordWrap is QRF_Yes. + +### 2.11 Helping The Output To Fit On The Terminal (nScreenWidth) + +The sqlite3_qrf_spec.nScreenWidth field can be set the number of +characters that will fit on one line on the viewer output device. +This is typically a number like 80 or 132. The formatter will attempt +to reduce the length of output lines, depending on the style, so +that all output fits on that screen. + +A value of zero for nScreenWidth means "unknown" or "no width limit". +When the value is zero, the formatter makes no attempt to keep the +lines of output short. + +The nScreenWidth is a hint to the formatter, not a requirement. +The formatter trieds to keep lines below the nScreenWidth limit, +but it does not guarantee that it will. + +The nScreenWidth field currently only makes a difference in +columnar styles (**Box**, **Column**, **Markdown**, and **Table**) +and in the **Line** style. + +### 2.12 Individual Column Width (nWidth and aWidth) + +The sqlite3_qrf_spec.aWidth field is a pointer to an array of +signed 16-bit integers that control the width of individual columns +in columnar output modes (QRF_STYLE_Box, QRF_STYLE_Column, +QRF_STYLE_Markdown, or QRF_STYLE_Table). The sqlite3_qrf_spec.nWidth +field is the number of integers in the aWidth array. + +If aWidth is a NULL pointer or if nWidth is zero, then the array is +assumed to be all zeros. If nWidth is less then the number of +columns in the output, then zero is used for the width +for all columns past then end of the aWidth array. + +The aWidth array is deliberately an array of 16-bit signed integers. +Only 16 bits are used because no good comes for having very large +column widths. The range if further restricted as follows: + +> ~~~ +#define QRF_MAX_WIDTH 10000 /* Maximum column width */ +#define QRF_MIN_WIDTH 0 /* Minimum column width */ +~~~ + +A width greater than then QRF_MAX_WIDTH is interpreted as QRF_MAX_WIDTH. + +Any aWidth\[\] value of zero means the formatter should use a flexible +width column (limited only by sqlite_qrf_spec.mxWidth) that is just +big enough to hold the largest row. + +For historical compatibility, aWidth\[\] can contain negative values, +down to -QRF_MAX_WIDTH. The column width used is the absolute value +of the number in aWidth\[\]. The only difference is that negative +values cause the default horizontal alignment to be QRF_ALIGN_Right. +The sign of the aWidth\[\] values only affects alignment if the +alignment is not otherwise specified by aAlign\[\] or eDfltAlign. +Again, negative values for aWidth\[\] entries are supported for +backwards compatibility only, and are not recommended for new +applications. + +### 2.13 Alignment (nAlignment, aAlignment, eDfltAlign, eTitleAlign) + +Some cells in a display table might contain a lot of text and thus +be wide, or they might contain newline characters or be wrapped by +width constraints so that they span many rows of text. Other cells +might be narrower and shorter. In columnar formats, the display width +of a cell is the maximum of the widest value in the same column, and the +display height is the height of the tallest value in the same row. +So some cells might be much taller and wider than necessary to hold +their values. + +Alignment determines where smaller values are placed within larger cells. + +The sqlite3_qrf_spec.aAlign field points to an array of unsigned characters +that specifies alignment (both vertical and horizontal) of individual +columns within the table. The sqlite3_qrf_spec.nAlign fields holds +the number of entries in the aAlign\[\] array. + +If sqlite3_qrf_spec.aAlign is a NULL pointer or if sqlite3_qrf_spec.nAlign +is zero, or for columns to the right of what are specified by +sqlite3_qrf_spec.nAlign, the sqlite3_qrf_spec.eDfltAlign value is used +for the alignment. Column names can be (and often are) aligned +differently, as specified by sqlite3_qrf_spec.eTitleAlign. + +Each alignment value specifies both vertical and horizontal alignment. +Horizontal alignment can be left, center, right, or no preference. +Vertical alignment can be top, middle, bottom, or no preference. +Thus there are 16 possible alignment values, as follows: + +> ~~~ +/* +** Horizontal Vertial +** ---------- -------- */ +#define QRF_ALIGN_Auto 0 /* auto auto */ +#define QRF_ALIGN_Left 1 /* left auto */ +#define QRF_ALIGN_Center 2 /* center auto */ +#define QRF_ALIGN_Right 3 /* right auto */ +#define QRF_ALIGN_Top 4 /* auto top */ +#define QRF_ALIGN_NW 5 /* left top */ +#define QRF_ALIGN_N 6 /* center top */ +#define QRF_ALIGN_NE 7 /* right top */ +#define QRF_ALIGN_Middle 8 /* auto middle */ +#define QRF_ALIGN_W 9 /* left middle */ +#define QRF_ALIGN_C 10 /* center middle */ +#define QRF_ALIGN_E 11 /* right middle */ +#define QRF_ALIGN_Bottom 12 /* auto bottom */ +#define QRF_ALIGN_SW 13 /* left bottom */ +#define QRF_ALIGN_S 14 /* center bottom */ +#define QRF_ALIGN_SE 15 /* right bottom */ +~~~ + +Notice how alignment values with an unspecified horizontal +or vertical component can be added to another alignment value +for which that component is specified, to get a fully +specified alignment. For eample: + +> QRF_ALIGN_Center + QRF_ALIGN_Bottom == QRF_ALIGN_S. + +The alignment for column names is always determined by the +eTitleAlign setting. If eTitleAlign is QRF_Auto, then column +names use center-bottom alignment, QRF_ALIGN_W, value 14. +The aAlign\[\] and eDfltAlign settings have no affect on +column names. + +For data in the first nAlign columns, the aAlign\[\] array +entry for that column takes precedence. If either the horizontal +or vertical alignment has an "auto" value for that column or if +a column is beyond the first nAlign entries, then eDfltAlign +is used as a backup. If neither aAlign\[\] nor eDfltAlign +specify a horizontal alignment, then values are right-aligned +(QRF_ALIGN_Right) if they are numeric and left-aligned +(QRF_ALIGN_Left) otherwise. If neither aAlign\[\] nor eDfltAlign +specify a vertical alignment, then values are top-aligned +(QRF_ALIGN_Top). + +*As of 2025-11-08, only horizontal alignment is implemented. +The vertical alignment settings are currently ignored and +the vertical alignment is always QRF_ALIGN_Top.* + +### 2.14 Row and Column Separator Strings + +The sqlite3_qrf_spec.zColumnSep and sqlite3_qrf_spec.zRowSep strings +are alternative column and row separator character sequences. If not +specified (if these pointers are left as NULL) then appropriate defaults +are used. Some output styles have hard-coded column and row separators +and these settings are ignored for those styles. + +### 2.15 The Output Table Name + +The sqlite3_qrf_spec.zTableName value is the name of the output table +when eStyle is QRF_STYLE_Insert. + +### 2.16 The Rendering Of NULL (zNull) + +If a value is NULL then show the NULL using the string +found in sqlite3_qrf_spec.zNull. If zNull is itself a NULL pointer +then NULL values are rendered as an empty string. + +### 2.17 Optional Value Rendering Callback + +If the sqlite3_qrf_spec.xRender field is not NULL, then each +sqlite3_value coming out of the query is first passed to the +xRender function, giving that function an opportunity to render +the results itself, using whatever custom format is desired. +If xRender chooses to render, it should write the rendering +into memory obtained from sqlite3_malloc() and return a pointer +to that memory. The xRender function can decline +to render (for example, based on the sqlite3_value_type() or other +characteristics of the value) in which case it can simply return a +NULL pointer and the usual default rendering will be used instead. + +The sqlite3_format_query_result() function (which calls xRender) +will take responsibility for freeing the string returned by xRender +after it has finished using it. + +The eText, eBlob, and eEsc settings above become no-ops if the xRender +routine returns non-NULL. In other words, the application-supplied +xRender routine is expected to do all of its own quoting and formatting. + +The xRender routine is expected to do character length limiting itself. +So the nCharLimit setting becomes a no-op if xRender is used. However +the nLineLimit setting is still applied. The nTitleLimit setting is +not applicable to xRender because title values come from the +sqlite3_column_name() interface not from sqlite3_column_value(), +and so that names of columns are never processed by xRender. + +## 3.0 The `sqlite3_format_query_result()` Interface + +Invoke the `sqlite3_format_query_result(P,S,E)` interface to run +the prepared statement P and format its results according to the +specification found in S. The sqlite3_format_query_result() function +will return an SQLite result code, usually SQLITE_OK, but perhaps +SQLITE_NOMEM or SQLITE_ERROR or similar. If an error occurs and if +the E parameter is not NULL, then error message text might be written +into *E. Any error message text will be stored in memory obtained +from sqlite3_malloc() and it is the responsibility of the caller to +free that memory by a subsequent call to sqlite3_free(). + + +## 4.0 Output Styles + +The result formatter supports a variety of output styles. The +output style (sometimes called "output mode") is determined by +the eStyle field of the sqlite3_qrf_spec object. The set of +supported output modes might increase in future versions. +The following output modes are currently defined: + +> ~~~ +#define QRF_STYLE_Auto 0 /* Choose a style automatically */ +#define QRF_STYLE_Box 1 /* Unicode box-drawing characters */ +#define QRF_STYLE_Column 2 /* One record per line in neat columns */ +#define QRF_STYLE_Count 3 /* Output only a count of the rows of output */ +#define QRF_STYLE_Csv 4 /* Comma-separated-value */ +#define QRF_STYLE_Eqp 5 /* Format EXPLAIN QUERY PLAN output */ +#define QRF_STYLE_Explain 6 /* EXPLAIN output */ +#define QRF_STYLE_Html 7 /* Generate an XHTML table */ +#define QRF_STYLE_Insert 8 /* Generate SQL "insert" statements */ +#define QRF_STYLE_Json 9 /* Output is a list of JSON objects */ +#define QRF_STYLE_JObject 10 /* Independent JSON objects for each row */ +#define QRF_STYLE_Line 11 /* One column per line. */ +#define QRF_STYLE_List 12 /* One record per line with a separator */ +#define QRF_STYLE_Markdown 13 /* Markdown formatting */ +#define QRF_STYLE_Off 14 /* No query output shown */ +#define QRF_STYLE_Quote 15 /* SQL-quoted, comma-separated */ +#define QRF_STYLE_Stats 16 /* EQP-like output but with performance stats */ +#define QRF_STYLE_StatsEst 17 /* EQP-like output with planner estimates */ +#define QRF_STYLE_StatsVm 18 /* EXPLAIN-like output with performance stats */ +#define QRF_STYLE_Table 19 /* MySQL-style table formatting */ +~~~ + +In the following subsections, these styles will often be referred +to without the "QRF_STYLE_" prefix. + +### 4.1 Default Style (Auto) + +The **Auto** style means QRF gets to choose an appropriate output +style. It will usually choose **Box**, but might also pick one of +**Explain** or **Eqp** if the `sqlite3_stmt_explain()` function +returns 1 or 2, respectively. + +### 4.2 Columnar Styles (Box, Column, Markdown, Table) + +The **Box**, **Column**, **Markdown**, and **Table** +modes are columnar. This means the output is arranged into neat, +uniform-width columns. These styles can use more memory, especially when +the query result has many rows, because they need to load the entire output +into memory first in order to determine how wide to make each column. + +The nWidth, aWidth, and mxWidth fields of the `sqlite3_qrf_spec` object +are used by these styles only, and are ignored by all other styles. +The zRowSep and zColumnSep settings are ignored by these styles. The +bTitles setting is honored by these styles; it defaults to QRF_SW_On. + +The **Box** style uses Unicode box-drawing character to draw a grid +of columns and rows to show the result. The **Table** is the same, +except that it uses ASCII-art rather than Unicode box-drawing characters +to draw the grid. The **Column** arranges the results in neat columns +but does not draw in column or row separator, except that it does draw +lines horizontal lines using "`-`" characters to separate the column names +from the data below. This is very similar to default output styling in +psql. The **Markdown** renders its result in the Markdown table format. + +The **Box** and **Table** styles normally have a border that surrounds +the entire result. However, if sqlite3_qrf_spec.bBorder is QRF_No, then +that border is omitted, saving a little space both horizontally and +vertically. + +#### 4.2.1 Split Column Mode + +If the bSplitColumn field is QRF_Yes, and eStyle is QRF_STYLE_Column, +and bTitles is QRF_No, and nScreenWidth is greater than zero, and if +the query only returns a single column, then a special rendering known +as "Split Column Mode" will be used. In split column mode, instead +of showing all results in one tall column, the content wraps vertically +so that it appears on the screen as multiple columns, as many as will +fit in the available screen width. + +### 4.3 Line-oriented Styles + +The line-oriented styles output each row of result as it is received from +the prepared statement. + +The **List** style is the most familiar line-oriented output format. +The **List** style shows output columns for each row on the +same line, each separated by a single "`|`" character and with lines +terminated by a single newline (\\u000a or \\n). These column +and row separator choices can be overridden using the zColumnSep +and zRowSep fields of the `sqlite3_qrf_spec` structure. The text +formatting is QRF_TEXT_Plain, and BLOB encoding is QRF_BLOB_Text. So +characters appear in the output exactly as they appear in the database. +Except the eEsp mode defaults to `QRF_ESC_On`, so that control +characters are escaped, for safety. + +The **Csv** and **Quote** styles are simply variations on **List** +with hard-coded values for some of the sqlite3_qrf_spec settings: + + +
    Quote Csv +
    zColumnSep "," "," +
    zRowSep "\\n" "\\r\\n" +
    zNull "NULL" "" +
    eText QRF_TEXT_Sql QRF_TEXT_Csv +
    eBlob QRF_BLOB_Sql QRF_BLOB_Text +
    + +The **Html** style generates HTML table content, just without +the `..
    ` around the outside. + +The **Insert** style generates a series of SQL "INSERT" statements +that will inserts the data that is output into a table whose name is defined +by the zTableName field of `sqlite3_qrf_spec`. If zTableName is NULL, +then a substitute name is used. If nMultiInsert is positive, then the +output will add multiple rows to each INSERT statement until the size +of the INSERT statement exceeds nMultiInsert bytes before starting +a new INSERT statement. + +The **Json** and **JObject** styles generates JSON text for the query result. +The **Json** style produces a JSON array of structures with one +structure per row. **JObject** outputs independent JSON objects, one per +row, with each structure on a separate line all by itself, and not +part of a larger array. In both cases, the labels on the elements of the +JSON objects are taken from the column names of the SQL query. So if +you have an SQL query that has two or more output columns with the same +name, you will end up with JSON structures that have duplicate elements. + +Finally, the **Line** style paints each column of a row on a +separate line with the column name on the left and a "`=`" separating the +column name from its value. A single blank line appears between rows. + +### 4.4 EXPLAIN Styles (Eqp, Explain) + +The **Eqp** and **Explain** styles format output for +EXPLAIN QUERY PLAN and EXPLAIN statements, respectively. If the input +statement is not already an EXPLAIN QUERY PLAN or EXPLAIN statement is +is temporarily converted for the duration of the rendering, but +is converted back before `sqlite3_format_query_result()` returns. + +### 4.5 ScanStatus Styles (Stats, StatsEst, StatsVm) + +The **Stats**, **StatsEst**, and **StatsVm** styles are similar to **Eqp** +and **Explain** except that they include profiling information +from prior executions of the input prepared statement. +These modes only work if SQLite has been compiled with +-DSQLITE_ENABLE_STMT_SCANSTATUS and if the SQLITE_DBCONFIG_STMT_SCANSTATUS +is enabled for the database connection. The **StatsVm** style +also requires the bytecode() virtual table which is enabled using +the -DSQLITE_ENABLE_BYTECODE_VTAB compile-time option. + +### 4.6 Other Styles (Count, Off) + +The **Count** style discards all query results and returns +a count of the number of rows of output at the end. The **Off** +style is completely silent; it generates no output. These corner-case +modes are sometimes useful for debugging. + +### 5.0 Source Code Files + +The SQLite Query Result Formatter is implemented in three source code files: + + * `qrf.c` → The implementation, written in portable C99 + * `qrf.h` → A header file defining interfaces + * `README.md` → This documentation + +To use the SQLite result formatter, include the "`qrf.h`" header file +and link the application against the "`qrf.c`" source file. diff --git a/ext/qrf/dev-notes.md b/ext/qrf/dev-notes.md new file mode 100644 index 0000000000..a46aada834 --- /dev/null +++ b/ext/qrf/dev-notes.md @@ -0,0 +1,14 @@ +# Developer Notes + +## Measuring Test Coverage On Linux + +On Mint Linux, as of 2025-12-02: + +> ~~~ +./configure --dev CFLAGS='-O0 -g -fprofile-arcs -ftest-coverage' +make clean testfixture +./testfixture test/qrf*.test +gcov -b -c testfixture-tclsqlite-ex.c +~~~ + +View results in tclsqlite-ex.c.gcov diff --git a/ext/qrf/qrf.c b/ext/qrf/qrf.c new file mode 100644 index 0000000000..4d559581cb --- /dev/null +++ b/ext/qrf/qrf.c @@ -0,0 +1,3007 @@ +/* +** 2025-10-20 +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** May you do good and not evil. +** May you find forgiveness for yourself and forgive others. +** May you share freely, never taking more than you give. +** +************************************************************************* +** Implementation of the Query Result-Format or "qrf" utility library for +** SQLite. See the README.md documentation for additional information. +*/ +#ifndef SQLITE_QRF_H +#include "qrf.h" +#endif +#include +#include +#include + +#ifndef SQLITE_AMALGAMATION +typedef sqlite3_int64 i64; +#endif + +/* A single line in the EQP output */ +typedef struct qrfEQPGraphRow qrfEQPGraphRow; +struct qrfEQPGraphRow { + int iEqpId; /* ID for this row */ + int iParentId; /* ID of the parent row */ + qrfEQPGraphRow *pNext; /* Next row in sequence */ + char zText[1]; /* Text to display for this row */ +}; + +/* All EQP output is collected into an instance of the following */ +typedef struct qrfEQPGraph qrfEQPGraph; +struct qrfEQPGraph { + qrfEQPGraphRow *pRow; /* Linked list of all rows of the EQP output */ + qrfEQPGraphRow *pLast; /* Last element of the pRow list */ + int nWidth; /* Width of the graph */ + char zPrefix[400]; /* Graph prefix */ +}; + +/* +** Private state information. Subject to change from one release to the +** next. +*/ +typedef struct Qrf Qrf; +struct Qrf { + sqlite3_stmt *pStmt; /* The statement whose output is to be rendered */ + sqlite3 *db; /* The corresponding database connection */ + sqlite3_stmt *pJTrans; /* JSONB to JSON translator statement */ + char **pzErr; /* Write error message here, if not NULL */ + sqlite3_str *pOut; /* Accumulated output */ + int iErr; /* Error code */ + int nCol; /* Number of output columns */ + int expMode; /* Original sqlite3_stmt_isexplain() plus 1 */ + int mxWidth; /* Screen width */ + int mxHeight; /* nLineLimit */ + union { + struct { /* Content for QRF_STYLE_Line */ + int mxColWth; /* Maximum display width of any column */ + char **azCol; /* Names of output columns (MODE_Line) */ + } sLine; + qrfEQPGraph *pGraph; /* EQP graph (Eqp, Stats, and StatsEst) */ + struct { /* Content for QRF_STYLE_Explain */ + int nIndent; /* Slots allocated for aiIndent */ + int iIndent; /* Current slot */ + int *aiIndent; /* Indentation for each opcode */ + } sExpln; + unsigned int nIns; /* Bytes used for current INSERT stmt */ + } u; + sqlite3_int64 nRow; /* Number of rows handled so far */ + int *actualWidth; /* Actual width of each column */ + sqlite3_qrf_spec spec; /* Copy of the original spec */ +}; + +/* +** Data for substitute ctype.h functions. Used for x-platform +** consistency and so that '_' is counted as an alphabetic +** character. +** +** 0x01 - space +** 0x02 - digit +** 0x04 - alphabetic, including '_' +*/ +static const char qrfCType[] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, + 0, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, + 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 0, 0, 0, 0, 4, + 0, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, + 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 +}; +#define qrfSpace(x) ((qrfCType[(unsigned char)x]&1)!=0) +#define qrfDigit(x) ((qrfCType[(unsigned char)x]&2)!=0) +#define qrfAlpha(x) ((qrfCType[(unsigned char)x]&4)!=0) +#define qrfAlnum(x) ((qrfCType[(unsigned char)x]&6)!=0) + +#ifndef deliberate_fall_through +/* Quiet some compilers about some of our intentional code. */ +# if defined(GCC_VERSION) && GCC_VERSION>=7000000 +# define deliberate_fall_through __attribute__((fallthrough)); +# else +# define deliberate_fall_through +# endif +#endif + +/* +** Set an error code and error message. +*/ +static void qrfError( + Qrf *p, /* Query result state */ + int iCode, /* Error code */ + const char *zFormat, /* Message format (or NULL) */ + ... +){ + p->iErr = iCode; + if( p->pzErr!=0 ){ + sqlite3_free(*p->pzErr); + *p->pzErr = 0; + if( zFormat ){ + va_list ap; + va_start(ap, zFormat); + *p->pzErr = sqlite3_vmprintf(zFormat, ap); + va_end(ap); + } + } +} + +/* +** Out-of-memory error. +*/ +static void qrfOom(Qrf *p){ + qrfError(p, SQLITE_NOMEM, "out of memory"); +} + +/* +** Transfer any error in pStr over into p. +*/ +static void qrfStrErr(Qrf *p, sqlite3_str *pStr){ + int rc = pStr ? sqlite3_str_errcode(pStr) : 0; + if( rc ){ + qrfError(p, rc, sqlite3_errstr(rc)); + } +} + + +/* +** Add a new entry to the EXPLAIN QUERY PLAN data +*/ +static void qrfEqpAppend(Qrf *p, int iEqpId, int p2, const char *zText){ + qrfEQPGraphRow *pNew; + sqlite3_int64 nText; + if( zText==0 ) return; + if( p->u.pGraph==0 ){ + p->u.pGraph = sqlite3_malloc64( sizeof(qrfEQPGraph) ); + if( p->u.pGraph==0 ){ + qrfOom(p); + return; + } + memset(p->u.pGraph, 0, sizeof(qrfEQPGraph) ); + } + nText = strlen(zText); + pNew = sqlite3_malloc64( sizeof(*pNew) + nText ); + if( pNew==0 ){ + qrfOom(p); + return; + } + pNew->iEqpId = iEqpId; + pNew->iParentId = p2; + memcpy(pNew->zText, zText, nText+1); + pNew->pNext = 0; + if( p->u.pGraph->pLast ){ + p->u.pGraph->pLast->pNext = pNew; + }else{ + p->u.pGraph->pRow = pNew; + } + p->u.pGraph->pLast = pNew; +} + +/* +** Free and reset the EXPLAIN QUERY PLAN data that has been collected +** in p->u.pGraph. +*/ +static void qrfEqpReset(Qrf *p){ + qrfEQPGraphRow *pRow, *pNext; + if( p->u.pGraph ){ + for(pRow = p->u.pGraph->pRow; pRow; pRow = pNext){ + pNext = pRow->pNext; + sqlite3_free(pRow); + } + sqlite3_free(p->u.pGraph); + p->u.pGraph = 0; + } +} + +/* Return the next EXPLAIN QUERY PLAN line with iEqpId that occurs after +** pOld, or return the first such line if pOld is NULL +*/ +static qrfEQPGraphRow *qrfEqpNextRow(Qrf *p, int iEqpId, qrfEQPGraphRow *pOld){ + qrfEQPGraphRow *pRow = pOld ? pOld->pNext : p->u.pGraph->pRow; + while( pRow && pRow->iParentId!=iEqpId ) pRow = pRow->pNext; + return pRow; +} + +/* Render a single level of the graph that has iEqpId as its parent. Called +** recursively to render sublevels. +*/ +static void qrfEqpRenderLevel(Qrf *p, int iEqpId){ + qrfEQPGraphRow *pRow, *pNext; + i64 n = strlen(p->u.pGraph->zPrefix); + char *z; + for(pRow = qrfEqpNextRow(p, iEqpId, 0); pRow; pRow = pNext){ + pNext = qrfEqpNextRow(p, iEqpId, pRow); + z = pRow->zText; + sqlite3_str_appendf(p->pOut, "%s%s%s\n", p->u.pGraph->zPrefix, + pNext ? "|--" : "`--", z); + if( n<(i64)sizeof(p->u.pGraph->zPrefix)-7 ){ + memcpy(&p->u.pGraph->zPrefix[n], pNext ? "| " : " ", 4); + qrfEqpRenderLevel(p, pRow->iEqpId); + p->u.pGraph->zPrefix[n] = 0; + } + } +} + +/* +** Render the 64-bit value N in a more human-readable format into +** pOut. +** +** + Only show the first three significant digits. +** + Append suffixes K, M, G, T, P, and E for 1e3, 1e6, ... 1e18 +*/ +static void qrfApproxInt64(sqlite3_str *pOut, i64 N){ + static const char aSuffix[] = { 'K', 'M', 'G', 'T', 'P', 'E' }; + int i; + if( N<0 ){ + N = N==INT64_MIN ? INT64_MAX : -N; + sqlite3_str_append(pOut, "-", 1); + } + if( N<10000 ){ + sqlite3_str_appendf(pOut, "%4lld ", N); + return; + } + for(i=1; i<=18; i++){ + N = (N+5)/10; + if( N<10000 ){ + int n = (int)N; + switch( i%3 ){ + case 0: + sqlite3_str_appendf(pOut, "%d.%02d", n/1000, (n%1000)/10); + break; + case 1: + sqlite3_str_appendf(pOut, "%2d.%d", n/100, (n%100)/10); + break; + case 2: + sqlite3_str_appendf(pOut, "%4d", n/10); + break; + } + sqlite3_str_append(pOut, &aSuffix[i/3], 1); + break; + } + } +} + +/* +** Display and reset the EXPLAIN QUERY PLAN data +*/ +static void qrfEqpRender(Qrf *p, i64 nCycle){ + qrfEQPGraphRow *pRow; + if( p->u.pGraph!=0 && (pRow = p->u.pGraph->pRow)!=0 ){ + if( pRow->zText[0]=='-' ){ + if( pRow->pNext==0 ){ + qrfEqpReset(p); + return; + } + sqlite3_str_appendf(p->pOut, "%s\n", pRow->zText+3); + p->u.pGraph->pRow = pRow->pNext; + sqlite3_free(pRow); + }else if( nCycle>0 ){ + int nSp = p->u.pGraph->nWidth - 2; + if( p->spec.eStyle==QRF_STYLE_StatsEst ){ + sqlite3_str_appendchar(p->pOut, nSp, ' '); + sqlite3_str_appendall(p->pOut, + "Cycles Loops (est) Rows (est)\n"); + sqlite3_str_appendchar(p->pOut, nSp, ' '); + sqlite3_str_appendall(p->pOut, + "---------- ------------ ------------\n"); + }else{ + sqlite3_str_appendchar(p->pOut, nSp, ' '); + sqlite3_str_appendall(p->pOut, + "Cycles Loops Rows \n"); + sqlite3_str_appendchar(p->pOut, nSp, ' '); + sqlite3_str_appendall(p->pOut, + "---------- ----- -----\n"); + } + sqlite3_str_appendall(p->pOut, "QUERY PLAN"); + sqlite3_str_appendchar(p->pOut, nSp - 10, ' '); + qrfApproxInt64(p->pOut, nCycle); + sqlite3_str_appendall(p->pOut, " 100%\n"); + }else{ + sqlite3_str_appendall(p->pOut, "QUERY PLAN\n"); + } + p->u.pGraph->zPrefix[0] = 0; + qrfEqpRenderLevel(p, 0); + qrfEqpReset(p); + } +} + +#ifdef SQLITE_ENABLE_STMT_SCANSTATUS +/* +** Helper function for qrfExpStats(). +** +*/ +static int qrfStatsHeight(sqlite3_stmt *p, int iEntry){ + int iPid = 0; + int ret = 1; + sqlite3_stmt_scanstatus_v2(p, iEntry, + SQLITE_SCANSTAT_SELECTID, SQLITE_SCANSTAT_COMPLEX, (void*)&iPid + ); + while( iPid!=0 ){ + int ii; + for(ii=0; 1; ii++){ + int iId; + int res; + res = sqlite3_stmt_scanstatus_v2(p, ii, + SQLITE_SCANSTAT_SELECTID, SQLITE_SCANSTAT_COMPLEX, (void*)&iId + ); + if( res ) break; + if( iId==iPid ){ + sqlite3_stmt_scanstatus_v2(p, ii, + SQLITE_SCANSTAT_PARENTID, SQLITE_SCANSTAT_COMPLEX, (void*)&iPid + ); + } + } + ret++; + } + return ret; +} +#endif /* SQLITE_ENABLE_STMT_SCANSTATUS */ + + +/* +** Generate ".scanstatus est" style of EQP output. +*/ +static void qrfEqpStats(Qrf *p){ +#ifndef SQLITE_ENABLE_STMT_SCANSTATUS + qrfError(p, SQLITE_ERROR, "not available in this build"); +#else + static const int f = SQLITE_SCANSTAT_COMPLEX; + sqlite3_stmt *pS = p->pStmt; + int i = 0; + i64 nTotal = 0; + int nWidth = 0; + int prevPid = -1; /* Previous iPid */ + double rEstCum = 1.0; /* Cumulative row estimate */ + sqlite3_str *pLine = sqlite3_str_new(p->db); + sqlite3_str *pStats = sqlite3_str_new(p->db); + qrfEqpReset(p); + + for(i=0; 1; i++){ + const char *z = 0; + int n = 0; + if( sqlite3_stmt_scanstatus_v2(pS,i,SQLITE_SCANSTAT_EXPLAIN,f,(void*)&z) ){ + break; + } + n = (int)strlen(z) + qrfStatsHeight(pS,i)*3; + if( n>nWidth ) nWidth = n; + } + nWidth += 2; + + sqlite3_stmt_scanstatus_v2(pS,-1, SQLITE_SCANSTAT_NCYCLE, f, (void*)&nTotal); + for(i=0; 1; i++){ + i64 nLoop = 0; + i64 nRow = 0; + i64 nCycle = 0; + int iId = 0; + int iPid = 0; + const char *zo = 0; + const char *zName = 0; + double rEst = 0.0; + + if( sqlite3_stmt_scanstatus_v2(pS,i,SQLITE_SCANSTAT_EXPLAIN,f,(void*)&zo) ){ + break; + } + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_PARENTID,f,(void*)&iPid); + if( iPid!=prevPid ){ + prevPid = iPid; + rEstCum = 1.0; + } + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_EST,f,(void*)&rEst); + rEstCum *= rEst; + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_NLOOP,f,(void*)&nLoop); + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_NVISIT,f,(void*)&nRow); + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_NCYCLE,f,(void*)&nCycle); + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_SELECTID,f,(void*)&iId); + sqlite3_stmt_scanstatus_v2(pS,i, SQLITE_SCANSTAT_NAME,f,(void*)&zName); + + if( nCycle>=0 || nLoop>=0 || nRow>=0 ){ + int nSp = 0; + sqlite3_str_reset(pStats); + if( nCycle>=0 && nTotal>0 ){ + qrfApproxInt64(pStats, nCycle); + sqlite3_str_appendf(pStats, " %3d%%", + ((nCycle*100)+nTotal/2) / nTotal + ); + nSp = 2; + } + if( nLoop>=0 ){ + if( nSp ) sqlite3_str_appendchar(pStats, nSp, ' '); + qrfApproxInt64(pStats, nLoop); + nSp = 2; + if( p->spec.eStyle==QRF_STYLE_StatsEst ){ + sqlite3_str_appendf(pStats, " "); + qrfApproxInt64(pStats, (i64)(rEstCum/rEst)); + } + } + if( nRow>=0 ){ + if( nSp ) sqlite3_str_appendchar(pStats, nSp, ' '); + qrfApproxInt64(pStats, nRow); + nSp = 2; + if( p->spec.eStyle==QRF_STYLE_StatsEst ){ + sqlite3_str_appendf(pStats, " "); + qrfApproxInt64(pStats, (i64)rEstCum); + } + } + sqlite3_str_appendf(pLine, + "% *s %s", -1*(nWidth-qrfStatsHeight(pS,i)*3), zo, + sqlite3_str_value(pStats) + ); + sqlite3_str_reset(pStats); + qrfEqpAppend(p, iId, iPid, sqlite3_str_value(pLine)); + sqlite3_str_reset(pLine); + }else{ + qrfEqpAppend(p, iId, iPid, zo); + } + } + if( p->u.pGraph ) p->u.pGraph->nWidth = nWidth; + qrfStrErr(p, pLine); + sqlite3_free(sqlite3_str_finish(pLine)); + qrfStrErr(p, pStats); + sqlite3_free(sqlite3_str_finish(pStats)); +#endif +} + + +/* +** Reset the prepared statement. +*/ +static void qrfResetStmt(Qrf *p){ + int rc = sqlite3_reset(p->pStmt); + if( rc!=SQLITE_OK && p->iErr==SQLITE_OK ){ + qrfError(p, rc, "%s", sqlite3_errmsg(p->db)); + } +} + +/* +** If xWrite is defined, send all content of pOut to xWrite and +** reset pOut. +*/ +static void qrfWrite(Qrf *p){ + int n; + if( p->spec.xWrite && (n = sqlite3_str_length(p->pOut))>0 ){ + int rc = p->spec.xWrite(p->spec.pWriteArg, + sqlite3_str_value(p->pOut), + (sqlite3_int64)n); + sqlite3_str_reset(p->pOut); + if( rc ){ + qrfError(p, rc, "Failed to write %d bytes of output", n); + } + } +} + +/* Lookup table to estimate the number of columns consumed by a Unicode +** character. +*/ +static const struct { + unsigned char w; /* Width of the character in columns */ + int iFirst; /* First character in a span having this width */ +} aQrfUWidth[] = { + /* {1, 0x00000}, */ + {0, 0x00300}, {1, 0x00370}, {0, 0x00483}, {1, 0x00487}, {0, 0x00488}, + {1, 0x0048a}, {0, 0x00591}, {1, 0x005be}, {0, 0x005bf}, {1, 0x005c0}, + {0, 0x005c1}, {1, 0x005c3}, {0, 0x005c4}, {1, 0x005c6}, {0, 0x005c7}, + {1, 0x005c8}, {0, 0x00600}, {1, 0x00604}, {0, 0x00610}, {1, 0x00616}, + {0, 0x0064b}, {1, 0x0065f}, {0, 0x00670}, {1, 0x00671}, {0, 0x006d6}, + {1, 0x006e5}, {0, 0x006e7}, {1, 0x006e9}, {0, 0x006ea}, {1, 0x006ee}, + {0, 0x0070f}, {1, 0x00710}, {0, 0x00711}, {1, 0x00712}, {0, 0x00730}, + {1, 0x0074b}, {0, 0x007a6}, {1, 0x007b1}, {0, 0x007eb}, {1, 0x007f4}, + {0, 0x00901}, {1, 0x00903}, {0, 0x0093c}, {1, 0x0093d}, {0, 0x00941}, + {1, 0x00949}, {0, 0x0094d}, {1, 0x0094e}, {0, 0x00951}, {1, 0x00955}, + {0, 0x00962}, {1, 0x00964}, {0, 0x00981}, {1, 0x00982}, {0, 0x009bc}, + {1, 0x009bd}, {0, 0x009c1}, {1, 0x009c5}, {0, 0x009cd}, {1, 0x009ce}, + {0, 0x009e2}, {1, 0x009e4}, {0, 0x00a01}, {1, 0x00a03}, {0, 0x00a3c}, + {1, 0x00a3d}, {0, 0x00a41}, {1, 0x00a43}, {0, 0x00a47}, {1, 0x00a49}, + {0, 0x00a4b}, {1, 0x00a4e}, {0, 0x00a70}, {1, 0x00a72}, {0, 0x00a81}, + {1, 0x00a83}, {0, 0x00abc}, {1, 0x00abd}, {0, 0x00ac1}, {1, 0x00ac6}, + {0, 0x00ac7}, {1, 0x00ac9}, {0, 0x00acd}, {1, 0x00ace}, {0, 0x00ae2}, + {1, 0x00ae4}, {0, 0x00b01}, {1, 0x00b02}, {0, 0x00b3c}, {1, 0x00b3d}, + {0, 0x00b3f}, {1, 0x00b40}, {0, 0x00b41}, {1, 0x00b44}, {0, 0x00b4d}, + {1, 0x00b4e}, {0, 0x00b56}, {1, 0x00b57}, {0, 0x00b82}, {1, 0x00b83}, + {0, 0x00bc0}, {1, 0x00bc1}, {0, 0x00bcd}, {1, 0x00bce}, {0, 0x00c3e}, + {1, 0x00c41}, {0, 0x00c46}, {1, 0x00c49}, {0, 0x00c4a}, {1, 0x00c4e}, + {0, 0x00c55}, {1, 0x00c57}, {0, 0x00cbc}, {1, 0x00cbd}, {0, 0x00cbf}, + {1, 0x00cc0}, {0, 0x00cc6}, {1, 0x00cc7}, {0, 0x00ccc}, {1, 0x00cce}, + {0, 0x00ce2}, {1, 0x00ce4}, {0, 0x00d41}, {1, 0x00d44}, {0, 0x00d4d}, + {1, 0x00d4e}, {0, 0x00dca}, {1, 0x00dcb}, {0, 0x00dd2}, {1, 0x00dd5}, + {0, 0x00dd6}, {1, 0x00dd7}, {0, 0x00e31}, {1, 0x00e32}, {0, 0x00e34}, + {1, 0x00e3b}, {0, 0x00e47}, {1, 0x00e4f}, {0, 0x00eb1}, {1, 0x00eb2}, + {0, 0x00eb4}, {1, 0x00eba}, {0, 0x00ebb}, {1, 0x00ebd}, {0, 0x00ec8}, + {1, 0x00ece}, {0, 0x00f18}, {1, 0x00f1a}, {0, 0x00f35}, {1, 0x00f36}, + {0, 0x00f37}, {1, 0x00f38}, {0, 0x00f39}, {1, 0x00f3a}, {0, 0x00f71}, + {1, 0x00f7f}, {0, 0x00f80}, {1, 0x00f85}, {0, 0x00f86}, {1, 0x00f88}, + {0, 0x00f90}, {1, 0x00f98}, {0, 0x00f99}, {1, 0x00fbd}, {0, 0x00fc6}, + {1, 0x00fc7}, {0, 0x0102d}, {1, 0x01031}, {0, 0x01032}, {1, 0x01033}, + {0, 0x01036}, {1, 0x0103b}, {0, 0x01058}, + {1, 0x0105a}, {2, 0x01100}, {0, 0x01160}, {1, 0x01200}, {0, 0x0135f}, + {1, 0x01360}, {0, 0x01712}, {1, 0x01715}, {0, 0x01732}, {1, 0x01735}, + {0, 0x01752}, {1, 0x01754}, {0, 0x01772}, {1, 0x01774}, {0, 0x017b4}, + {1, 0x017b6}, {0, 0x017b7}, {1, 0x017be}, {0, 0x017c6}, {1, 0x017c7}, + {0, 0x017c9}, {1, 0x017d4}, {0, 0x017dd}, {1, 0x017de}, {0, 0x0180b}, + {1, 0x0180e}, {0, 0x018a9}, {1, 0x018aa}, {0, 0x01920}, {1, 0x01923}, + {0, 0x01927}, {1, 0x01929}, {0, 0x01932}, {1, 0x01933}, {0, 0x01939}, + {1, 0x0193c}, {0, 0x01a17}, {1, 0x01a19}, {0, 0x01b00}, {1, 0x01b04}, + {0, 0x01b34}, {1, 0x01b35}, {0, 0x01b36}, {1, 0x01b3b}, {0, 0x01b3c}, + {1, 0x01b3d}, {0, 0x01b42}, {1, 0x01b43}, {0, 0x01b6b}, {1, 0x01b74}, + {0, 0x01dc0}, {1, 0x01dcb}, {0, 0x01dfe}, {1, 0x01e00}, {0, 0x0200b}, + {1, 0x02010}, {0, 0x0202a}, {1, 0x0202f}, {0, 0x02060}, {1, 0x02064}, + {0, 0x0206a}, {1, 0x02070}, {0, 0x020d0}, {1, 0x020f0}, {2, 0x02329}, + {1, 0x0232b}, {2, 0x02e80}, {0, 0x0302a}, {2, 0x03030}, {1, 0x0303f}, + {2, 0x03040}, {0, 0x03099}, {2, 0x0309b}, {1, 0x0a4d0}, {0, 0x0a806}, + {1, 0x0a807}, {0, 0x0a80b}, {1, 0x0a80c}, {0, 0x0a825}, {1, 0x0a827}, + {2, 0x0ac00}, {1, 0x0d7a4}, {2, 0x0f900}, {1, 0x0fb00}, {0, 0x0fb1e}, + {1, 0x0fb1f}, {0, 0x0fe00}, {2, 0x0fe10}, {1, 0x0fe1a}, {0, 0x0fe20}, + {1, 0x0fe24}, {2, 0x0fe30}, {1, 0x0fe70}, {0, 0x0feff}, {2, 0x0ff00}, + {1, 0x0ff61}, {2, 0x0ffe0}, {1, 0x0ffe7}, {0, 0x0fff9}, {1, 0x0fffc}, + {0, 0x10a01}, {1, 0x10a04}, {0, 0x10a05}, {1, 0x10a07}, {0, 0x10a0c}, + {1, 0x10a10}, {0, 0x10a38}, {1, 0x10a3b}, {0, 0x10a3f}, {1, 0x10a40}, + {0, 0x1d167}, {1, 0x1d16a}, {0, 0x1d173}, {1, 0x1d183}, {0, 0x1d185}, + {1, 0x1d18c}, {0, 0x1d1aa}, {1, 0x1d1ae}, {0, 0x1d242}, {1, 0x1d245}, + {2, 0x20000}, {1, 0x2fffe}, {2, 0x30000}, {1, 0x3fffe}, {0, 0xe0001}, + {1, 0xe0002}, {0, 0xe0020}, {1, 0xe0080}, {0, 0xe0100}, {1, 0xe01f0} +}; + +/* +** Return an estimate of the width, in columns, for the single Unicode +** character c. For normal characters, the answer is always 1. But the +** estimate might be 0 or 2 for zero-width and double-width characters. +** +** Different display devices display unicode using different widths. So +** it is impossible to know that true display width with 100% accuracy. +** Inaccuracies in the width estimates might cause columns to be misaligned. +** Unfortunately, there is nothing we can do about that. +*/ +int sqlite3_qrf_wcwidth(int c){ + int iFirst, iLast; + + /* Fast path for common characters */ + if( c<0x300 ) return 1; + + /* The general case */ + iFirst = 0; + iLast = sizeof(aQrfUWidth)/sizeof(aQrfUWidth[0]) - 1; + while( iFirst c ){ + iLast = iMid - 1; + }else{ + return aQrfUWidth[iMid].w; + } + } + if( aQrfUWidth[iLast].iFirst > c ) return aQrfUWidth[iFirst].w; + return aQrfUWidth[iLast].w; +} + +/* +** Compute the value and length of a multi-byte UTF-8 character that +** begins at z[0]. Return the length. Write the Unicode value into *pU. +** +** This routine only works for *multi-byte* UTF-8 characters. It does +** not attempt to detect illegal characters. +*/ +int sqlite3_qrf_decode_utf8(const unsigned char *z, int *pU){ + if( (z[0] & 0xe0)==0xc0 && (z[1] & 0xc0)==0x80 ){ + *pU = ((z[0] & 0x1f)<<6) | (z[1] & 0x3f); + return 2; + } + if( (z[0] & 0xf0)==0xe0 && (z[1] & 0xc0)==0x80 && (z[2] & 0xc0)==0x80 ){ + *pU = ((z[0] & 0x0f)<<12) | ((z[1] & 0x3f)<<6) | (z[2] & 0x3f); + return 3; + } + if( (z[0] & 0xf8)==0xf0 && (z[1] & 0xc0)==0x80 && (z[2] & 0xc0)==0x80 + && (z[3] & 0xc0)==0x80 + ){ + *pU = ((z[0] & 0x0f)<<18) | ((z[1] & 0x3f)<<12) | ((z[2] & 0x3f))<<6 + | (z[3] & 0x3f); + return 4; + } + *pU = 0; + return 1; +} + +/* +** Check to see if z[] is a valid VT100 escape. If it is, then +** return the number of bytes in the escape sequence. Return 0 if +** z[] is not a VT100 escape. +** +** This routine assumes that z[0] is \033 (ESC). +*/ +static int qrfIsVt100(const unsigned char *z){ + int i; + if( z[1]!='[' ) return 0; + i = 2; + while( z[i]>=0x30 && z[i]<=0x3f ){ i++; } + while( z[i]>=0x20 && z[i]<=0x2f ){ i++; } + if( z[i]<0x40 || z[i]>0x7e ) return 0; + return i+1; +} + +/* +** Return the length of a string in display characters. +** +** Most characters of the input string count as 1, including +** multi-byte UTF8 characters. However, zero-width unicode +** characters and VT100 escape sequences count as zero, and +** double-width characters count as two. +** +** The definition of "zero-width" and "double-width" characters +** is not precise. It depends on the output device, to some extent, +** and it varies according to the Unicode version. This routine +** makes the best guess that it can. +*/ +size_t sqlite3_qrf_wcswidth(const char *zIn){ + const unsigned char *z = (const unsigned char*)zIn; + size_t n = 0; + while( *z ){ + if( z[0]<' ' ){ + int k; + if( z[0]=='\033' && (k = qrfIsVt100(z))>0 ){ + z += k; + }else{ + z++; + } + }else if( (0x80&z[0])==0 ){ + n++; + z++; + }else{ + int u = 0; + int len = sqlite3_qrf_decode_utf8(z, &u); + z += len; + n += sqlite3_qrf_wcwidth(u); + } + } + return n; +} + +/* +** Return the display width of the longest line of text +** in the (possibly) multi-line input string zIn[0..nByte]. +** zIn[] is not necessarily zero-terminated. Take +** into account tab characters, zero- and double-width +** characters, CR and NL, and VT100 escape codes. +** +** Write the number of newlines into *pnNL. So, *pnNL will +** return 0 if everything fits on one line, or positive it +** it will need to be split. +*/ +static int qrfDisplayWidth(const char *zIn, sqlite3_int64 nByte, int *pnNL){ + const unsigned char *z; + const unsigned char *zEnd; + int mx = 0; + int n = 0; + int nNL = 0; + if( zIn==0 ) zIn = ""; + z = (const unsigned char*)zIn; + zEnd = &z[nByte]; + while( z0 ){ + z += k; + }else{ + if( z[0]=='\t' ){ + n = (n+8)&~7; + }else if( z[0]=='\n' || z[0]=='\r' ){ + nNL++; + if( n>mx ) mx = n; + n = 0; + } + z++; + } + }else if( (0x80&z[0])==0 ){ + n++; + z++; + }else{ + int u = 0; + int len = sqlite3_qrf_decode_utf8(z, &u); + z += len; + n += sqlite3_qrf_wcwidth(u); + } + } + if( mx>n ) n = mx; + if( pnNL ) *pnNL = nNL; + return n; +} + +/* +** Escape the input string if it is needed and in accordance with +** eEsc, which is either QRF_ESC_Ascii or QRF_ESC_Symbol. +** +** Escaping is needed if the string contains any control characters +** other than \t, \n, and \r\n +** +** If no escaping is needed (the common case) then set *ppOut to NULL +** and return 0. If escaping is needed, write the escaped string into +** memory obtained from sqlite3_malloc64() and make *ppOut point to that +** memory and return 0. If an error occurs, return non-zero. +** +** The caller is responsible for freeing *ppFree if it is non-NULL in order +** to reclaim memory. +*/ +static void qrfEscape( + int eEsc, /* QRF_ESC_Ascii or QRF_ESC_Symbol */ + sqlite3_str *pStr, /* String to be escaped */ + int iStart /* Begin escapding on this byte of pStr */ +){ + sqlite3_int64 i, j; /* Loop counters */ + sqlite3_int64 sz; /* Size of the string prior to escaping */ + sqlite3_int64 nCtrl = 0;/* Number of control characters to escape */ + unsigned char *zIn; /* Text to be escaped */ + unsigned char c; /* A single character of the text */ + unsigned char *zOut; /* Where to write the results */ + + /* Find the text to be escaped */ + zIn = (unsigned char*)sqlite3_str_value(pStr); + if( zIn==0 ) return; + zIn += iStart; + + /* Count the control characters */ + for(i=0; (c = zIn[i])!=0; i++){ + if( c<=0x1f + && c!='\t' + && c!='\n' + && (c!='\r' || zIn[i+1]!='\n') + ){ + nCtrl++; + } + } + if( nCtrl==0 ) return; /* Early out if no control characters */ + + /* Make space to hold the escapes. Copy the original text to the end + ** of the available space. */ + sz = sqlite3_str_length(pStr) - iStart; + if( eEsc==QRF_ESC_Symbol ) nCtrl *= 2; + sqlite3_str_appendchar(pStr, nCtrl, ' '); + zOut = (unsigned char*)sqlite3_str_value(pStr); + if( zOut==0 ) return; + zOut += iStart; + zIn = zOut + nCtrl; + memmove(zIn,zOut,sz); + + /* Convert the control characters */ + for(i=j=0; (c = zIn[i])!=0; i++){ + if( c>0x1f + || c=='\t' + || c=='\n' + || (c=='\r' && zIn[i+1]=='\n') + ){ + continue; + } + if( i>0 ){ + memmove(&zOut[j], zIn, i); + j += i; + } + zIn += i+1; + i = -1; + if( eEsc==QRF_ESC_Symbol ){ + zOut[j++] = 0xe2; + zOut[j++] = 0x90; + zOut[j++] = 0x80+c; + }else{ + zOut[j++] = '^'; + zOut[j++] = 0x40+c; + } + } +} + +/* +** Determine if the string z[] can be shown as plain text. Return true +** if z[] is unambiguously text. Return false if z[] needs to be +** quoted. +** +** All of the following must be true in order for z[] to be relaxable: +** +** (1) z[] does not begin or end with ' or whitespace +** (2) z[] is not the same as the NULL rendering +** (3) z[] does not looks like a numeric literal +*/ +static int qrfRelaxable(Qrf *p, const char *z){ + size_t i, n; + if( z[0]=='\'' || qrfSpace(z[0]) ) return 0; + if( z[0]==0 ){ + return (p->spec.zNull!=0 && p->spec.zNull[0]!=0); + } + n = strlen(z); + if( n==0 || z[n-1]=='\'' || qrfSpace(z[n-1]) ) return 0; + if( p->spec.zNull && strcmp(p->spec.zNull,z)==0 ) return 0; + i = (z[0]=='-' || z[0]=='+'); + if( strcmp(z+i,"Inf")==0 ) return 0; + if( !qrfDigit(z[i]) ) return 1; + i++; + while( qrfDigit(z[i]) ){ i++; } + if( z[i]==0 ) return 0; + if( z[i]=='.' ){ + i++; + while( qrfDigit(z[i]) ){ i++; } + if( z[i]==0 ) return 0; + } + if( z[i]=='e' || z[i]=='E' ){ + i++; + if( z[i]=='+' || z[i]=='-' ){ i++; } + if( !qrfDigit(z[i]) ) return 1; + i++; + while( qrfDigit(z[i]) ){ i++; } + } + return z[i]!=0; +} + +/* +** If a field contains any character identified by a 1 in the following +** array, then the string must be quoted for CSV. +*/ +static const char qrfCsvQuote[] = { + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, +}; + +/* +** Encode text appropriately and append it to pOut. +*/ +static void qrfEncodeText(Qrf *p, sqlite3_str *pOut, const char *zTxt){ + int iStart = sqlite3_str_length(pOut); + switch( p->spec.eText ){ + case QRF_TEXT_Relaxed: + if( qrfRelaxable(p, zTxt) ){ + sqlite3_str_appendall(pOut, zTxt); + break; + } + deliberate_fall_through; /* FALLTHRU */ + case QRF_TEXT_Sql: { + if( p->spec.eEsc==QRF_ESC_Off ){ + sqlite3_str_appendf(pOut, "%Q", zTxt); + }else{ + sqlite3_str_appendf(pOut, "%#Q", zTxt); + } + break; + } + case QRF_TEXT_Csv: { + unsigned int i; + for(i=0; zTxt[i]; i++){ + if( qrfCsvQuote[((const unsigned char*)zTxt)[i]] ){ + i = 0; + break; + } + } + if( i==0 || strstr(zTxt, p->spec.zColumnSep)!=0 ){ + sqlite3_str_appendf(pOut, "\"%w\"", zTxt); + }else{ + sqlite3_str_appendall(pOut, zTxt); + } + break; + } + case QRF_TEXT_Html: { + const unsigned char *z = (const unsigned char*)zTxt; + while( *z ){ + unsigned int i = 0; + unsigned char c; + while( (c=z[i])>'>' + || (c && c!='<' && c!='>' && c!='&' && c!='\"' && c!='\'') + ){ + i++; + } + if( i>0 ){ + sqlite3_str_append(pOut, (const char*)z, i); + } + switch( z[i] ){ + case '>': sqlite3_str_append(pOut, "<", 4); break; + case '&': sqlite3_str_append(pOut, "&", 5); break; + case '<': sqlite3_str_append(pOut, "<", 4); break; + case '"': sqlite3_str_append(pOut, """, 6); break; + case '\'': sqlite3_str_append(pOut, "'", 5); break; + default: i--; + } + z += i + 1; + } + break; + } + case QRF_TEXT_Tcl: + case QRF_TEXT_Json: { + const unsigned char *z = (const unsigned char*)zTxt; + sqlite3_str_append(pOut, "\"", 1); + while( *z ){ + unsigned int i; + for(i=0; z[i]>=0x20 && z[i]!='\\' && z[i]!='"'; i++){} + if( i>0 ){ + sqlite3_str_append(pOut, (const char*)z, i); + } + if( z[i]==0 ) break; + switch( z[i] ){ + case '"': sqlite3_str_append(pOut, "\\\"", 2); break; + case '\\': sqlite3_str_append(pOut, "\\\\", 2); break; + case '\b': sqlite3_str_append(pOut, "\\b", 2); break; + case '\f': sqlite3_str_append(pOut, "\\f", 2); break; + case '\n': sqlite3_str_append(pOut, "\\n", 2); break; + case '\r': sqlite3_str_append(pOut, "\\r", 2); break; + case '\t': sqlite3_str_append(pOut, "\\t", 2); break; + default: { + if( p->spec.eText==QRF_TEXT_Json ){ + sqlite3_str_appendf(pOut, "\\u%04x", z[i]); + }else{ + sqlite3_str_appendf(pOut, "\\%03o", z[i]); + } + break; + } + } + z += i + 1; + } + sqlite3_str_append(pOut, "\"", 1); + break; + } + default: { + sqlite3_str_appendall(pOut, zTxt); + break; + } + } + if( p->spec.eEsc!=QRF_ESC_Off ){ + qrfEscape(p->spec.eEsc, pOut, iStart); + } +} + +/* +** Do a quick sanity check to see aBlob[0..nBlob-1] is valid JSONB +** return true if it is and false if it is not. +** +** False positives are possible, but not false negatives. +*/ +static int qrfJsonbQuickCheck(unsigned char *aBlob, int nBlob){ + unsigned char x; /* Payload size half-byte */ + int i; /* Loop counter */ + int n; /* Bytes in the payload size integer */ + sqlite3_uint64 sz; /* value of the payload size integer */ + + if( nBlob==0 ) return 0; + x = aBlob[0]>>4; + if( x<=11 ) return nBlob==(1+x); + n = x<14 ? x-11 : 4*(x-13); + if( nBlob<1+n ) return 0; + sz = aBlob[1]; + for(i=1; ipStmt is known to be a BLOB. Check +** to see if that BLOB is really a JSONB blob. If it is, then translate +** it into a text JSON representation and return a pointer to that text JSON. +** If the BLOB is not JSONB, then return a NULL pointer. +** +** The memory used to hold the JSON text is managed internally by the +** "p" object and is overwritten and/or deallocated upon the next call +** to this routine (with the same p argument) or when the p object is +** finailized. +*/ +static const char *qrfJsonbToJson(Qrf *p, int iCol){ + int nByte; + const void *pBlob; + int rc; + nByte = sqlite3_column_bytes(p->pStmt, iCol); + pBlob = sqlite3_column_blob(p->pStmt, iCol); + if( qrfJsonbQuickCheck((unsigned char*)pBlob, nByte)==0 ){ + return 0; + } + if( p->pJTrans==0 ){ + sqlite3 *db; + rc = sqlite3_open(":memory:",&db); + if( rc ){ + sqlite3_close(db); + return 0; + } + rc = sqlite3_prepare_v2(db, "SELECT json(?1)", -1, &p->pJTrans, 0); + if( rc ){ + sqlite3_finalize(p->pJTrans); + p->pJTrans = 0; + sqlite3_close(db); + return 0; + } + }else{ + sqlite3_reset(p->pJTrans); + } + sqlite3_bind_blob(p->pJTrans, 1, (void*)pBlob, nByte, SQLITE_STATIC); + rc = sqlite3_step(p->pJTrans); + if( rc==SQLITE_ROW ){ + return (const char*)sqlite3_column_text(p->pJTrans, 0); + }else{ + return 0; + } +} + +/* +** Adjust the input string zIn[] such that it is no more than N display +** characters wide. If it is wider than that, then truncate and add +** ellipsis. Or if zIn[] contains a \r or \n, truncate at that point, +** adding ellipsis. Embedded tabs in zIn[] are converted into ordinary +** spaces. +** +** Return this display width of the modified title string. +*/ +static int qrfTitleLimit(char *zIn, int N){ + unsigned char *z = (unsigned char*)zIn; + int n = 0; + unsigned char *zEllipsis = 0; + while( 1 /*exit-by-break*/ ){ + if( z[0]<' ' ){ + int k; + if( z[0]==0 ){ + zEllipsis = 0; + break; + }else if( z[0]=='\033' && (k = qrfIsVt100(z))>0 ){ + z += k; + }else if( z[0]=='\t' ){ + z[0] = ' '; + }else if( z[0]=='\n' || z[0]=='\r' ){ + z[0] = ' '; + }else{ + z++; + } + }else if( (0x80&z[0])==0 ){ + if( n>=(N-3) && zEllipsis==0 ) zEllipsis = z; + if( n==N ){ z[0] = 0; break; } + n++; + z++; + }else{ + int u = 0; + int len = sqlite3_qrf_decode_utf8(z, &u); + if( n+len>(N-3) && zEllipsis==0 ) zEllipsis = z; + if( n+len>N ){ z[0] = 0; break; } + z += len; + n += sqlite3_qrf_wcwidth(u); + } + } + if( zEllipsis && N>=3 ) memcpy(zEllipsis,"...",4); + return n; +} + + +/* +** Render value pVal into pOut +*/ +static void qrfRenderValue(Qrf *p, sqlite3_str *pOut, int iCol){ +#if SQLITE_VERSION_NUMBER>=3052000 + int iStartLen = sqlite3_str_length(pOut); +#endif + if( p->spec.xRender ){ + sqlite3_value *pVal; + char *z; + pVal = sqlite3_value_dup(sqlite3_column_value(p->pStmt,iCol)); + z = p->spec.xRender(p->spec.pRenderArg, pVal); + sqlite3_value_free(pVal); + if( z ){ + sqlite3_str_appendall(pOut, z); + sqlite3_free(z); + return; + } + } + switch( sqlite3_column_type(p->pStmt,iCol) ){ + case SQLITE_INTEGER: { + sqlite3_str_appendf(pOut, "%lld", sqlite3_column_int64(p->pStmt,iCol)); + break; + } + case SQLITE_FLOAT: { + const char *zTxt = (const char*)sqlite3_column_text(p->pStmt,iCol); + sqlite3_str_appendall(pOut, zTxt); + break; + } + case SQLITE_BLOB: { + if( p->spec.bTextJsonb==QRF_Yes ){ + const char *zJson = qrfJsonbToJson(p, iCol); + if( zJson ){ + if( p->spec.eText==QRF_TEXT_Sql ){ + sqlite3_str_append(pOut,"jsonb(",6); + qrfEncodeText(p, pOut, zJson); + sqlite3_str_append(pOut,")",1); + }else{ + qrfEncodeText(p, pOut, zJson); + } + break; + } + } + switch( p->spec.eBlob ){ + case QRF_BLOB_Hex: + case QRF_BLOB_Sql: { + int iStart; + int nBlob = sqlite3_column_bytes(p->pStmt,iCol); + int i, j; + char *zVal; + const unsigned char *a = sqlite3_column_blob(p->pStmt,iCol); + if( p->spec.eBlob==QRF_BLOB_Sql ){ + sqlite3_str_append(pOut, "x'", 2); + } + iStart = sqlite3_str_length(pOut); + sqlite3_str_appendchar(pOut, nBlob, ' '); + sqlite3_str_appendchar(pOut, nBlob, ' '); + if( p->spec.eBlob==QRF_BLOB_Sql ){ + sqlite3_str_appendchar(pOut, 1, '\''); + } + if( sqlite3_str_errcode(pOut) ) return; + zVal = sqlite3_str_value(pOut); + for(i=0, j=iStart; i>4)&0xf]; + zVal[j+1] = "0123456789abcdef"[(c)&0xf]; + } + break; + } + case QRF_BLOB_Tcl: + case QRF_BLOB_Json: { + int iStart; + int nBlob = sqlite3_column_bytes(p->pStmt,iCol); + int i, j; + char *zVal; + const unsigned char *a = sqlite3_column_blob(p->pStmt,iCol); + int szC = p->spec.eBlob==QRF_BLOB_Json ? 6 : 4; + sqlite3_str_append(pOut, "\"", 1); + iStart = sqlite3_str_length(pOut); + for(i=szC; i>0; i--){ + sqlite3_str_appendchar(pOut, nBlob, ' '); + } + sqlite3_str_appendchar(pOut, 1, '"'); + if( sqlite3_str_errcode(pOut) ) return; + zVal = sqlite3_str_value(pOut); + for(i=0, j=iStart; i>6)&3); + zVal[j+2] = '0' + ((c>>3)&7); + zVal[j+3] = '0' + (c&7); + }else{ + zVal[j+1] = 'u'; + zVal[j+2] = '0'; + zVal[j+3] = '0'; + zVal[j+4] = "0123456789abcdef"[(c>>4)&0xf]; + zVal[j+5] = "0123456789abcdef"[(c)&0xf]; + } + } + break; + } + case QRF_BLOB_Size: { + int nBlob = sqlite3_column_bytes(p->pStmt,iCol); + sqlite3_str_appendf(pOut, "(%d-byte blob)", nBlob); + break; + } + default: { + const char *zTxt = (const char*)sqlite3_column_text(p->pStmt,iCol); + qrfEncodeText(p, pOut, zTxt); + } + } + break; + } + case SQLITE_NULL: { + sqlite3_str_appendall(pOut, p->spec.zNull); + break; + } + case SQLITE_TEXT: { + const char *zTxt = (const char*)sqlite3_column_text(p->pStmt,iCol); + qrfEncodeText(p, pOut, zTxt); + break; + } + } +#if SQLITE_VERSION_NUMBER>=3052000 + if( p->spec.nCharLimit>0 + && (sqlite3_str_length(pOut) - iStartLen) > p->spec.nCharLimit + ){ + const unsigned char *z; + int ii = 0, w = 0, limit = p->spec.nCharLimit; + z = (const unsigned char*)sqlite3_str_value(pOut) + iStartLen; + if( limit<4 ) limit = 4; + while( 1 ){ + if( z[ii]<' ' ){ + int k; + if( z[ii]=='\033' && (k = qrfIsVt100(z+ii))>0 ){ + ii += k; + }else if( z[ii]==0 ){ + break; + }else{ + ii++; + } + }else if( (0x80&z[ii])==0 ){ + w++; + if( w>limit ) break; + ii++; + }else{ + int u = 0; + int len = sqlite3_qrf_decode_utf8(&z[ii], &u); + w += sqlite3_qrf_wcwidth(u); + if( w>limit ) break; + ii += len; + } + } + if( w>limit ){ + sqlite3_str_truncate(pOut, iStartLen+ii); + sqlite3_str_append(pOut, "...", 3); + } + } +#endif +} + +/* Trim spaces of the end if pOut +*/ +static void qrfRTrim(sqlite3_str *pOut){ +#if SQLITE_VERSION_NUMBER>=3052000 + int nByte = sqlite3_str_length(pOut); + const char *zOut = sqlite3_str_value(pOut); + while( nByte>0 && zOut[nByte-1]==' ' ){ nByte--; } + sqlite3_str_truncate(pOut, nByte); +#endif +} + +/* +** Store string zUtf to pOut as w characters. If w is negative, +** then right-justify the text. W is the width in display characters, not +** in bytes. Double-width unicode characters count as two characters. +** VT100 escape sequences count as zero. And so forth. +*/ +static void qrfWidthPrint(Qrf *p, sqlite3_str *pOut, int w, const char *zUtf){ + const unsigned char *a = (const unsigned char*)zUtf; + static const int mxW = 10000000; + unsigned char c; + int i = 0; + int n = 0; + int k; + int aw; + (void)p; + if( w<-mxW ){ + w = -mxW; + }else if( w>mxW ){ + w= mxW; + } + aw = w<0 ? -w : w; + if( a==0 ) a = (const unsigned char*)""; + while( (c = a[i])!=0 ){ + if( (c&0xc0)==0xc0 ){ + int u; + int len = sqlite3_qrf_decode_utf8(a+i, &u); + int x = sqlite3_qrf_wcwidth(u); + if( x+n>aw ){ + break; + } + i += len; + n += x; + }else if( c==0x1b && (k = qrfIsVt100(&a[i]))>0 ){ + i += k; + }else if( n>=aw ){ + break; + }else{ + n++; + i++; + } + } + if( n>=aw ){ + sqlite3_str_append(pOut, zUtf, i); + }else if( w<0 ){ + if( aw>n ) sqlite3_str_appendchar(pOut, aw-n, ' '); + sqlite3_str_append(pOut, zUtf, i); + }else{ + sqlite3_str_append(pOut, zUtf, i); + if( aw>n ) sqlite3_str_appendchar(pOut, aw-n, ' '); + } +} + +/* +** (*pz)[] is a line of text that is to be displayed the box or table or +** similar tabular formats. z[] contain newlines or might be too wide +** to fit in the columns so will need to be split into multiple line. +** +** This routine determines: +** +** * How many bytes of z[] should be shown on the current line. +** * How many character positions those bytes will cover. +** * The byte offset to the start of the next line. +*/ +static void qrfWrapLine( + const char *zIn, /* Input text to be displayed */ + int w, /* Column width in characters (not bytes) */ + int bWrap, /* True if we should do word-wrapping */ + int *pnThis, /* OUT: How many bytes of z[] for the current line */ + int *pnWide, /* OUT: How wide is the text of this line */ + int *piNext /* OUT: Offset into z[] to start of the next line */ +){ + int i; /* Input bytes consumed */ + int k; /* Bytes in a VT100 code */ + int n; /* Output column number */ + const unsigned char *z = (const unsigned char*)zIn; + unsigned char c = 0; + + if( z[0]==0 ){ + *pnThis = 0; + *pnWide = 0; + *piNext = 0; + return; + } + n = 0; + for(i=0; n<=w; i++){ + c = z[i]; + if( c>=0xc0 ){ + int u; + int len = sqlite3_qrf_decode_utf8(&z[i], &u); + int wcw = sqlite3_qrf_wcwidth(u); + if( wcw+n>w ) break; + i += len-1; + n += wcw; + continue; + } + if( c>=' ' ){ + if( n==w ) break; + n++; + continue; + } + if( c==0 || c=='\n' ) break; + if( c=='\r' && z[i+1]=='\n' ){ c = z[++i]; break; } + if( c=='\t' ){ + int wcw = 8 - (n&7); + if( n+wcw>w ) break; + n += wcw; + continue; + } + if( c==0x1b && (k = qrfIsVt100(&z[i]))>0 ){ + i += k-1; + }else if( n==w ){ + break; + }else{ + n++; + } + } + if( c==0 ){ + *pnThis = i; + *pnWide = n; + *piNext = i; + return; + } + if( c=='\n' ){ + *pnThis = i; + *pnWide = n; + *piNext = i+1; + return; + } + + /* If we get this far, that means the current line will end at some + ** point that is neither a "\n" or a 0x00. Figure out where that + ** split should occur + */ + if( bWrap && z[i]!=0 && !qrfSpace(z[i]) && qrfAlnum(c)==qrfAlnum(z[i]) ){ + /* Perhaps try to back up to a better place to break the line */ + for(k=i-1; k>=i/2; k--){ + if( qrfSpace(z[k]) ) break; + } + if( k=i/2; k--){ + if( qrfAlnum(z[k-1])!=qrfAlnum(z[k]) && (z[k]&0xc0)!=0x80 ) break; + } + } + if( k>=i/2 ){ + i = k; + n = qrfDisplayWidth((const char*)z, k, 0); + } + } + *pnThis = i; + *pnWide = n; + while( zIn[i]==' ' || zIn[i]=='\t' || zIn[i]=='\r' ){ i++; } + *piNext = i; +} + +/* +** Append nVal bytes of text from zVal onto the end of pOut. +** Convert tab characters in zVal to the appropriate number of +** spaces. +*/ +static void qrfAppendWithTabs( + sqlite3_str *pOut, /* Append text here */ + const char *zVal, /* Text to append */ + int nVal /* Use only the first nVal bytes of zVal[] */ +){ + int i = 0; + unsigned int col = 0; + unsigned char *z = (unsigned char *)zVal; + while( i0 ){ + sqlite3_str_append(pOut, (const char*)z, k); + z += k; + nVal -= k; + }else if( c=='\t' ){ + k = 8 - (col&7); + sqlite3_str_appendchar(pOut, k, ' '); + col += k; + z++; + nVal--; + }else if( c=='\r' && nVal==1 ){ + z++; + nVal--; + }else{ + char zCtrlPik[4]; + col++; + zCtrlPik[0] = 0xe2; + zCtrlPik[1] = 0x90; + zCtrlPik[2] = 0x80+c; + sqlite3_str_append(pOut, zCtrlPik, 3); + z++; + nVal--; + } + }else if( (0x80&c)==0 ){ + i++; + col++; + }else{ + int u = 0; + int len = sqlite3_qrf_decode_utf8(&z[i], &u); + i += len; + col += sqlite3_qrf_wcwidth(u); + } + } + sqlite3_str_append(pOut, (const char*)z, i); +} + +/* +** GCC does not define the offsetof() macro so we'll have to do it +** ourselves. +*/ +#ifndef offsetof +# define offsetof(ST,M) ((size_t)((char*)&((ST*)0)->M - (char*)0)) +#endif + +/* +** Data for columnar layout, collected into a single object so +** that it can be more easily passed into subroutines. +*/ +typedef struct qrfColData qrfColData; +struct qrfColData { + Qrf *p; /* The QRF instance */ + int nCol; /* Number of columns in the table */ + unsigned char bMultiRow; /* One or more cells will span multiple lines */ + unsigned char nMargin; /* Width of column margins */ + sqlite3_int64 nRow; /* Number of rows */ + sqlite3_int64 nAlloc; /* Number of cells allocated */ + sqlite3_int64 n; /* Number of cells. nCol*nRow */ + char **az; /* Content of all cells */ + int *aiWth; /* Width of each cell */ + unsigned char *abNum; /* True for each numeric cell */ + struct qrfPerCol { /* Per-column data */ + char *z; /* Cache of text for current row */ + int w; /* Computed width of this column */ + int mxW; /* Maximum natural (unwrapped) width */ + unsigned char e; /* Alignment */ + unsigned char fx; /* Width is fixed */ + unsigned char bNum; /* True if is numeric */ + } *a; /* One per column */ +}; + +/* +** Output horizontally justified text into pOut. The text is the +** first nVal bytes of zVal. Include nWS bytes of whitespace, either +** split between both sides, or on the left, or on the right, depending +** on eAlign. +*/ +static void qrfPrintAligned( + sqlite3_str *pOut, /* Append text here */ + struct qrfPerCol *pCol, /* Information about the text to print */ + int nVal, /* Use only the first nVal bytes of zVal[] */ + int nWS /* Whitespace for horizonal alignment */ +){ + unsigned char eAlign = pCol->e & QRF_ALIGN_HMASK; + if( eAlign==QRF_Auto && pCol->bNum ) eAlign = QRF_ALIGN_Right; + if( eAlign==QRF_ALIGN_Center ){ + /* Center the text */ + sqlite3_str_appendchar(pOut, nWS/2, ' '); + qrfAppendWithTabs(pOut, pCol->z, nVal); + sqlite3_str_appendchar(pOut, nWS - nWS/2, ' '); + }else if( eAlign==QRF_ALIGN_Right ){ + /* Right justify the text */ + sqlite3_str_appendchar(pOut, nWS, ' '); + qrfAppendWithTabs(pOut, pCol->z, nVal); + }else{ + /* Left justify the text */ + qrfAppendWithTabs(pOut, pCol->z, nVal); + sqlite3_str_appendchar(pOut, nWS, ' '); + } +} + +/* +** Free all the memory allocates in the qrfColData object +*/ +static void qrfColDataFree(qrfColData *p){ + sqlite3_int64 i; + for(i=0; in; i++) sqlite3_free(p->az[i]); + sqlite3_free(p->az); + sqlite3_free(p->aiWth); + sqlite3_free(p->abNum); + sqlite3_free(p->a); + memset(p, 0, sizeof(*p)); +} + +/* +** Allocate space for more cells in the qrfColData object. +** Return non-zero if a memory allocation fails. +*/ +static int qrfColDataEnlarge(qrfColData *p){ + char **azData; + int *aiWth; + unsigned char *abNum; + p->nAlloc = 2*p->nAlloc + 10*p->nCol; + azData = sqlite3_realloc64(p->az, p->nAlloc*sizeof(char*)); + if( azData==0 ){ + qrfOom(p->p); + qrfColDataFree(p); + return 1; + } + p->az = azData; + aiWth = sqlite3_realloc64(p->aiWth, p->nAlloc*sizeof(int)); + if( aiWth==0 ){ + qrfOom(p->p); + qrfColDataFree(p); + return 1; + } + p->aiWth = aiWth; + abNum = sqlite3_realloc64(p->abNum, p->nAlloc); + if( abNum==0 ){ + qrfOom(p->p); + qrfColDataFree(p); + return 1; + } + p->abNum = abNum; + return 0; +} + +/* +** Print a markdown or table-style row separator using ascii-art +*/ +static void qrfRowSeparator(sqlite3_str *pOut, qrfColData *p, char cSep){ + int i; + if( p->nCol>0 ){ + int useBorder = p->p->spec.bBorder!=QRF_No; + if( useBorder ){ + sqlite3_str_append(pOut, &cSep, 1); + } + sqlite3_str_appendchar(pOut, p->a[0].w+p->nMargin, '-'); + for(i=1; inCol; i++){ + sqlite3_str_append(pOut, &cSep, 1); + sqlite3_str_appendchar(pOut, p->a[i].w+p->nMargin, '-'); + } + if( useBorder ){ + sqlite3_str_append(pOut, &cSep, 1); + } + } + sqlite3_str_append(pOut, "\n", 1); +} + +/* +** UTF8 box-drawing characters. Imagine box lines like this: +** +** 1 +** | +** 4 --+-- 2 +** | +** 3 +** +** Each box characters has between 2 and 4 of the lines leading from +** the center. The characters are here identified by the numbers of +** their corresponding lines. +*/ +#define BOX_24 "\342\224\200" /* U+2500 --- */ +#define BOX_13 "\342\224\202" /* U+2502 | */ +#define BOX_23 "\342\224\214" /* U+250c ,- */ +#define BOX_34 "\342\224\220" /* U+2510 -, */ +#define BOX_12 "\342\224\224" /* U+2514 '- */ +#define BOX_14 "\342\224\230" /* U+2518 -' */ +#define BOX_123 "\342\224\234" /* U+251c |- */ +#define BOX_134 "\342\224\244" /* U+2524 -| */ +#define BOX_234 "\342\224\254" /* U+252c -,- */ +#define BOX_124 "\342\224\264" /* U+2534 -'- */ +#define BOX_1234 "\342\224\274" /* U+253c -|- */ + +/* Rounded corners: */ +#define BOX_R12 "\342\225\260" /* U+2570 '- */ +#define BOX_R23 "\342\225\255" /* U+256d ,- */ +#define BOX_R34 "\342\225\256" /* U+256e -, */ +#define BOX_R14 "\342\225\257" /* U+256f -' */ + +/* Doubled horizontal lines: */ +#define DBL_24 "\342\225\220" /* U+2550 === */ +#define DBL_123 "\342\225\236" /* U+255e |= */ +#define DBL_134 "\342\225\241" /* U+2561 =| */ +#define DBL_1234 "\342\225\252" /* U+256a =|= */ + +/* Draw horizontal line N characters long using unicode box +** characters +*/ +static void qrfBoxLine(sqlite3_str *pOut, int N, int bDbl){ + const char *azDash[2] = { + BOX_24 BOX_24 BOX_24 BOX_24 BOX_24 BOX_24 BOX_24 BOX_24 BOX_24 BOX_24, + DBL_24 DBL_24 DBL_24 DBL_24 DBL_24 DBL_24 DBL_24 DBL_24 DBL_24 DBL_24 + };/* 0 1 2 3 4 5 6 7 8 9 */ + const int nDash = 30; + N *= 3; + while( N>nDash ){ + sqlite3_str_append(pOut, azDash[bDbl], nDash); + N -= nDash; + } + sqlite3_str_append(pOut, azDash[bDbl], N); +} + +/* +** Draw a horizontal separator for a QRF_STYLE_Box table. +*/ +static void qrfBoxSeparator( + sqlite3_str *pOut, + qrfColData *p, + const char *zSep1, + const char *zSep2, + const char *zSep3, + int bDbl +){ + int i; + if( p->nCol>0 ){ + int useBorder = p->p->spec.bBorder!=QRF_No; + if( useBorder ){ + sqlite3_str_appendall(pOut, zSep1); + } + qrfBoxLine(pOut, p->a[0].w+p->nMargin, bDbl); + for(i=1; inCol; i++){ + sqlite3_str_appendall(pOut, zSep2); + qrfBoxLine(pOut, p->a[i].w+p->nMargin, bDbl); + } + if( useBorder ){ + sqlite3_str_appendall(pOut, zSep3); + } + } + sqlite3_str_append(pOut, "\n", 1); +} + +/* +** Load into pData the default alignment for the body of a table. +*/ +static void qrfLoadAlignment(qrfColData *pData, Qrf *p){ + sqlite3_int64 i; + for(i=0; inCol; i++){ + pData->a[i].e = p->spec.eDfltAlign; + if( ispec.nAlign ){ + unsigned char ax = p->spec.aAlign[i]; + if( (ax & QRF_ALIGN_HMASK)!=0 ){ + pData->a[i].e = (ax & QRF_ALIGN_HMASK) | + (pData->a[i].e & QRF_ALIGN_VMASK); + } + }else if( ispec.nWidth ){ + if( p->spec.aWidth[i]<0 ){ + pData->a[i].e = QRF_ALIGN_Right | + (pData->a[i].e & QRF_ALIGN_VMASK); + } + } + } +} + +/* +** If the single column in pData->a[] with pData->n entries can be +** laid out as nCol columns with a 2-space gap between each such +** that all columns fit within nSW, then return a pointer to an array +** of integers which is the width of each column from left to right. +** +** If the layout is not possible, return a NULL pointer. +** +** Space to hold the returned array is from sqlite_malloc64(). +*/ +static int *qrfValidLayout( + qrfColData *pData, /* Collected query results */ + Qrf *p, /* On which to report an OOM */ + int nCol, /* Attempt this many columns */ + int nSW /* Screen width */ +){ + int i; /* Loop counter */ + int nr; /* Number of rows */ + int w = 0; /* Width of the current column */ + int t; /* Total width of all columns */ + int *aw; /* Array of individual column widths */ + + aw = sqlite3_malloc64( sizeof(int)*nCol ); + if( aw==0 ){ + qrfOom(p); + return 0; + } + nr = (pData->n + nCol - 1)/nCol; + for(i=0; in; i++){ + if( (i%nr)==0 ){ + if( i>0 ) aw[i/nr-1] = w; + w = pData->aiWth[i]; + }else if( pData->aiWth[i]>w ){ + w = pData->aiWth[i]; + } + } + aw[nCol-1] = w; + for(t=i=0; inSW ){ + sqlite3_free(aw); + return 0; + } + return aw; +} + +/* +** The output is single-column and the bSplitColumn flag is set. +** Check to see if the single-column output can be split into multiple +** columns that appear side-by-side. Adjust pData appropriately. +*/ +static void qrfSplitColumn(qrfColData *pData, Qrf *p){ + int nCol = 1; + int *aw = 0; + char **az = 0; + int *aiWth = 0; + unsigned char *abNum = 0; + int nColNext = 2; + int w; + struct qrfPerCol *a = 0; + sqlite3_int64 nRow = 1; + sqlite3_int64 i; + while( 1/*exit-by-break*/ ){ + int *awNew = qrfValidLayout(pData, p, nColNext, p->spec.nScreenWidth); + if( awNew==0 ) break; + sqlite3_free(aw); + aw = awNew; + nCol = nColNext; + nRow = (pData->n + nCol - 1)/nCol; + if( nRow==1 ) break; + nColNext++; + while( (pData->n + nColNext - 1)/nColNext == nRow ) nColNext++; + } + if( nCol==1 ){ + sqlite3_free(aw); + return; /* Cannot do better than 1 column */ + } + az = sqlite3_malloc64( nRow*nCol*sizeof(char*) ); + if( az==0 ){ + qrfOom(p); + return; + } + aiWth = sqlite3_malloc64( nRow*nCol*sizeof(int) ); + if( aiWth==0 ){ + sqlite3_free(az); + qrfOom(p); + return; + } + a = sqlite3_malloc64( nCol*sizeof(struct qrfPerCol) ); + if( a==0 ){ + sqlite3_free(az); + sqlite3_free(aiWth); + qrfOom(p); + return; + } + abNum = sqlite3_malloc64( nRow*nCol ); + if( abNum==0 ){ + sqlite3_free(az); + sqlite3_free(aiWth); + sqlite3_free(a); + qrfOom(p); + return; + } + for(i=0; in; i++){ + sqlite3_int64 j = (i%nRow)*nCol + (i/nRow); + az[j] = pData->az[i]; + abNum[j]= pData->abNum[i]; + pData->az[i] = 0; + aiWth[j] = pData->aiWth[i]; + } + while( ia[0].e; + } + sqlite3_free(pData->az); + sqlite3_free(pData->aiWth); + sqlite3_free(pData->a); + sqlite3_free(pData->abNum); + sqlite3_free(aw); + pData->az = az; + pData->aiWth = aiWth; + pData->a = a; + pData->abNum = abNum; + pData->nCol = nCol; + pData->n = pData->nAlloc = nRow*nCol; + for(i=w=0; inMargin = (p->spec.nScreenWidth - w)/(nCol - 1); + if( pData->nMargin>5 ) pData->nMargin = 5; +} + +/* +** Adjust the layout for the screen width restriction +*/ +static void qrfRestrictScreenWidth(qrfColData *pData, Qrf *p){ + int sepW; /* Width of all box separators and margins */ + int sumW; /* Total width of data area over all columns */ + int targetW; /* Desired total data area */ + int i; /* Loop counters */ + int nCol; /* Number of columns */ + + pData->nMargin = 2; /* Default to normal margins */ + if( p->spec.nScreenWidth==0 ) return; + if( p->spec.eStyle==QRF_STYLE_Column ){ + sepW = pData->nCol*2 - 2; + }else{ + sepW = pData->nCol*3 + 1; + if( p->spec.bBorder==QRF_No ) sepW -= 2; + } + nCol = pData->nCol; + for(i=sumW=0; ia[i].w; + if( p->spec.nScreenWidth >= sumW+sepW ) return; + + /* First thing to do is reduce the separation between columns */ + pData->nMargin = 0; + if( p->spec.eStyle==QRF_STYLE_Column ){ + sepW = pData->nCol - 1; + }else{ + sepW = pData->nCol + 1; + if( p->spec.bBorder==QRF_No ) sepW -= 2; + } + targetW = p->spec.nScreenWidth - sepW; + +#define MIN_SQUOZE 8 +#define MIN_EX_SQUOZE 16 + /* Reduce the width of the widest eligible column. A column is + ** eligible for narrowing if: + ** + ** * It is not a fixed-width column (a[0].fx is false) + ** * The current width is more than MIN_SQUOZE + ** * Either: + ** + The current width is more then MIN_EX_SQUOZE, or + ** + The current width is more than half the max width (a[].mxW) + ** + ** Keep making reductions until either no more reductions are + ** possible or until the size target is reached. + */ + while( sumW > targetW ){ + int gain, w; + int ix = -1; + int mx = 0; + for(i=0; ia[i].fx==0 + && (w = pData->a[i].w)>mx + && w>MIN_SQUOZE + && (w>MIN_EX_SQUOZE || w*2>pData->a[i].mxW) + ){ + ix = i; + mx = w; + } + } + if( ix<0 ) break; + if( mx>=MIN_SQUOZE*2 ){ + gain = mx/2; + }else{ + gain = mx - MIN_SQUOZE; + } + if( sumW - gain < targetW ){ + gain = sumW - targetW; + } + sumW -= gain; + pData->a[ix].w -= gain; + pData->bMultiRow = 1; + } +} + +/* +** Columnar modes require that the entire query be evaluated first, with +** results written into memory, so that we can compute appropriate column +** widths. +*/ +static void qrfColumnar(Qrf *p){ + sqlite3_int64 i, j; /* Loop counters */ + const char *colSep = 0; /* Column separator text */ + const char *rowSep = 0; /* Row terminator text */ + const char *rowStart = 0; /* Row start text */ + int szColSep, szRowSep, szRowStart; /* Size in bytes of previous 3 */ + int rc; /* Result code */ + int nColumn = p->nCol; /* Number of columns */ + int bWW; /* True to do word-wrap */ + sqlite3_str *pStr; /* Temporary rendering */ + qrfColData data; /* Columnar layout data */ + int bRTrim; /* Trim trailing space */ + + rc = sqlite3_step(p->pStmt); + if( rc!=SQLITE_ROW || nColumn==0 ){ + return; /* No output */ + } + + /* Initialize the data container */ + memset(&data, 0, sizeof(data)); + data.nCol = p->nCol; + data.p = p; + data.a = sqlite3_malloc64( nColumn*sizeof(struct qrfPerCol) ); + if( data.a==0 ){ + qrfOom(p); + return; + } + memset(data.a, 0, nColumn*sizeof(struct qrfPerCol) ); + if( qrfColDataEnlarge(&data) ) return; + assert( data.az!=0 ); + + /* Load the column header names and all cell content into data */ + if( p->spec.bTitles==QRF_Yes ){ + unsigned char saved_eText = p->spec.eText; + p->spec.eText = p->spec.eTitle; + memset(data.abNum, 0, nColumn); + for(i=0; ipStmt,i); + int nNL = 0; + int n, w; + pStr = sqlite3_str_new(p->db); + qrfEncodeText(p, pStr, z ? z : ""); + n = sqlite3_str_length(pStr); + qrfStrErr(p, pStr); + z = data.az[data.n] = sqlite3_str_finish(pStr); + if( p->spec.nTitleLimit ){ + nNL = 0; + data.aiWth[data.n] = w = qrfTitleLimit(data.az[data.n], + p->spec.nTitleLimit ); + }else{ + data.aiWth[data.n] = w = qrfDisplayWidth(z, n, &nNL); + } + data.n++; + if( w>data.a[i].mxW ) data.a[i].mxW = w; + if( nNL ) data.bMultiRow = 1; + } + p->spec.eText = saved_eText; + p->nRow++; + } + do{ + if( data.n+nColumn > data.nAlloc ){ + if( qrfColDataEnlarge(&data) ) return; + } + for(i=0; ipStmt,i); + pStr = sqlite3_str_new(p->db); + qrfRenderValue(p, pStr, i); + n = sqlite3_str_length(pStr); + qrfStrErr(p, pStr); + z = data.az[data.n] = sqlite3_str_finish(pStr); + data.abNum[data.n] = eType==SQLITE_INTEGER || eType==SQLITE_FLOAT; + data.aiWth[data.n] = w = qrfDisplayWidth(z, n, &nNL); + data.n++; + if( w>data.a[i].mxW ) data.a[i].mxW = w; + if( nNL ) data.bMultiRow = 1; + } + p->nRow++; + }while( sqlite3_step(p->pStmt)==SQLITE_ROW && p->iErr==SQLITE_OK ); + if( p->iErr ){ + qrfColDataFree(&data); + return; + } + + /* Compute the width and alignment of every column */ + if( p->spec.bTitles==QRF_No ){ + qrfLoadAlignment(&data, p); + }else{ + unsigned char e; + if( p->spec.eTitleAlign==QRF_Auto ){ + e = QRF_ALIGN_Center; + }else{ + e = p->spec.eTitleAlign; + } + for(i=0; ispec.nWidth ){ + w = p->spec.aWidth[i]; + if( w==(-32768) ){ + w = 0; + if( p->spec.nAlign>i && (p->spec.aAlign[i] & QRF_ALIGN_HMASK)==0 ){ + data.a[i].e |= QRF_ALIGN_Right; + } + }else if( w<0 ){ + w = -w; + if( p->spec.nAlign>i && (p->spec.aAlign[i] & QRF_ALIGN_HMASK)==0 ){ + data.a[i].e |= QRF_ALIGN_Right; + } + } + if( w ) data.a[i].fx = 1; + } + if( w==0 ){ + w = data.a[i].mxW; + if( p->spec.nWrap>0 && w>p->spec.nWrap ){ + w = p->spec.nWrap; + data.bMultiRow = 1; + } + }else if( (data.bMultiRow==0 || w==1) && data.a[i].mxW>w ){ + data.bMultiRow = 1; + if( w==1 ){ + /* If aiWth[j] is 2 or more, then there might be a double-wide + ** character somewhere. So make the column width at least 2. */ + w = 2; + } + } + data.a[i].w = w; + } + + if( nColumn==1 + && data.n>1 + && p->spec.bSplitColumn==QRF_Yes + && p->spec.eStyle==QRF_STYLE_Column + && p->spec.bTitles==QRF_No + && p->spec.nScreenWidth>data.a[0].w+3 + ){ + /* Attempt to convert single-column tables into multi-column by + ** verticle wrapping, if the screen is wide enough and if the + ** bSplitColumn flag is set. */ + qrfSplitColumn(&data, p); + nColumn = data.nCol; + }else{ + /* Adjust the column widths due to screen width restrictions */ + qrfRestrictScreenWidth(&data, p); + } + + /* Draw the line across the top of the table. Also initialize + ** the row boundary and column separator texts. */ + switch( p->spec.eStyle ){ + case QRF_STYLE_Box: + if( data.nMargin ){ + rowStart = BOX_13 " "; + colSep = " " BOX_13 " "; + rowSep = " " BOX_13 "\n"; + }else{ + rowStart = BOX_13; + colSep = BOX_13; + rowSep = BOX_13 "\n"; + } + if( p->spec.bBorder==QRF_No){ + rowStart += 3; + rowSep = "\n"; + }else{ + qrfBoxSeparator(p->pOut, &data, BOX_R23, BOX_234, BOX_R34, 0); + } + break; + case QRF_STYLE_Table: + if( data.nMargin ){ + rowStart = "| "; + colSep = " | "; + rowSep = " |\n"; + }else{ + rowStart = "|"; + colSep = "|"; + rowSep = "|\n"; + } + if( p->spec.bBorder==QRF_No ){ + rowStart += 1; + rowSep = "\n"; + }else{ + qrfRowSeparator(p->pOut, &data, '+'); + } + break; + case QRF_STYLE_Column: { + static const char zSpace[] = " "; + rowStart = ""; + if( data.nMargin<2 ){ + colSep = " "; + }else if( data.nMargin<=5 ){ + colSep = &zSpace[5-data.nMargin]; + }else{ + colSep = zSpace; + } + rowSep = "\n"; + break; + } + default: /*case QRF_STYLE_Markdown:*/ + if( data.nMargin ){ + rowStart = "| "; + colSep = " | "; + rowSep = " |\n"; + }else{ + rowStart = "|"; + colSep = "|"; + rowSep = "|\n"; + } + break; + } + szRowStart = (int)strlen(rowStart); + szRowSep = (int)strlen(rowSep); + szColSep = (int)strlen(colSep); + + bWW = (p->spec.bWordWrap==QRF_Yes && data.bMultiRow); + if( p->spec.eStyle==QRF_STYLE_Column + || (p->spec.bBorder==QRF_No + && (p->spec.eStyle==QRF_STYLE_Box || p->spec.eStyle==QRF_STYLE_Table) + ) + ){ + bRTrim = 1; + }else{ + bRTrim = 0; + } + for(i=0; ipOut)==SQLITE_OK; i+=nColumn){ + int bMore; + int nRow = 0; + + /* Draw a single row of the table. This might be the title line + ** (if there is a title line) or a row in the body of the table. + ** The column number will be j. The row number is i/nColumn. + */ + for(j=0; jpOut, rowStart, szRowStart); + bMore = 0; + for(j=0; jpOut, &data.a[j], nThis, nWS); + data.a[j].z += iNext; + if( data.a[j].z[0]!=0 ){ + bMore = 1; + } + if( jpOut, colSep, szColSep); + }else{ + if( bRTrim ) qrfRTrim(p->pOut); + sqlite3_str_append(p->pOut, rowSep, szRowSep); + } + } + }while( bMore && ++nRow < p->mxHeight ); + if( bMore ){ + /* This row was terminated by nLineLimit. Show ellipsis. */ + sqlite3_str_append(p->pOut, rowStart, szRowStart); + for(j=0; jpOut, data.a[j].w, ' '); + }else{ + int nE = 3; + if( nE>data.a[j].w ) nE = data.a[j].w; + data.a[j].z = "..."; + qrfPrintAligned(p->pOut, &data.a[j], nE, data.a[j].w-nE); + } + if( jpOut, colSep, szColSep); + }else{ + if( bRTrim ) qrfRTrim(p->pOut); + sqlite3_str_append(p->pOut, rowSep, szRowSep); + } + } + } + + /* Draw either (1) the separator between the title line and the body + ** of the table, or (2) separators between individual rows of the table + ** body. isTitleDataSeparator will be true if we are doing (1). + */ + if( (i==0 || data.bMultiRow) && i+nColumnspec.bTitles==QRF_Yes); + if( isTitleDataSeparator ){ + qrfLoadAlignment(&data, p); + } + switch( p->spec.eStyle ){ + case QRF_STYLE_Table: { + if( isTitleDataSeparator || data.bMultiRow ){ + qrfRowSeparator(p->pOut, &data, '+'); + } + break; + } + case QRF_STYLE_Box: { + if( isTitleDataSeparator ){ + qrfBoxSeparator(p->pOut, &data, DBL_123, DBL_1234, DBL_134, 1); + }else if( data.bMultiRow ){ + qrfBoxSeparator(p->pOut, &data, BOX_123, BOX_1234, BOX_134, 0); + } + break; + } + case QRF_STYLE_Markdown: { + if( isTitleDataSeparator ){ + qrfRowSeparator(p->pOut, &data, '|'); + } + break; + } + case QRF_STYLE_Column: { + if( isTitleDataSeparator ){ + for(j=0; jpOut, data.a[j].w, '-'); + if( jpOut, colSep, szColSep); + }else{ + qrfRTrim(p->pOut); + sqlite3_str_append(p->pOut, rowSep, szRowSep); + } + } + }else if( data.bMultiRow ){ + qrfRTrim(p->pOut); + sqlite3_str_append(p->pOut, "\n", 1); + } + break; + } + } + } + } + + /* Draw the line across the bottom of the table */ + if( p->spec.bBorder!=QRF_No ){ + switch( p->spec.eStyle ){ + case QRF_STYLE_Box: + qrfBoxSeparator(p->pOut, &data, BOX_R12, BOX_124, BOX_R14, 0); + break; + case QRF_STYLE_Table: + qrfRowSeparator(p->pOut, &data, '+'); + break; + } + } + qrfWrite(p); + + qrfColDataFree(&data); + return; +} + +/* +** Parameter azArray points to a zero-terminated array of strings. zStr +** points to a single nul-terminated string. Return non-zero if zStr +** is equal, according to strcmp(), to any of the strings in the array. +** Otherwise, return zero. +*/ +static int qrfStringInArray(const char *zStr, const char **azArray){ + int i; + if( zStr==0 ) return 0; + for(i=0; azArray[i]; i++){ + if( 0==strcmp(zStr, azArray[i]) ) return 1; + } + return 0; +} + +/* +** Print out an EXPLAIN with indentation. This is a two-pass algorithm. +** +** On the first pass, we compute aiIndent[iOp] which is the amount of +** indentation to apply to the iOp-th opcode. The output actually occurs +** on the second pass. +** +** The indenting rules are: +** +** * For each "Next", "Prev", "VNext" or "VPrev" instruction, indent +** all opcodes that occur between the p2 jump destination and the opcode +** itself by 2 spaces. +** +** * Do the previous for "Return" instructions for when P2 is positive. +** See tag-20220407a in wherecode.c and vdbe.c. +** +** * For each "Goto", if the jump destination is earlier in the program +** and ends on one of: +** Yield SeekGt SeekLt RowSetRead Rewind +** or if the P1 parameter is one instead of zero, +** then indent all opcodes between the earlier instruction +** and "Goto" by 2 spaces. +*/ +static void qrfExplain(Qrf *p){ + int *abYield = 0; /* abYield[iOp] is rue if opcode iOp is an OP_Yield */ + int *aiIndent = 0; /* Indent the iOp-th opcode by aiIndent[iOp] */ + i64 nAlloc = 0; /* Allocated size of aiIndent[], abYield */ + int nIndent = 0; /* Number of entries in aiIndent[] */ + int iOp; /* Opcode number */ + int i; /* Column loop counter */ + + const char *azNext[] = { "Next", "Prev", "VPrev", "VNext", "SorterNext", + "Return", 0 }; + const char *azYield[] = { "Yield", "SeekLT", "SeekGT", "RowSetRead", + "Rewind", 0 }; + const char *azGoto[] = { "Goto", 0 }; + + /* The caller guarantees that the leftmost 4 columns of the statement + ** passed to this function are equivalent to the leftmost 4 columns + ** of EXPLAIN statement output. In practice the statement may be + ** an EXPLAIN, or it may be a query on the bytecode() virtual table. */ + assert( sqlite3_column_count(p->pStmt)>=4 ); + assert( 0==sqlite3_stricmp( sqlite3_column_name(p->pStmt, 0), "addr" ) ); + assert( 0==sqlite3_stricmp( sqlite3_column_name(p->pStmt, 1), "opcode" ) ); + assert( 0==sqlite3_stricmp( sqlite3_column_name(p->pStmt, 2), "p1" ) ); + assert( 0==sqlite3_stricmp( sqlite3_column_name(p->pStmt, 3), "p2" ) ); + + for(iOp=0; SQLITE_ROW==sqlite3_step(p->pStmt) && !p->iErr; iOp++){ + int iAddr = sqlite3_column_int(p->pStmt, 0); + const char *zOp = (const char*)sqlite3_column_text(p->pStmt, 1); + int p1 = sqlite3_column_int(p->pStmt, 2); + int p2 = sqlite3_column_int(p->pStmt, 3); + + /* Assuming that p2 is an instruction address, set variable p2op to the + ** index of that instruction in the aiIndent[] array. p2 and p2op may be + ** different if the current instruction is part of a sub-program generated + ** by an SQL trigger or foreign key. */ + int p2op = (p2 + (iOp-iAddr)); + + /* Grow the aiIndent array as required */ + if( iOp>=nAlloc ){ + nAlloc += 100; + aiIndent = (int*)sqlite3_realloc64(aiIndent, nAlloc*sizeof(int)); + abYield = (int*)sqlite3_realloc64(abYield, nAlloc*sizeof(int)); + if( aiIndent==0 || abYield==0 ){ + qrfOom(p); + sqlite3_free(aiIndent); + sqlite3_free(abYield); + return; + } + } + + abYield[iOp] = qrfStringInArray(zOp, azYield); + aiIndent[iOp] = 0; + nIndent = iOp+1; + if( qrfStringInArray(zOp, azNext) && p2op>0 ){ + for(i=p2op; ipStmt); + if( p->iErr==SQLITE_OK ){ + static const int aExplainWidth[] = {4, 13, 4, 4, 4, 13, 2, 13}; + static const int aExplainMap[] = {0, 1, 2, 3, 4, 5, 6, 7 }; + static const int aScanExpWidth[] = {4,15, 6, 13, 4, 4, 4, 13, 2, 13}; + static const int aScanExpMap[] = {0, 9, 8, 1, 2, 3, 4, 5, 6, 7 }; + const int *aWidth = aExplainWidth; + const int *aMap = aExplainMap; + int nWidth = sizeof(aExplainWidth)/sizeof(int); + int iIndent = 1; + int nArg = p->nCol; + if( p->spec.eStyle==QRF_STYLE_StatsVm ){ + aWidth = aScanExpWidth; + aMap = aScanExpMap; + nWidth = sizeof(aScanExpWidth)/sizeof(int); + iIndent = 3; + } + if( nArg>nWidth ) nArg = nWidth; + + for(iOp=0; sqlite3_step(p->pStmt)==SQLITE_ROW && !p->iErr; iOp++){ + /* If this is the first row seen, print out the headers */ + if( iOp==0 ){ + for(i=0; ipStmt, aMap[i]); + qrfWidthPrint(p,p->pOut, aWidth[i], zCol); + if( i==nArg-1 ){ + sqlite3_str_append(p->pOut, "\n", 1); + }else{ + sqlite3_str_append(p->pOut, " ", 2); + } + } + for(i=0; ipOut, "%.*c", aWidth[i], '-'); + if( i==nArg-1 ){ + sqlite3_str_append(p->pOut, "\n", 1); + }else{ + sqlite3_str_append(p->pOut, " ", 2); + } + } + } + + for(i=0; ipStmt, aMap[i]); + int len; + if( i==nArg-1 ) w = 0; + if( zVal==0 ) zVal = ""; + len = (int)sqlite3_qrf_wcswidth(zVal); + if( len>w ){ + w = len; + zSep = " "; + } + if( i==iIndent && aiIndent && iOppOut, aiIndent[iOp], ' '); + } + qrfWidthPrint(p, p->pOut, w, zVal); + if( i==nArg-1 ){ + sqlite3_str_append(p->pOut, "\n", 1); + }else{ + sqlite3_str_appendall(p->pOut, zSep); + } + } + p->nRow++; + } + qrfWrite(p); + } + sqlite3_free(aiIndent); +} + +/* +** Do a "scanstatus vm" style EXPLAIN listing on p->pStmt. +** +** p->pStmt is probably not an EXPLAIN query. Instead, construct a +** new query that is a bytecode() rendering of p->pStmt with extra +** columns for the "scanstatus vm" outputs, and run the results of +** that new query through the normal EXPLAIN formatting. +*/ +static void qrfScanStatusVm(Qrf *p){ + sqlite3_stmt *pOrigStmt = p->pStmt; + sqlite3_stmt *pExplain; + int rc; + static const char *zSql = + " SELECT addr, opcode, p1, p2, p3, p4, p5, comment, nexec," + " format('% 6s (%.2f%%)'," + " CASE WHEN ncycle<100_000 THEN ncycle || ' '" + " WHEN ncycle<100_000_000 THEN (ncycle/1_000) || 'K'" + " WHEN ncycle<100_000_000_000 THEN (ncycle/1_000_000) || 'M'" + " ELSE (ncycle/1000_000_000) || 'G' END," + " ncycle*100.0/(sum(ncycle) OVER ())" + " ) AS cycles" + " FROM bytecode(?1)"; + rc = sqlite3_prepare_v2(p->db, zSql, -1, &pExplain, 0); + if( rc ){ + qrfError(p, rc, "%s", sqlite3_errmsg(p->db)); + sqlite3_finalize(pExplain); + return; + } + sqlite3_bind_pointer(pExplain, 1, pOrigStmt, "stmt-pointer", 0); + p->pStmt = pExplain; + p->nCol = 10; + qrfExplain(p); + sqlite3_finalize(pExplain); + p->pStmt = pOrigStmt; +} + +/* +** Attempt to determine if identifier zName needs to be quoted, either +** because it contains non-alphanumeric characters, or because it is an +** SQLite keyword. Be conservative in this estimate: When in doubt assume +** that quoting is required. +** +** Return 1 if quoting is required. Return 0 if no quoting is required. +*/ + +static int qrf_need_quote(const char *zName){ + int i; + const unsigned char *z = (const unsigned char*)zName; + if( z==0 ) return 1; + if( !qrfAlpha(z[0]) ) return 1; + for(i=0; z[i]; i++){ + if( !qrfAlnum(z[i]) ) return 1; + } + return sqlite3_keyword_check(zName, i)!=0; +} + +/* +** Helper function for QRF_STYLE_Json and QRF_STYLE_JObject. +** The initial "{" for a JSON object that will contain row content +** has been output. Now output all the content. +*/ +static void qrfOneJsonRow(Qrf *p){ + int i, nItem; + for(nItem=i=0; inCol; i++){ + const char *zCName; + zCName = sqlite3_column_name(p->pStmt, i); + if( nItem>0 ) sqlite3_str_append(p->pOut, ",", 1); + nItem++; + qrfEncodeText(p, p->pOut, zCName); + sqlite3_str_append(p->pOut, ":", 1); + qrfRenderValue(p, p->pOut, i); + } + qrfWrite(p); +} + +/* +** Render a single row of output for non-columnar styles - any +** style that lets us render row by row as the content is received +** from the query. +*/ +static void qrfOneSimpleRow(Qrf *p){ + int i; + switch( p->spec.eStyle ){ + case QRF_STYLE_Off: + case QRF_STYLE_Count: { + /* No-op */ + break; + } + case QRF_STYLE_Json: { + if( p->nRow==0 ){ + sqlite3_str_append(p->pOut, "[{", 2); + }else{ + sqlite3_str_append(p->pOut, "},\n{", 4); + } + qrfOneJsonRow(p); + break; + } + case QRF_STYLE_JObject: { + if( p->nRow==0 ){ + sqlite3_str_append(p->pOut, "{", 1); + }else{ + sqlite3_str_append(p->pOut, "}\n{", 3); + } + qrfOneJsonRow(p); + break; + } + case QRF_STYLE_Html: { + if( p->nRow==0 && p->spec.bTitles==QRF_Yes ){ + sqlite3_str_append(p->pOut, "", 4); + for(i=0; inCol; i++){ + const char *zCName = sqlite3_column_name(p->pStmt, i); + sqlite3_str_append(p->pOut, "\n", 5); + qrfEncodeText(p, p->pOut, zCName); + } + sqlite3_str_append(p->pOut, "\n\n", 7); + } + sqlite3_str_append(p->pOut, "", 4); + for(i=0; inCol; i++){ + sqlite3_str_append(p->pOut, "\n", 5); + qrfRenderValue(p, p->pOut, i); + } + sqlite3_str_append(p->pOut, "\n\n", 7); + qrfWrite(p); + break; + } + case QRF_STYLE_Insert: { + unsigned int mxIns = p->spec.nMultiInsert; + int szStart = sqlite3_str_length(p->pOut); + if( p->u.nIns==0 || p->u.nIns>=mxIns ){ + if( p->u.nIns ){ + sqlite3_str_append(p->pOut, ";\n", 2); + p->u.nIns = 0; + } + if( qrf_need_quote(p->spec.zTableName) ){ + sqlite3_str_appendf(p->pOut,"INSERT INTO \"%w\"",p->spec.zTableName); + }else{ + sqlite3_str_appendf(p->pOut,"INSERT INTO %s",p->spec.zTableName); + } + if( p->spec.bTitles==QRF_Yes ){ + for(i=0; inCol; i++){ + const char *zCName = sqlite3_column_name(p->pStmt, i); + if( qrf_need_quote(zCName) ){ + sqlite3_str_appendf(p->pOut, "%c\"%w\"", + i==0 ? '(' : ',', zCName); + }else{ + sqlite3_str_appendf(p->pOut, "%c%s", + i==0 ? '(' : ',', zCName); + } + } + sqlite3_str_append(p->pOut, ")", 1); + } + sqlite3_str_append(p->pOut," VALUES(", 8); + }else{ + sqlite3_str_append(p->pOut,",\n (", 5); + } + for(i=0; inCol; i++){ + if( i>0 ) sqlite3_str_append(p->pOut, ",", 1); + qrfRenderValue(p, p->pOut, i); + } + p->u.nIns += sqlite3_str_length(p->pOut) + 2 - szStart; + if( p->u.nIns>=mxIns ){ + sqlite3_str_append(p->pOut, ");\n", 3); + p->u.nIns = 0; + }else{ + sqlite3_str_append(p->pOut, ")", 1); + } + qrfWrite(p); + break; + } + case QRF_STYLE_Line: { + sqlite3_str *pVal; + int mxW; + int bWW; + int nSep; + if( p->u.sLine.azCol==0 ){ + p->u.sLine.azCol = sqlite3_malloc64( p->nCol*sizeof(char*) ); + if( p->u.sLine.azCol==0 ){ + qrfOom(p); + break; + } + p->u.sLine.mxColWth = 0; + for(i=0; inCol; i++){ + int sz; + const char *zCName = sqlite3_column_name(p->pStmt, i); + if( zCName==0 ) zCName = "unknown"; + p->u.sLine.azCol[i] = sqlite3_mprintf("%s", zCName); + if( p->spec.nTitleLimit>0 ){ + (void)qrfTitleLimit(p->u.sLine.azCol[i], p->spec.nTitleLimit); + } + sz = (int)sqlite3_qrf_wcswidth(p->u.sLine.azCol[i]); + if( sz > p->u.sLine.mxColWth ) p->u.sLine.mxColWth = sz; + } + } + if( p->nRow ) sqlite3_str_append(p->pOut, "\n", 1); + pVal = sqlite3_str_new(p->db); + nSep = (int)strlen(p->spec.zColumnSep); + mxW = p->mxWidth - (nSep + p->u.sLine.mxColWth); + bWW = p->spec.bWordWrap==QRF_Yes; + for(i=0; inCol; i++){ + const char *zVal; + int cnt = 0; + qrfWidthPrint(p, p->pOut, -p->u.sLine.mxColWth, p->u.sLine.azCol[i]); + sqlite3_str_append(p->pOut, p->spec.zColumnSep, nSep); + qrfRenderValue(p, pVal, i); + zVal = sqlite3_str_value(pVal); + if( zVal==0 ) zVal = ""; + do{ + int nThis, nWide, iNext; + qrfWrapLine(zVal, mxW, bWW, &nThis, &nWide, &iNext); + if( cnt ){ + sqlite3_str_appendchar(p->pOut,p->u.sLine.mxColWth+nSep,' '); + } + cnt++; + if( cnt>p->mxHeight ){ + zVal = "..."; + nThis = iNext = 3; + } + sqlite3_str_append(p->pOut, zVal, nThis); + sqlite3_str_append(p->pOut, "\n", 1); + zVal += iNext; + }while( zVal[0] ); + sqlite3_str_reset(pVal); + } + qrfStrErr(p, pVal); + sqlite3_free(sqlite3_str_finish(pVal)); + qrfWrite(p); + break; + } + case QRF_STYLE_Eqp: { + const char *zEqpLine = (const char*)sqlite3_column_text(p->pStmt,3); + int iEqpId = sqlite3_column_int(p->pStmt, 0); + int iParentId = sqlite3_column_int(p->pStmt, 1); + if( zEqpLine==0 ) zEqpLine = ""; + if( zEqpLine[0]=='-' ) qrfEqpRender(p, 0); + qrfEqpAppend(p, iEqpId, iParentId, zEqpLine); + break; + } + default: { /* QRF_STYLE_List */ + if( p->nRow==0 && p->spec.bTitles==QRF_Yes ){ + int saved_eText = p->spec.eText; + p->spec.eText = p->spec.eTitle; + for(i=0; inCol; i++){ + const char *zCName = sqlite3_column_name(p->pStmt, i); + if( i>0 ) sqlite3_str_appendall(p->pOut, p->spec.zColumnSep); + qrfEncodeText(p, p->pOut, zCName); + } + sqlite3_str_appendall(p->pOut, p->spec.zRowSep); + qrfWrite(p); + p->spec.eText = saved_eText; + } + for(i=0; inCol; i++){ + if( i>0 ) sqlite3_str_appendall(p->pOut, p->spec.zColumnSep); + qrfRenderValue(p, p->pOut, i); + } + sqlite3_str_appendall(p->pOut, p->spec.zRowSep); + qrfWrite(p); + break; + } + } + p->nRow++; +} + +/* +** Initialize the internal Qrf object. +*/ +static void qrfInitialize( + Qrf *p, /* State object to be initialized */ + sqlite3_stmt *pStmt, /* Query whose output to be formatted */ + const sqlite3_qrf_spec *pSpec, /* Format specification */ + char **pzErr /* Write errors here */ +){ + size_t sz; /* Size of pSpec[], based on pSpec->iVersion */ + memset(p, 0, sizeof(*p)); + p->pzErr = pzErr; + if( pSpec->iVersion>1 ){ + qrfError(p, SQLITE_ERROR, + "unusable sqlite3_qrf_spec.iVersion (%d)", + pSpec->iVersion); + return; + } + p->pStmt = pStmt; + p->db = sqlite3_db_handle(pStmt); + p->pOut = sqlite3_str_new(p->db); + if( p->pOut==0 ){ + qrfOom(p); + return; + } + p->iErr = SQLITE_OK; + p->nCol = sqlite3_column_count(p->pStmt); + p->nRow = 0; + sz = sizeof(sqlite3_qrf_spec); + memcpy(&p->spec, pSpec, sz); + if( p->spec.zNull==0 ) p->spec.zNull = ""; + p->mxWidth = p->spec.nScreenWidth; + if( p->mxWidth<=0 ) p->mxWidth = QRF_MAX_WIDTH; + p->mxHeight = p->spec.nLineLimit; + if( p->mxHeight<=0 ) p->mxHeight = 2147483647; + if( p->spec.eStyle>QRF_STYLE_Table ) p->spec.eStyle = QRF_Auto; + if( p->spec.eEsc>QRF_ESC_Symbol ) p->spec.eEsc = QRF_Auto; + if( p->spec.eText>QRF_TEXT_Relaxed ) p->spec.eText = QRF_Auto; + if( p->spec.eTitle>QRF_TEXT_Relaxed ) p->spec.eTitle = QRF_Auto; + if( p->spec.eBlob>QRF_BLOB_Size ) p->spec.eBlob = QRF_Auto; +qrf_reinit: + switch( p->spec.eStyle ){ + case QRF_Auto: { + switch( sqlite3_stmt_isexplain(pStmt) ){ + case 0: p->spec.eStyle = QRF_STYLE_Box; break; + case 1: p->spec.eStyle = QRF_STYLE_Explain; break; + default: p->spec.eStyle = QRF_STYLE_Eqp; break; + } + goto qrf_reinit; + } + case QRF_STYLE_List: { + if( p->spec.zColumnSep==0 ) p->spec.zColumnSep = "|"; + if( p->spec.zRowSep==0 ) p->spec.zRowSep = "\n"; + break; + } + case QRF_STYLE_JObject: + case QRF_STYLE_Json: { + p->spec.eText = QRF_TEXT_Json; + p->spec.zNull = "null"; + break; + } + case QRF_STYLE_Html: { + p->spec.eText = QRF_TEXT_Html; + p->spec.zNull = "null"; + break; + } + case QRF_STYLE_Insert: { + p->spec.eText = QRF_TEXT_Sql; + p->spec.zNull = "NULL"; + if( p->spec.zTableName==0 || p->spec.zTableName[0]==0 ){ + p->spec.zTableName = "tab"; + } + p->u.nIns = 0; + break; + } + case QRF_STYLE_Line: { + if( p->spec.zColumnSep==0 ){ + p->spec.zColumnSep = ": "; + } + break; + } + case QRF_STYLE_Csv: { + p->spec.eStyle = QRF_STYLE_List; + p->spec.eText = QRF_TEXT_Csv; + p->spec.zColumnSep = ","; + p->spec.zRowSep = "\r\n"; + p->spec.zNull = ""; + break; + } + case QRF_STYLE_Quote: { + p->spec.eText = QRF_TEXT_Sql; + p->spec.zNull = "NULL"; + p->spec.zColumnSep = ","; + p->spec.zRowSep = "\n"; + break; + } + case QRF_STYLE_Eqp: { + int expMode = sqlite3_stmt_isexplain(p->pStmt); + if( expMode!=2 ){ + sqlite3_stmt_explain(p->pStmt, 2); + p->expMode = expMode+1; + } + break; + } + case QRF_STYLE_Explain: { + int expMode = sqlite3_stmt_isexplain(p->pStmt); + if( expMode!=1 ){ + sqlite3_stmt_explain(p->pStmt, 1); + p->expMode = expMode+1; + } + break; + } + } + if( p->spec.eEsc==QRF_Auto ){ + p->spec.eEsc = QRF_ESC_Ascii; + } + if( p->spec.eText==QRF_Auto ){ + p->spec.eText = QRF_TEXT_Plain; + } + if( p->spec.eTitle==QRF_Auto ){ + switch( p->spec.eStyle ){ + case QRF_STYLE_Box: + case QRF_STYLE_Column: + case QRF_STYLE_Table: + p->spec.eTitle = QRF_TEXT_Plain; + break; + default: + p->spec.eTitle = p->spec.eText; + break; + } + } + if( p->spec.eBlob==QRF_Auto ){ + switch( p->spec.eText ){ + case QRF_TEXT_Sql: p->spec.eBlob = QRF_BLOB_Sql; break; + case QRF_TEXT_Csv: p->spec.eBlob = QRF_BLOB_Tcl; break; + case QRF_TEXT_Tcl: p->spec.eBlob = QRF_BLOB_Tcl; break; + case QRF_TEXT_Json: p->spec.eBlob = QRF_BLOB_Json; break; + default: p->spec.eBlob = QRF_BLOB_Text; break; + } + } + if( p->spec.bTitles==QRF_Auto ){ + switch( p->spec.eStyle ){ + case QRF_STYLE_Box: + case QRF_STYLE_Csv: + case QRF_STYLE_Column: + case QRF_STYLE_Table: + case QRF_STYLE_Markdown: + p->spec.bTitles = QRF_Yes; + break; + default: + p->spec.bTitles = QRF_No; + break; + } + } + if( p->spec.bWordWrap==QRF_Auto ){ + p->spec.bWordWrap = QRF_Yes; + } + if( p->spec.bTextJsonb==QRF_Auto ){ + p->spec.bTextJsonb = QRF_No; + } + if( p->spec.zColumnSep==0 ) p->spec.zColumnSep = ","; + if( p->spec.zRowSep==0 ) p->spec.zRowSep = "\n"; +} + +/* +** Finish rendering the results +*/ +static void qrfFinalize(Qrf *p){ + switch( p->spec.eStyle ){ + case QRF_STYLE_Count: { + sqlite3_str_appendf(p->pOut, "%lld\n", p->nRow); + break; + } + case QRF_STYLE_Json: { + if( p->nRow>0 ){ + sqlite3_str_append(p->pOut, "}]\n", 3); + } + break; + } + case QRF_STYLE_JObject: { + if( p->nRow>0 ){ + sqlite3_str_append(p->pOut, "}\n", 2); + } + break; + } + case QRF_STYLE_Insert: { + if( p->u.nIns ){ + sqlite3_str_append(p->pOut, ";\n", 2); + } + break; + } + case QRF_STYLE_Line: { + if( p->u.sLine.azCol ){ + int i; + for(i=0; inCol; i++) sqlite3_free(p->u.sLine.azCol[i]); + sqlite3_free(p->u.sLine.azCol); + } + break; + } + case QRF_STYLE_Stats: + case QRF_STYLE_StatsEst: { + i64 nCycle = 0; +#ifdef SQLITE_ENABLE_STMT_SCANSTATUS + sqlite3_stmt_scanstatus_v2(p->pStmt, -1, SQLITE_SCANSTAT_NCYCLE, + SQLITE_SCANSTAT_COMPLEX, (void*)&nCycle); +#endif + qrfEqpRender(p, nCycle); + break; + } + case QRF_STYLE_Eqp: { + qrfEqpRender(p, 0); + break; + } + } + qrfWrite(p); + qrfStrErr(p, p->pOut); + if( p->spec.pzOutput ){ + if( p->spec.pzOutput[0] ){ + sqlite3_int64 n, sz; + char *zCombined; + sz = strlen(p->spec.pzOutput[0]); + n = sqlite3_str_length(p->pOut); + zCombined = sqlite3_realloc64(p->spec.pzOutput[0], sz+n+1); + if( zCombined==0 ){ + sqlite3_free(p->spec.pzOutput[0]); + p->spec.pzOutput[0] = 0; + qrfOom(p); + }else{ + p->spec.pzOutput[0] = zCombined; + memcpy(zCombined+sz, sqlite3_str_value(p->pOut), n+1); + } + sqlite3_free(sqlite3_str_finish(p->pOut)); + }else{ + p->spec.pzOutput[0] = sqlite3_str_finish(p->pOut); + } + }else if( p->pOut ){ + sqlite3_free(sqlite3_str_finish(p->pOut)); + } + if( p->expMode>0 ){ + sqlite3_stmt_explain(p->pStmt, p->expMode-1); + } + if( p->actualWidth ){ + sqlite3_free(p->actualWidth); + } + if( p->pJTrans ){ + sqlite3 *db = sqlite3_db_handle(p->pJTrans); + sqlite3_finalize(p->pJTrans); + sqlite3_close(db); + } +} + +/* +** Run the prepared statement pStmt and format the results according +** to the specification provided in pSpec. Return an error code. +** If pzErr is not NULL and if an error occurs, write an error message +** into *pzErr. +*/ +int sqlite3_format_query_result( + sqlite3_stmt *pStmt, /* Statement to evaluate */ + const sqlite3_qrf_spec *pSpec, /* Format specification */ + char **pzErr /* Write error message here */ +){ + Qrf qrf; /* The new Qrf being created */ + + if( pStmt==0 ) return SQLITE_OK; /* No-op */ + if( pSpec==0 ) return SQLITE_MISUSE; + qrfInitialize(&qrf, pStmt, pSpec, pzErr); + switch( qrf.spec.eStyle ){ + case QRF_STYLE_Box: + case QRF_STYLE_Column: + case QRF_STYLE_Markdown: + case QRF_STYLE_Table: { + /* Columnar modes require that the entire query be evaluated and the + ** results stored in memory, so that we can compute column widths */ + qrfColumnar(&qrf); + break; + } + case QRF_STYLE_Explain: { + qrfExplain(&qrf); + break; + } + case QRF_STYLE_StatsVm: { + qrfScanStatusVm(&qrf); + break; + } + case QRF_STYLE_Stats: + case QRF_STYLE_StatsEst: { + qrfEqpStats(&qrf); + break; + } + default: { + /* Non-columnar modes where the output can occur after each row + ** of result is received */ + while( qrf.iErr==SQLITE_OK && sqlite3_step(pStmt)==SQLITE_ROW ){ + qrfOneSimpleRow(&qrf); + } + break; + } + } + qrfResetStmt(&qrf); + qrfFinalize(&qrf); + return qrf.iErr; +} diff --git a/ext/qrf/qrf.h b/ext/qrf/qrf.h new file mode 100644 index 0000000000..e5171b01a0 --- /dev/null +++ b/ext/qrf/qrf.h @@ -0,0 +1,201 @@ +/* +** 2025-10-20 +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** May you do good and not evil. +** May you find forgiveness for yourself and forgive others. +** May you share freely, never taking more than you give. +** +************************************************************************* +** Header file for the Query Result-Format or "qrf" utility library for +** SQLite. See the README.md documentation for additional information. +*/ +#ifndef SQLITE_QRF_H +#define SQLITE_QRF_H +#ifdef __cplusplus +extern "C" { +#endif +#include +#include "sqlite3.h" + +/* +** Specification used by clients to define the output format they want +*/ +typedef struct sqlite3_qrf_spec sqlite3_qrf_spec; +struct sqlite3_qrf_spec { + unsigned char iVersion; /* Version number of this structure */ + unsigned char eStyle; /* Formatting style. "box", "csv", etc... */ + unsigned char eEsc; /* How to escape control characters in text */ + unsigned char eText; /* Quoting style for text */ + unsigned char eTitle; /* Quating style for the text of column names */ + unsigned char eBlob; /* Quoting style for BLOBs */ + unsigned char bTitles; /* True to show column names */ + unsigned char bWordWrap; /* Try to wrap on word boundaries */ + unsigned char bTextJsonb; /* Render JSONB blobs as JSON text */ + unsigned char eDfltAlign; /* Default alignment, no covered by aAlignment */ + unsigned char eTitleAlign; /* Alignment for column headers */ + unsigned char bSplitColumn; /* Wrap single-column output into many columns */ + unsigned char bBorder; /* Show outer border in Box and Table styles */ + short int nWrap; /* Wrap columns wider than this */ + short int nScreenWidth; /* Maximum overall table width */ + short int nLineLimit; /* Maximum number of lines for any row */ + short int nTitleLimit; /* Maximum number of characters in a title */ + unsigned int nMultiInsert; /* Add rows to one INSERT until size exceeds */ + int nCharLimit; /* Maximum number of characters in a cell */ + int nWidth; /* Number of entries in aWidth[] */ + int nAlign; /* Number of entries in aAlignment[] */ + short int *aWidth; /* Column widths */ + unsigned char *aAlign; /* Column alignments */ + char *zColumnSep; /* Alternative column separator */ + char *zRowSep; /* Alternative row separator */ + char *zTableName; /* Output table name */ + char *zNull; /* Rendering of NULL */ + char *(*xRender)(void*,sqlite3_value*); /* Render a value */ + int (*xWrite)(void*,const char*,sqlite3_int64); /* Write output */ + void *pRenderArg; /* First argument to the xRender callback */ + void *pWriteArg; /* First argument to the xWrite callback */ + char **pzOutput; /* Storage location for output string */ + /* Additional fields may be added in the future */ +}; + +/* +** Interfaces +*/ +int sqlite3_format_query_result( + sqlite3_stmt *pStmt, /* SQL statement to run */ + const sqlite3_qrf_spec *pSpec, /* Result format specification */ + char **pzErr /* OUT: Write error message here */ +); + +/* +** Range of values for sqlite3_qrf_spec.aWidth[] entries and for +** sqlite3_qrf_spec.mxColWidth and .nScreenWidth +*/ +#define QRF_MAX_WIDTH 10000 +#define QRF_MIN_WIDTH 0 + +/* +** Output styles: +*/ +#define QRF_STYLE_Auto 0 /* Choose a style automatically */ +#define QRF_STYLE_Box 1 /* Unicode box-drawing characters */ +#define QRF_STYLE_Column 2 /* One record per line in neat columns */ +#define QRF_STYLE_Count 3 /* Output only a count of the rows of output */ +#define QRF_STYLE_Csv 4 /* Comma-separated-value */ +#define QRF_STYLE_Eqp 5 /* Format EXPLAIN QUERY PLAN output */ +#define QRF_STYLE_Explain 6 /* EXPLAIN output */ +#define QRF_STYLE_Html 7 /* Generate an XHTML table */ +#define QRF_STYLE_Insert 8 /* Generate SQL "insert" statements */ +#define QRF_STYLE_Json 9 /* Output is a list of JSON objects */ +#define QRF_STYLE_JObject 10 /* Independent JSON objects for each row */ +#define QRF_STYLE_Line 11 /* One column per line. */ +#define QRF_STYLE_List 12 /* One record per line with a separator */ +#define QRF_STYLE_Markdown 13 /* Markdown formatting */ +#define QRF_STYLE_Off 14 /* No query output shown */ +#define QRF_STYLE_Quote 15 /* SQL-quoted, comma-separated */ +#define QRF_STYLE_Stats 16 /* EQP-like output but with performance stats */ +#define QRF_STYLE_StatsEst 17 /* EQP-like output with planner estimates */ +#define QRF_STYLE_StatsVm 18 /* EXPLAIN-like output with performance stats */ +#define QRF_STYLE_Table 19 /* MySQL-style table formatting */ + +/* +** Quoting styles for text. +** Allowed values for sqlite3_qrf_spec.eText +*/ +#define QRF_TEXT_Auto 0 /* Choose text encoding automatically */ +#define QRF_TEXT_Plain 1 /* Literal text */ +#define QRF_TEXT_Sql 2 /* Quote as an SQL literal */ +#define QRF_TEXT_Csv 3 /* CSV-style quoting */ +#define QRF_TEXT_Html 4 /* HTML-style quoting */ +#define QRF_TEXT_Tcl 5 /* C/Tcl quoting */ +#define QRF_TEXT_Json 6 /* JSON quoting */ +#define QRF_TEXT_Relaxed 7 /* Relaxed SQL quoting */ + +/* +** Quoting styles for BLOBs +** Allowed values for sqlite3_qrf_spec.eBlob +*/ +#define QRF_BLOB_Auto 0 /* Determine BLOB quoting using eText */ +#define QRF_BLOB_Text 1 /* Display content exactly as it is */ +#define QRF_BLOB_Sql 2 /* Quote as an SQL literal */ +#define QRF_BLOB_Hex 3 /* Hexadecimal representation */ +#define QRF_BLOB_Tcl 4 /* "\000" notation */ +#define QRF_BLOB_Json 5 /* A JSON string */ +#define QRF_BLOB_Size 6 /* Display the blob size only */ + +/* +** Control-character escape modes. +** Allowed values for sqlite3_qrf_spec.eEsc +*/ +#define QRF_ESC_Auto 0 /* Choose the ctrl-char escape automatically */ +#define QRF_ESC_Off 1 /* Do not escape control characters */ +#define QRF_ESC_Ascii 2 /* Unix-style escapes. Ex: U+0007 shows ^G */ +#define QRF_ESC_Symbol 3 /* Unicode escapes. Ex: U+0007 shows U+2407 */ + +/* +** Allowed values for "boolean" fields, such as "bColumnNames", "bWordWrap", +** and "bTextJsonb". There is an extra "auto" variants so these are actually +** tri-state settings, not booleans. +*/ +#define QRF_SW_Auto 0 /* Let QRF choose the best value */ +#define QRF_SW_Off 1 /* This setting is forced off */ +#define QRF_SW_On 2 /* This setting is forced on */ +#define QRF_Auto 0 /* Alternate spelling for QRF_*_Auto */ +#define QRF_No 1 /* Alternate spelling for QRF_SW_Off */ +#define QRF_Yes 2 /* Alternate spelling for QRF_SW_On */ + +/* +** Possible alignment values alignment settings +** +** Horizontal Vertial +** ---------- -------- */ +#define QRF_ALIGN_Auto 0 /* auto auto */ +#define QRF_ALIGN_Left 1 /* left auto */ +#define QRF_ALIGN_Center 2 /* center auto */ +#define QRF_ALIGN_Right 3 /* right auto */ +#define QRF_ALIGN_Top 4 /* auto top */ +#define QRF_ALIGN_NW 5 /* left top */ +#define QRF_ALIGN_N 6 /* center top */ +#define QRF_ALIGN_NE 7 /* right top */ +#define QRF_ALIGN_Middle 8 /* auto middle */ +#define QRF_ALIGN_W 9 /* left middle */ +#define QRF_ALIGN_C 10 /* center middle */ +#define QRF_ALIGN_E 11 /* right middle */ +#define QRF_ALIGN_Bottom 12 /* auto bottom */ +#define QRF_ALIGN_SW 13 /* left bottom */ +#define QRF_ALIGN_S 14 /* center bottom */ +#define QRF_ALIGN_SE 15 /* right bottom */ +#define QRF_ALIGN_HMASK 3 /* Horizontal alignment mask */ +#define QRF_ALIGN_VMASK 12 /* Vertical alignment mask */ + +/* +** Auxiliary routines contined within this module that might be useful +** in other contexts, and which are therefore exported. +*/ +/* +** Return an estimate of the width, in columns, for the single Unicode +** character c. For normal characters, the answer is always 1. But the +** estimate might be 0 or 2 for zero-width and double-width characters. +** +** Different devices display unicode using different widths. So +** it is impossible to know that true display width with 100% accuracy. +** Inaccuracies in the width estimates might cause columns to be misaligned. +** Unfortunately, there is nothing we can do about that. +*/ +int sqlite3_qrf_wcwidth(int c); + +/* +** Return an estimate of the number of display columns used by the +** string in the argument. The width of individual characters is +** determined as for sqlite3_qrf_wcwidth(). VT100 escape code sequences +** are assigned a width of zero. +*/ +size_t sqlite3_qrf_wcswidth(const char*); + + +#ifdef __cplusplus +} +#endif +#endif /* !defined(SQLITE_QRF_H) */ diff --git a/ext/rbu/rbu11.test b/ext/rbu/rbu11.test index a42163cce6..513ab29f63 100644 --- a/ext/rbu/rbu11.test +++ b/ext/rbu/rbu11.test @@ -192,4 +192,32 @@ do_test 4.7.2 { list [catch {rbu close} msg] $msg } {1 {SQLITE_ERROR - rbu_state mismatch error}} +#------------------------------------------------------------------------- +# https://sqlite.org/forum/info/6d0a31e22a435877 +# +reset_db +forcedelete rbu.db + +do_execsql_test 5.0 { + CREATE TABLE t1(a BLOB); + INSERT INTO t1 VALUES(x'41'); +} + +sqlite3 dbRbu rbu.db +dbRbu eval { + CREATE TABLE data_t1(a, rbu_control, rbu_rowid); + INSERT INTO data_t1 VALUES(X'310a313a5a337e7e7e7e7e40302c303b','f',1); +} +dbRbu close + +do_test 5.1 { + sqlite3rbu rbu test.db rbu.db + rbu step +} {SQLITE_ERROR} + +do_test 5.2 { + list [catch {rbu close} msg] $msg +} {1 {SQLITE_ERROR - corrupt fossil delta}} + + finish_test diff --git a/ext/rbu/sqlite3rbu.c b/ext/rbu/sqlite3rbu.c index 4509986ee7..f377d5c30d 100644 --- a/ext/rbu/sqlite3rbu.c +++ b/ext/rbu/sqlite3rbu.c @@ -623,7 +623,7 @@ static int rbuDeltaApply( /* ERROR: copy exceeds output file size */ return -1; } - if( (int)(ofst+cnt) > lenSrc ){ + if( (u64)ofst+(u64)cnt > (u64)lenSrc ){ /* ERROR: copy extends past end of input */ return -1; } @@ -2269,8 +2269,8 @@ static char *rbuObjIterGetIndexWhere(sqlite3rbu *p, RbuObjIter *pIter){ /* If necessary, grow the pIter->aIdxCol[] array */ if( iIdxCol==nIdxAlloc ){ - RbuSpan *aIdxCol = (RbuSpan*)sqlite3_realloc( - pIter->aIdxCol, (nIdxAlloc+16)*sizeof(RbuSpan) + RbuSpan *aIdxCol = (RbuSpan*)sqlite3_realloc64( + pIter->aIdxCol, nIdxAlloc*sizeof(RbuSpan) + 16*sizeof(RbuSpan) ); if( aIdxCol==0 ){ rc = SQLITE_NOMEM; diff --git a/ext/repair/README.md b/ext/repair/README.md deleted file mode 100644 index 927ceb7c44..0000000000 --- a/ext/repair/README.md +++ /dev/null @@ -1,16 +0,0 @@ -This folder contains extensions and utility programs intended to analyze -live database files, detect problems, and possibly fix them. - -As SQLite is being used on larger and larger databases, database sizes -are growing into the terabyte range. At that size, hardware malfunctions -and/or cosmic rays will occasionally corrupt a database file. Detecting -problems and fixing errors a terabyte-sized databases can take hours or days, -and it is undesirable to take applications that depend on the databases -off-line for such a long time. -The utilities in the folder are intended to provide mechanisms for -detecting and fixing problems in large databases while those databases -are in active use. - -The utilities and extensions in this folder are experimental and under -active development at the time of this writing (2017-10-12). If and when -they stabilize, this README will be updated to reflect that fact. diff --git a/ext/repair/checkfreelist.c b/ext/repair/checkfreelist.c deleted file mode 100644 index d1d3d54074..0000000000 --- a/ext/repair/checkfreelist.c +++ /dev/null @@ -1,310 +0,0 @@ -/* -** 2017 October 11 -** -** The author disclaims copyright to this source code. In place of -** a legal notice, here is a blessing: -** -** May you do good and not evil. -** May you find forgiveness for yourself and forgive others. -** May you share freely, never taking more than you give. -** -************************************************************************* -** -** This module exports a single C function: -** -** int sqlite3_check_freelist(sqlite3 *db, const char *zDb); -** -** This function checks the free-list in database zDb (one of "main", -** "temp", etc.) and reports any errors by invoking the sqlite3_log() -** function. It returns SQLITE_OK if successful, or an SQLite error -** code otherwise. It is not an error if the free-list is corrupted but -** no IO or OOM errors occur. -** -** If this file is compiled and loaded as an SQLite loadable extension, -** it adds an SQL function "checkfreelist" to the database handle, to -** be invoked as follows: -** -** SELECT checkfreelist(); -** -** This function performs the same checks as sqlite3_check_freelist(), -** except that it returns all error messages as a single text value, -** separated by newline characters. If the freelist is not corrupted -** in any way, an empty string is returned. -** -** To compile this module for use as an SQLite loadable extension: -** -** gcc -Os -fPIC -shared checkfreelist.c -o checkfreelist.so -*/ - -#include "sqlite3ext.h" -SQLITE_EXTENSION_INIT1 - -#ifndef SQLITE_AMALGAMATION -# include -# include -# include -# include -# if defined(SQLITE_COVERAGE_TEST) || defined(SQLITE_MUTATION_TEST) -# define SQLITE_OMIT_AUXILIARY_SAFETY_CHECKS 1 -# endif -# if defined(SQLITE_OMIT_AUXILIARY_SAFETY_CHECKS) -# define ALWAYS(X) (1) -# define NEVER(X) (0) -# elif !defined(NDEBUG) -# define ALWAYS(X) ((X)?1:(assert(0),0)) -# define NEVER(X) ((X)?(assert(0),1):0) -# else -# define ALWAYS(X) (X) -# define NEVER(X) (X) -# endif - typedef unsigned char u8; - typedef unsigned short u16; - typedef unsigned int u32; -#define get4byte(x) ( \ - ((u32)((x)[0])<<24) + \ - ((u32)((x)[1])<<16) + \ - ((u32)((x)[2])<<8) + \ - ((u32)((x)[3])) \ -) -#endif - -/* -** Execute a single PRAGMA statement and return the integer value returned -** via output parameter (*pnOut). -** -** The SQL statement passed as the third argument should be a printf-style -** format string containing a single "%s" which will be replace by the -** value passed as the second argument. e.g. -** -** sqlGetInteger(db, "main", "PRAGMA %s.page_count", pnOut) -** -** executes "PRAGMA main.page_count" and stores the results in (*pnOut). -*/ -static int sqlGetInteger( - sqlite3 *db, /* Database handle */ - const char *zDb, /* Database name ("main", "temp" etc.) */ - const char *zFmt, /* SQL statement format */ - u32 *pnOut /* OUT: Integer value */ -){ - int rc, rc2; - char *zSql; - sqlite3_stmt *pStmt = 0; - int bOk = 0; - - zSql = sqlite3_mprintf(zFmt, zDb); - if( zSql==0 ){ - rc = SQLITE_NOMEM; - }else{ - rc = sqlite3_prepare_v2(db, zSql, -1, &pStmt, 0); - sqlite3_free(zSql); - } - - if( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){ - *pnOut = (u32)sqlite3_column_int(pStmt, 0); - bOk = 1; - } - - rc2 = sqlite3_finalize(pStmt); - if( rc==SQLITE_OK ) rc = rc2; - if( rc==SQLITE_OK && bOk==0 ) rc = SQLITE_ERROR; - return rc; -} - -/* -** Argument zFmt must be a printf-style format string and must be -** followed by its required arguments. If argument pzOut is NULL, -** then the results of printf()ing the format string are passed to -** sqlite3_log(). Otherwise, they are appended to the string -** at (*pzOut). -*/ -static int checkFreelistError(char **pzOut, const char *zFmt, ...){ - int rc = SQLITE_OK; - char *zErr = 0; - va_list ap; - - va_start(ap, zFmt); - zErr = sqlite3_vmprintf(zFmt, ap); - if( zErr==0 ){ - rc = SQLITE_NOMEM; - }else{ - if( pzOut ){ - *pzOut = sqlite3_mprintf("%s%z%s", *pzOut?"\n":"", *pzOut, zErr); - if( *pzOut==0 ) rc = SQLITE_NOMEM; - }else{ - sqlite3_log(SQLITE_ERROR, "checkfreelist: %s", zErr); - } - sqlite3_free(zErr); - } - va_end(ap); - return rc; -} - -static int checkFreelist( - sqlite3 *db, - const char *zDb, - char **pzOut -){ - /* This query returns one row for each page on the free list. Each row has - ** two columns - the page number and page content. */ - const char *zTrunk = - "WITH freelist_trunk(i, d, n) AS (" - "SELECT 1, NULL, sqlite_readint32(data, 32) " - "FROM sqlite_dbpage(:1) WHERE pgno=1 " - "UNION ALL " - "SELECT n, data, sqlite_readint32(data) " - "FROM freelist_trunk, sqlite_dbpage(:1) WHERE pgno=n " - ")" - "SELECT i, d FROM freelist_trunk WHERE i!=1;"; - - int rc, rc2; /* Return code */ - sqlite3_stmt *pTrunk = 0; /* Compilation of zTrunk */ - u32 nPage = 0; /* Number of pages in db */ - u32 nExpected = 0; /* Expected number of free pages */ - u32 nFree = 0; /* Number of pages on free list */ - - if( zDb==0 ) zDb = "main"; - - if( (rc = sqlGetInteger(db, zDb, "PRAGMA %s.page_count", &nPage)) - || (rc = sqlGetInteger(db, zDb, "PRAGMA %s.freelist_count", &nExpected)) - ){ - return rc; - } - - rc = sqlite3_prepare_v2(db, zTrunk, -1, &pTrunk, 0); - if( rc!=SQLITE_OK ) return rc; - sqlite3_bind_text(pTrunk, 1, zDb, -1, SQLITE_STATIC); - while( rc==SQLITE_OK && sqlite3_step(pTrunk)==SQLITE_ROW ){ - u32 i; - u32 iTrunk = (u32)sqlite3_column_int(pTrunk, 0); - const u8 *aData = (const u8*)sqlite3_column_blob(pTrunk, 1); - u32 nData = (u32)sqlite3_column_bytes(pTrunk, 1); - u32 iNext = get4byte(&aData[0]); - u32 nLeaf = get4byte(&aData[4]); - - if( nLeaf>((nData/4)-2-6) ){ - rc = checkFreelistError(pzOut, - "leaf count out of range (%d) on trunk page %d", - (int)nLeaf, (int)iTrunk - ); - nLeaf = (nData/4) - 2 - 6; - } - - nFree += 1+nLeaf; - if( iNext>nPage ){ - rc = checkFreelistError(pzOut, - "trunk page %d is out of range", (int)iNext - ); - } - - for(i=0; rc==SQLITE_OK && inPage ){ - rc = checkFreelistError(pzOut, - "leaf page %d is out of range (child %d of trunk page %d)", - (int)iLeaf, (int)i, (int)iTrunk - ); - } - } - } - - if( rc==SQLITE_OK && nFree!=nExpected ){ - rc = checkFreelistError(pzOut, - "free-list count mismatch: actual=%d header=%d", - (int)nFree, (int)nExpected - ); - } - - rc2 = sqlite3_finalize(pTrunk); - if( rc==SQLITE_OK ) rc = rc2; - return rc; -} - -int sqlite3_check_freelist(sqlite3 *db, const char *zDb){ - return checkFreelist(db, zDb, 0); -} - -static void checkfreelist_function( - sqlite3_context *pCtx, - int nArg, - sqlite3_value **apArg -){ - const char *zDb; - int rc; - char *zOut = 0; - sqlite3 *db = sqlite3_context_db_handle(pCtx); - - assert( nArg==1 ); - zDb = (const char*)sqlite3_value_text(apArg[0]); - rc = checkFreelist(db, zDb, &zOut); - if( rc==SQLITE_OK ){ - sqlite3_result_text(pCtx, zOut?zOut:"ok", -1, SQLITE_TRANSIENT); - }else{ - sqlite3_result_error_code(pCtx, rc); - } - - sqlite3_free(zOut); -} - -/* -** An SQL function invoked as follows: -** -** sqlite_readint32(BLOB) -- Decode 32-bit integer from start of blob -*/ -static void readint_function( - sqlite3_context *pCtx, - int nArg, - sqlite3_value **apArg -){ - const u8 *zBlob; - int nBlob; - int iOff = 0; - u32 iRet = 0; - - if( nArg!=1 && nArg!=2 ){ - sqlite3_result_error( - pCtx, "wrong number of arguments to function sqlite_readint32()", -1 - ); - return; - } - if( nArg==2 ){ - iOff = sqlite3_value_int(apArg[1]); - } - - zBlob = sqlite3_value_blob(apArg[0]); - nBlob = sqlite3_value_bytes(apArg[0]); - - if( nBlob>=(iOff+4) ){ - iRet = get4byte(&zBlob[iOff]); - } - - sqlite3_result_int64(pCtx, (sqlite3_int64)iRet); -} - -/* -** Register the SQL functions. -*/ -static int cflRegister(sqlite3 *db){ - int rc = sqlite3_create_function( - db, "sqlite_readint32", -1, SQLITE_UTF8, 0, readint_function, 0, 0 - ); - if( rc!=SQLITE_OK ) return rc; - rc = sqlite3_create_function( - db, "checkfreelist", 1, SQLITE_UTF8, 0, checkfreelist_function, 0, 0 - ); - return rc; -} - -/* -** Extension load function. -*/ -#ifdef _WIN32 -__declspec(dllexport) -#endif -int sqlite3_checkfreelist_init( - sqlite3 *db, - char **pzErrMsg, - const sqlite3_api_routines *pApi -){ - SQLITE_EXTENSION_INIT2(pApi); - return cflRegister(db); -} diff --git a/ext/repair/checkindex.c b/ext/repair/checkindex.c deleted file mode 100644 index ed30357e5d..0000000000 --- a/ext/repair/checkindex.c +++ /dev/null @@ -1,929 +0,0 @@ -/* -** 2017 October 27 -** -** The author disclaims copyright to this source code. In place of -** a legal notice, here is a blessing: -** -** May you do good and not evil. -** May you find forgiveness for yourself and forgive others. -** May you share freely, never taking more than you give. -** -************************************************************************* -*/ - -#include "sqlite3ext.h" -SQLITE_EXTENSION_INIT1 - -/* -** Stuff that is available inside the amalgamation, but which we need to -** declare ourselves if this module is compiled separately. -*/ -#ifndef SQLITE_AMALGAMATION -# include -# include -# include -# include -typedef unsigned char u8; -typedef unsigned short u16; -typedef unsigned int u32; -#define get4byte(x) ( \ - ((u32)((x)[0])<<24) + \ - ((u32)((x)[1])<<16) + \ - ((u32)((x)[2])<<8) + \ - ((u32)((x)[3])) \ -) -#endif - -typedef struct CidxTable CidxTable; -typedef struct CidxCursor CidxCursor; - -struct CidxTable { - sqlite3_vtab base; /* Base class. Must be first */ - sqlite3 *db; -}; - -struct CidxCursor { - sqlite3_vtab_cursor base; /* Base class. Must be first */ - sqlite3_int64 iRowid; /* Row number of the output */ - char *zIdxName; /* Copy of the index_name parameter */ - char *zAfterKey; /* Copy of the after_key parameter */ - sqlite3_stmt *pStmt; /* SQL statement that generates the output */ -}; - -typedef struct CidxColumn CidxColumn; -struct CidxColumn { - char *zExpr; /* Text for indexed expression */ - int bDesc; /* True for DESC columns, otherwise false */ - int bKey; /* Part of index, not PK */ -}; - -typedef struct CidxIndex CidxIndex; -struct CidxIndex { - char *zWhere; /* WHERE clause, if any */ - int nCol; /* Elements in aCol[] array */ - CidxColumn aCol[1]; /* Array of indexed columns */ -}; - -static void *cidxMalloc(int *pRc, int n){ - void *pRet = 0; - assert( n!=0 ); - if( *pRc==SQLITE_OK ){ - pRet = sqlite3_malloc(n); - if( pRet ){ - memset(pRet, 0, n); - }else{ - *pRc = SQLITE_NOMEM; - } - } - return pRet; -} - -static void cidxCursorError(CidxCursor *pCsr, const char *zFmt, ...){ - va_list ap; - va_start(ap, zFmt); - assert( pCsr->base.pVtab->zErrMsg==0 ); - pCsr->base.pVtab->zErrMsg = sqlite3_vmprintf(zFmt, ap); - va_end(ap); -} - -/* -** Connect to the incremental_index_check virtual table. -*/ -static int cidxConnect( - sqlite3 *db, - void *pAux, - int argc, const char *const*argv, - sqlite3_vtab **ppVtab, - char **pzErr -){ - int rc = SQLITE_OK; - CidxTable *pRet; - -#define IIC_ERRMSG 0 -#define IIC_CURRENT_KEY 1 -#define IIC_INDEX_NAME 2 -#define IIC_AFTER_KEY 3 -#define IIC_SCANNER_SQL 4 - rc = sqlite3_declare_vtab(db, - "CREATE TABLE xyz(" - " errmsg TEXT," /* Error message or NULL if everything is ok */ - " current_key TEXT," /* SQLite quote() text of key values */ - " index_name HIDDEN," /* IN: name of the index being scanned */ - " after_key HIDDEN," /* IN: Start scanning after this key */ - " scanner_sql HIDDEN" /* debugging info: SQL used for scanner */ - ")" - ); - pRet = cidxMalloc(&rc, sizeof(CidxTable)); - if( pRet ){ - pRet->db = db; - } - - *ppVtab = (sqlite3_vtab*)pRet; - return rc; -} - -/* -** Disconnect from or destroy an incremental_index_check virtual table. -*/ -static int cidxDisconnect(sqlite3_vtab *pVtab){ - CidxTable *pTab = (CidxTable*)pVtab; - sqlite3_free(pTab); - return SQLITE_OK; -} - -/* -** idxNum and idxStr are not used. There are only three possible plans, -** which are all distinguished by the number of parameters. -** -** No parameters: A degenerate plan. The result is zero rows. -** 1 Parameter: Scan all of the index starting with first entry -** 2 parameters: Scan the index starting after the "after_key". -** -** Provide successively smaller costs for each of these plans to encourage -** the query planner to select the one with the most parameters. -*/ -static int cidxBestIndex(sqlite3_vtab *tab, sqlite3_index_info *pInfo){ - int iIdxName = -1; - int iAfterKey = -1; - int i; - - for(i=0; inConstraint; i++){ - struct sqlite3_index_constraint *p = &pInfo->aConstraint[i]; - if( p->usable==0 ) continue; - if( p->op!=SQLITE_INDEX_CONSTRAINT_EQ ) continue; - - if( p->iColumn==IIC_INDEX_NAME ){ - iIdxName = i; - } - if( p->iColumn==IIC_AFTER_KEY ){ - iAfterKey = i; - } - } - - if( iIdxName<0 ){ - pInfo->estimatedCost = 1000000000.0; - }else{ - pInfo->aConstraintUsage[iIdxName].argvIndex = 1; - pInfo->aConstraintUsage[iIdxName].omit = 1; - if( iAfterKey<0 ){ - pInfo->estimatedCost = 1000000.0; - }else{ - pInfo->aConstraintUsage[iAfterKey].argvIndex = 2; - pInfo->aConstraintUsage[iAfterKey].omit = 1; - pInfo->estimatedCost = 1000.0; - } - } - - return SQLITE_OK; -} - -/* -** Open a new btreeinfo cursor. -*/ -static int cidxOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){ - CidxCursor *pRet; - int rc = SQLITE_OK; - - pRet = cidxMalloc(&rc, sizeof(CidxCursor)); - - *ppCursor = (sqlite3_vtab_cursor*)pRet; - return rc; -} - -/* -** Close a btreeinfo cursor. -*/ -static int cidxClose(sqlite3_vtab_cursor *pCursor){ - CidxCursor *pCsr = (CidxCursor*)pCursor; - sqlite3_finalize(pCsr->pStmt); - sqlite3_free(pCsr->zIdxName); - sqlite3_free(pCsr->zAfterKey); - sqlite3_free(pCsr); - return SQLITE_OK; -} - -/* -** Move a btreeinfo cursor to the next entry in the file. -*/ -static int cidxNext(sqlite3_vtab_cursor *pCursor){ - CidxCursor *pCsr = (CidxCursor*)pCursor; - int rc = sqlite3_step(pCsr->pStmt); - if( rc!=SQLITE_ROW ){ - rc = sqlite3_finalize(pCsr->pStmt); - pCsr->pStmt = 0; - if( rc!=SQLITE_OK ){ - sqlite3 *db = ((CidxTable*)pCsr->base.pVtab)->db; - cidxCursorError(pCsr, "Cursor error: %s", sqlite3_errmsg(db)); - } - }else{ - pCsr->iRowid++; - rc = SQLITE_OK; - } - return rc; -} - -/* We have reached EOF if previous sqlite3_step() returned -** anything other than SQLITE_ROW; -*/ -static int cidxEof(sqlite3_vtab_cursor *pCursor){ - CidxCursor *pCsr = (CidxCursor*)pCursor; - return pCsr->pStmt==0; -} - -static char *cidxMprintf(int *pRc, const char *zFmt, ...){ - char *zRet = 0; - va_list ap; - va_start(ap, zFmt); - zRet = sqlite3_vmprintf(zFmt, ap); - if( *pRc==SQLITE_OK ){ - if( zRet==0 ){ - *pRc = SQLITE_NOMEM; - } - }else{ - sqlite3_free(zRet); - zRet = 0; - } - va_end(ap); - return zRet; -} - -static sqlite3_stmt *cidxPrepare( - int *pRc, CidxCursor *pCsr, const char *zFmt, ... -){ - sqlite3_stmt *pRet = 0; - char *zSql; - va_list ap; /* ... printf arguments */ - va_start(ap, zFmt); - - zSql = sqlite3_vmprintf(zFmt, ap); - if( *pRc==SQLITE_OK ){ - if( zSql==0 ){ - *pRc = SQLITE_NOMEM; - }else{ - sqlite3 *db = ((CidxTable*)pCsr->base.pVtab)->db; - *pRc = sqlite3_prepare_v2(db, zSql, -1, &pRet, 0); - if( *pRc!=SQLITE_OK ){ - cidxCursorError(pCsr, "SQL error: %s", sqlite3_errmsg(db)); - } - } - } - sqlite3_free(zSql); - va_end(ap); - - return pRet; -} - -static void cidxFinalize(int *pRc, sqlite3_stmt *pStmt){ - int rc = sqlite3_finalize(pStmt); - if( *pRc==SQLITE_OK ) *pRc = rc; -} - -char *cidxStrdup(int *pRc, const char *zStr){ - char *zRet = 0; - if( *pRc==SQLITE_OK ){ - int n = (int)strlen(zStr); - zRet = cidxMalloc(pRc, n+1); - if( zRet ) memcpy(zRet, zStr, n+1); - } - return zRet; -} - -static void cidxFreeIndex(CidxIndex *pIdx){ - if( pIdx ){ - int i; - for(i=0; inCol; i++){ - sqlite3_free(pIdx->aCol[i].zExpr); - } - sqlite3_free(pIdx->zWhere); - sqlite3_free(pIdx); - } -} - -static int cidx_isspace(char c){ - return c==' ' || c=='\t' || c=='\r' || c=='\n'; -} - -static int cidx_isident(char c){ - return c<0 - || (c>='0' && c<='9') || (c>='a' && c<='z') - || (c>='A' && c<='Z') || c=='_'; -} - -#define CIDX_PARSE_EOF 0 -#define CIDX_PARSE_COMMA 1 /* "," */ -#define CIDX_PARSE_OPEN 2 /* "(" */ -#define CIDX_PARSE_CLOSE 3 /* ")" */ - -/* -** Argument zIn points into the start, middle or end of a CREATE INDEX -** statement. If argument pbDoNotTrim is non-NULL, then this function -** scans the input until it finds EOF, a comma (",") or an open or -** close parenthesis character. It then sets (*pzOut) to point to said -** character and returns a CIDX_PARSE_XXX constant as appropriate. The -** parser is smart enough that special characters inside SQL strings -** or comments are not returned for. -** -** Or, if argument pbDoNotTrim is NULL, then this function sets *pzOut -** to point to the first character of the string that is not whitespace -** or part of an SQL comment and returns CIDX_PARSE_EOF. -** -** Additionally, if pbDoNotTrim is not NULL and the element immediately -** before (*pzOut) is an SQL comment of the form "-- comment", then -** (*pbDoNotTrim) is set before returning. In all other cases it is -** cleared. -*/ -static int cidxFindNext( - const char *zIn, - const char **pzOut, - int *pbDoNotTrim /* OUT: True if prev is -- comment */ -){ - const char *z = zIn; - - while( 1 ){ - while( cidx_isspace(*z) ) z++; - if( z[0]=='-' && z[1]=='-' ){ - z += 2; - while( z[0]!='\n' ){ - if( z[0]=='\0' ) return CIDX_PARSE_EOF; - z++; - } - while( cidx_isspace(*z) ) z++; - if( pbDoNotTrim ) *pbDoNotTrim = 1; - }else - if( z[0]=='/' && z[1]=='*' ){ - z += 2; - while( z[0]!='*' || z[1]!='/' ){ - if( z[1]=='\0' ) return CIDX_PARSE_EOF; - z++; - } - z += 2; - }else{ - *pzOut = z; - if( pbDoNotTrim==0 ) return CIDX_PARSE_EOF; - switch( *z ){ - case '\0': - return CIDX_PARSE_EOF; - case '(': - return CIDX_PARSE_OPEN; - case ')': - return CIDX_PARSE_CLOSE; - case ',': - return CIDX_PARSE_COMMA; - - case '"': - case '\'': - case '`': { - char q = *z; - z++; - while( *z ){ - if( *z==q ){ - z++; - if( *z!=q ) break; - } - z++; - } - break; - } - - case '[': - while( *z++!=']' ); - break; - - default: - z++; - break; - } - *pbDoNotTrim = 0; - } - } - - assert( 0 ); - return -1; -} - -static int cidxParseSQL(CidxCursor *pCsr, CidxIndex *pIdx, const char *zSql){ - const char *z = zSql; - const char *z1; - int e; - int rc = SQLITE_OK; - int nParen = 1; - int bDoNotTrim = 0; - CidxColumn *pCol = pIdx->aCol; - - e = cidxFindNext(z, &z, &bDoNotTrim); - if( e!=CIDX_PARSE_OPEN ) goto parse_error; - z1 = z+1; - z++; - while( nParen>0 ){ - e = cidxFindNext(z, &z, &bDoNotTrim); - if( e==CIDX_PARSE_EOF ) goto parse_error; - if( (e==CIDX_PARSE_COMMA || e==CIDX_PARSE_CLOSE) && nParen==1 ){ - const char *z2 = z; - if( pCol->zExpr ) goto parse_error; - - if( bDoNotTrim==0 ){ - while( cidx_isspace(z[-1]) ) z--; - if( !sqlite3_strnicmp(&z[-3], "asc", 3) && 0==cidx_isident(z[-4]) ){ - z -= 3; - while( cidx_isspace(z[-1]) ) z--; - }else - if( !sqlite3_strnicmp(&z[-4], "desc", 4) && 0==cidx_isident(z[-5]) ){ - z -= 4; - while( cidx_isspace(z[-1]) ) z--; - } - while( cidx_isspace(z1[0]) ) z1++; - } - - pCol->zExpr = cidxMprintf(&rc, "%.*s", z-z1, z1); - pCol++; - z = z1 = z2+1; - } - if( e==CIDX_PARSE_OPEN ) nParen++; - if( e==CIDX_PARSE_CLOSE ) nParen--; - z++; - } - - /* Search for a WHERE clause */ - cidxFindNext(z, &z, 0); - if( 0==sqlite3_strnicmp(z, "where", 5) ){ - pIdx->zWhere = cidxMprintf(&rc, "%s\n", &z[5]); - }else if( z[0]!='\0' ){ - goto parse_error; - } - - return rc; - - parse_error: - cidxCursorError(pCsr, "Parse error in: %s", zSql); - return SQLITE_ERROR; -} - -static int cidxLookupIndex( - CidxCursor *pCsr, /* Cursor object */ - const char *zIdx, /* Name of index to look up */ - CidxIndex **ppIdx, /* OUT: Description of columns */ - char **pzTab /* OUT: Table name */ -){ - int rc = SQLITE_OK; - char *zTab = 0; - CidxIndex *pIdx = 0; - - sqlite3_stmt *pFindTab = 0; - sqlite3_stmt *pInfo = 0; - - /* Find the table for this index. */ - pFindTab = cidxPrepare(&rc, pCsr, - "SELECT tbl_name, sql FROM sqlite_schema WHERE name=%Q AND type='index'", - zIdx - ); - if( rc==SQLITE_OK && sqlite3_step(pFindTab)==SQLITE_ROW ){ - const char *zSql = (const char*)sqlite3_column_text(pFindTab, 1); - zTab = cidxStrdup(&rc, (const char*)sqlite3_column_text(pFindTab, 0)); - - pInfo = cidxPrepare(&rc, pCsr, "PRAGMA index_xinfo(%Q)", zIdx); - if( rc==SQLITE_OK ){ - int nAlloc = 0; - int iCol = 0; - - while( sqlite3_step(pInfo)==SQLITE_ROW ){ - const char *zName = (const char*)sqlite3_column_text(pInfo, 2); - const char *zColl = (const char*)sqlite3_column_text(pInfo, 4); - CidxColumn *p; - if( zName==0 ) zName = "rowid"; - if( iCol==nAlloc ){ - int nByte = sizeof(CidxIndex) + sizeof(CidxColumn)*(nAlloc+8); - pIdx = (CidxIndex*)sqlite3_realloc(pIdx, nByte); - nAlloc += 8; - } - p = &pIdx->aCol[iCol++]; - p->bDesc = sqlite3_column_int(pInfo, 3); - p->bKey = sqlite3_column_int(pInfo, 5); - if( zSql==0 || p->bKey==0 ){ - p->zExpr = cidxMprintf(&rc, "\"%w\" COLLATE %s",zName,zColl); - }else{ - p->zExpr = 0; - } - pIdx->nCol = iCol; - pIdx->zWhere = 0; - } - cidxFinalize(&rc, pInfo); - } - - if( rc==SQLITE_OK && zSql ){ - rc = cidxParseSQL(pCsr, pIdx, zSql); - } - } - - cidxFinalize(&rc, pFindTab); - if( rc==SQLITE_OK && zTab==0 ){ - rc = SQLITE_ERROR; - } - - if( rc!=SQLITE_OK ){ - sqlite3_free(zTab); - cidxFreeIndex(pIdx); - }else{ - *pzTab = zTab; - *ppIdx = pIdx; - } - - return rc; -} - -static int cidxDecodeAfter( - CidxCursor *pCsr, - int nCol, - const char *zAfterKey, - char ***pazAfter -){ - char **azAfter; - int rc = SQLITE_OK; - int nAfterKey = (int)strlen(zAfterKey); - - azAfter = cidxMalloc(&rc, sizeof(char*)*nCol + nAfterKey+1); - if( rc==SQLITE_OK ){ - int i; - char *zCopy = (char*)&azAfter[nCol]; - char *p = zCopy; - memcpy(zCopy, zAfterKey, nAfterKey+1); - for(i=0; i='0' && *p<='9') - || *p=='.' || *p=='+' || *p=='-' || *p=='e' || *p=='E' - ){ - p++; - } - } - - while( *p==' ' ) p++; - if( *p!=(i==(nCol-1) ? '\0' : ',') ){ - goto parse_error; - } - *p++ = '\0'; - } - } - - *pazAfter = azAfter; - return rc; - - parse_error: - sqlite3_free(azAfter); - *pazAfter = 0; - cidxCursorError(pCsr, "%s", "error parsing after value"); - return SQLITE_ERROR; -} - -static char *cidxWhere( - int *pRc, CidxColumn *aCol, char **azAfter, int iGt, int bLastIsNull -){ - char *zRet = 0; - const char *zSep = ""; - int i; - - for(i=0; i"), - azAfter[iGt] - ); - }else{ - zRet = cidxMprintf(pRc, "%z%s(%s) IS NOT NULL", zRet, zSep,aCol[iGt].zExpr); - } - - return zRet; -} - -#define CIDX_CLIST_ALL 0 -#define CIDX_CLIST_ORDERBY 1 -#define CIDX_CLIST_CURRENT_KEY 2 -#define CIDX_CLIST_SUBWHERE 3 -#define CIDX_CLIST_SUBEXPR 4 - -/* -** This function returns various strings based on the contents of the -** CidxIndex structure and the eType parameter. -*/ -static char *cidxColumnList( - int *pRc, /* IN/OUT: Error code */ - const char *zIdx, - CidxIndex *pIdx, /* Indexed columns */ - int eType /* True to include ASC/DESC */ -){ - char *zRet = 0; - if( *pRc==SQLITE_OK ){ - const char *aDir[2] = {"", " DESC"}; - int i; - const char *zSep = ""; - - for(i=0; inCol; i++){ - CidxColumn *p = &pIdx->aCol[i]; - assert( pIdx->aCol[i].bDesc==0 || pIdx->aCol[i].bDesc==1 ); - switch( eType ){ - - case CIDX_CLIST_ORDERBY: - zRet = cidxMprintf(pRc, "%z%s%d%s", zRet, zSep, i+1, aDir[p->bDesc]); - zSep = ","; - break; - - case CIDX_CLIST_CURRENT_KEY: - zRet = cidxMprintf(pRc, "%z%squote(i%d)", zRet, zSep, i); - zSep = "||','||"; - break; - - case CIDX_CLIST_SUBWHERE: - if( p->bKey==0 ){ - zRet = cidxMprintf(pRc, "%z%s%s IS i.i%d", zRet, - zSep, p->zExpr, i - ); - zSep = " AND "; - } - break; - - case CIDX_CLIST_SUBEXPR: - if( p->bKey==1 ){ - zRet = cidxMprintf(pRc, "%z%s%s IS i.i%d", zRet, - zSep, p->zExpr, i - ); - zSep = " AND "; - } - break; - - default: - assert( eType==CIDX_CLIST_ALL ); - zRet = cidxMprintf(pRc, "%z%s(%s) AS i%d", zRet, zSep, p->zExpr, i); - zSep = ", "; - break; - } - } - } - - return zRet; -} - -/* -** Generate SQL (in memory obtained from sqlite3_malloc()) that will -** continue the index scan for zIdxName starting after zAfterKey. -*/ -int cidxGenerateScanSql( - CidxCursor *pCsr, /* The cursor which needs the new statement */ - const char *zIdxName, /* index to be scanned */ - const char *zAfterKey, /* start after this key, if not NULL */ - char **pzSqlOut /* OUT: Write the generated SQL here */ -){ - int rc; - char *zTab = 0; - char *zCurrentKey = 0; - char *zOrderBy = 0; - char *zSubWhere = 0; - char *zSubExpr = 0; - char *zSrcList = 0; - char **azAfter = 0; - CidxIndex *pIdx = 0; - - *pzSqlOut = 0; - rc = cidxLookupIndex(pCsr, zIdxName, &pIdx, &zTab); - - zOrderBy = cidxColumnList(&rc, zIdxName, pIdx, CIDX_CLIST_ORDERBY); - zCurrentKey = cidxColumnList(&rc, zIdxName, pIdx, CIDX_CLIST_CURRENT_KEY); - zSubWhere = cidxColumnList(&rc, zIdxName, pIdx, CIDX_CLIST_SUBWHERE); - zSubExpr = cidxColumnList(&rc, zIdxName, pIdx, CIDX_CLIST_SUBEXPR); - zSrcList = cidxColumnList(&rc, zIdxName, pIdx, CIDX_CLIST_ALL); - - if( rc==SQLITE_OK && zAfterKey ){ - rc = cidxDecodeAfter(pCsr, pIdx->nCol, zAfterKey, &azAfter); - } - - if( rc==SQLITE_OK ){ - if( zAfterKey==0 ){ - *pzSqlOut = cidxMprintf(&rc, - "SELECT (SELECT %s FROM %Q AS t WHERE %s), %s " - "FROM (SELECT %s FROM %Q INDEXED BY %Q %s%sORDER BY %s) AS i", - zSubExpr, zTab, zSubWhere, zCurrentKey, - zSrcList, zTab, zIdxName, - (pIdx->zWhere ? "WHERE " : ""), (pIdx->zWhere ? pIdx->zWhere : ""), - zOrderBy - ); - }else{ - const char *zSep = ""; - char *zSql; - int i; - - zSql = cidxMprintf(&rc, - "SELECT (SELECT %s FROM %Q WHERE %s), %s FROM (", - zSubExpr, zTab, zSubWhere, zCurrentKey - ); - for(i=pIdx->nCol-1; i>=0; i--){ - int j; - if( pIdx->aCol[i].bDesc && azAfter[i]==0 ) continue; - for(j=0; j<2; j++){ - char *zWhere = cidxWhere(&rc, pIdx->aCol, azAfter, i, j); - zSql = cidxMprintf(&rc, "%z" - "%sSELECT * FROM (" - "SELECT %s FROM %Q INDEXED BY %Q WHERE %s%s%z ORDER BY %s" - ")", - zSql, zSep, zSrcList, zTab, zIdxName, - pIdx->zWhere ? pIdx->zWhere : "", - pIdx->zWhere ? " AND " : "", - zWhere, zOrderBy - ); - zSep = " UNION ALL "; - if( pIdx->aCol[i].bDesc==0 ) break; - } - } - *pzSqlOut = cidxMprintf(&rc, "%z) AS i", zSql); - } - } - - sqlite3_free(zTab); - sqlite3_free(zCurrentKey); - sqlite3_free(zOrderBy); - sqlite3_free(zSubWhere); - sqlite3_free(zSubExpr); - sqlite3_free(zSrcList); - cidxFreeIndex(pIdx); - sqlite3_free(azAfter); - return rc; -} - - -/* -** Position a cursor back to the beginning. -*/ -static int cidxFilter( - sqlite3_vtab_cursor *pCursor, - int idxNum, const char *idxStr, - int argc, sqlite3_value **argv -){ - int rc = SQLITE_OK; - CidxCursor *pCsr = (CidxCursor*)pCursor; - const char *zIdxName = 0; - const char *zAfterKey = 0; - - sqlite3_free(pCsr->zIdxName); - pCsr->zIdxName = 0; - sqlite3_free(pCsr->zAfterKey); - pCsr->zAfterKey = 0; - sqlite3_finalize(pCsr->pStmt); - pCsr->pStmt = 0; - - if( argc>0 ){ - zIdxName = (const char*)sqlite3_value_text(argv[0]); - if( argc>1 ){ - zAfterKey = (const char*)sqlite3_value_text(argv[1]); - } - } - - if( zIdxName ){ - char *zSql = 0; - pCsr->zIdxName = sqlite3_mprintf("%s", zIdxName); - pCsr->zAfterKey = zAfterKey ? sqlite3_mprintf("%s", zAfterKey) : 0; - rc = cidxGenerateScanSql(pCsr, zIdxName, zAfterKey, &zSql); - if( zSql ){ - pCsr->pStmt = cidxPrepare(&rc, pCsr, "%z", zSql); - } - } - - if( pCsr->pStmt ){ - assert( rc==SQLITE_OK ); - rc = cidxNext(pCursor); - } - pCsr->iRowid = 1; - return rc; -} - -/* -** Return a column value. -*/ -static int cidxColumn( - sqlite3_vtab_cursor *pCursor, - sqlite3_context *ctx, - int iCol -){ - CidxCursor *pCsr = (CidxCursor*)pCursor; - assert( iCol>=IIC_ERRMSG && iCol<=IIC_SCANNER_SQL ); - switch( iCol ){ - case IIC_ERRMSG: { - const char *zVal = 0; - if( sqlite3_column_type(pCsr->pStmt, 0)==SQLITE_INTEGER ){ - if( sqlite3_column_int(pCsr->pStmt, 0)==0 ){ - zVal = "row data mismatch"; - } - }else{ - zVal = "row missing"; - } - sqlite3_result_text(ctx, zVal, -1, SQLITE_STATIC); - break; - } - case IIC_CURRENT_KEY: { - sqlite3_result_value(ctx, sqlite3_column_value(pCsr->pStmt, 1)); - break; - } - case IIC_INDEX_NAME: { - sqlite3_result_text(ctx, pCsr->zIdxName, -1, SQLITE_TRANSIENT); - break; - } - case IIC_AFTER_KEY: { - sqlite3_result_text(ctx, pCsr->zAfterKey, -1, SQLITE_TRANSIENT); - break; - } - case IIC_SCANNER_SQL: { - char *zSql = 0; - cidxGenerateScanSql(pCsr, pCsr->zIdxName, pCsr->zAfterKey, &zSql); - sqlite3_result_text(ctx, zSql, -1, sqlite3_free); - break; - } - } - return SQLITE_OK; -} - -/* Return the ROWID for the sqlite_btreeinfo table */ -static int cidxRowid(sqlite3_vtab_cursor *pCursor, sqlite_int64 *pRowid){ - CidxCursor *pCsr = (CidxCursor*)pCursor; - *pRowid = pCsr->iRowid; - return SQLITE_OK; -} - -/* -** Register the virtual table modules with the database handle passed -** as the only argument. -*/ -static int ciInit(sqlite3 *db){ - static sqlite3_module cidx_module = { - 0, /* iVersion */ - 0, /* xCreate */ - cidxConnect, /* xConnect */ - cidxBestIndex, /* xBestIndex */ - cidxDisconnect, /* xDisconnect */ - 0, /* xDestroy */ - cidxOpen, /* xOpen - open a cursor */ - cidxClose, /* xClose - close a cursor */ - cidxFilter, /* xFilter - configure scan constraints */ - cidxNext, /* xNext - advance a cursor */ - cidxEof, /* xEof - check for end of scan */ - cidxColumn, /* xColumn - read data */ - cidxRowid, /* xRowid - read data */ - 0, /* xUpdate */ - 0, /* xBegin */ - 0, /* xSync */ - 0, /* xCommit */ - 0, /* xRollback */ - 0, /* xFindMethod */ - 0, /* xRename */ - 0, /* xSavepoint */ - 0, /* xRelease */ - 0, /* xRollbackTo */ - 0, /* xShadowName */ - 0 /* xIntegrity */ - }; - return sqlite3_create_module(db, "incremental_index_check", &cidx_module, 0); -} - -/* -** Extension load function. -*/ -#ifdef _WIN32 -__declspec(dllexport) -#endif -int sqlite3_checkindex_init( - sqlite3 *db, - char **pzErrMsg, - const sqlite3_api_routines *pApi -){ - SQLITE_EXTENSION_INIT2(pApi); - return ciInit(db); -} diff --git a/ext/repair/sqlite3_checker.c.in b/ext/repair/sqlite3_checker.c.in deleted file mode 100644 index 96b15f2713..0000000000 --- a/ext/repair/sqlite3_checker.c.in +++ /dev/null @@ -1,85 +0,0 @@ -/* -** Read an SQLite database file and analyze its space utilization. Generate -** text on standard output. -*/ -#define TCLSH_INIT_PROC sqlite3_checker_init_proc -#define SQLITE_ENABLE_DBPAGE_VTAB 1 -#undef SQLITE_THREADSAFE -#define SQLITE_THREADSAFE 0 -#undef SQLITE_ENABLE_COLUMN_METADATA -#define SQLITE_OMIT_DECLTYPE 1 -#define SQLITE_OMIT_DEPRECATED 1 -#define SQLITE_OMIT_PROGRESS_CALLBACK 1 -#define SQLITE_OMIT_SHARED_CACHE 1 -#define SQLITE_DEFAULT_MEMSTATUS 0 -#define SQLITE_MAX_EXPR_DEPTH 0 -INCLUDE sqlite3.c -INCLUDE $ROOT/src/tclsqlite.c -INCLUDE $ROOT/ext/misc/btreeinfo.c -INCLUDE $ROOT/ext/repair/checkindex.c -INCLUDE $ROOT/ext/repair/checkfreelist.c - -/* -** Decode a pointer to an sqlite3 object. -*/ -int getDbPointer(Tcl_Interp *interp, const char *zA, sqlite3 **ppDb){ - struct SqliteDb *p; - Tcl_CmdInfo cmdInfo; - if( Tcl_GetCommandInfo(interp, zA, &cmdInfo) ){ - p = (struct SqliteDb*)cmdInfo.objClientData; - *ppDb = p->db; - return TCL_OK; - }else{ - *ppDb = 0; - return TCL_ERROR; - } - return TCL_OK; -} - -/* -** sqlite3_imposter db main rootpage {CREATE TABLE...} ;# setup an imposter -** sqlite3_imposter db main ;# rm all imposters -*/ -static int sqlite3_imposter( - void *clientData, - Tcl_Interp *interp, - int objc, - Tcl_Obj *CONST objv[] -){ - sqlite3 *db; - const char *zSchema; - int iRoot; - const char *zSql; - - if( objc!=3 && objc!=5 ){ - Tcl_WrongNumArgs(interp, 1, objv, "DB SCHEMA [ROOTPAGE SQL]"); - return TCL_ERROR; - } - if( getDbPointer(interp, Tcl_GetString(objv[1]), &db) ) return TCL_ERROR; - zSchema = Tcl_GetString(objv[2]); - if( objc==3 ){ - sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, db, zSchema, 0, 1); - }else{ - if( Tcl_GetIntFromObj(interp, objv[3], &iRoot) ) return TCL_ERROR; - zSql = Tcl_GetString(objv[4]); - sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, db, zSchema, 1, iRoot); - sqlite3_exec(db, zSql, 0, 0, 0); - sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, db, zSchema, 0, 0); - } - return TCL_OK; -} - -#include - -const char *sqlite3_checker_init_proc(Tcl_Interp *interp){ - Tcl_CreateObjCommand(interp, "sqlite3_imposter", - (Tcl_ObjCmdProc*)sqlite3_imposter, 0, 0); - sqlite3_auto_extension((void(*)(void))sqlite3_btreeinfo_init); - sqlite3_auto_extension((void(*)(void))sqlite3_checkindex_init); - sqlite3_auto_extension((void(*)(void))sqlite3_checkfreelist_init); - return -BEGIN_STRING -INCLUDE $ROOT/ext/repair/sqlite3_checker.tcl -END_STRING -; -} diff --git a/ext/repair/sqlite3_checker.tcl b/ext/repair/sqlite3_checker.tcl deleted file mode 100644 index 2ae6e15b12..0000000000 --- a/ext/repair/sqlite3_checker.tcl +++ /dev/null @@ -1,264 +0,0 @@ -# This TCL script is the main driver script for the sqlite3_checker utility -# program. -# - -# Special case: -# -# sqlite3_checker --test FILENAME ARGS -# -# uses FILENAME in place of this script. -# -if {[lindex $argv 0]=="--test" && [llength $argv]>1} { - set ::argv0 [lindex $argv 1] - set argv [lrange $argv 2 end] - source $argv0 - exit 0 -} - -# Emulate a TCL shell -# -proc tclsh {} { - set line {} - while {![eof stdin]} { - if {$line!=""} { - puts -nonewline "> " - } else { - puts -nonewline "% " - } - flush stdout - append line [gets stdin] - if {[info complete $line]} { - if {[catch {uplevel #0 $line} result]} { - puts stderr "Error: $result" - } elseif {$result!=""} { - puts $result - } - set line {} - } else { - append line \n - } - } -} - -# Do an incremental integrity check of a single index -# -proc check_index {idxname batchsize bTrace} { - set i 0 - set more 1 - set nerr 0 - set pct 00.0 - set max [db one {SELECT nEntry FROM sqlite_btreeinfo('main') - WHERE name=$idxname}] - puts -nonewline "$idxname: $i of $max rows ($pct%)\r" - flush stdout - if {$bTrace} { - set sql {SELECT errmsg, current_key AS key, - CASE WHEN rowid=1 THEN scanner_sql END AS traceOut - FROM incremental_index_check($idxname) - WHERE after_key=$key - LIMIT $batchsize} - } else { - set sql {SELECT errmsg, current_key AS key, NULL AS traceOut - FROM incremental_index_check($idxname) - WHERE after_key=$key - LIMIT $batchsize} - } - while {$more} { - set more 0 - db eval $sql { - set more 1 - if {$errmsg!=""} { - incr nerr - puts "$idxname: key($key): $errmsg" - } elseif {$traceOut!=""} { - puts "$idxname: $traceOut" - } - incr i - - } - set x [format {%.1f} [expr {($i*100.0)/$max}]] - if {$x!=$pct} { - puts -nonewline "$idxname: $i of $max rows ($pct%)\r" - flush stdout - set pct $x - } - } - puts "$idxname: $nerr errors out of $i entries" -} - -# Print a usage message on standard error, then quit. -# -proc usage {} { - set argv0 [file rootname [file tail [info nameofexecutable]]] - puts stderr "Usage: $argv0 OPTIONS database-filename" - puts stderr { -Do sanity checking on a live SQLite3 database file specified by the -"database-filename" argument. - -Options: - - --batchsize N Number of rows to check per transaction - - --freelist Perform a freelist check - - --index NAME Run a check of the index NAME - - --summary Print summary information about the database - - --table NAME Run a check of all indexes for table NAME - - --tclsh Run the built-in TCL interpreter (for debugging) - - --trace (Debugging only:) Output trace information on the scan - - --version Show the version number of SQLite -} - exit 1 -} - -set file_to_analyze {} -append argv {} -set bFreelistCheck 0 -set bSummary 0 -set zIndex {} -set zTable {} -set batchsize 1000 -set bAll 1 -set bTrace 0 -set argc [llength $argv] -for {set i 0} {$i<$argc} {incr i} { - set arg [lindex $argv $i] - if {[regexp {^-+tclsh$} $arg]} { - tclsh - exit 0 - } - if {[regexp {^-+version$} $arg]} { - sqlite3 mem :memory: - puts [mem one {SELECT sqlite_version()||' '||sqlite_source_id()}] - mem close - exit 0 - } - if {[regexp {^-+freelist$} $arg]} { - set bFreelistCheck 1 - set bAll 0 - continue - } - if {[regexp {^-+summary$} $arg]} { - set bSummary 1 - set bAll 0 - continue - } - if {[regexp {^-+trace$} $arg]} { - set bTrace 1 - continue - } - if {[regexp {^-+batchsize$} $arg]} { - incr i - if {$i>=$argc} { - puts stderr "missing argument on $arg" - exit 1 - } - set batchsize [lindex $argv $i] - continue - } - if {[regexp {^-+index$} $arg]} { - incr i - if {$i>=$argc} { - puts stderr "missing argument on $arg" - exit 1 - } - set zIndex [lindex $argv $i] - set bAll 0 - continue - } - if {[regexp {^-+table$} $arg]} { - incr i - if {$i>=$argc} { - puts stderr "missing argument on $arg" - exit 1 - } - set zTable [lindex $argv $i] - set bAll 0 - continue - } - if {[regexp {^-} $arg]} { - puts stderr "Unknown option: $arg" - usage - } - if {$file_to_analyze!=""} { - usage - } else { - set file_to_analyze $arg - } -} -if {$file_to_analyze==""} usage - -# If a TCL script is specified on the command-line, then run that -# script. -# -if {[file extension $file_to_analyze]==".tcl"} { - source $file_to_analyze - exit 0 -} - -set root_filename $file_to_analyze -regexp {^file:(//)?([^?]*)} $file_to_analyze all x1 root_filename -if {![file exists $root_filename]} { - puts stderr "No such file: $root_filename" - exit 1 -} -if {![file readable $root_filename]} { - puts stderr "File is not readable: $root_filename" - exit 1 -} - -if {[catch {sqlite3 db $file_to_analyze} res]} { - puts stderr "Cannot open datababase $root_filename: $res" - exit 1 -} - -if {$bFreelistCheck || $bAll} { - puts -nonewline "freelist-check: " - flush stdout - db eval BEGIN - puts [db one {SELECT checkfreelist('main')}] - db eval END -} -if {$bSummary} { - set scale 0 - set pgsz [db one {PRAGMA page_size}] - db eval {SELECT nPage*$pgsz AS sz, name, tbl_name - FROM sqlite_btreeinfo - WHERE type='index' - ORDER BY 1 DESC, name} { - if {$scale==0} { - if {$sz>10000000} { - set scale 1000000.0 - set unit MB - } else { - set scale 1000.0 - set unit KB - } - } - puts [format {%7.1f %s index %s of table %s} \ - [expr {$sz/$scale}] $unit $name $tbl_name] - } -} -if {$zIndex!=""} { - check_index $zIndex $batchsize $bTrace -} -if {$zTable!=""} { - foreach idx [db eval {SELECT name FROM sqlite_master - WHERE type='index' AND rootpage>0 - AND tbl_name=$zTable}] { - check_index $idx $batchsize $bTrace - } -} -if {$bAll} { - set allidx [db eval {SELECT name FROM sqlite_btreeinfo('main') - WHERE type='index' AND rootpage>0 - ORDER BY nEntry}] - foreach idx $allidx { - check_index $idx $batchsize $bTrace - } -} diff --git a/ext/repair/test/README.md b/ext/repair/test/README.md deleted file mode 100644 index 8cc954adf5..0000000000 --- a/ext/repair/test/README.md +++ /dev/null @@ -1,13 +0,0 @@ -To run these tests, first build sqlite3_checker: - - -> make sqlite3_checker - - -Then run the "test.tcl" script using: - - -> ./sqlite3_checker --test $path/test.tcl - - -Optionally add the full pathnames of individual *.test modules diff --git a/ext/repair/test/checkfreelist01.test b/ext/repair/test/checkfreelist01.test deleted file mode 100644 index 7e2dd51c37..0000000000 --- a/ext/repair/test/checkfreelist01.test +++ /dev/null @@ -1,92 +0,0 @@ -# 2017-10-11 - -set testprefix checkfreelist - -do_execsql_test 1.0 { - PRAGMA page_size=1024; - CREATE TABLE t1(a, b); -} - -do_execsql_test 1.2 { SELECT checkfreelist('main') } {ok} -do_execsql_test 1.3 { - WITH s(i) AS ( - SELECT 1 UNION ALL SELECT i+1 FROM s WHERE i<10000 - ) - INSERT INTO t1 SELECT randomblob(400), randomblob(400) FROM s; - DELETE FROM t1 WHERE rowid%3; - PRAGMA freelist_count; -} {6726} - -do_execsql_test 1.4 { SELECT checkfreelist('main') } {ok} -do_execsql_test 1.5 { - WITH freelist_trunk(i, d, n) AS ( - SELECT 1, NULL, sqlite_readint32(data, 32) FROM sqlite_dbpage WHERE pgno=1 - UNION ALL - SELECT n, data, sqlite_readint32(data) - FROM freelist_trunk, sqlite_dbpage WHERE pgno=n - ) - SELECT i FROM freelist_trunk WHERE i!=1; -} { - 10009 9715 9343 8969 8595 8222 7847 7474 7102 6727 6354 5982 5608 5234 - 4860 4487 4112 3740 3367 2992 2619 2247 1872 1499 1125 752 377 5 -} - -do_execsql_test 1.6 { SELECT checkfreelist('main') } {ok} - -proc set_int {blob idx newval} { - binary scan $blob I* ints - lset ints $idx $newval - binary format I* $ints -} -db func set_int set_int - -proc get_int {blob idx} { - binary scan $blob I* ints - lindex $ints $idx -} -db func get_int get_int - -do_execsql_test 1.7 { - BEGIN; - UPDATE sqlite_dbpage - SET data = set_int(data, 1, get_int(data, 1)-1) - WHERE pgno=4860; - SELECT checkfreelist('main'); - ROLLBACK; -} {{free-list count mismatch: actual=6725 header=6726}} - -do_execsql_test 1.8 { - BEGIN; - UPDATE sqlite_dbpage - SET data = set_int(data, 5, (SELECT * FROM pragma_page_count)+1) - WHERE pgno=4860; - SELECT checkfreelist('main'); - ROLLBACK; -} {{leaf page 10092 is out of range (child 3 of trunk page 4860)}} - -do_execsql_test 1.9 { - BEGIN; - UPDATE sqlite_dbpage - SET data = set_int(data, 5, 0) - WHERE pgno=4860; - SELECT checkfreelist('main'); - ROLLBACK; -} {{leaf page 0 is out of range (child 3 of trunk page 4860)}} - -do_execsql_test 1.10 { - BEGIN; - UPDATE sqlite_dbpage - SET data = set_int(data, get_int(data, 1)+1, 0) - WHERE pgno=5; - SELECT checkfreelist('main'); - ROLLBACK; -} {{leaf page 0 is out of range (child 247 of trunk page 5)}} - -do_execsql_test 1.11 { - BEGIN; - UPDATE sqlite_dbpage - SET data = set_int(data, 1, 249) - WHERE pgno=5; - SELECT checkfreelist('main'); - ROLLBACK; -} {{leaf count out of range (249) on trunk page 5}} diff --git a/ext/repair/test/checkindex01.test b/ext/repair/test/checkindex01.test deleted file mode 100644 index 97973aee71..0000000000 --- a/ext/repair/test/checkindex01.test +++ /dev/null @@ -1,349 +0,0 @@ -# 2017-10-11 -# -set testprefix checkindex - -do_execsql_test 1.0 { - CREATE TABLE t1(a, b); - CREATE INDEX i1 ON t1(a); - INSERT INTO t1 VALUES('one', 2); - INSERT INTO t1 VALUES('two', 4); - INSERT INTO t1 VALUES('three', 6); - INSERT INTO t1 VALUES('four', 8); - INSERT INTO t1 VALUES('five', 10); - - CREATE INDEX i2 ON t1(a DESC); -} {} - -proc incr_index_check {idx nStep} { - set Q { - SELECT errmsg, current_key FROM incremental_index_check($idx, $after) - LIMIT $nStep - } - - set res [list] - while {1} { - unset -nocomplain current_key - set res1 [db eval $Q] - if {[llength $res1]==0} break - set res [concat $res $res1] - set after [lindex $res end] - } - - return $res -} - -proc do_index_check_test {tn idx res} { - uplevel [list do_execsql_test $tn.1 " - SELECT errmsg, current_key FROM incremental_index_check('$idx'); - " $res] - - uplevel [list do_test $tn.2 "incr_index_check $idx 1" [list {*}$res]] - uplevel [list do_test $tn.3 "incr_index_check $idx 2" [list {*}$res]] - uplevel [list do_test $tn.4 "incr_index_check $idx 5" [list {*}$res]] -} - - -do_execsql_test 1.2.1 { - SELECT rowid, errmsg IS NULL, current_key FROM incremental_index_check('i1'); -} { - 1 1 'five',5 - 2 1 'four',4 - 3 1 'one',1 - 4 1 'three',3 - 5 1 'two',2 -} -do_execsql_test 1.2.2 { - SELECT errmsg IS NULL, current_key, index_name, after_key, scanner_sql - FROM incremental_index_check('i1') LIMIT 1; -} { - 1 - 'five',5 - i1 - {} - {SELECT (SELECT a IS i.i0 FROM 't1' AS t WHERE "rowid" COLLATE BINARY IS i.i1), quote(i0)||','||quote(i1) FROM (SELECT (a) AS i0, ("rowid" COLLATE BINARY) AS i1 FROM 't1' INDEXED BY 'i1' ORDER BY 1,2) AS i} -} - -do_index_check_test 1.3 i1 { - {} 'five',5 - {} 'four',4 - {} 'one',1 - {} 'three',3 - {} 'two',2 -} - -do_index_check_test 1.4 i2 { - {} 'two',2 - {} 'three',3 - {} 'one',1 - {} 'four',4 - {} 'five',5 -} - -do_test 1.5 { - set tblroot [db one { SELECT rootpage FROM sqlite_master WHERE name='t1' }] - sqlite3_imposter db main $tblroot {CREATE TABLE xt1(a,b)} - db eval { - UPDATE xt1 SET a='six' WHERE rowid=3; - DELETE FROM xt1 WHERE rowid = 5; - } - sqlite3_imposter db main -} {} - -do_index_check_test 1.6 i1 { - {row missing} 'five',5 - {} 'four',4 - {} 'one',1 - {row data mismatch} 'three',3 - {} 'two',2 -} - -do_index_check_test 1.7 i2 { - {} 'two',2 - {row data mismatch} 'three',3 - {} 'one',1 - {} 'four',4 - {row missing} 'five',5 -} - -#-------------------------------------------------------------------------- -do_execsql_test 2.0 { - - CREATE TABLE t2(a INTEGER PRIMARY KEY, b, c, d); - - INSERT INTO t2 VALUES(1, NULL, 1, 1); - INSERT INTO t2 VALUES(2, 1, NULL, 1); - INSERT INTO t2 VALUES(3, 1, 1, NULL); - - INSERT INTO t2 VALUES(4, 2, 2, 1); - INSERT INTO t2 VALUES(5, 2, 2, 2); - INSERT INTO t2 VALUES(6, 2, 2, 3); - - INSERT INTO t2 VALUES(7, 2, 2, 1); - INSERT INTO t2 VALUES(8, 2, 2, 2); - INSERT INTO t2 VALUES(9, 2, 2, 3); - - CREATE INDEX i3 ON t2(b, c, d); - CREATE INDEX i4 ON t2(b DESC, c DESC, d DESC); - CREATE INDEX i5 ON t2(d, c DESC, b); -} {} - -do_index_check_test 2.1 i3 { - {} NULL,1,1,1 - {} 1,NULL,1,2 - {} 1,1,NULL,3 - {} 2,2,1,4 - {} 2,2,1,7 - {} 2,2,2,5 - {} 2,2,2,8 - {} 2,2,3,6 - {} 2,2,3,9 -} - -do_index_check_test 2.2 i4 { - {} 2,2,3,6 - {} 2,2,3,9 - {} 2,2,2,5 - {} 2,2,2,8 - {} 2,2,1,4 - {} 2,2,1,7 - {} 1,1,NULL,3 - {} 1,NULL,1,2 - {} NULL,1,1,1 -} - -do_index_check_test 2.3 i5 { - {} NULL,1,1,3 - {} 1,2,2,4 - {} 1,2,2,7 - {} 1,1,NULL,1 - {} 1,NULL,1,2 - {} 2,2,2,5 - {} 2,2,2,8 - {} 3,2,2,6 - {} 3,2,2,9 -} - -#-------------------------------------------------------------------------- -do_execsql_test 3.0 { - - CREATE TABLE t3(w, x, y, z PRIMARY KEY) WITHOUT ROWID; - CREATE INDEX t3wxy ON t3(w, x, y); - CREATE INDEX t3wxy2 ON t3(w DESC, x DESC, y DESC); - - INSERT INTO t3 VALUES(NULL, NULL, NULL, 1); - INSERT INTO t3 VALUES(NULL, NULL, NULL, 2); - INSERT INTO t3 VALUES(NULL, NULL, NULL, 3); - - INSERT INTO t3 VALUES('a', NULL, NULL, 4); - INSERT INTO t3 VALUES('a', NULL, NULL, 5); - INSERT INTO t3 VALUES('a', NULL, NULL, 6); - - INSERT INTO t3 VALUES('a', 'b', NULL, 7); - INSERT INTO t3 VALUES('a', 'b', NULL, 8); - INSERT INTO t3 VALUES('a', 'b', NULL, 9); - -} {} - -do_index_check_test 3.1 t3wxy { - {} NULL,NULL,NULL,1 {} NULL,NULL,NULL,2 {} NULL,NULL,NULL,3 - {} 'a',NULL,NULL,4 {} 'a',NULL,NULL,5 {} 'a',NULL,NULL,6 - {} 'a','b',NULL,7 {} 'a','b',NULL,8 {} 'a','b',NULL,9 -} -do_index_check_test 3.2 t3wxy2 { - {} 'a','b',NULL,7 {} 'a','b',NULL,8 {} 'a','b',NULL,9 - {} 'a',NULL,NULL,4 {} 'a',NULL,NULL,5 {} 'a',NULL,NULL,6 - {} NULL,NULL,NULL,1 {} NULL,NULL,NULL,2 {} NULL,NULL,NULL,3 -} - -#-------------------------------------------------------------------------- -# Test with an index that uses non-default collation sequences. -# -do_execsql_test 4.0 { - CREATE TABLE t4(a INTEGER PRIMARY KEY, c1 TEXT, c2 TEXT); - INSERT INTO t4 VALUES(1, 'aaa', 'bbb'); - INSERT INTO t4 VALUES(2, 'AAA', 'CCC'); - INSERT INTO t4 VALUES(3, 'aab', 'ddd'); - INSERT INTO t4 VALUES(4, 'AAB', 'EEE'); - - CREATE INDEX t4cc ON t4(c1 COLLATE nocase, c2 COLLATE nocase); -} - -do_index_check_test 4.1 t4cc { - {} 'aaa','bbb',1 - {} 'AAA','CCC',2 - {} 'aab','ddd',3 - {} 'AAB','EEE',4 -} - -do_test 4.2 { - set tblroot [db one { SELECT rootpage FROM sqlite_master WHERE name='t4' }] - sqlite3_imposter db main $tblroot \ - {CREATE TABLE xt4(a INTEGER PRIMARY KEY, c1 TEXT, c2 TEXT)} - - db eval { - UPDATE xt4 SET c1='hello' WHERE rowid=2; - DELETE FROM xt4 WHERE rowid = 3; - } - sqlite3_imposter db main -} {} - -do_index_check_test 4.3 t4cc { - {} 'aaa','bbb',1 - {row data mismatch} 'AAA','CCC',2 - {row missing} 'aab','ddd',3 - {} 'AAB','EEE',4 -} - -#-------------------------------------------------------------------------- -# Test an index on an expression. -# -do_execsql_test 5.0 { - CREATE TABLE t5(x INTEGER PRIMARY KEY, y TEXT, UNIQUE(y)); - INSERT INTO t5 VALUES(1, '{"x":1, "y":1}'); - INSERT INTO t5 VALUES(2, '{"x":2, "y":2}'); - INSERT INTO t5 VALUES(3, '{"x":3, "y":3}'); - INSERT INTO t5 VALUES(4, '{"w":4, "z":4}'); - INSERT INTO t5 VALUES(5, '{"x":5, "y":5}'); - - CREATE INDEX t5x ON t5( json_extract(y, '$.x') ); - CREATE INDEX t5y ON t5( json_extract(y, '$.y') DESC ); -} - -do_index_check_test 5.1.1 t5x { - {} NULL,4 {} 1,1 {} 2,2 {} 3,3 {} 5,5 -} - -do_index_check_test 5.1.2 t5y { - {} 5,5 {} 3,3 {} 2,2 {} 1,1 {} NULL,4 -} - -do_index_check_test 5.1.3 sqlite_autoindex_t5_1 { - {} {'{"w":4, "z":4}',4} - {} {'{"x":1, "y":1}',1} - {} {'{"x":2, "y":2}',2} - {} {'{"x":3, "y":3}',3} - {} {'{"x":5, "y":5}',5} -} - -do_test 5.2 { - set tblroot [db one { SELECT rootpage FROM sqlite_master WHERE name='t5' }] - sqlite3_imposter db main $tblroot \ - {CREATE TABLE xt5(a INTEGER PRIMARY KEY, c1 TEXT);} - db eval { - UPDATE xt5 SET c1='{"x":22, "y":11}' WHERE rowid=1; - DELETE FROM xt5 WHERE rowid = 4; - } - sqlite3_imposter db main -} {} - -do_index_check_test 5.3.1 t5x { - {row missing} NULL,4 - {row data mismatch} 1,1 - {} 2,2 - {} 3,3 - {} 5,5 -} - -do_index_check_test 5.3.2 sqlite_autoindex_t5_1 { - {row missing} {'{"w":4, "z":4}',4} - {row data mismatch} {'{"x":1, "y":1}',1} - {} {'{"x":2, "y":2}',2} - {} {'{"x":3, "y":3}',3} - {} {'{"x":5, "y":5}',5} -} - -#------------------------------------------------------------------------- -# -do_execsql_test 6.0 { - CREATE TABLE t6(x INTEGER PRIMARY KEY, y, z); - CREATE INDEX t6x1 ON t6(y, /* one,two,three */ z); - CREATE INDEX t6x2 ON t6(z, -- hello,world, - y); - - CREATE INDEX t6x3 ON t6(z -- hello,world - , y); - - INSERT INTO t6 VALUES(1, 2, 3); - INSERT INTO t6 VALUES(4, 5, 6); -} - -do_index_check_test 6.1 t6x1 { - {} 2,3,1 - {} 5,6,4 -} -do_index_check_test 6.2 t6x2 { - {} 3,2,1 - {} 6,5,4 -} -do_index_check_test 6.2 t6x3 { - {} 3,2,1 - {} 6,5,4 -} - -#------------------------------------------------------------------------- -# -do_execsql_test 7.0 { - CREATE TABLE t7(x INTEGER PRIMARY KEY, y, z); - INSERT INTO t7 VALUES(1, 1, 1); - INSERT INTO t7 VALUES(2, 2, 0); - INSERT INTO t7 VALUES(3, 3, 1); - INSERT INTO t7 VALUES(4, 4, 0); - - CREATE INDEX t7i1 ON t7(y) WHERE z=1; - CREATE INDEX t7i2 ON t7(y) /* hello,world */ WHERE z=1; - CREATE INDEX t7i3 ON t7(y) WHERE -- yep - z=1; - CREATE INDEX t7i4 ON t7(y) WHERE z=1 -- yep; -} -do_index_check_test 7.1 t7i1 { - {} 1,1 {} 3,3 -} -do_index_check_test 7.2 t7i2 { - {} 1,1 {} 3,3 -} -do_index_check_test 7.3 t7i3 { - {} 1,1 {} 3,3 -} -do_index_check_test 7.4 t7i4 { - {} 1,1 {} 3,3 -} diff --git a/ext/repair/test/test.tcl b/ext/repair/test/test.tcl deleted file mode 100644 index c073bb73c5..0000000000 --- a/ext/repair/test/test.tcl +++ /dev/null @@ -1,67 +0,0 @@ -# Run this script using -# -# sqlite3_checker --test $thisscript $testscripts -# -# The $testscripts argument is optional. If omitted, all *.test files -# in the same directory as $thisscript are run. -# -set NTEST 0 -set NERR 0 - - -# Invoke the do_test procedure to run a single test -# -# The $expected parameter is the expected result. The result is the return -# value from the last TCL command in $cmd. -# -# Normally, $expected must match exactly. But if $expected is of the form -# "/regexp/" then regular expression matching is used. If $expected is -# "~/regexp/" then the regular expression must NOT match. If $expected is -# of the form "#/value-list/" then each term in value-list must be numeric -# and must approximately match the corresponding numeric term in $result. -# Values must match within 10%. Or if the $expected term is A..B then the -# $result term must be in between A and B. -# -proc do_test {name cmd expected} { - if {[info exists ::testprefix]} { - set name "$::testprefix$name" - } - - incr ::NTEST - puts -nonewline $name... - flush stdout - - if {[catch {uplevel #0 "$cmd;\n"} result]} { - puts -nonewline $name... - puts "\nError: $result" - incr ::NERR - } else { - set ok [expr {[string compare $result $expected]==0}] - if {!$ok} { - puts "\n! $name expected: \[$expected\]\n! $name got: \[$result\]" - incr ::NERR - } else { - puts " Ok" - } - } - flush stdout -} - -# -# do_execsql_test TESTNAME SQL RES -# -proc do_execsql_test {testname sql {result {}}} { - uplevel [list do_test $testname [list db eval $sql] [list {*}$result]] -} - -if {[llength $argv]==0} { - set dir [file dirname $argv0] - set argv [glob -nocomplain $dir/*.test] -} -foreach testfile $argv { - file delete -force test.db - sqlite3 db test.db - source $testfile - catch {db close} -} -puts "$NERR errors out of $NTEST tests" diff --git a/ext/rtree/geopoly.c b/ext/rtree/geopoly.c index 0ae42e7b72..22166a6f9e 100644 --- a/ext/rtree/geopoly.c +++ b/ext/rtree/geopoly.c @@ -200,7 +200,7 @@ static int geopolyParseNumber(GeoParse *p, GeoCoord *pVal){ /* The sqlite3AtoF() routine is much much faster than atof(), if it ** is available */ double r; - (void)sqlite3AtoF((const char*)p->z, &r, j, SQLITE_UTF8); + (void)sqlite3AtoF((const char*)p->z, &r); *pVal = r; #else *pVal = (GeoCoord)atof((const char*)p->z); diff --git a/ext/rtree/rtree.c b/ext/rtree/rtree.c index 8b913ef2df..faebdce78d 100644 --- a/ext/rtree/rtree.c +++ b/ext/rtree/rtree.c @@ -1037,7 +1037,17 @@ static void rtreeRelease(Rtree *pRtree){ pRtree->inWrTrans = 0; assert( pRtree->nCursor==0 ); nodeBlobReset(pRtree); - assert( pRtree->nNodeRef==0 || pRtree->bCorrupt ); + if( pRtree->nNodeRef ){ + int i; + assert( pRtree->bCorrupt ); + for(i=0; iaHash[i] ){ + RtreeNode *pNext = pRtree->aHash[i]->pNext; + sqlite3_free(pRtree->aHash[i]); + pRtree->aHash[i] = pNext; + } + } + } sqlite3_finalize(pRtree->pWriteNode); sqlite3_finalize(pRtree->pDeleteNode); sqlite3_finalize(pRtree->pReadRowid); @@ -2329,7 +2339,7 @@ static int AdjustTree( int iCell; cnt++; - if( NEVER(cnt>100) ){ + if( cnt>100 ){ RTREE_IS_CORRUPT(pRtree); return SQLITE_CORRUPT_VTAB; } @@ -2687,15 +2697,6 @@ static int SplitNode( rc = updateMapping(pRtree, pCell->iRowid, pLeft, iHeight); } - if( rc==SQLITE_OK ){ - rc = nodeRelease(pRtree, pRight); - pRight = 0; - } - if( rc==SQLITE_OK ){ - rc = nodeRelease(pRtree, pLeft); - pLeft = 0; - } - splitnode_out: nodeRelease(pRtree, pRight); nodeRelease(pRtree, pLeft); @@ -2880,7 +2881,7 @@ static int rtreeInsertCell( rc = SplitNode(pRtree, pNode, pCell, iHeight); }else{ rc = AdjustTree(pRtree, pNode, pCell); - if( ALWAYS(rc==SQLITE_OK) ){ + if( rc==SQLITE_OK ){ if( iHeight==0 ){ rc = rowidWrite(pRtree, pCell->iRowid, pNode->iNode); }else{ @@ -3775,7 +3776,7 @@ static void rtreenode(sqlite3_context *ctx, int nArg, sqlite3_value **apArg){ if( node.zData==0 ) return; nData = sqlite3_value_bytes(apArg[1]); if( nData<4 ) return; - if( nData> 0) & 0xFF; } +/* +** Write a double value to the buffer aBuf[]. +*/ +static void sessionPutDouble(u8 *aBuf, double r){ + /* TODO: SQLite does something special to deal with mixed-endian + ** floating point values (e.g. ARM7). This code probably should + ** too. */ + u64 i; + assert( sizeof(double)==8 && sizeof(u64)==8 ); + memcpy(&i, &r, 8); + sessionPutI64(aBuf, i); +} + /* ** This function is used to serialize the contents of value pValue (see ** comment titled "RECORD FORMAT" above). @@ -414,16 +442,13 @@ static int sessionSerializeValue( /* TODO: SQLite does something special to deal with mixed-endian ** floating point values (e.g. ARM7). This code probably should ** too. */ - u64 i; if( eType==SQLITE_INTEGER ){ - i = (u64)sqlite3_value_int64(pValue); + u64 i = (u64)sqlite3_value_int64(pValue); + sessionPutI64(&aBuf[1], i); }else{ - double r; - assert( sizeof(double)==8 && sizeof(u64)==8 ); - r = sqlite3_value_double(pValue); - memcpy(&i, &r, 8); + double r = sqlite3_value_double(pValue); + sessionPutDouble(&aBuf[1], r); } - sessionPutI64(&aBuf[1], i); } nByte = 9; break; @@ -643,14 +668,10 @@ static unsigned int sessionChangeHash( int isPK = pTab->abPK[i]; if( bPkOnly && isPK==0 ) continue; - /* It is not possible for eType to be SQLITE_NULL here. The session - ** module does not record changes for rows with NULL values stored in - ** primary key columns. */ assert( eType==SQLITE_INTEGER || eType==SQLITE_FLOAT || eType==SQLITE_TEXT || eType==SQLITE_BLOB || eType==SQLITE_NULL || eType==0 ); - assert( !isPK || (eType!=0 && eType!=SQLITE_NULL) ); if( isPK ){ a++; @@ -658,12 +679,16 @@ static unsigned int sessionChangeHash( if( eType==SQLITE_INTEGER || eType==SQLITE_FLOAT ){ h = sessionHashAppendI64(h, sessionGetI64(a)); a += 8; - }else{ + }else if( eType==SQLITE_TEXT || eType==SQLITE_BLOB ){ int n; a += sessionVarintGet(a, &n); h = sessionHashAppendBlob(h, n, a); a += n; } + /* It should not be possible for eType to be SQLITE_NULL or 0x00 here, + ** as the session module does not record changes for rows with NULL + ** values stored in primary key columns. But a corrupt changesets + ** may contain such a value. */ }else{ a += sessionSerialLen(a); } @@ -1355,9 +1380,7 @@ static void sessionUpdateOneChange( case SQLITE_FLOAT: { double rVal = sqlite3_column_double(pDflt, iField); - i64 iVal = 0; - memcpy(&iVal, &rVal, sizeof(rVal)); - sessionPutI64(&pNew->aRecord[pNew->nRecord], iVal); + sessionPutDouble(&pNew->aRecord[pNew->nRecord], rVal); pNew->nRecord += 8; break; } @@ -2614,15 +2637,14 @@ static void sessionAppendCol( int eType = sqlite3_column_type(pStmt, iCol); sessionAppendByte(p, (u8)eType, pRc); if( eType==SQLITE_INTEGER || eType==SQLITE_FLOAT ){ - sqlite3_int64 i; u8 aBuf[8]; if( eType==SQLITE_INTEGER ){ - i = sqlite3_column_int64(pStmt, iCol); + sqlite3_int64 i = sqlite3_column_int64(pStmt, iCol); + sessionPutI64(aBuf, i); }else{ double r = sqlite3_column_double(pStmt, iCol); - memcpy(&i, &r, 8); + sessionPutDouble(aBuf, r); } - sessionPutI64(aBuf, i); sessionAppendBlob(p, aBuf, 8, pRc); } if( eType==SQLITE_BLOB || eType==SQLITE_TEXT ){ @@ -3072,10 +3094,13 @@ static int sessionGenerateChangeset( } if( pSession->rc ) return pSession->rc; - rc = sqlite3_exec(pSession->db, "SAVEPOINT changeset", 0, 0, 0); - if( rc!=SQLITE_OK ) return rc; sqlite3_mutex_enter(sqlite3_db_mutex(db)); + rc = sqlite3_exec(pSession->db, "SAVEPOINT changeset", 0, 0, 0); + if( rc!=SQLITE_OK ){ + sqlite3_mutex_leave(sqlite3_db_mutex(db)); + return rc; + } for(pTab=pSession->pTable; rc==SQLITE_OK && pTab; pTab=pTab->pNext){ if( pTab->nEntry ){ @@ -3558,7 +3583,8 @@ static int sessionReadRecord( u8 *aVal = &pIn->aData[pIn->iNext]; if( eType==SQLITE_TEXT || eType==SQLITE_BLOB ){ int nByte; - pIn->iNext += sessionVarintGet(aVal, &nByte); + int nRem = pIn->nData - pIn->iNext; + pIn->iNext += sessionVarintGetSafe(aVal, nRem, &nByte); rc = sessionInputBuffer(pIn, nByte); if( rc==SQLITE_OK ){ if( nByte<0 || nByte>pIn->nData-pIn->iNext ){ @@ -3611,7 +3637,8 @@ static int sessionChangesetBufferTblhdr(SessionInput *pIn, int *pnByte){ rc = sessionInputBuffer(pIn, 9); if( rc==SQLITE_OK ){ - nRead += sessionVarintGet(&pIn->aData[pIn->iNext + nRead], &nCol); + int nBuf = pIn->nData - pIn->iNext; + nRead += sessionVarintGetSafe(&pIn->aData[pIn->iNext], nBuf, &nCol); /* The hard upper limit for the number of columns in an SQLite ** database table is, according to sqliteLimit.h, 32676. So ** consider any table-header that purports to have more than 65536 @@ -3631,8 +3658,15 @@ static int sessionChangesetBufferTblhdr(SessionInput *pIn, int *pnByte){ while( (pIn->iNext + nRead)nData && pIn->aData[pIn->iNext + nRead] ){ nRead++; } + + /* Break out of the loop if if the nul-terminator byte has been found. + ** Otherwise, read some more input data and keep seeking. If there is + ** no more input data, consider the changeset corrupt. */ if( (pIn->iNext + nRead)nData ) break; rc = sessionInputBuffer(pIn, nRead + 100); + if( rc==SQLITE_OK && (pIn->iNext + nRead)>=pIn->nData ){ + rc = SQLITE_CORRUPT_BKPT; + } } *pnByte = nRead+1; return rc; @@ -3653,7 +3687,7 @@ static int sessionChangesetBufferRecord( int *pnByte /* OUT: Size of record in bytes */ ){ int rc = SQLITE_OK; - int nByte = 0; + i64 nByte = 0; int i; for(i=0; rc==SQLITE_OK && iaData[pIn->iNext + nByte++]; if( eType==SQLITE_TEXT || eType==SQLITE_BLOB ){ int n; - nByte += sessionVarintGet(&pIn->aData[pIn->iNext+nByte], &n); + int nRem = pIn->nData - (pIn->iNext + nByte); + nByte += sessionVarintGetSafe(&pIn->aData[pIn->iNext+nByte], nRem, &n); nByte += n; rc = sessionInputBuffer(pIn, nByte); }else if( eType==SQLITE_INTEGER || eType==SQLITE_FLOAT ){ nByte += 8; } } + if( (pIn->iNext+nByte)>pIn->nData ){ + rc = SQLITE_CORRUPT_BKPT; + } } *pnByte = nByte; return rc; @@ -3764,10 +3802,10 @@ static int sessionChangesetNextOne( memset(p->apValue, 0, sizeof(sqlite3_value*)*p->nCol*2); } - /* Make sure the buffer contains at least 10 bytes of input data, or all - ** remaining data if there are less than 10 bytes available. This is - ** sufficient either for the 'T' or 'P' byte and the varint that follows - ** it, or for the two single byte values otherwise. */ + /* Make sure the buffer contains at least 2 bytes of input data, or all + ** remaining data if there are less than 2 bytes available. This is + ** sufficient either for the 'T' or 'P' byte that begins a new table, + ** or for the "op" and "bIndirect" single bytes otherwise. */ p->rc = sessionInputBuffer(&p->in, 2); if( p->rc!=SQLITE_OK ) return p->rc; @@ -3797,11 +3835,13 @@ static int sessionChangesetNextOne( return (p->rc = SQLITE_CORRUPT_BKPT); } - p->op = op; - p->bIndirect = p->in.aData[p->in.iNext++]; - if( p->op!=SQLITE_UPDATE && p->op!=SQLITE_DELETE && p->op!=SQLITE_INSERT ){ + if( (op!=SQLITE_UPDATE && op!=SQLITE_DELETE && op!=SQLITE_INSERT) + || (p->in.iNext>=p->in.nData) + ){ return (p->rc = SQLITE_CORRUPT_BKPT); } + p->op = op; + p->bIndirect = p->in.aData[p->in.iNext++]; if( paRec ){ int nVal; /* Number of values to buffer */ @@ -5649,6 +5689,21 @@ int sqlite3changeset_apply_strm( ); } +/* +** The parts of the sqlite3_changegroup structure used by the +** sqlite3changegroup_change_xxx() APIs. +*/ +typedef struct ChangeData ChangeData; +struct ChangeData { + SessionTable *pTab; + int bIndirect; + int eOp; + + int nBufAlloc; + SessionBuffer *aBuf; + SessionBuffer record; +}; + /* ** sqlite3_changegroup handle. */ @@ -5660,12 +5715,17 @@ struct sqlite3_changegroup { sqlite3 *db; /* Configured by changegroup_schema() */ char *zDb; /* Configured by changegroup_schema() */ + ChangeData cd; /* Used by changegroup_change_xxx() APIs. */ }; /* ** This function is called to merge two changes to the same row together as ** part of an sqlite3changeset_concat() operation. A new change object is ** allocated and a pointer to it stored in *ppNew. +** +** Because they have been vetted by sqlite3changegroup_add() or similar, +** both the aRec[] change and the pExist change are safe to use without +** checking for buffer overflows. */ static int sessionChangeMerge( SessionTable *pTab, /* Table structure */ @@ -5806,7 +5866,7 @@ static int sessionChangeMerge( memcpy(aCsr, aRec, nRec); aCsr += nRec; }else{ - if( 0==sessionMergeUpdate(&aCsr, pTab, bPatchset, aExist, 0,aRec,0) ){ + if( 0==sessionMergeUpdate(&aCsr, pTab, bPatchset, aExist,0,aRec,0) ){ sqlite3_free(pNew); pNew = 0; } @@ -5899,15 +5959,14 @@ static int sessionChangesetExtendRecord( switch( eType ){ case SQLITE_FLOAT: case SQLITE_INTEGER: { - i64 iVal; - if( eType==SQLITE_INTEGER ){ - iVal = sqlite3_column_int64(pTab->pDfltStmt, ii); - }else{ - double rVal = sqlite3_column_int64(pTab->pDfltStmt, ii); - memcpy(&iVal, &rVal, sizeof(i64)); - } if( SQLITE_OK==sessionBufferGrow(pOut, 8, &rc) ){ - sessionPutI64(&pOut->aBuf[pOut->nBuf], iVal); + if( eType==SQLITE_INTEGER ){ + sqlite3_int64 iVal = sqlite3_column_int64(pTab->pDfltStmt, ii); + sessionPutI64(&pOut->aBuf[pOut->nBuf], iVal); + }else{ + double rVal = sqlite3_column_double(pTab->pDfltStmt, ii); + sessionPutDouble(&pOut->aBuf[pOut->nBuf], rVal); + } pOut->nBuf += 8; } break; @@ -5978,13 +6037,19 @@ static int sessionChangesetFindTable( int nCol = 0; *ppTab = 0; - sqlite3changeset_pk(pIter, &abPK, &nCol); /* Search the list for an existing table */ for(pTab = pGrp->pList; pTab; pTab=pTab->pNext){ if( 0==sqlite3_strnicmp(pTab->zName, zTab, nTab+1) ) break; } + + if( pIter ){ + sqlite3changeset_pk(pIter, &abPK, &nCol); + }else if( !pTab && !pGrp->db ){ + return SQLITE_OK; + } + /* If one was not found above, create a new table now */ if( !pTab ){ SessionTable **ppNew; @@ -5996,15 +6061,17 @@ static int sessionChangesetFindTable( memset(pTab, 0, sizeof(SessionTable)); pTab->nCol = nCol; pTab->abPK = (u8*)&pTab[1]; - memcpy(pTab->abPK, abPK, nCol); + if( nCol>0 ){ + memcpy(pTab->abPK, abPK, nCol); + } pTab->zName = (char*)&pTab->abPK[nCol]; memcpy(pTab->zName, zTab, nTab+1); if( pGrp->db ){ pTab->nCol = 0; rc = sessionInitTable(0, pTab, pGrp->db, pGrp->zDb); - if( rc ){ - assert( pTab->azCol==0 ); + if( rc || pTab->nCol==0 ){ + sqlite3_free(pTab->azCol); sqlite3_free(pTab); return rc; } @@ -6019,7 +6086,7 @@ static int sessionChangesetFindTable( } /* Check that the table is compatible. */ - if( !sessionChangesetCheckCompat(pTab, nCol, abPK) ){ + if( pIter && !sessionChangesetCheckCompat(pTab, nCol, abPK) ){ rc = SQLITE_SCHEMA; } @@ -6028,44 +6095,27 @@ static int sessionChangesetFindTable( } /* -** Add the change currently indicated by iterator pIter to the hash table -** belonging to changegroup pGrp. +** Add a single change to the changegroup pGrp. */ static int sessionOneChangeToHash( - sqlite3_changegroup *pGrp, - sqlite3_changeset_iter *pIter, - int bRebase + sqlite3_changegroup *pGrp, /* Changegroup to update */ + SessionTable *pTab, /* Table change pertains to */ + int op, /* One of SQLITE_INSERT, UPDATE, DELETE */ + int bIndirect, /* True to flag change as "indirect" */ + int nCol, /* Number of columns in record(s) */ + u8 *aRec, /* Serialized change record(s) */ + int nRec, /* Size of aRec[] in bytes */ + int bRebase /* True if this is a rebase blob */ ){ int rc = SQLITE_OK; - int nCol = 0; - int op = 0; int iHash = 0; - int bIndirect = 0; SessionChange *pChange = 0; SessionChange *pExist = 0; SessionChange **pp = 0; - SessionTable *pTab = 0; - u8 *aRec = &pIter->in.aData[pIter->in.iCurrent + 2]; - int nRec = (pIter->in.iNext - pIter->in.iCurrent) - 2; assert( nRec>0 ); - /* Ensure that only changesets, or only patchsets, but not a mixture - ** of both, are being combined. It is an error to try to combine a - ** changeset and a patchset. */ - if( pGrp->pList==0 ){ - pGrp->bPatch = pIter->bPatchset; - }else if( pIter->bPatchset!=pGrp->bPatch ){ - rc = SQLITE_ERROR; - } - - if( rc==SQLITE_OK ){ - const char *zTab = 0; - sqlite3changeset_op(pIter, &zTab, &nCol, &op, &bIndirect); - rc = sessionChangesetFindTable(pGrp, zTab, pIter, &pTab); - } - - if( rc==SQLITE_OK && nColnCol ){ + if( nColnCol ){ SessionBuffer *pBuf = &pGrp->rec; rc = sessionChangesetExtendRecord(pGrp, pTab, nCol, op, aRec, nRec, pBuf); aRec = pBuf->aBuf; @@ -6073,7 +6123,7 @@ static int sessionOneChangeToHash( assert( pGrp->db ); } - if( rc==SQLITE_OK && sessionGrowHash(0, pIter->bPatchset, pTab) ){ + if( rc==SQLITE_OK && sessionGrowHash(0, pGrp->bPatch, pTab) ){ rc = SQLITE_NOMEM; } @@ -6081,12 +6131,12 @@ static int sessionOneChangeToHash( /* Search for existing entry. If found, remove it from the hash table. ** Code below may link it back in. */ iHash = sessionChangeHash( - pTab, (pIter->bPatchset && op==SQLITE_DELETE), aRec, pTab->nChange + pTab, (pGrp->bPatch && op==SQLITE_DELETE), aRec, pTab->nChange ); for(pp=&pTab->apChange[iHash]; *pp; pp=&(*pp)->pNext){ int bPkOnly1 = 0; int bPkOnly2 = 0; - if( pIter->bPatchset ){ + if( pGrp->bPatch ){ bPkOnly1 = (*pp)->op==SQLITE_DELETE; bPkOnly2 = op==SQLITE_DELETE; } @@ -6101,7 +6151,7 @@ static int sessionOneChangeToHash( if( rc==SQLITE_OK ){ rc = sessionChangeMerge(pTab, bRebase, - pIter->bPatchset, pExist, op, bIndirect, aRec, nRec, &pChange + pGrp->bPatch, pExist, op, bIndirect, aRec, nRec, &pChange ); } if( rc==SQLITE_OK && pChange ){ @@ -6110,6 +6160,47 @@ static int sessionOneChangeToHash( pTab->nEntry++; } + return rc; +} + +/* +** Add the change currently indicated by iterator pIter to the hash table +** belonging to changegroup pGrp. +*/ +static int sessionOneChangeIterToHash( + sqlite3_changegroup *pGrp, + sqlite3_changeset_iter *pIter, + int bRebase +){ + u8 *aRec = &pIter->in.aData[pIter->in.iCurrent + 2]; + int nRec = (pIter->in.iNext - pIter->in.iCurrent) - 2; + const char *zTab = 0; + int nCol = 0; + int op = 0; + int bIndirect = 0; + int rc = SQLITE_OK; + SessionTable *pTab = 0; + + /* Ensure that only changesets, or only patchsets, but not a mixture + ** of both, are being combined. It is an error to try to combine a + ** changeset and a patchset. */ + if( pGrp->pList==0 ){ + pGrp->bPatch = pIter->bPatchset; + }else if( pIter->bPatchset!=pGrp->bPatch ){ + rc = SQLITE_ERROR; + } + + if( rc==SQLITE_OK ){ + sqlite3changeset_op(pIter, &zTab, &nCol, &op, &bIndirect); + rc = sessionChangesetFindTable(pGrp, zTab, pIter, &pTab); + } + + if( rc==SQLITE_OK ){ + rc = sessionOneChangeToHash( + pGrp, pTab, op, bIndirect, nCol, aRec, nRec, bRebase + ); + } + if( rc==SQLITE_OK ) rc = pIter->rc; return rc; } @@ -6129,7 +6220,7 @@ static int sessionChangesetToHash( pIter->in.bNoDiscard = 1; while( SQLITE_ROW==(sessionChangesetNext(pIter, &aRec, &nRec, 0)) ){ - rc = sessionOneChangeToHash(pGrp, pIter, bRebase); + rc = sessionOneChangeIterToHash(pGrp, pIter, bRebase); if( rc!=SQLITE_OK ) break; } @@ -6219,6 +6310,33 @@ int sqlite3changegroup_new(sqlite3_changegroup **pp){ return rc; } +/* +** Configure a changegroup object. +*/ +int sqlite3changegroup_config( + sqlite3_changegroup *pGrp, + int op, + void *pArg +){ + int rc = SQLITE_OK; + + switch( op ){ + case SQLITE_CHANGEGROUP_CONFIG_PATCHSET: { + int arg = *(int*)pArg; + if( pGrp->pList==0 && arg>=0 ){ + pGrp->bPatch = (arg>0); + } + *(int*)pArg = pGrp->bPatch; + break; + } + default: + rc = SQLITE_MISUSE; + break; + } + + return rc; +} + /* ** Provide a database schema to the changegroup object. */ @@ -6277,7 +6395,7 @@ int sqlite3changegroup_add_change( rc = SQLITE_ERROR; }else{ pIter->in.bNoDiscard = 1; - rc = sessionOneChangeToHash(pGrp, pIter, 0); + rc = sessionOneChangeIterToHash(pGrp, pIter, 0); } return rc; } @@ -6329,6 +6447,12 @@ int sqlite3changegroup_output_strm( */ void sqlite3changegroup_delete(sqlite3_changegroup *pGrp){ if( pGrp ){ + int ii; + for(ii=0; iicd.nBufAlloc; ii++){ + sqlite3_free(pGrp->cd.aBuf[ii].aBuf); + } + sqlite3_free(pGrp->cd.record.aBuf); + sqlite3_free(pGrp->cd.aBuf); sqlite3_free(pGrp->zDb); sessionDeleteTable(0, pGrp->pList); sqlite3_free(pGrp->rec.aBuf); @@ -6759,4 +6883,326 @@ int sqlite3session_config(int op, void *pArg){ return rc; } +/* +** Begin adding a change to a changegroup object. +*/ +int sqlite3changegroup_change_begin( + sqlite3_changegroup *pGrp, + int eOp, + const char *zTab, + int bIndirect, + char **pzErr +){ + SessionTable *pTab = 0; + int rc = SQLITE_OK; + + if( pGrp->cd.pTab ){ + rc = SQLITE_MISUSE; + }else if( eOp!=SQLITE_INSERT && eOp!=SQLITE_UPDATE && eOp!=SQLITE_DELETE ){ + rc = SQLITE_ERROR; + }else{ + rc = sessionChangesetFindTable(pGrp, zTab, 0, &pTab); + } + if( rc==SQLITE_OK ){ + if( pTab==0 ){ + if( pzErr ){ + *pzErr = sqlite3_mprintf("no such table: %s", zTab); + } + rc = SQLITE_ERROR; + }else{ + int nReq = pTab->nCol * (eOp==SQLITE_UPDATE ? 2 : 1); + pGrp->cd.pTab = pTab; + pGrp->cd.eOp = eOp; + pGrp->cd.bIndirect = bIndirect; + + if( pGrp->cd.nBufAlloccd.aBuf, nReq * sizeof(SessionBuffer) + ); + if( aBuf==0 ){ + rc = SQLITE_NOMEM; + }else{ + memset(&aBuf[pGrp->cd.nBufAlloc], 0, + sizeof(SessionBuffer) * (nReq - pGrp->cd.nBufAlloc) + ); + pGrp->cd.aBuf = aBuf; + pGrp->cd.nBufAlloc = nReq; + } + } + +#ifdef SQLITE_DEBUG + { + /* Assert that all column values are currently undefined */ + int ii; + for(ii=0; iicd.nBufAlloc; ii++){ + assert( pGrp->cd.aBuf[ii].nBuf==0 ); + } + } +#endif + } + } + + return rc; +} + +/* +** This function does processing common to the _change_int64(), _change_text() +** and other similar APIs. +*/ +static int checkChangeParams( + sqlite3_changegroup *pGrp, + int bNew, + int iCol, + sqlite3_int64 nReq, + SessionBuffer **ppBuf +){ + int rc = SQLITE_OK; + if( pGrp->cd.pTab==0 ){ + rc = SQLITE_MISUSE; + }else if( iCol<0 || iCol>=pGrp->cd.pTab->nCol ){ + rc = SQLITE_RANGE; + }else if( + (bNew && pGrp->cd.eOp==SQLITE_DELETE) + || (!bNew && pGrp->cd.eOp==SQLITE_INSERT) + ){ + rc = SQLITE_ERROR; + }else{ + SessionBuffer *pBuf = &pGrp->cd.aBuf[iCol]; + if( pGrp->cd.eOp==SQLITE_UPDATE && bNew ){ + pBuf += pGrp->cd.pTab->nCol; + } + pBuf->nBuf = 0; + sessionBufferGrow(pBuf, nReq, &rc); + pBuf->nBuf = nReq; + *ppBuf = pBuf; + } + return rc; +} + +/* +** Configure the change currently under construction with an integer value. +*/ +int sqlite3changegroup_change_int64( + sqlite3_changegroup *pGrp, + int bNew, + int iCol, + sqlite3_int64 iVal +){ + int rc = SQLITE_OK; + SessionBuffer *pBuf = 0; + + if( SQLITE_OK!=(rc = checkChangeParams(pGrp, bNew, iCol, 9, &pBuf)) ){ + return rc; + } + + pBuf->aBuf[0] = SQLITE_INTEGER; + sessionPutI64(&pBuf->aBuf[1], iVal); + return SQLITE_OK; +} + +/* +** Configure the change currently under construction with a null value. +*/ +int sqlite3changegroup_change_null( + sqlite3_changegroup *pGrp, + int bNew, + int iCol +){ + int rc = SQLITE_OK; + SessionBuffer *pBuf = 0; + + if( SQLITE_OK!=(rc = checkChangeParams(pGrp, bNew, iCol, 1, &pBuf)) ){ + return rc; + } + + pBuf->aBuf[0] = SQLITE_NULL; + return SQLITE_OK; +} + +/* +** Configure the change currently under construction with a real value. +*/ +int sqlite3changegroup_change_double( + sqlite3_changegroup *pGrp, + int bNew, + int iCol, + double fVal +){ + int rc = SQLITE_OK; + SessionBuffer *pBuf = 0; + + if( SQLITE_OK!=(rc = checkChangeParams(pGrp, bNew, iCol, 9, &pBuf)) ){ + return rc; + } + + pBuf->aBuf[0] = SQLITE_FLOAT; + sessionPutDouble(&pBuf->aBuf[1], fVal); + return SQLITE_OK; +} + +/* +** Configure the change currently under construction with a text value. +*/ +int sqlite3changegroup_change_text( + sqlite3_changegroup *pGrp, + int bNew, + int iCol, + const char *pVal, + int nVal +){ + int nText = nVal>=0 ? nVal : strlen(pVal); + sqlite3_int64 nByte = 1 + sessionVarintLen(nText) + nText; + int rc = SQLITE_OK; + SessionBuffer *pBuf = 0; + + if( SQLITE_OK!=(rc = checkChangeParams(pGrp, bNew, iCol, nByte, &pBuf)) ){ + return rc; + } + + pBuf->aBuf[0] = SQLITE_TEXT; + pBuf->nBuf = (1 + sessionVarintPut(&pBuf->aBuf[1], nText)); + memcpy(&pBuf->aBuf[pBuf->nBuf], pVal, nText); + pBuf->nBuf += nText; + + return SQLITE_OK; +} + +/* +** Configure the change currently under construction with a blob value. +*/ +int sqlite3changegroup_change_blob( + sqlite3_changegroup *pGrp, + int bNew, + int iCol, + const void *pVal, + int nVal +){ + sqlite3_int64 nByte = 1 + sessionVarintLen(nVal) + nVal; + int rc = SQLITE_OK; + SessionBuffer *pBuf = 0; + + if( SQLITE_OK!=(rc = checkChangeParams(pGrp, bNew, iCol, nByte, &pBuf)) ){ + return rc; + } + + pBuf->aBuf[0] = SQLITE_BLOB; + pBuf->nBuf = (1 + sessionVarintPut(&pBuf->aBuf[1], nVal)); + memcpy(&pBuf->aBuf[pBuf->nBuf], pVal, nVal); + pBuf->nBuf += nVal; + + return SQLITE_OK; +} + +/* +** Finish any change currently being constructed by the changegroup object. +*/ +int sqlite3changegroup_change_finish( + sqlite3_changegroup *pGrp, + int bDiscard, + char **pzErr +){ + int rc = SQLITE_OK; + if( pGrp->cd.pTab ){ + SessionBuffer *aBuf = pGrp->cd.aBuf; + int ii; + + if( bDiscard==0 ){ + int nBuf = pGrp->cd.pTab->nCol; + u8 eUndef = SQLITE_NULL; + if( pGrp->cd.eOp==SQLITE_UPDATE ){ + for(ii=0; iicd.pTab->abPK[ii] ){ + if( aBuf[ii].nBuf<=1 ){ + *pzErr = sqlite3_mprintf( + "invalid change: %s value in PK of old.* record", + aBuf[ii].nBuf==1 ? "null" : "undefined" + ); + rc = SQLITE_ERROR; + break; + }else if( aBuf[ii + nBuf].nBuf>0 ){ + *pzErr = sqlite3_mprintf( + "invalid change: defined value in PK of new.* record" + ); + rc = SQLITE_ERROR; + break; + } + }else + if( pGrp->bPatch==0 && (aBuf[ii].nBuf>0)!=(aBuf[ii+nBuf].nBuf>0) ){ + *pzErr = sqlite3_mprintf( + "invalid change: column %d " + "- old.* value is %sdefined but new.* is %sdefined", + ii, aBuf[ii].nBuf ? "" : "un", aBuf[ii+nBuf].nBuf ? "" : "un" + ); + rc = SQLITE_ERROR; + break; + } + } + eUndef = 0x00; + if( pGrp->bPatch==0 ) nBuf = nBuf * 2; + }else{ + for(ii=0; iicd.pTab->abPK[ii]; + if( (pGrp->cd.eOp==SQLITE_INSERT || pGrp->bPatch==0 || isPK) + && aBuf[ii].nBuf==0 + ){ + *pzErr = sqlite3_mprintf( + "invalid change: column %d is undefined", ii + ); + rc = SQLITE_ERROR; + break; + } + if( aBuf[ii].nBuf==1 && isPK ){ + *pzErr = sqlite3_mprintf( + "invalid change: null value in PK" + ); + rc = SQLITE_ERROR; + break; + } + } + } + + pGrp->cd.record.nBuf = 0; + for(ii=0; iicd.aBuf[ii]; + if( pGrp->bPatch ){ + if( pGrp->cd.pTab->abPK[ii]==0 ){ + if( pGrp->cd.eOp==SQLITE_UPDATE ){ + p += pGrp->cd.pTab->nCol; + }else if( pGrp->cd.eOp==SQLITE_DELETE ){ + continue; + } + } + } + if( 0==sessionBufferGrow(&pGrp->cd.record, p->nBuf?p->nBuf:1, &rc) ){ + if( p->nBuf ){ + memcpy(&pGrp->cd.record.aBuf[pGrp->cd.record.nBuf],p->aBuf,p->nBuf); + pGrp->cd.record.nBuf += p->nBuf; + }else{ + pGrp->cd.record.aBuf[pGrp->cd.record.nBuf++] = eUndef; + } + } + } + if( rc==SQLITE_OK ){ + rc = sessionOneChangeToHash( + pGrp, pGrp->cd.pTab, + pGrp->cd.eOp, pGrp->cd.bIndirect, pGrp->cd.pTab->nCol, + pGrp->cd.record.aBuf, pGrp->cd.record.nBuf, 0 + ); + } + } + + /* Reset all aBuf[] entries to "undefined". */ + { + int nZero = pGrp->cd.pTab->nCol; + if( pGrp->cd.eOp==SQLITE_UPDATE ) nZero += nZero; + for(ii=0; iicd.aBuf[ii].nBuf = 0; + } + } + pGrp->cd.pTab = 0; + } + + return rc; +} + #endif /* SQLITE_ENABLE_SESSION && SQLITE_ENABLE_PREUPDATE_HOOK */ diff --git a/ext/session/sqlite3session.h b/ext/session/sqlite3session.h index 28b90eb6b5..fb2336d326 100644 --- a/ext/session/sqlite3session.h +++ b/ext/session/sqlite3session.h @@ -1853,6 +1853,232 @@ int sqlite3session_config(int op, void *pArg); */ #define SQLITE_SESSION_CONFIG_STRMSIZE 1 +/* +** CAPI3REF: Configure a changegroup object +** +** Configure the changegroup object passed as the first argument. +** At present the only valid value for the second parameter is +** [SQLITE_CHANGEGROUP_CONFIG_PATCHSET]. +*/ +int sqlite3changegroup_config(sqlite3_changegroup*, int, void *pArg); + +/* +** CAPI3REF: Options for sqlite3changegroup_config(). +** +** The following values may be passed as the 2nd parameter to +** sqlite3changegroup_config(). +** +**
    SQLITE_CHANGEGROUP_CONFIG_PATCHSET
    +** A changegroup object generates either a changeset or patchset. Usually, +** this is determined by whether the first call to sqlite3changegroup_add() +** is passed a changeset or a patchset. Or, if the first changes are added +** to the changegroup object using the sqlite3changegroup_change_xxx() +** APIs, then this option may be used to configure whether the changegroup +** object generates a changeset or patchset. +** +** When this option is invoked, parameter pArg must point to a value of +** type int. If the changegroup currently contains zero changes, and the +** value of the int variable is zero or greater than zero, then the +** changegroup is configured to generate a changeset or patchset, +** respectively. It is a no-op, not an error, if the changegroup is not +** configured because it has already started accumulating changes. +** +** Before returning, the int variable is set to 0 if the changegroup is +** configured to generate a changeset, or 1 if it is configured to generate +** a patchset. +*/ +#define SQLITE_CHANGEGROUP_CONFIG_PATCHSET 1 + + +/* +** CAPI3REF: Begin adding a change to a changegroup +** +** This API is used, in concert with other sqlite3changegroup_change_xxx() +** APIs, to add changes to a changegroup object one at a time. To add a +** single change, the caller must: +** +** 1. Invoke sqlite3changegroup_change_begin() to indicate the type of +** change (INSERT, UPDATE or DELETE), the affected table and whether +** or not the change should be marked as indirect. +** +** 2. Invoke sqlite3changegroup_change_int64() or one of the other four +** value functions - _null(), _double(), _text() or _blob() - one or +** more times to specify old.* and new.* values for the change being +** constructed. +** +** 3. Invoke sqlite3changegroup_change_finish() to either finish adding +** the change to the group, or to discard the change altogether. +** +** The first argument to this function must be a pointer to the existing +** changegroup object that the change will be added to. The second argument +** must be SQLITE_INSERT, SQLITE_UPDATE or SQLITE_DELETE. The third is the +** name of the table that the change affects, and the fourth is a boolean +** flag specifying whether the change should be marked as "indirect" (if +** bIndirect is non-zero) or not indirect (if bIndirect is zero). +** +** Following a successful call to this function, this function may not be +** called again on the same changegroup object until after +** sqlite3changegroup_change_finish() has been called. Doing so is an +** SQLITE_MISUSE error. +** +** The changegroup object passed as the first argument must be already +** configured with schema data for the specified table. It may be configured +** either by calling sqlite3changegroup_schema() with a database that contains +** the table, or sqlite3changegroup_add() with a changeset that contains the +** table. If the changegroup object has not been configured with a schema for +** the specified table when this function is called, SQLITE_ERROR is returned. +** +** If successful, SQLITE_OK is returned. Otherwise, if an error occurs, an +** SQLite error code is returned. In this case, if argument pzErr is non-NULL, +** then (*pzErr) may be set to point to a buffer containing a utf-8 formated, +** nul-terminated, English language error message. It is the responsibility +** of the caller to eventually free this buffer using sqlite3_free(). +*/ +int sqlite3changegroup_change_begin( + sqlite3_changegroup*, + int eOp, + const char *zTab, + int bIndirect, + char **pzErr +); + +/* +** CAPI3REF: Add a 64-bit integer to a changegroup +** +** This function may only be called between a successful call to +** sqlite3changegroup_change_begin() and its matching +** sqlite3changegroup_change_finish() call. If it is called at any +** other time, it is an SQLITE_MISUSE error. Calling this function +** specifies a 64-bit integer value to be used in the change currently being +** added to the changegroup object passed as the first argument. +** +** The second parameter, bNew, specifies whether the value is to be part of +** the new.* (if bNew is non-zero) or old.* (if bNew is zero) record of +** the change under construction. If this does not match the type of change +** specified by the preceding call to sqlite3changegroup_change_begin() (i.e. +** an old.* value for an SQLITE_INSERT change, or a new.* value for an +** SQLITE_DELETE), then SQLITE_ERROR is returned. +** +** The third parameter specifies the column of the old.* or new.* record that +** the value will be a part of. If the specified table has an explicit primary +** key, then this is the index of the table column, numbered from 0 in the order +** specified within the CREATE TABLE statement. Or, if the table uses an +** implicit rowid key, then the column 0 is the rowid and the explicit columns +** are numbered starting from 1. If the iCol parameter is less than 0 or greater +** than the index of the last column in the table, SQLITE_RANGE is returned. +** +** The fourth parameter is the integer value to use as part of the old.* or +** new.* record. +** +** If this call is successful, SQLITE_OK is returned. Otherwise, if an +** error occurs, an SQLite error code is returned. +*/ +int sqlite3changegroup_change_int64( + sqlite3_changegroup*, + int bNew, + int iCol, + sqlite3_int64 iVal +); + +/* +** CAPI3REF: Add a NULL to a changegroup +** +** This function is similar to sqlite3changegroup_change_int64(). Except that +** it configures the change currently under construction with a NULL value +** instead of a 64-bit integer. +*/ +int sqlite3changegroup_change_null(sqlite3_changegroup*, int, int); + +/* +** CAPI3REF: Add an double to a changegroup +** +** This function is similar to sqlite3changegroup_change_int64(). Except that +** it configures the change currently being constructed with a real value +** instead of a 64-bit integer. +*/ +int sqlite3changegroup_change_double(sqlite3_changegroup*, int, int, double); + +/* +** CAPI3REF: Add a text value to a changegroup +** +** This function is similar to sqlite3changegroup_change_int64(). It configures +** the currently accumulated change with a text value instead of a 64-bit +** integer. Parameter pVal points to a buffer containing the text encoded using +** utf-8. Parameter nVal may either be the size of the text value in bytes, or +** else a negative value, in which case the buffer pVal points to is assumed to +** be nul-terminated. +*/ +int sqlite3changegroup_change_text( + sqlite3_changegroup*, int, int, const char *pVal, int nVal +); + +/* +** CAPI3REF: Add a blob to a changegroup +** +** This function is similar to sqlite3changegroup_change_int64(). It configures +** the currently accumulated change with a blob value instead of a 64-bit +** integer. Parameter pVal points to a buffer containing the blob. Parameter +** nVal is the size of the blob in bytes. +*/ +int sqlite3changegroup_change_blob( + sqlite3_changegroup*, int, int, const void *pVal, int nVal +); + +/* +** CAPI3REF: Finish adding one-at-at-time changes to a changegroup +** +** This function may only be called following a successful call to +** sqlite3changegroup_change_begin(). Otherwise, it is an SQLITE_MISUSE error. +** +** If parameter bDiscard is non-zero, then the current change is simply +** discarded. In this case this function is always successful and SQLITE_OK +** returned. +** +** If parameter bDiscard is zero, then an attempt is made to add the current +** change to the changegroup. Assuming the changegroup is configured to +** produce a changeset (not a patchset), this requires that: +** +** * If the change is an INSERT or DELETE, then a value must be specified +** for all columns of the new.* or old.* record, respectively. +** +** * If the change is an UPDATE record, then values must be provided for +** the PRIMARY KEY columns of the old.* record, but must not be provided +** for PRIMARY KEY columns of the new.* record. +** +** * If the change is an UPDATE record, then for each non-PRIMARY KEY +** column in the old.* record for which a value has been provided, a +** value must also be provided for the same column in the new.* record. +** Similarly, for each non-PK column in the old.* record for which +** a value is not provided, a value must not be provided for the same +** column in the new.* record. +** +** * All values specified for PRIMARY KEY columns must be non-NULL. +** +** Otherwise, it is an error. +** +** If the changegroup already contains a change for the same row (identified +** by PRIMARY KEY columns), then the current change is combined with the +** existing change in the same way as for sqlite3changegroup_add(). +** +** For a patchset, all of the above rules apply except that it doesn't matter +** whether or not values are provided for the non-PK old.* record columns +** for an UPDATE or DELETE change. This means that code used to produce +** a changeset using the sqlite3changegroup_change_xxx() APIs may also +** be used to produce patchsets. +** +** If the call is successful, SQLITE_OK is returned. Otherwise, if an error +** occurs, an SQLite error code is returned. If an error is returned and +** parameter pzErr is not NULL, then (*pzErr) may be set to point to a buffer +** containing a nul-terminated, utf-8 encoded, English language error message. +** It is the responsibility of the caller to eventually free any such error +** message buffer using sqlite3_free(). +*/ +int sqlite3changegroup_change_finish( + sqlite3_changegroup*, + int bDiscard, + char **pzErr +); + /* ** Make sure we can call this stuff from C++. */ diff --git a/ext/session/test_session.c b/ext/session/test_session.c index 6ad5b37749..1b09714225 100644 --- a/ext/session/test_session.c +++ b/ext/session/test_session.c @@ -7,10 +7,14 @@ #include #include "tclsqlite.h" +#include + #ifndef SQLITE_AMALGAMATION typedef unsigned char u8; #endif +extern const char *sqlite3ErrName(int); + typedef struct TestSession TestSession; struct TestSession { sqlite3_session *pSession; @@ -395,7 +399,6 @@ static int SQLITE_TCLAPI test_session_cmd( } rc = sqlite3session_object_config(pSession, aOpt[iOpt].opt, &iArg); if( rc!=SQLITE_OK ){ - extern const char *sqlite3ErrName(int); Tcl_SetObjResult(interp, Tcl_NewStringObj(sqlite3ErrName(rc), -1)); }else{ Tcl_SetObjResult(interp, Tcl_NewIntObj(iArg)); @@ -856,6 +859,21 @@ static int testStreamInput( return SQLITE_OK; } +/* +** This works like Tcl_GetByteArrayFromObj(), except that it returns a buffer +** allocated using malloc() that must be freed by the caller. This is done +** because Tcl's buffers are often padded by a few bytes, which prevents +** small overreads from being detected when tests are run under asan. +*/ +static void *testGetByteArrayFromObj(Tcl_Obj *p, Tcl_Size *pnByte){ + Tcl_Size nByte = 0; + void *aByte = Tcl_GetByteArrayFromObj(p, &nByte); + void *aCopy = malloc(nByte ? (size_t)nByte : 1); + memcpy(aCopy, aByte, (size_t)nByte); + *pnByte = nByte; + return aCopy; +} + static int SQLITE_TCLAPI testSqlite3changesetApply( int iVersion, @@ -920,7 +938,7 @@ static int SQLITE_TCLAPI testSqlite3changesetApply( return TCL_ERROR; } db = *(sqlite3 **)info.objClientData; - pChangeset = (void *)Tcl_GetByteArrayFromObj(objv[2], &nChangeset); + pChangeset = (void *)testGetByteArrayFromObj(objv[2], &nChangeset); ctx.pConflictScript = objv[3]; ctx.pFilterScript = objc==5 ? objv[4] : 0; ctx.interp = interp; @@ -972,6 +990,7 @@ static int SQLITE_TCLAPI testSqlite3changesetApply( } } + free(pChangeset); if( rc!=SQLITE_OK ){ return test_session_error(interp, rc, 0); }else{ @@ -1096,6 +1115,21 @@ static int SQLITE_TCLAPI test_sqlite3changeset_invert( return rc; } +/* +** Copy buffer aIn[] to a new nIn byte buffer obtained from malloc(). Use +** plain malloc() instead of any Tcl function because valgrind and asan are +** better at detecting small overflows in that case. Avoid sqlite3_malloc() +** here because that means dealing with injected OOM errors. +** +** The caller is responsible for eventually calling free() on the returned +** value. +*/ +static u8 *copyToMalloc(const u8 *aIn, int nIn){ + u8 *pRet = malloc(nIn); + memcpy(pRet, aIn, nIn); + return pRet; +} + /* ** sqlite3changeset_concat LEFT RIGHT */ @@ -1126,6 +1160,9 @@ static int SQLITE_TCLAPI test_sqlite3changeset_concat( sLeft.nStream = test_tcl_integer(interp, SESSION_STREAM_TCL_VAR); sRight.nStream = sLeft.nStream; + sLeft.aData = copyToMalloc(sLeft.aData, sLeft.nData); + sRight.aData = copyToMalloc(sRight.aData, sRight.nData); + if( sLeft.nStream>0 ){ rc = sqlite3changeset_concat_strm( testStreamInput, (void*)&sLeft, @@ -1138,6 +1175,9 @@ static int SQLITE_TCLAPI test_sqlite3changeset_concat( ); } + free(sLeft.aData); + free(sRight.aData); + if( rc!=SQLITE_OK ){ rc = test_session_error(interp, rc, 0); }else{ @@ -1195,7 +1235,12 @@ static int SQLITE_TCLAPI test_sqlite3session_foreach( pCS = objv[2]; pScript = objv[3]; - pChangeset = (void *)Tcl_GetByteArrayFromObj(pCS, &nChangeset); + /* Take a copy of the changeset into an exact sized buffer allocated + ** using malloc(). The Tcl buffer will be padded by a few bytes, which + ** prevents small overreads from being detected by ASAN when the tests + ** are run. */ + pChangeset = (void*)testGetByteArrayFromObj(pCS, &nChangeset); + sStr.nStream = test_tcl_integer(interp, SESSION_STREAM_TCL_VAR); if( isInvert ){ int f = SQLITE_CHANGESETSTART_INVERT; @@ -1216,32 +1261,33 @@ static int SQLITE_TCLAPI test_sqlite3session_foreach( rc = sqlite3changeset_start_strm(&pIter, testStreamInput, (void*)&sStr); } } - if( rc!=SQLITE_OK ){ - return test_session_error(interp, rc, 0); - } - while( SQLITE_ROW==sqlite3changeset_next(pIter) ){ - Tcl_Obj *pVar = 0; /* Tcl value to set $VARNAME to */ - pVar = testIterData(pIter); - Tcl_ObjSetVar2(interp, pVarname, 0, pVar, 0); - rc = Tcl_EvalObjEx(interp, pScript, 0); - if( rc!=TCL_OK && rc!=TCL_CONTINUE ){ - sqlite3changeset_finalize(pIter); - return rc==TCL_BREAK ? TCL_OK : rc; + if( rc==SQLITE_OK ){ + while( SQLITE_ROW==sqlite3changeset_next(pIter) ){ + Tcl_Obj *pVar = 0; /* Tcl value to set $VARNAME to */ + pVar = testIterData(pIter); + Tcl_ObjSetVar2(interp, pVarname, 0, pVar, 0); + rc = Tcl_EvalObjEx(interp, pScript, 0); + if( rc!=TCL_OK && rc!=TCL_CONTINUE ){ + sqlite3changeset_finalize(pIter); + free(pChangeset); + return rc==TCL_BREAK ? TCL_OK : rc; + } } - } - if( isCheckNext ){ - int rc2 = sqlite3changeset_next(pIter); - rc = sqlite3changeset_finalize(pIter); - assert( (rc2==SQLITE_DONE && rc==SQLITE_OK) || rc2==rc ); - }else{ - rc = sqlite3changeset_finalize(pIter); + if( isCheckNext ){ + int rc2 = sqlite3changeset_next(pIter); + rc = sqlite3changeset_finalize(pIter); + assert( (rc2==SQLITE_DONE && rc==SQLITE_OK) || rc2==rc ); + }else{ + rc = sqlite3changeset_finalize(pIter); + } } + + free(pChangeset); if( rc!=SQLITE_OK ){ return test_session_error(interp, rc, 0); } - return TCL_OK; } @@ -1530,6 +1576,14 @@ static void test_changegroup_del(void *clientData){ ckfree(pGrp); } +static int testGetNewOrOld(Tcl_Interp *interp, Tcl_Obj *pObj, int *pbNew){ + const char *azVal[] = { "old", "new", 0 }; + int iIdx = 0; + int rc = Tcl_GetIndexFromObj(interp, pObj, azVal, "record", 0, &iIdx); + *pbNew = iIdx; + return rc; +} + /* ** Tclcmd: $changegroup schema DB DBNAME ** Tclcmd: $changegroup add CHANGESET @@ -1547,14 +1601,25 @@ static int SQLITE_TCLAPI test_changegroup_cmd( const char *zSub; int nArg; const char *zMsg; - int iSub; } aSub[] = { - { "schema", 2, "DB DBNAME", }, /* 0 */ - { "add", 1, "CHANGESET", }, /* 1 */ - { "output", 0, "", }, /* 2 */ - { "delete", 0, "", }, /* 3 */ - { "add_change", 1, "ITERATOR", }, /* 4 */ - { 0 } + { "schema", 2, "DB DBNAME" }, /* 0 */ + { "add", 1, "CHANGESET" }, /* 1 */ + { "output", 0, "" }, /* 2 */ + { "delete", 0, "" }, /* 3 */ + { "add_change", 1, "ITERATOR" }, /* 4 */ + + { "change_begin", 3, "TYPE TABLE INDIRECT" }, /* 5 */ + { "change_int64", 3, "[new|old] ICOL VALUE" }, /* 6 */ + { "change_null", 2, "[new|old] ICOL" }, /* 7 */ + { "change_double", 3, "[new|old] ICOL VALUE" }, /* 8 */ + { "change_text", 3, "[new|old] ICOL VALUE" }, /* 9 */ + { "change_blob", 3, "[new|old] ICOL VALUE" }, /* 10 */ + { "change_finish", 1, "BDISCARD" }, /* 11 */ + + { "config", 2, "OPTION INTVAL" }, /* 12 */ + { "change_text-1", 3, "[new|old] ICOL VALUE" }, /* 13 */ + { "change_begin_ne", 3, "TYPE TABLE INDIRECT" }, /* 14 */ + { 0, 0, 0 } }; int rc = TCL_OK; int iSub = 0; @@ -1623,6 +1688,193 @@ static int SQLITE_TCLAPI test_changegroup_cmd( break; }; + case 14: /* change_beginne */ + case 5: { /* change_begin */ + struct ChangeType { + const char *zType; + int eType; + } aType[] = { + { "INSERT", SQLITE_INSERT }, + { "UPDATE", SQLITE_UPDATE }, + { "DELETE", SQLITE_DELETE }, + { 0, 0 } + }; + int eType = 0; + const char *zTab = 0; + int bIndirect; + int iIdx = 0; + char *zErr = 0; + char **pz = ((iSub==5) ? &zErr : 0); + + if( TCL_OK!=Tcl_GetIntFromObj(0, objv[2], &eType) ){ + rc = Tcl_GetIndexFromObjStruct( + interp, objv[2], aType, sizeof(aType[0]), "TYPE", 0, &iIdx + ); + if( rc!=TCL_OK ) return rc; + eType = aType[iIdx].eType; + } + zTab = Tcl_GetString(objv[3]); + if( Tcl_GetBooleanFromObj(interp, objv[4], &bIndirect) ){ + return TCL_ERROR; + } + + rc = sqlite3changegroup_change_begin(p->pGrp, eType, zTab, bIndirect, pz); + assert( zErr==0 || rc!=SQLITE_OK ); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, zErr); + } + + break; + } + + case 6: { /* change_int64 */ + int bNew = 0; + int iCol = 0; + sqlite3_int64 iVal = 0; + if( TCL_OK!=testGetNewOrOld(interp, objv[2], &bNew) + || TCL_OK!=Tcl_GetIntFromObj(interp, objv[3], &iCol) + || TCL_OK!=Tcl_GetWideIntFromObj(interp, objv[4], &iVal) + ){ + rc = TCL_ERROR; + }else{ + rc = sqlite3changegroup_change_int64(p->pGrp, bNew, iCol, iVal); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + } + } + break; + } + + case 7: { /* change_null */ + int bNew = 0; + int iCol = 0; + if( TCL_OK!=testGetNewOrOld(interp, objv[2], &bNew) + || TCL_OK!=Tcl_GetIntFromObj(interp, objv[3], &iCol) + ){ + rc = TCL_ERROR; + }else{ + rc = sqlite3changegroup_change_null(p->pGrp, bNew, iCol); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + } + } + break; + } + + case 8: { /* change_double */ + int bNew = 0; + int iCol = 0; + double rVal = 0; + if( TCL_OK!=testGetNewOrOld(interp, objv[2], &bNew) + || TCL_OK!=Tcl_GetIntFromObj(interp, objv[3], &iCol) + || TCL_OK!=Tcl_GetDoubleFromObj(interp, objv[4], &rVal) + ){ + rc = TCL_ERROR; + }else{ + rc = sqlite3changegroup_change_double(p->pGrp, bNew, iCol, rVal); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + } + } + break; + } + + case 9: { /* change_text */ + int bNew = 0; + int iCol = 0; + if( TCL_OK!=testGetNewOrOld(interp, objv[2], &bNew) + || TCL_OK!=Tcl_GetIntFromObj(interp, objv[3], &iCol) + ){ + rc = TCL_ERROR; + }else{ + Tcl_Size nVal = 0; + const char *pVal = Tcl_GetStringFromObj(objv[4], &nVal); + rc = sqlite3changegroup_change_text(p->pGrp, bNew, iCol, pVal, nVal); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + } + } + break; + } + + case 10: { /* change_blob */ + int bNew = 0; + int iCol = 0; + if( TCL_OK!=testGetNewOrOld(interp, objv[2], &bNew) + || TCL_OK!=Tcl_GetIntFromObj(interp, objv[3], &iCol) + ){ + rc = TCL_ERROR; + }else{ + Tcl_Size nVal = 0; + const u8 *pVal = Tcl_GetByteArrayFromObj(objv[4], &nVal); + rc = sqlite3changegroup_change_blob(p->pGrp, bNew, iCol, pVal, nVal); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + } + } + break; + } + + case 11: { /* change_finish */ + int bDiscard = 0; + if( TCL_OK!=Tcl_GetBooleanFromObj(interp, objv[2], &bDiscard) ){ + rc = TCL_ERROR; + }else{ + char *zErr = 0; + rc = sqlite3changegroup_change_finish(p->pGrp, bDiscard, &zErr); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, zErr); + } + } + break; + } + + case 12: { /* config */ + struct OptionName { + const char *zOpt; + int op; + } aOp[] = { + { "patchset", SQLITE_CHANGEGROUP_CONFIG_PATCHSET }, + { 0, 0 } + }; + int iIdx = 0; + int iArg = 0; + rc = Tcl_GetIndexFromObjStruct( + interp, objv[2], aOp, sizeof(aOp[0]), "option", 0, &iIdx + ); + if( rc==TCL_OK + && (rc = Tcl_GetIntFromObj(interp, objv[3], &iArg))==TCL_OK + ){ + int op = aOp[iIdx].op; + void *pArg = (void*)&iArg; + + rc = sqlite3changegroup_config(p->pGrp, op, pArg); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + }else{ + Tcl_SetObjResult(interp, Tcl_NewIntObj(iArg)); + } + } + break; + } + + case 13: { /* change_text-1 */ + int bNew = 0; + int iCol = 0; + if( TCL_OK!=testGetNewOrOld(interp, objv[2], &bNew) + || TCL_OK!=Tcl_GetIntFromObj(interp, objv[3], &iCol) + ){ + rc = TCL_ERROR; + }else{ + const char *pVal = Tcl_GetString(objv[4]); + rc = sqlite3changegroup_change_text(p->pGrp, bNew, iCol, pVal, -1); + if( rc!=SQLITE_OK ){ + rc = test_session_error(interp, rc, 0); + } + } + break; + } + default: { /* delete */ assert( iSub==3 ); Tcl_DeleteCommand(interp, Tcl_GetString(objv[0])); diff --git a/ext/wasm/EXPORTED_FUNCTIONS.fiddle.in b/ext/wasm/EXPORTED_FUNCTIONS.fiddle.in deleted file mode 100644 index 103704df10..0000000000 --- a/ext/wasm/EXPORTED_FUNCTIONS.fiddle.in +++ /dev/null @@ -1,10 +0,0 @@ -_fiddle_db_arg -_fiddle_db_filename -_fiddle_exec -_fiddle_experiment -_fiddle_interrupt -_fiddle_main -_fiddle_reset_db -_fiddle_db_handle -_fiddle_db_vfs -_fiddle_export_db diff --git a/ext/wasm/GNUmakefile b/ext/wasm/GNUmakefile index 937e16d6ef..05375d42bd 100644 --- a/ext/wasm/GNUmakefile +++ b/ext/wasm/GNUmakefile @@ -1,5 +1,6 @@ -####################################################################### -# This GNU makefile creates the canonical sqlite3 WASM builds. +# +# This GNU makefile creates the canonical sqlite3 WASM builds. Plus some +# others. # # This build assumes a Linux platform and is not intended for # general-purpose client-level use, except for creating builds with @@ -10,28 +11,33 @@ # # default, all = build in dev mode # -# o0, o1, o2, o3, os, oz = full clean/rebuild with the -Ox level indicated -# by the target name. Rebuild is necessary for all components to get -# the desired optimization level. +# o0, o1, o2, o3, os, oz = full clean/rebuild with the -Ox level +# indicated by the target name. A clean rebuild is necessary for +# all components to get the desired optimization level. # # dist = create end user deliverables. Add dist.build=oX to build -# with a specific optimization level, where oX is one of the -# above-listed o? or qo? target names. +# with a specific optimization level, where oX is one of the +# above-listed o? target names. # # snapshot = like dist, but uses a zip file name which clearly -# marks it as a prerelease/snapshot build. +# marks it as a prerelease/snapshot build. # # clean = clean up # # Required tools beyond those needed for the canonical builds: # # - Emscripten SDK: https://emscripten.org/docs/getting_started/downloads.html +# # - The bash shell +# # - GNU make, GNU sed, GNU awk, GNU grep (all in the $PATH and without # a "g" prefix like they have on some non-GNU systems) -# - wasm-strip for release builds: https://github.com/WebAssembly/wabt +# +# - wasm-strip for release builds: https://github.com/WebAssembly/wabt. +# It will build without this but the .wasm files will be huge. +# # - InfoZip for 'dist' zip file -######################################################################## +# default: all MAKEFILE = $(lastword $(MAKEFILE_LIST)) CLEAN_FILES = @@ -80,6 +86,7 @@ emo.strip = 💈 emo.test = 🧪 emo.tool = 🔨 emo.wasm-opt = 🧼 +emo.cleanup = 🧼 # 👷🪄🧮🧫🧽🍿⛽🚧🎱🪚🏆🧼 # @@ -199,7 +206,7 @@ b.mkdir@ = if [ ! -d $(dir $@) ]; then \ # $1 = logtag, $2 = src file(s). $3 = dest dir b.cp = $(call b.mkdir@); \ echo '$(logtag.$(1)) $(emo.disk) $(2) ==> $3'; \ - cp -p $(2) $(3) || exit + cp -f -p $(2) $(3) || exit # # $(call b.c-pp.shcmd,LOGTAG,src,dest,-Dx=y...) @@ -213,9 +220,8 @@ b.cp = $(call b.mkdir@); \ # $4 = optional $(bin.c-pp) flags define b.c-pp.shcmd $(call b.mkdir@); \ -$(call b.echo,$(1),$(emo.disk)$(emo.lock) $(bin.c-pp) $(4) $(if $(loud.if),$(2))); \ -rm -f $(3); \ -$(bin.c-pp) -o $(3) $(4) $(2) || exit; \ +$(call b.echo,$(1),$(emo.disk)$(emo.lock)[$(3)] $(4)); \ +rm -f $(3); $(bin.c-pp) -o $(3) $(4) $(2) || exit; \ chmod -w $(3) endef @@ -227,10 +233,38 @@ endef # Args: as for $(b.c-pp.shcmd). define b.c-pp.target $(3): $$(MAKEFILE_LIST) $$(bin.c-pp) $(2) - @$$(call b.c-pp.shcmd,$(1),$(2),$(3),$(4) $(b.c-pp.target.flags)) + @$$(call b.mkdir@) + @$$(call b.c-pp.shcmd,$(1),$(2),$(3),$(4) $$(b.c-pp.target.flags)) CLEAN_FILES += $(3) endef + +# +# The various -D... values used by *.c-pp.js include: +# +# -Dtarget:es6-module: for all ESM module builds +# +# -Dtarget:node: for node.js builds +# +# -Dtarget:es6-module -Dtarget:es6-bundler-friendly: intended for +# "bundler-friendly" ESM module build. These have some restrictions +# on how URL() objects are constructed in some contexts: URLs which +# refer to files which are part of this project must be referenced +# as string literals so that bundlers' static-analysis tools can +# find those files and include them in their bundles. +# +# -Dtarget:es6-module -Dtarget:node: is intended for use by node.js +# for node.js, as opposed to by node.js on behalf of a +# browser. Mixing -sENVIRONMENT=web and -sENVIRONMENT=node leads to +# ambiguity and confusion on node's part, as it's unable to +# reliably determine whether the target is a browser or node. +# +# To repeat: all node.js builds are 100% untested and unsupported. +# +# Most c-pp.D.X are set via $(bin.mkwb) and X is a build name. +# Those make rules reference c-pp.D.64bit, so it should be defined in +# advance. +# c-pp.D.64bit = -Dbits64 # @@ -248,11 +282,21 @@ c-pp.D.64bit = -Dbits64 # # This is intended to be used in makefile targets which generate an # Emscripten module and where $@ is the module's .js/.mjs file. +# +# This is inherently fragile and has been broken by Emscripten updates +# before. +# +ifeq (1,1) +define b.strip-js-emcc-bindings +echo "$(1) $(emo.garbage) Stripping export wrappers."; \ +sed -i -e '/var _sqlite3.*makeInvalidEarly.*;/d' \ +-e '/assert.*sqlite3.*missing.*;/d' \ +-e '/_sqlite.*createExportWrapper.*;/d' $@ +endef +else b.strip-js-emcc-bindings = \ - sed -i -e '/^.*= $_sqlite3\|_fiddle$[^=]*=.*createExportWrapper/d' \ - -e '/^var $_sqlite3\|_fiddle$[^=]*=.*makeInvalidEarlyAccess/d' $@ || exit; \ - echo '$(1) $(emo.garbage) (Probably) /createExportWrapper()/d and /makeInvalidEarlyAccess()/d' - + echo '$(1) $(emo.bug) (disabled because it breaks emsdk 4.0.16+)' +endif # # Set up sqlite3.c and sqlite3.h... @@ -271,33 +315,36 @@ b.strip-js-emcc-bindings = \ # $(sqlite3.canonical.c) must point to the sqlite3.c in # the sqlite3 canonical source tree, as that source file # is required for certain utility and test code. +# sqlite3.canonical.c = $(dir.top)/sqlite3.c sqlite3.c ?= $(firstword $(wildcard $(dir.top)/sqlite3-see.c) $(sqlite3.canonical.c)) -sqlite3.h = $(dir $(sqlite3.c))/sqlite3.h +sqlite3.h = $(dir $(sqlite3.c))sqlite3.h # # bin.version-info = binary to output various sqlite3 version info for # embedding in the JS files and in building the distribution zip file. # It must NOT be in $(dir.tmp) because we need it to survive the # cleanup process for the dist build to work properly. +# bin.version-info = ./version-info $(bin.version-info): $(dir.tool)/version-info.c $(sqlite3.h) $(dir.top)/Makefile $(CC) -o $@ -I$(dir $(sqlite3.h)) $(dir.tool)/version-info.c t-version-info: $(bin.version-info) DISTCLEAN_FILES += $(bin.version-info) + # # bin.stripcomments is used for stripping C/C++-style comments from JS # files. The JS files contain large chunks of documentation which we # don't need for all builds. That app's -k flag is of particular # importance here, as it allows us to retain the opening comment # block(s), which contain the license header and version info. +# bin.stripccomments = $(dir.tool)/stripccomments $(bin.stripccomments): $(bin.stripccomments).c $(MAKEFILE) $(CC) -o $@ $< t-stripccomments: $(bin.stripccomments) DISTCLEAN_FILES += $(bin.stripccomments) - ifeq (1,$(MAKING_CLEAN)) SQLITE_C_IS_SEE = 0 else @@ -325,7 +372,7 @@ endif # undefine barebones # relatively new gmake feature, not ubiquitous # -# It's important that sqlite3.h be built to completion before any +# It's important that sqlite3.[ch] be built to completion before any # other parts of the build run, thus we use .NOTPARALLEL to disable # parallel build of that file and its dependants. However, that makes # the whole build non-parallelizable because everything has a dep on @@ -342,8 +389,10 @@ $(sqlite3.h): # $(MAKE) -C $(dir.top) sqlite3.c $(sqlite3.c): $(sqlite3.h) +# # Common options for building sqlite3-wasm.c and speedtest1.c. # Explicit ENABLEs... +# SQLITE_OPT.common = \ -DSQLITE_THREADSAFE=0 \ -DSQLITE_TEMP_STORE=2 \ @@ -360,11 +409,15 @@ SQLITE_OPT.common = \ # removing them from this list will serve only to break the speedtest1 # builds. +# # Currently always needed but TODO is paring tester1.c-pp.js down # to be able to run without this: +# SQLITE_OPT.common += -DSQLITE_WASM_ENABLE_C_TESTS +# # Extra flags for full-featured builds... +# SQLITE_OPT.full-featured = \ -DSQLITE_ENABLE_BYTECODE_VTAB \ -DSQLITE_ENABLE_DBPAGE_VTAB \ @@ -421,13 +474,14 @@ else # -DSQLITE_OMIT_WINDOWFUNC endif +# #SQLITE_OPT += -DSQLITE_DEBUG # Enabling SQLITE_DEBUG will break sqlite3_wasm_vfs_create_file() # (and thus sqlite3_js_vfs_create_file()). Those functions are # deprecated and alternatives are in place, but this crash behavior # can be used to find errant uses of sqlite3_js_vfs_create_file() # in client code. -######################################################################## +# # The following flags are hard-coded into sqlite3-wasm.c and cannot be # modified via the build process: # @@ -436,10 +490,9 @@ endif # SQLITE_OMIT_DEPRECATED # SQLITE_OMIT_UTF16 # SQLITE_OMIT_SHARED_CACHE -######################################################################## - +# -######################################################################## +# # Adding custom C code via sqlite3_wasm_extra_init.c: # # If the canonical build process finds the file @@ -460,7 +513,7 @@ endif # make sqlite3_wasm_extra_init.c=my_custom_stuff.c # # See example_extra_init.c for an example implementation. -######################################################################## +# sqlite3_wasm_extra_init.c ?= $(wildcard sqlite3_wasm_extra_init.c) cflags.wasm_extra_init = ifneq (,$(sqlite3_wasm_extra_init.c)) @@ -481,7 +534,7 @@ endif # WASM_CUSTOM_INSTANTIATE = 1 -######################################################################## +# # $(bin.c-pp): a minimal text file preprocessor. Like C's but much # less so. # @@ -512,18 +565,26 @@ WASM_CUSTOM_INSTANTIATE = 1 # # -D... flags which should be included in all invocations should be # appended to $(b.c-pp.target.flags). -bin.c-pp = ./c-pp-lite -$(bin.c-pp): c-pp-lite.c $(sqlite3.c) $(MAKEFILE) - $(CC) -O0 -o $@ c-pp-lite.c $(sqlite3.c) '-DCMPP_DEFAULT_DELIM="//#"' -I$(dir.top) \ +# +bin.c-pp = ./c-pp +$(bin.c-pp): libcmpp.c $(sqlite3.c) $(MAKEFILE) + $(CC) -O0 -o $@ libcmpp.c $(dir.top)/sqlite3.c -I$(dir.top) \ -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_UTF16 \ -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_OMIT_WAL -DSQLITE_THREADSAFE=0 \ - -DSQLITE_TEMP_STORE=3 + -DSQLITE_TEMP_STORE=3 \ + '-DCMPP_DEFAULT_DELIM="//#"' -DCMPP_MAIN -DCMPP_OMIT_D_MODULE \ + -DCMPP_OMIT_D_PIPE DISTCLEAN_FILES += $(bin.c-pp) b.c-pp.target.flags ?= ifeq (1,$(SQLITE_C_IS_SEE)) b.c-pp.target.flags += -Denable-see endif +api.oo1 ?= 1 +ifeq (0,$(api.oo1)) + b.c-pp.target.flags += -Domit-oo1 +endif +# # cflags.common = C compiler flags for all builds cflags.common = -I. -I$(dir $(sqlite3.c)) -std=c99 -fPIC # emcc.WASM_BIGINT = 1 for BigInt (C int64) support, else 0. The API @@ -531,13 +592,14 @@ cflags.common = -I. -I$(dir $(sqlite3.c)) -std=c99 -fPIC # _are not tested_ on any regular basis. emcc.WASM_BIGINT ?= 1 emcc.MEMORY64 ?= 0 -######################################################################## +# # https://emscripten.org/docs/tools_reference/settings_reference.html#memory64 # # 64-bit build requires wasm-strip 1.0.36 (maybe 1.0.35, but not # 1.0.34) or will fail to strip with "tables may not be 64-bit". -######################################################################## +# +# # emcc_opt = optimization-related flags. These are primarily used by # the various oX targets. build times for -O levels higher than 0 are # painful at dev-time. @@ -549,14 +611,17 @@ emcc.MEMORY64 ?= 0 # -O2 (which consistently creates the fastest-running deliverables). # Build time suffers greatly compared to -O0, which is why -O0 is the # default. +# ifeq (,$(filter $(OPTIMIZED_TARGETS),$(MAKECMDGOALS))) emcc_opt ?= -O0 else emcc_opt ?= -Oz endif +# # When passing emcc_opt from the CLI, += and re-assignment have no # effect, so emcc_opt+=-g3 doesn't work. So... +# emcc_opt_full = $(emcc_opt) -g3 # ^^^ ALWAYS use -g3. See below for why. # @@ -583,26 +648,26 @@ emcc_opt_full = $(emcc_opt) -g3 # Much practice has demonstrated that -O2 consistently gives the best # runtime speeds, but not by a large enough factor to rule out use of # -Oz when smaller deliverable size is a priority. -######################################################################## +# -######################################################################## +# # EXPORTED_FUNCTIONS.* = files for use with Emscripten's # -sEXPORTED_FUNCTION flag. -EXPORTED_FUNCTIONS.api.core = $(dir.api)/EXPORTED_FUNCTIONS.sqlite3-core -EXPORTED_FUNCTIONS.api.in = $(EXPORTED_FUNCTIONS.api.core) -ifeq (1,$(SQLITE_C_IS_SEE)) - EXPORTED_FUNCTIONS.api.in += $(dir.api)/EXPORTED_FUNCTIONS.sqlite3-see -endif -ifeq (0,$(wasm-bare-bones)) - EXPORTED_FUNCTIONS.api.in += $(dir.api)/EXPORTED_FUNCTIONS.sqlite3-extras -endif +# +EXPORTED_FUNCTIONS.api.in = $(dir.api)/EXPORTED_FUNCTIONS.c-pp EXPORTED_FUNCTIONS.api = $(dir.tmp)/EXPORTED_FUNCTIONS.api -$(EXPORTED_FUNCTIONS.api): $(EXPORTED_FUNCTIONS.api.in) $(sqlite3.c) $(MAKEFILE) - @$(call b.mkdir@) - cat $(EXPORTED_FUNCTIONS.api.in) > $@ +EXPORTED_FUNCTIONS.c-pp.flags = +ifeq (1,$(wasm-bare-bones)) + EXPORTED_FUNCTIONS.c-pp.flags += -Dbare-bones +endif +$(eval $(call b.c-pp.target,filter,\ + $(EXPORTED_FUNCTIONS.api.in),\ + $(EXPORTED_FUNCTIONS.api),\ + $(EXPORTED_FUNCTIONS.c-pp.flags))) -######################################################################## +# # emcc flags for .c/.o/.wasm/.js. +# emcc.flags = ifeq (1,$(emcc.verbose)) emcc.flags += -v @@ -690,9 +755,9 @@ ifeq (,$(emcc.INITIAL_MEMORY.$(emcc.INITIAL_MEMORY))) $(error emcc.INITIAL_MEMORY must be one of: 4, 8, 16, 32, 64, 96, 128 (megabytes)) endif emcc.jsflags += -sINITIAL_MEMORY=$(emcc.INITIAL_MEMORY.$(emcc.INITIAL_MEMORY)) +# # /INITIAL_MEMORY -######################################################################## -#emcc.jsflags += -sMEMORY64=$(emcc.MEMORY64) +# emcc.jsflags += $(emcc.environment) emcc.jsflags += -sSTACK_SIZE=512KB @@ -701,7 +766,8 @@ emcc.jsflags += -sSTACK_SIZE=512KB # VFS, which requires twice that for its xRead() and xWrite() methods. # 2023-03: those methods have since been adapted to use a malloc()'d # buffer. -######################################################################## + +# # $(sqlite3.js.init-func) is the name Emscripten assigns our exported # module init/load function. This symbol name is hard-coded in # $(extern-post-js.js) as well as in numerous docs. @@ -741,7 +807,7 @@ emcc.jsflags += -sLLD_REPORT_UNDEFINED #emcc.jsflags += --experimental-pic --unresolved-symbols=ingore-all --import-undefined #emcc.jsflags += --unresolved-symbols=ignore-all -######################################################################## +# # -sSINGLE_FILE: # https://github.com/emscripten-core/emscripten/blob/main/src/settings.js # @@ -750,12 +816,7 @@ emcc.jsflags += -sLLD_REPORT_UNDEFINED # cannot wasm-strip the binary before it gets encoded into the JS # file. The result is that the generated JS file is, because of the # -g3 debugging info, _huge_. -######################################################################## - - -sqlite3.wasm = $(dir.dout)/sqlite3.wasm -sqlite3-wasm.c = $(dir.api)/sqlite3-wasm.c -sqlite3-wasm.c.in = $(sqlite3-wasm.c) $(sqlite3_wasm_extra_init.c) +# # # b.call.patch-export-default is used by mkwasmbuilds.c and the @@ -798,51 +859,6 @@ if [ x1 = x$(1) ]; then \ fi endef -# -# The various -D... values used by *.c-pp.js include: -# -# -Dtarget:es6-module: for all ESM module builds -# -# -Dtarget:node: for node.js builds -# -# -Dtarget:es6-module -Dtarget:es6-bundler-friendly: intended for -# "bundler-friendly" ESM module build. These have some restrictions -# on how URL() objects are constructed in some contexts: URLs which -# refer to files which are part of this project must be referenced -# as string literals so that bundlers' static-analysis tools can -# find those files and include them in their bundles. -# -# -Dtarget:es6-module -Dtarget:node: is intended for use by node.js -# for node.js, as opposed to by node.js on behalf of a -# browser. Mixing -sENVIRONMENT=web and -sENVIRONMENT=node leads to -# ambiguity and confusion on node's part, as it's unable to -# reliably determine whether the target is a browser or node. -# -# To repeat: all node.js builds are 100% untested and unsupported. -# -######################################################################## - -# -# Inputs/outputs for the sqlite3-api.js family. -# -# sqlite3-api.jses = the list of JS files which make up -# sqlite3-api.js, in the order they need to be assembled. -sqlite3-api.jses = $(sqlite3-license-version.js) -sqlite3-api.jses += $(dir.api)/sqlite3-api-prologue.js -sqlite3-api.jses += $(dir.common)/whwasmutil.js -sqlite3-api.jses += $(dir.jacc)/jaccwabyt.js -sqlite3-api.jses += $(dir.api)/sqlite3-api-glue.c-pp.js -sqlite3-api.jses += $(sqlite3-api-build-version.js) -sqlite3-api.jses += $(dir.api)/sqlite3-api-oo1.c-pp.js -sqlite3-api.jses += $(dir.api)/sqlite3-api-worker1.c-pp.js -sqlite3-api.jses += $(dir.api)/sqlite3-vfs-helper.c-pp.js -ifeq (0,$(wasm-bare-bones)) - sqlite3-api.jses += $(dir.api)/sqlite3-vtab-helper.c-pp.js -endif -sqlite3-api.jses += $(dir.api)/sqlite3-vfs-opfs.c-pp.js -sqlite3-api.jses += $(dir.api)/sqlite3-vfs-opfs-sahpool.c-pp.js -sqlite3-api.jses += $(dir.api)/sqlite3-api-cleanup.js - # # $(sqlite3-license-version.js) contains the license header and # in-comment build version info. @@ -852,17 +868,15 @@ sqlite3-api.jses += $(dir.api)/sqlite3-api-cleanup.js # sqlite3-license-version.js = $(dir.tmp)/sqlite3-license-version.js $(sqlite3-license-version.js): $(bin.version-info) \ - $(dir.api)/sqlite3-license-version-header.js - @echo '$(logtag.@) $(emo.disk)'; { \ - $(call b.mkdir@); \ + $(dir.api)/sqlite3-license-version-header.js $(MAKEFILE) + @$(call b.mkdir@); echo '$(logtag.@) $(emo.disk)'; { \ cat $(dir.api)/sqlite3-license-version-header.js || exit $$?; \ - echo '/*'; \ + echo '/* @preserve'; \ echo '** This code was built from sqlite3 version...'; \ echo '**'; \ awk '/define SQLITE_VERSION/{$$1=""; print "**" $$0}' $(sqlite3.h); \ awk '/define SQLITE_SOURCE_ID/{$$1=""; print "**" $$0}' $(sqlite3.h); \ echo '**'; echo '** Emscripten SDK: $(emcc.version)'; \ - echo '**'; \ echo '*/'; \ } > $@ @@ -881,6 +895,36 @@ $(sqlite3-api-build-version.js): $(bin.version-info) $(MAKEFILE) echo '});'; \ } > $@ +# +# Inputs/outputs for the sqlite3-api.js family. +# +# sqlite3-api.jses = the list of JS files which make up +# sqlite3-api.js, in the order they need to be assembled. +sqlite3-api.jses = $(sqlite3-license-version.js) +sqlite3-api.jses += $(dir.api)/sqlite3-api-prologue.js +sqlite3-api.jses += $(sqlite3-api-build-version.js) +sqlite3-api.jses += $(dir.common)/whwasmutil.js +sqlite3-api.jses += $(dir.jacc)/jaccwabyt.js +sqlite3-api.jses += $(dir.api)/sqlite3-api-glue.c-pp.js +sqlite3-api.jses += $(dir.api)/sqlite3-api-oo1.c-pp.js +sqlite3-api.jses += $(dir.api)/sqlite3-api-worker1.c-pp.js +sqlite3-api.jses += $(dir.api)/sqlite3-vfs-helper.c-pp.js +ifeq (0,$(wasm-bare-bones)) + sqlite3-api.jses += $(dir.api)/sqlite3-vtab-helper.c-pp.js +endif +sqlite3-api.jses += $(dir.api)/sqlite3-vfs-kvvfs.c-pp.js +sqlite3-api.jses += $(dir.api)/opfs-common-shared.c-pp.js +sqlite3-api.jses += $(dir.api)/sqlite3-vfs-opfs.c-pp.js +sqlite3-api.jses += $(dir.api)/sqlite3-vfs-opfs-sahpool.c-pp.js +sqlite3-api.jses += $(dir.api)/sqlite3-vfs-opfs-wl.c-pp.js + +# Parallel builds can fail if $(sqlite3-license-version.js) is not +# created early enough, so make all files in $(sqlite-api.jses) except +# for $(sqlite3-license-version.js) depend on +# $(sqlite3-license-version.js). +deps.jses = $(filter-out $(sqlite3-license-version.js),$(sqlite3-api.jses)) +$(deps.jses): $(sqlite3-license-version.js) + # # extern-post-js* and extern-pre-js* are files for use with # Emscripten's --extern-pre-js and --extern-post-js flags. @@ -918,7 +962,7 @@ $(post-js.in.js): $(MKDIR.bld) $(post-jses.js) $(MAKEFILE) done > $@ # -# speedtest1 decls needed before the $(bin.mkws)-generated makefile +# speedtest1 decls needed before the $(bin.mkwb)-generated makefile # is included. # bin.speedtest1 = ../../speedtest1 @@ -957,12 +1001,21 @@ endif # /shell.c ######################################################################## +# +# Fiddle-related decls we need before .wasmbuilds is included +# + +fiddle.c.in = $(dir.top)/shell.c $(sqlite3-wasm.c) + EXPORTED_FUNCTIONS.fiddle = $(dir.tmp)/EXPORTED_FUNCTIONS.fiddle -$(EXPORTED_FUNCTIONS.fiddle): $(fiddle.EXPORTED_FUNCTIONS.in) $(MAKEFILE_LIST) - @$(b.mkdir@) - @sort -u $(fiddle.EXPORTED_FUNCTIONS.in) > $@ +$(EXPORTED_FUNCTIONS.fiddle): $(EXPORTED_FUNCTIONS.api.in) \ + $(MAKEFILE_LIST) $(bin.c-pp) + @$(call b.mkdir@) + @$(call b.c-pp.shcmd,fiddle,$(EXPORTED_FUNCTIONS.api.in),\ + $@,$(EXPORTED_FUNCTIONS.c-pp.flags) -Dfiddle) @echo $(logtag.@) $(emo.disk) + emcc.flags.fiddle = \ $(emcc.cflags) $(emcc_opt_full) \ --minify 0 \ @@ -987,26 +1040,23 @@ emcc.flags.fiddle = \ -USQLITE_WASM_BARE_BONES \ -DSQLITE_SHELL_FIDDLE -clean: clean-fiddle -clean-fiddle: - rm -f $(dir.fiddle)/fiddle-module.js \ - $(dir.fiddle)/*.wasm \ - $(dir.fiddle)/sqlite3-opfs-*.js \ - $(dir.fiddle)/*.gz \ - EXPORTED_FUNCTIONS.fiddle - rm -fr $(dir.fiddle-debug) - emcc.flags.fiddle.debug = $(emcc.flags.fiddle) \ -DSQLITE_DEBUG \ -DSQLITE_ENABLE_SELECTTRACE \ -DSQLITE_ENABLE_WHERETRACE -fiddle.EXPORTED_FUNCTIONS.in = \ - EXPORTED_FUNCTIONS.fiddle.in \ - $(dir.api)/EXPORTED_FUNCTIONS.sqlite3-core \ - $(dir.api)/EXPORTED_FUNCTIONS.sqlite3-extras - -fiddle.c.in = $(dir.top)/shell.c $(sqlite3-wasm.c) +clean: clean-fiddle +clean-fiddle: + rm -f $(dir.fiddle)/fiddle-module.js \ + $(dir.fiddle)/*.wasm \ + $(dir.fiddle)/sqlite3-opfs-*.js \ + $(dir.fiddle)/*.gz \ + $(dir.fiddle)/index.html \ + $(EXPORTED_FUNCTIONS.fiddle) + rm -fr $(dir.fiddle-debug) +distclean: distclean-fiddle +distclean-fiddle: + rm -fr $(dir.fiddle)/jqterm # # WASMFS build - unsupported and untested. We used WASMFS @@ -1028,6 +1078,14 @@ cflags.wasmfs = -DSQLITE_ENABLE_WASMFS # end wasmfs (the rest is in mkwasmbuilds.c) # +# +# +# +sqlite3-wasm.c = $(dir.api)/sqlite3-wasm.c +# List of input files for compiling $(sqlite3-wasm.c). That file +# #include's sqlite3.c directly, so it's implicitly includes here. +sqlite3-wasm.c.in = $(sqlite3-wasm.c) $(sqlite3_wasm_extra_init.c) + # # $(bin.mkwb) is used for generating much of the makefile code for the # various wasm builds. It used to be generated in this makefile via a @@ -1083,10 +1141,10 @@ sqlite3.ext.js = define gen-worker1 # $1 = X.ext part of sqlite3-worker1X.ext # $2 = $(c-pp.D.NAME) -$(call b.c-pp.target,filter,$(dir.api)/sqlite3-worker1.c-pp.js,\ - $(dir.dout)/sqlite3-worker1$(1),$(2)) -sqlite3.ext.js += $(dir.dout)/sqlite3-worker1$(1) -all: $(dir.dout)/sqlite3-worker1$(1) +$(call b.c-pp.target,filter,$$(dir.api)/sqlite3-worker1.c-pp.js,\ + $$(dir.dout)/sqlite3-worker1$(1),$(2)) +sqlite3.ext.js += $$(dir.dout)/sqlite3-worker1$(1) +all: $$(dir.dout)/sqlite3-worker1$(1) endef $(eval $(call gen-worker1,.js,$(c-pp.D.vanilla))) @@ -1100,10 +1158,10 @@ $(eval $(call gen-worker1,-bundler-friendly.mjs,$(c-pp.D.bundler))) define gen-promiser # $1 = X.ext part of sqlite3-worker1-promiserX.ext # $2 = $(c-pp.D.NAME) -$(call b.c-pp.target,filter,$(dir.api)/sqlite3-worker1-promiser.c-pp.js,\ - $(dir.dout)/sqlite3-worker1-promiser$(1),$(2)) -sqlite3.ext.js += $(dir.dout)/sqlite3-worker1-promiser$(1) -all: $(dir.dout)/sqlite3-worker1-promiser$(1) +$(call b.c-pp.target,filter,$$(dir.api)/sqlite3-worker1-promiser.c-pp.js,\ + $$(dir.dout)/sqlite3-worker1-promiser$(1),$(2)) +sqlite3.ext.js += $$(dir.dout)/sqlite3-worker1-promiser$(1) +all: $$(dir.dout)/sqlite3-worker1-promiser$(1) endef $(eval $(call gen-promiser,.js,$(c-pp.D.vanilla))) @@ -1128,13 +1186,13 @@ all: demos ####################################################################### # -# "SOAP" is a static file which is not part of the amalgamation but -# gets copied into the build output folder and into each of the fiddle -# builds. +# "SOAP" is not part of the amalgamation but gets copied into the +# build output folder and into each of the fiddle builds. # sqlite3.ext.js += $(dir.dout)/sqlite3-opfs-async-proxy.js -$(dir.dout)/sqlite3-opfs-async-proxy.js: $(dir.api)/sqlite3-opfs-async-proxy.js - @$(call b.cp,@,$<,$@) +$(eval $(call b.c-pp.target,soap,\ + $(dir.api)/sqlite3-opfs-async-proxy.c-pp.js,\ + $(dir.dout)/sqlite3-opfs-async-proxy.js)) # # Add a dep of $(sqlite3.ext.js) on every individual build's JS file. @@ -1145,12 +1203,19 @@ $(dir.dout)/sqlite3-opfs-async-proxy.js: $(dir.api)/sqlite3-opfs-async-proxy.js # we don't otherwise have a great place to attach them such that # they're always copied when we need them. # +# The var $(out.$(B).js) comes from $(bin.mkwb) and $(B) is the name +# of a build set up by that tool, e.g. b-vanilla or b-esm64. +# $(foreach B,$(b.names),$(eval $(out.$(B).js): $(sqlite3.ext.js))) + # # b-all: builds all available js/wasm builds. # $(foreach B,$(b.names),$(eval b-all: $(out.$(B).js))) +#$(foreach B,$(b.names),$(eval pre: $(pre-js.$(B).js))) +$(foreach B,$(b.names),$(eval post: $(post-js.$(B).js))) + # # speedtest1 is our primary benchmarking tool. # @@ -1160,7 +1225,7 @@ $(foreach B,$(b.names),$(eval b-all: $(out.$(B).js))) # These flags get applied via $(bin.mkwb). emcc.speedtest1.common = $(emcc_opt_full) emcc.speedtest1 = -I. -I$(dir $(sqlite3.canonical.c)) -emcc.speedtest1 += -sENVIRONMENT=web +emcc.speedtest1 += -sENVIRONMENT=web,worker emcc.speedtest1 += -sALLOW_MEMORY_GROWTH emcc.speedtest1 += -sINITIAL_MEMORY=$(emcc.INITIAL_MEMORY.32) emcc.speedtest1.common += -sINVOKE_RUN=0 @@ -1199,28 +1264,37 @@ speedtest1.exit-runtime1 = -sEXIT_RUNTIME=1 # -sEXIT_RUNTIME=1 but we need EXIT_RUNTIME=0 for the worker-based app # which runs speedtest1 multiple times. -$(EXPORTED_FUNCTIONS.speedtest1): $(EXPORTED_FUNCTIONS.api.core) - @$(call b.echo,@,$(emo.disk)); \ - $(call b.mkdir@); \ - { echo _wasm_main; cat $(EXPORTED_FUNCTIONS.api.core); } > $@ || exit +$(EXPORTED_FUNCTIONS.speedtest1): $(EXPORTED_FUNCTIONS.api) + @$(call b.mkdir@); $(call b.echo,@,$(emo.disk)); \ + { echo _wasm_main; cat $(EXPORTED_FUNCTIONS.api); } > $@ || exit speedtest1: b-speedtest1 +st: speedtest1 # # Generate 64-bit variants of speedtest1*.{js,html} # +# $1 = input file +# $2 = output file +# +# TODO: preprocess these like we do the rest. +# define gen-st64 $(2): $(1) @$$(call b.echo,speedtest164,$$(emo.disk)$(emo.lock) Creating from $$<) - @rm -f $$@; \ - sed -e 's/$(3)\.js/$(3)-64bit\.js/' < $$< > $$@; \ + rm -f $$@; \ + sed -e 's/speedtest1\.js/speedtest1-64bit\.js/' \ + -e 's/speedtest1-worker\.js/speedtest1-worker-64bit\.js/' \ + < $$< > $$@; \ chmod -w $$@ -b-speedtest164: $(2) +$(2): b-speedtest164 +speedtest1: $(1) $(2) CLEAN_FILES += $(2) endef +speedtest1: b-speedtest164 -$(eval $(call gen-st64,speedtest1.html,speedtest1-64bit.html,speedtest1)) -$(eval $(call gen-st64,speedtest1-worker.html,speedtest1-worker-64bit.html,speedtest1-worker)) -$(eval $(call gen-st64,speedtest1-worker.js,speedtest1-worker-64bit.js,speedtest1-worker)) +$(eval $(call gen-st64,speedtest1.html,speedtest1-64bit.html)) +$(eval $(call gen-st64,speedtest1-worker.html,speedtest1-worker-64bit.html)) +$(eval $(call gen-st64,speedtest1-worker.js,speedtest1-worker-64bit.js)) # end speedtest1.js ######################################################################## @@ -1244,9 +1318,11 @@ $(eval $(call gen-st64,speedtest1-worker.js,speedtest1-worker-64bit.js,speedtest # # To create those, we filter tester1.c-pp.js/html with $(bin.c-pp)... +# # tester1.js variants: +# define gen-tester1.js -# $1 = build name to have dep on +# $1 = build name to have a dep on # $2 = suffix for tester1SUFFIX JS # $3 = $(bin.c-pp) flags $(call b.c-pp.target,test,tester1.c-pp.js,tester1$(2),$(3)) @@ -1255,20 +1331,18 @@ tester1-$(1): tester1$(2) tester1: tester1$(2) endef -$(eval $(call gen-tester1.js,vanilla,.js, \ - $(c-pp.D.vanilla) \ - -Dsqlite3.js=$(dir.dout)/sqlite3.js)) -$(eval $(call gen-tester1.js,vanilla64,-64bit.js, \ - $(c-pp.D.vanilla64) \ - -Dsqlite3.js=$(dir.dout)/sqlite3-64bit.js)) -$(eval $(call gen-tester1.js,esm,.mjs, \ - $(c-pp.D.esm) \ - -Dsqlite3.js=$(dir.dout)/sqlite3.mjs)) -$(eval $(call gen-tester1.js,esm64,-64bit.mjs, \ - $(c-pp.D.esm64) \ - -Dsqlite3.js=$(dir.dout)/sqlite3-64bit.mjs)) +$(eval $(call gen-tester1.js,vanilla,.js,\ + $(c-pp.D.vanilla) -Dsqlite3.js=$(dir.dout)/sqlite3.js)) +$(eval $(call gen-tester1.js,vanilla64,-64bit.js,\ + $(c-pp.D.vanilla64) -Dsqlite3.js=$(dir.dout)/sqlite3-64bit.js)) +$(eval $(call gen-tester1.js,esm,.mjs,\ + $(c-pp.D.esm) -Dsqlite3.js=$(dir.dout)/sqlite3.mjs)) +$(eval $(call gen-tester1.js,esm64,-64bit.mjs,\ + $(c-pp.D.esm64) -Dsqlite3.js=$(dir.dout)/sqlite3-64bit.mjs)) +# # tester1.html variants: +# define gen-tester1.html # $1 = build name to have a dep on # $2 = filename suffix: empty, -64bit, -esm, esm-64bit @@ -1303,9 +1377,11 @@ $(eval $(call gen-tester1.html,esm64,-esm-64bit,\ -Dtester1.js=tester1-64bit.mjs \ -Dsqlite3.js=$(dir.dout)/sqlite3-64bit.mjs)) -# tester1-worker.html variants: -# There is no ESM variant of this file. Instead, that page accepts a -# ?esm URL flag to switch to ESM mode. +# +# tester1-worker.html variants: There is no ESM variant of this +# file. Instead, that page accepts the ?esm URL flag to switch to ESM +# mode. +# $(eval $(call b.c-pp.target,test,tester1-worker.c-pp.html,\ tester1-worker.html,-Dbitness=32)) $(eval $(call b.c-pp.target,test,tester1-worker.c-pp.html,\ @@ -1315,8 +1391,57 @@ tester1-worker.html: tester1.mjs tester1-worker-64bit.html: tester1-64bit.mjs all: tester1 +# # end tester1 -######################################################################## +# + +# +# jquery.terminal support for fiddle: +# +# If a clone of https://github.com/jcubic/jquery.terminal +# is found in $(JQTERM), defaulting to $(HOME)/src/jquery.terminal +# then add jquery.terminal support to fiddle. +# +# To build that package, from its checkout dir: +# +# npm install +# make +# +c-pp.D.fiddle ?= +JQTERM ?= $(HOME)/src/jquery.terminal +dir.jqtermExt = $(firstword $(wildcard $(JQTERM))) +#$(info dir.jqtermExt=$(dir.jqtermExt)) +ifeq (0,$(MAKING_CLEAN)) +ifeq (,$(wildcard $(dir.jqtermExt)/js/jquery.terminal.min.js)) +$(info $(emo.magic) To add jquery.terminal support to fiddle, set JQTERM=/path/to/its/built/checkout) +else +$(info $(emo.magic) jquery.terminal found in $(dir.jqtermExt) - adding it to fiddle. Make sure it is built!) + +dir.jqterm = $(dir.fiddle)/jqterm +$(dir.fiddle)/jqterm/jquery.terminal.bundle.min.js: + @$(call b.mkdir@) + cat $(dir.jqtermExt)/js/jquery-1*.min.js \ + $(dir.jqtermExt)/js/jquery.terminal.min.js > $@ + +$(dir.fiddle)/jqterm/jquery.terminal.min.css: $(dir.jqtermExt)/css/jquery.terminal.min.css + @$(call b.mkdir@) + @$(call b.cp,fiddle,$<,$(dir $@)) + +$(dir.fiddle)/index.html: $(dir.fiddle)/jqterm/jquery.terminal.bundle.min.js \ + $(dir.fiddle)/jqterm/jquery.terminal.min.css +c-pp.D.fiddle += -Djqterm +endif +endif +# ^^^ JQTERM/MAKING_CLEAN + +# +# Generate fiddle/index.html. Must come after JQTERM is handled. +# +$(dir.fiddle)/index.html: $(dir.fiddle)/index.c-pp.html +$(eval $(call b.c-pp.target,fiddle,\ + $(dir.fiddle)/index.c-pp.html,$(dir.fiddle)/index.html,$(c-pp.D.fiddle))) +$(out.fiddle.wasm): $(dir.fiddle)/index.html + # # Convenience rules to rebuild with various -Ox levels. Much @@ -1327,6 +1452,7 @@ all: tester1 # # Achtung: build times with anything higher than -O0 are somewhat # painful, which is why -O0 is the default. +# .PHONY: o0 o1 o2 o3 os oz emcc-opt-extra = #ifeq (1,$(wasm-bare-bones)) @@ -1377,7 +1503,9 @@ push-testing: ssh wasm-testing 'cd $(wasm-testing.dir) && bash .gzip' || \ echo "SSH failed: it's likely that stale content will be served via old gzip files." +# # build everything needed by push-testing with -Oz +# .PHONY: for-testing for-testing: emcc_opt=-Oz for-testing: loud=1 @@ -1404,9 +1532,9 @@ update-docs: exit 127 else wasm.docs.jswasm = $(wasm.docs.home)/jswasm -update-docs: $(bin.stripccomments) $(out.sqlite3.js) $(out.sqlite3.wasm) +update-docs: $(bin.stripccomments) $(out.vanilla.js) $(out.vanilla.wasm) @echo "Copying files to the /wasm docs. Be sure to use an -Oz build for this!"; - cp -p $(sqlite3.wasm) $(wasm.docs.jswasm)/. + cp -p $(out.vanilla.wasm) $(wasm.docs.jswasm)/. $(bin.stripccomments) -k -k < $(out.vanilla.js) \ | sed -e '/^[ \t]*$$/d' > $(wasm.docs.jswasm)/sqlite3.js cp -p demo-123.js demo-123.html demo-123-worker.html $(wasm.docs.home)/. @@ -1462,13 +1590,46 @@ endif dist-name-prefix = sqlite-wasm$(dist-name-extra) .PHONY: dist dist: - ./mkdist.sh $(dist-name-prefix) + $(bin.bash) ./mkdist.sh $(dist-name-prefix) snapshot: - ./mkdist.sh $(dist-name-prefix) --snapshot + $(bin.bash) ./mkdist.sh $(dist-name-prefix) --snapshot endif # ^^^ making dist/snapshot CLEAN_FILES += $(wildcard sqlite-wasm-*.zip) +######################################################################## +# The npm target is specifically for preparing files for the downstream +# https://github.com/sqlite/sqlite-wasm (npm) distribution. +# +# Per agreement with that project's maintainers, these filenames need +# to remain stable. To avoid breakage in their deployment process, any +# changes (like renaming files) which potentially break their deployment +# needs to be communicated to that project via opening a new ticket or +# direct coordination with its maintainers. +# +# This target does a full clean/rebuild so that we can ensure that the +# optimization level is set consistently across all files. +# +npm.bundle.zip = npm-bundle.zip +CLEAN_FILES += $(npm.bundle.zip) +# Distributables which need to be built for npm: +npm_files = $(addprefix $(dir.dout)/, \ +sqlite3-bundler-friendly.mjs \ +sqlite3-opfs-async-proxy.js \ +sqlite3-worker1-bundler-friendly.mjs \ +sqlite3-worker1-promiser.mjs \ +sqlite3-worker1.mjs \ +sqlite3.mjs \ +sqlite3.wasm \ +sqlite3-node.mjs \ +) +npm: $(sqlite3.canonical.c) + @echo "$(emo.cleanup) Forcing a clean rebuild to ensure consistent optimization flags." + $(MAKE) clean + $(MAKE) -e "emcc_opt=-Oz $(emcc-opt-extra)" $(npm_files) + rm -f $(npm.bundle.zip); zip -r $(npm.bundle.zip) $(npm_files) + unzip -l $(npm.bundle.zip) + ######################################################################## # Explanation of, and some commentary on, various emcc build flags # follows. Full docs for these can be found at: diff --git a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-core b/ext/wasm/api/EXPORTED_FUNCTIONS.c-pp similarity index 60% rename from ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-core rename to ext/wasm/api/EXPORTED_FUNCTIONS.c-pp index 5060545102..2b8397a6d7 100644 --- a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-core +++ b/ext/wasm/api/EXPORTED_FUNCTIONS.c-pp @@ -13,6 +13,7 @@ _sqlite3_bind_parameter_index _sqlite3_bind_parameter_name _sqlite3_bind_pointer _sqlite3_bind_text +_sqlite3_bind_zeroblob _sqlite3_busy_handler _sqlite3_busy_timeout _sqlite3_cancel_auto_extension @@ -75,6 +76,7 @@ _sqlite3_limit _sqlite3_malloc _sqlite3_malloc64 _sqlite3_msize +_sqlite3_next_stmt _sqlite3_open _sqlite3_open_v2 _sqlite3_overload_function @@ -155,3 +157,92 @@ _sqlite3_vtab_in_next _sqlite3_vtab_nochange _sqlite3_vtab_on_conflict _sqlite3_vtab_rhs_value +//#if not bare-bones +_sqlite3_column_database_name +_sqlite3_column_origin_name +_sqlite3_column_table_name +_sqlite3_create_module +_sqlite3_create_module_v2 +_sqlite3_create_window_function +_sqlite3_declare_vtab +_sqlite3_drop_modules +_sqlite3_preupdate_blobwrite +_sqlite3_preupdate_count +_sqlite3_preupdate_depth +_sqlite3_preupdate_hook +_sqlite3_preupdate_new +_sqlite3_preupdate_old +_sqlite3_progress_handler +_sqlite3_set_authorizer +_sqlite3_vtab_collation +_sqlite3_vtab_distinct +_sqlite3_vtab_in +_sqlite3_vtab_in_first +_sqlite3_vtab_in_next +_sqlite3_vtab_nochange +_sqlite3_vtab_on_conflict +_sqlite3_vtab_rhs_value +_sqlite3changegroup_add +_sqlite3changegroup_add_strm +_sqlite3changegroup_delete +_sqlite3changegroup_new +_sqlite3changegroup_output +_sqlite3changegroup_output_strm +_sqlite3changeset_apply +_sqlite3changeset_apply_strm +_sqlite3changeset_apply_v2 +_sqlite3changeset_apply_v2_strm +_sqlite3changeset_apply_v3 +_sqlite3changeset_apply_v3_strm +_sqlite3changeset_concat +_sqlite3changeset_concat_strm +_sqlite3changeset_conflict +_sqlite3changeset_finalize +_sqlite3changeset_fk_conflicts +_sqlite3changeset_invert +_sqlite3changeset_invert_strm +_sqlite3changeset_new +_sqlite3changeset_next +_sqlite3changeset_old +_sqlite3changeset_op +_sqlite3changeset_pk +_sqlite3changeset_start +_sqlite3changeset_start_strm +_sqlite3changeset_start_v2 +_sqlite3changeset_start_v2_strm +_sqlite3session_attach +_sqlite3session_changeset +_sqlite3session_changeset_size +_sqlite3session_changeset_strm +_sqlite3session_config +_sqlite3session_create +_sqlite3session_delete +_sqlite3session_diff +_sqlite3session_enable +_sqlite3session_indirect +_sqlite3session_isempty +_sqlite3session_memory_used +_sqlite3session_object_config +_sqlite3session_patchset +_sqlite3session_patchset_strm +_sqlite3session_table_filter +//#/if not bare-bones +//#if enable-see +_sqlite3_key +_sqlite3_key_v2 +_sqlite3_rekey +_sqlite3_rekey_v2 +_sqlite3_activate_see +//#/if enable-see +//#if fiddle +_fiddle_db_arg +_fiddle_db_filename +_fiddle_exec +_fiddle_experiment +_fiddle_interrupt +_fiddle_main +_fiddle_reset_db +_fiddle_db_handle +_fiddle_db_vfs +_fiddle_export_db +//#/if fiddle diff --git a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-extras b/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-extras deleted file mode 100644 index e8304b5f2a..0000000000 --- a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-extras +++ /dev/null @@ -1,68 +0,0 @@ -_sqlite3_column_database_name -_sqlite3_column_origin_name -_sqlite3_column_table_name -_sqlite3_create_module -_sqlite3_create_module_v2 -_sqlite3_create_window_function -_sqlite3_declare_vtab -_sqlite3_drop_modules -_sqlite3_preupdate_blobwrite -_sqlite3_preupdate_count -_sqlite3_preupdate_depth -_sqlite3_preupdate_hook -_sqlite3_preupdate_new -_sqlite3_preupdate_old -_sqlite3_progress_handler -_sqlite3_set_authorizer -_sqlite3_vtab_collation -_sqlite3_vtab_distinct -_sqlite3_vtab_in -_sqlite3_vtab_in_first -_sqlite3_vtab_in_next -_sqlite3_vtab_nochange -_sqlite3_vtab_on_conflict -_sqlite3_vtab_rhs_value -_sqlite3changegroup_add -_sqlite3changegroup_add_strm -_sqlite3changegroup_delete -_sqlite3changegroup_new -_sqlite3changegroup_output -_sqlite3changegroup_output_strm -_sqlite3changeset_apply -_sqlite3changeset_apply_strm -_sqlite3changeset_apply_v2 -_sqlite3changeset_apply_v2_strm -_sqlite3changeset_apply_v3 -_sqlite3changeset_apply_v3_strm -_sqlite3changeset_concat -_sqlite3changeset_concat_strm -_sqlite3changeset_conflict -_sqlite3changeset_finalize -_sqlite3changeset_fk_conflicts -_sqlite3changeset_invert -_sqlite3changeset_invert_strm -_sqlite3changeset_new -_sqlite3changeset_next -_sqlite3changeset_old -_sqlite3changeset_op -_sqlite3changeset_pk -_sqlite3changeset_start -_sqlite3changeset_start_strm -_sqlite3changeset_start_v2 -_sqlite3changeset_start_v2_strm -_sqlite3session_attach -_sqlite3session_changeset -_sqlite3session_changeset_size -_sqlite3session_changeset_strm -_sqlite3session_config -_sqlite3session_create -_sqlite3session_delete -_sqlite3session_diff -_sqlite3session_enable -_sqlite3session_indirect -_sqlite3session_isempty -_sqlite3session_memory_used -_sqlite3session_object_config -_sqlite3session_patchset -_sqlite3session_patchset_strm -_sqlite3session_table_filter diff --git a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see b/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see deleted file mode 100644 index 83f3a97dbc..0000000000 --- a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see +++ /dev/null @@ -1,5 +0,0 @@ -_sqlite3_key -_sqlite3_key_v2 -_sqlite3_rekey -_sqlite3_rekey_v2 -_sqlite3_activate_see diff --git a/ext/wasm/api/EXPORTED_RUNTIME_METHODS.sqlite3-api b/ext/wasm/api/EXPORTED_RUNTIME_METHODS.sqlite3-api deleted file mode 100644 index aab1d8bd37..0000000000 --- a/ext/wasm/api/EXPORTED_RUNTIME_METHODS.sqlite3-api +++ /dev/null @@ -1,3 +0,0 @@ -FS -wasmMemory - diff --git a/ext/wasm/api/README.md b/ext/wasm/api/README.md index 3c9669e6ba..279d216bf0 100644 --- a/ext/wasm/api/README.md +++ b/ext/wasm/api/README.md @@ -23,7 +23,7 @@ this writing, but is not set in stone forever and may change at any time. This doc targets maintainers of this code and those wanting to dive in to the details, not end user. -First off, a pikchr of the proverbial onion: +First off, a [pikchr][] of the proverbial onion: ```pikchr toggle center scale = 0.85 @@ -60,14 +60,14 @@ maintenance point of view. At the center of the onion is `sqlite3-api.js`, which gets generated by concatenating the following files together in their listed order: -- **`sqlite3-api-prologue.js`**\ +- **`sqlite3-api-prologue.js`** Contains the initial bootstrap setup of the sqlite3 API - objects. This is exposed as a function, rather than objects, so that + objects. This is exposed as a bootstrapping function so that the next step can pass in a config object which abstracts away parts of the WASM environment, to facilitate plugging it in to arbitrary WASM toolchains. The bootstrapping function gets removed from the global scope in a later stage of the bootstrapping process. -- **`../common/whwasmutil.js`**\ +- **`../common/whwasmutil.js`** A semi-third-party collection of JS/WASM utility code intended to replace much of the Emscripten glue. The sqlite3 APIs internally use these APIs instead of their Emscripten counterparts, in order to be @@ -77,79 +77,78 @@ by concatenating the following files together in their listed order: toolchains. It is "semi-third-party" in that it was created in order to support this tree but is standalone and maintained together with... -- **`../jaccwabyt/jaccwabyt.js`**\ +- **`../jaccwabyt/jaccwabyt.js`** Another semi-third-party API which creates bindings between JS and C structs, such that changes to the struct state from either JS or C are visible to the other end of the connection. This is also an independent spinoff project, conceived for the sqlite3 project but maintained separately. -- **`sqlite3-api-glue.js`**\ +- **`sqlite3-api-glue.js`** Invokes functionality exposed by the previous two files to flesh out low-level parts of `sqlite3-api-prologue.js`. Most of these pieces involve populating the `sqlite3.capi.wasm` object and creating `sqlite3.capi.sqlite3_...()` bindings. This file also deletes most global-scope symbols the above files create, effectively moving them into the scope being used for initializing the API. -- **`/sqlite3-api-build-version.js`**\ +- **`/sqlite3-api-build-version.js`** Gets created by the build process and populates the `sqlite3.version` object. This part is not critical, but records the version of the library against which this module was built. -- **`sqlite3-api-oo1.js`**\ +- **`sqlite3-api-oo1.js`** Provides a high-level object-oriented wrapper to the lower-level C API, colloquially known as OO API #1. Its API is similar to other high-level sqlite3 JS wrappers and should feel relatively familiar to anyone familiar with such APIs. It is not a "required component" and can be elided from builds which do not want it. -- **`sqlite3-api-worker1.js`**\ +- **`sqlite3-api-worker1.js`** A Worker-thread-based API which uses OO API #1 to provide an interface to a database which can be driven from the main Window thread via the Worker message-passing interface. Like OO API #1, this is an optional component, offering one of any number of potential implementations for such an API. - - **`sqlite3-worker1.js`**\ + - **`sqlite3-worker1.js`** Is not part of the amalgamated sources and is intended to be loaded by a client Worker thread. It loads the sqlite3 module and runs the Worker #1 API which is implemented in `sqlite3-api-worker1.js`. - - **`sqlite3-worker1-promiser.js`**\ + - **`sqlite3-worker1-promiser.js`** Is likewise not part of the amalgamated sources and provides a Promise-based interface into the Worker #1 API. This is a far user-friendlier way to interface with databases running in a Worker thread. -- **`sqlite3-vfs-helper.js`**\ +- **`sqlite3-vfs-helper.c-pp.js`** Installs the `sqlite3.vfs` namespace, which contain helpers for use by downstream code which creates `sqlite3_vfs` implementations. -- **`sqlite3-vtab-helper.js`**\ +- **`sqlite3-vtab-helper.c-pp.js`** Installs the `sqlite3.vtab` namespace, which contain helpers for use by downstream code which creates `sqlite3_module` implementations. -- **`sqlite3-vfs-opfs.c-pp.js`**\ +- **`sqlite3-vfs-opfs.c-pp.js`** is an sqlite3 VFS implementation which supports the [Origin-Private FileSystem (OPFS)][OPFS] as a storage layer to provide persistent storage for database files in a browser. It requires... - - **`sqlite3-opfs-async-proxy.js`**\ + - **`sqlite3-opfs-async-proxy.js`** is the asynchronous backend part of the [OPFS][] proxy. It speaks directly to the (async) OPFS API and channels those results back to its synchronous counterpart. This file, because it must be started in its own Worker, is not part of the amalgamation. -- **`sqlite3-vfs-opfs-sahpool.c-pp.js`**\ +- **`sqlite3-vfs-opfs-sahpool.c-pp.js`** is another sqlite3 VFS supporting the [OPFS][], but uses a completely different approach than the above-listed one. -- **`sqlite3-api-cleanup.js`**\ - The previous files do not immediately extend the library. Instead - they add callback functions to be called during its - bootstrapping. Some also temporarily create global objects in order - to communicate their state to the files which follow them. This file - cleans up any dangling globals and runs the API bootstrapping - process, which is what finally executes the initialization code - installed by the previous files. As of this writing, this code - ensures that the previous files leave no more than a single global - symbol installed - `sqlite3InitModule()`. When adapting the API for - non-Emscripten toolchains, this "should" be the only file, of those - in this list, where changes are needed. The Emscripten-specific - pieces described below may also require counterparts in any as-yet - hypothetical alternative build. +The previous files do not immediately extend the library. Instead they +install a global function `sqlite3ApiBootstrap()`, which downstream +code must call to configure the library for the current JS/WASM +environment. Each file listed above pushes a callback into the +bootstrapping queue, to be called as part of `sqlite3ApiBootstrap()`. +Some files also temporarily create global objects in order to +communicate their state to the files which follow them. Those +get cleaned up vi `post-js-footer.js`, described below. + +Adapting the build for non-Emscripten toolchains essentially requires packaging +the above files, concatated together, into that toolchain's "JS glue" +and, in the final stage of that glue, call `sqlite3ApiBootstrap()` and +return its result to the end user. **Files with the extension `.c-pp.js`** are intended [to be processed with `c-pp`](#c-pp), noting that such preprocessing may be applied @@ -171,23 +170,27 @@ from this file rather than `sqlite3.c`. The following Emscripten-specific files are injected into the build-generated `sqlite3.js` along with `sqlite3-api.js`. -- **`extern-pre-js.js`**\ +- **`extern-pre-js.js`** Emscripten-specific header for Emscripten's `--extern-pre-js` flag. As of this writing, that file is only used for experimentation purposes and holds no code relevant to the production deliverables. -- **`pre-js.c-pp.js`**\ +- **`pre-js.c-pp.js`** Emscripten-specific header for Emscripten's `--pre-js` flag. This file overrides certain Emscripten behavior before Emscripten does most of its work. -- **`post-js-header.js`**\ +- **`post-js-header.js`** Emscripten-specific header for the `--post-js` input. It opens up, but does not close, a function used for initializing the library. -- (**`sqlite3-api.js`** gets sandwiched between these ↑ two - ↓ files.) -- **`post-js-footer.js`**\ +- **`sqlite3-api.js`** gets sandwiched between these ↑ two + ↓ files. +- **`post-js-footer.js`** Emscripten-specific footer for the `--post-js` input. This closes - off the function opened by `post-js-header.js`. -- **`extern-post-js.c-pp.js`**\ + off the function opened by `post-js-header.js`. This file cleans up + any dangling globals and runs `sqlite3ApiBootstrap()`. As of this + writing, this code ensures that the previous files leave no more + than a single global symbol installed - `sqlite3InitModule()`. + +- **`extern-post-js.c-pp.js`** Emscripten-specific header for Emscripten's `--extern-post-js` flag. This file is run in the global scope. It overwrites the Emscripten-installed `sqlite3InitModule()` function with one which @@ -205,9 +208,10 @@ Preprocessing of Source Files Certain files in the build require preprocessing to filter in/out parts which differ between vanilla JS, ES6 Modules, and node.js builds. The preprocessor application itself is in -[`c-pp.c`](/file/ext/wasm/c-pp.c) and the complete technical details -of such preprocessing are maintained in +[`c-pp-lite.c`](/file/ext/wasm/c-pp-lite.c) and the complete technical +details of such preprocessing are maintained in [`GNUMakefile`](/file/ext/wasm/GNUmakefile). [OPFS]: https://developer.mozilla.org/en-US/docs/Web/API/File_System_API/Origin_private_file_system +[pikchr]: https://pikchr.org diff --git a/ext/wasm/api/extern-post-js.c-pp.js b/ext/wasm/api/extern-post-js.c-pp.js index 606e02ae28..b2e760d6a0 100644 --- a/ext/wasm/api/extern-post-js.c-pp.js +++ b/ext/wasm/api/extern-post-js.c-pp.js @@ -14,7 +14,7 @@ */ //#if target:es6-module const toExportForESM = -//#endif +//#/if (function(){ //console.warn("this is extern-post-js"); /** @@ -26,7 +26,7 @@ const toExportForESM = */ const originalInit = sqlite3InitModule; if(!originalInit){ - throw new Error("Expecting globalThis.sqlite3InitModule to be defined by the Emscripten build."); + throw new Error("Expecting sqlite3InitModule to be defined by the Emscripten build."); } /** We need to add some state which our custom Module.locateFile() @@ -73,6 +73,8 @@ const toExportForESM = const sIM = globalThis.sqlite3InitModule = function ff(...args){ //console.warn("Using replaced sqlite3InitModule()",globalThis.location); + sIMS.emscriptenLocateFile = args[0]?.locateFile /* see pre-js.c-pp.js [tag:locateFile] */; + sIMS.emscriptenInstantiateWasm = args[0]?.instantiateWasm /* see pre-js.c-pp.js [tag:locateFile] */; return originalInit(...args).then((EmscriptenModule)=>{ sIMS.debugModule("sqlite3InitModule() sIMS =",sIMS); sIMS.debugModule("sqlite3InitModule() EmscriptenModule =",EmscriptenModule); @@ -95,7 +97,7 @@ const toExportForESM = //console.warn("sqlite3InitModule() returning E-module.",EmscriptenModule); return EmscriptenModule; } -//#endif +//#/if return s; }).catch((e)=>{ console.error("Exception loading sqlite3 module:",e); @@ -126,10 +128,10 @@ const toExportForESM = } /* AMD modules get injected in a way we cannot override, so we can't handle those here. */ -//#endif // !target:es6-module +//#/if // !target:es6-module return sIM; })(); //#if target:es6-module sqlite3InitModule = toExportForESM; export default sqlite3InitModule; -//#endif +//#/if diff --git a/ext/wasm/api/opfs-common-inline.c-pp.js b/ext/wasm/api/opfs-common-inline.c-pp.js new file mode 100644 index 0000000000..74e911e56d --- /dev/null +++ b/ext/wasm/api/opfs-common-inline.c-pp.js @@ -0,0 +1,191 @@ +//#if 0 +/** + This file is for preprocessor #include into the "opfs" and + "opfs-wl" impls, as well as their async-proxy part. It must be + inlined in those files, as opposed to being a shared copy in the + library, because (A) the async proxy does not load the library and + (B) it references an object which is local to each of those files + but which has a 99% identical structure for each. +*/ +//#/if +//#// vfs.metrics.enable is a refactoring crutch. +//#define vfs.metrics.enable 0 +const initS11n = function(){ + /** + This proxy de/serializes cross-thread function arguments and + output-pointer values via the state.sabIO SharedArrayBuffer, + using the region defined by (state.sabS11nOffset, + state.sabS11nOffset + state.sabS11nSize]. Only one dataset is + recorded at a time. + + This is not a general-purpose format. It only supports the + range of operations, and data sizes, needed by the + sqlite3_vfs and sqlite3_io_methods operations. Serialized + data are transient and this serialization algorithm may + change at any time. + + The data format can be succinctly summarized as: + + Nt...Td...D + + Where: + + - N = number of entries (1 byte) + + - t = type ID of first argument (1 byte) + + - ...T = type IDs of the 2nd and subsequent arguments (1 byte + each). + + - d = raw bytes of first argument (per-type size). + + - ...D = raw bytes of the 2nd and subsequent arguments (per-type + size). + + All types except strings have fixed sizes. Strings are stored + using their TextEncoder/TextDecoder representations. It would + arguably make more sense to store them as Int16Arrays of + their JS character values, but how best/fastest to get that + in and out of string form is an open point. Initial + experimentation with that approach did not gain us any speed. + + Historical note: this impl was initially about 1% this size by + using using JSON.stringify/parse(), but using fit-to-purpose + serialization saves considerable runtime. + */ + if(state.s11n) return state.s11n; + const textDecoder = new TextDecoder(), + textEncoder = new TextEncoder('utf-8'), + viewU8 = new Uint8Array(state.sabIO, state.sabS11nOffset, state.sabS11nSize), + viewDV = new DataView(state.sabIO, state.sabS11nOffset, state.sabS11nSize); + state.s11n = Object.create(null); + /* Only arguments and return values of these types may be + serialized. This covers the whole range of types needed by the + sqlite3_vfs API. */ + const TypeIds = Object.create(null); + TypeIds.number = { id: 1, size: 8, getter: 'getFloat64', setter: 'setFloat64' }; + TypeIds.bigint = { id: 2, size: 8, getter: 'getBigInt64', setter: 'setBigInt64' }; + TypeIds.boolean = { id: 3, size: 4, getter: 'getInt32', setter: 'setInt32' }; + TypeIds.string = { id: 4 }; + + const getTypeId = (v)=>( + TypeIds[typeof v] + || toss("Maintenance required: this value type cannot be serialized.",v) + ); + const getTypeIdById = (tid)=>{ + switch(tid){ + case TypeIds.number.id: return TypeIds.number; + case TypeIds.bigint.id: return TypeIds.bigint; + case TypeIds.boolean.id: return TypeIds.boolean; + case TypeIds.string.id: return TypeIds.string; + default: toss("Invalid type ID:",tid); + } + }; + + /** + Returns an array of the deserialized state stored by the most + recent serialize() operation (from this thread or the + counterpart thread), or null if the serialization buffer is + empty. If passed a truthy argument, the serialization buffer + is cleared after deserialization. + */ + state.s11n.deserialize = function(clear=false){ +//#if vfs.metrics.enable + ++metrics.s11n.deserialize.count; +//#/if + const t = performance.now(); + const argc = viewU8[0]; + const rc = argc ? [] : null; + if(argc){ + const typeIds = []; + let offset = 1, i, n, v; + for(i = 0; i < argc; ++i, ++offset){ + typeIds.push(getTypeIdById(viewU8[offset])); + } + for(i = 0; i < argc; ++i){ + const t = typeIds[i]; + if(t.getter){ + v = viewDV[t.getter](offset, state.littleEndian); + offset += t.size; + }else{/*String*/ + n = viewDV.getInt32(offset, state.littleEndian); + offset += 4; + v = textDecoder.decode(viewU8.slice(offset, offset+n)); + offset += n; + } + rc.push(v); + } + } + if(clear) viewU8[0] = 0; + //log("deserialize:",argc, rc); +//#if vfs.metrics.enable + metrics.s11n.deserialize.time += performance.now() - t; +//#/if + return rc; + }; + + /** + Serializes all arguments to the shared buffer for consumption + by the counterpart thread. + + This routine is only intended for serializing OPFS VFS + arguments and (in at least one special case) result values, + and the buffer is sized to be able to comfortably handle + those. + + If passed no arguments then it zeroes out the serialization + state. + */ + state.s11n.serialize = function(...args){ + const t = performance.now(); +//#if vfs.metrics.enable + ++metrics.s11n.serialize.count; +//#/if + if(args.length){ + //log("serialize():",args); + const typeIds = []; + let i = 0, offset = 1; + viewU8[0] = args.length & 0xff /* header = # of args */; + for(; i < args.length; ++i, ++offset){ + /* Write the TypeIds.id value into the next args.length + bytes. */ + typeIds.push(getTypeId(args[i])); + viewU8[offset] = typeIds[i].id; + } + for(i = 0; i < args.length; ++i) { + /* Deserialize the following bytes based on their + corresponding TypeIds.id from the header. */ + const t = typeIds[i]; + if(t.setter){ + viewDV[t.setter](offset, args[i], state.littleEndian); + offset += t.size; + }else{/*String*/ + const s = textEncoder.encode(args[i]); + viewDV.setInt32(offset, s.byteLength, state.littleEndian); + offset += 4; + viewU8.set(s, offset); + offset += s.byteLength; + } + } + //log("serialize() result:",viewU8.slice(0,offset)); + }else{ + viewU8[0] = 0; + } +//#if vfs.metrics.enable + metrics.s11n.serialize.time += performance.now() - t; +//#/if + }; + +//#if defined opfs-async-proxy + state.s11n.storeException = state.asyncS11nExceptions + ? ((priority,e)=>{ + if(priority<=state.asyncS11nExceptions){ + state.s11n.serialize([e.name,': ',e.message].join("")); + } + }) + : ()=>{}; +//#/if + + return state.s11n; +//#undef vfs.metrics.enable +}/*initS11n()*/; diff --git a/ext/wasm/api/opfs-common-shared.c-pp.js b/ext/wasm/api/opfs-common-shared.c-pp.js new file mode 100644 index 0000000000..24ae2632fb --- /dev/null +++ b/ext/wasm/api/opfs-common-shared.c-pp.js @@ -0,0 +1,1301 @@ +//#if not target:node +/* + 2026-03-04 + + The author disclaims copyright to this source code. In place of a + legal notice, here is a blessing: + + * May you do good and not evil. + * May you find forgiveness for yourself and forgive others. + * May you share freely, never taking more than you give. + + *********************************************************************** + + This file holds code shared by sqlite3-vfs-opfs{,-wl}.c-pp.js. It + creates a private/internal sqlite3.opfs namespace common to the two + and used (only) by them and the test framework. It is not part of + the public API. The library deletes sqlite3.opfs in its final + bootstrapping steps unless it's specifically told to keep them (for + testing purposes only) using an undocumented and unsupported + mechanism. +*/ +globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ + 'use strict'; + if( sqlite3.config.disable?.vfs?.opfs && + sqlite3.config.disable.vfs['opfs-vfs'] ){ + return; + } + const toss = sqlite3.util.toss, + capi = sqlite3.capi, + util = sqlite3.util, + wasm = sqlite3.wasm; + + /** + Generic utilities for working with OPFS. This will get filled out + by the Promise setup and, on success, installed as sqlite3.opfs. + + This is an internal/private namespace intended for use solely by + the OPFS VFSes and test code for them. The library bootstrapping + process removes this object in non-testing contexts. + */ + const opfsUtil = sqlite3.opfs = Object.create(null); + + /** + Returns true if _this_ thread has access to the OPFS APIs. + */ + opfsUtil.thisThreadHasOPFS = ()=>{ + return globalThis.FileSystemHandle && + globalThis.FileSystemDirectoryHandle && + globalThis.FileSystemFileHandle && + globalThis.FileSystemFileHandle.prototype.createSyncAccessHandle && + navigator?.storage?.getDirectory; + }; + + /** + Must be called by the OPFS VFSes immediately after they determine + whether OPFS is available by calling + thisThreadHasOPFS(). Resolves to the OPFS storage root directory + and sets opfsUtil.rootDirectory to that value. + */ + opfsUtil.getRootDir = async function f(){ + return f.promise ??= navigator.storage.getDirectory().then(d=>{ + opfsUtil.rootDirectory = d; + return d; + }).catch(e=>{ + delete f.promise; + throw e; + }); + }; + + /** + Expects an OPFS file path. It gets resolved, such that ".." + components are properly expanded, and returned. If the 2nd arg + is true, the result is returned as an array of path elements, + else an absolute path string is returned. + */ + opfsUtil.getResolvedPath = function(filename,splitIt){ + const p = new URL(filename, "file://irrelevant").pathname; + return splitIt ? p.split('/').filter((v)=>!!v) : p; + }; + + /** + Takes the absolute path to a filesystem element. Returns an + array of [handleOfContainingDir, filename]. If the 2nd argument + is truthy then each directory element leading to the file is + created along the way. Throws if any creation or resolution + fails. + */ + opfsUtil.getDirForFilename = async function f(absFilename, createDirs = false){ + const path = opfsUtil.getResolvedPath(absFilename, true); + const filename = path.pop(); + let dh = await opfsUtil.getRootDir(); + for(const dirName of path){ + if(dirName){ + dh = await dh.getDirectoryHandle(dirName, {create: !!createDirs}); + } + } + return [dh, filename]; + }; + + /** + Creates the given directory name, recursively, in + the OPFS filesystem. Returns true if it succeeds or the + directory already exists, else false. + */ + opfsUtil.mkdir = async function(absDirName){ + try { + await opfsUtil.getDirForFilename(absDirName+"/filepart", true); + return true; + }catch(e){ + //sqlite3.config.warn("mkdir(",absDirName,") failed:",e); + return false; + } + }; + + /** + Checks whether the given OPFS filesystem entry exists, + returning true if it does, false if it doesn't or if an + exception is intercepted while trying to make the + determination. + */ + opfsUtil.entryExists = async function(fsEntryName){ + try { + const [dh, fn] = await opfsUtil.getDirForFilename(fsEntryName); + await dh.getFileHandle(fn); + return true; + }catch(e){ + return false; + } + }; + + /** + Generates a random ASCII string len characters long, intended for + use as a temporary file name. + */ + opfsUtil.randomFilename = function f(len=16){ + if(!f._chars){ + f._chars = "abcdefghijklmnopqrstuvwxyz"+ + "ABCDEFGHIJKLMNOPQRSTUVWXYZ"+ + "012346789"; + f._n = f._chars.length; + } + const a = []; + let i = 0; + for( ; i < len; ++i){ + const ndx = Math.random() * (f._n * 64) % f._n | 0; + a[i] = f._chars[ndx]; + } + return a.join(""); + /* + An alternative impl. with an unpredictable length + but much simpler: + + Math.floor(Math.random() * Number.MAX_SAFE_INTEGER).toString(36) + */ + }; + + /** + Returns a promise which resolves to an object which represents + all files and directories in the OPFS tree. The top-most object + has two properties: `dirs` is an array of directory entries + (described below) and `files` is a list of file names for all + files in that directory. + + Traversal starts at sqlite3.opfs.rootDirectory. + + Each `dirs` entry is an object in this form: + + ``` + { name: directoryName, + dirs: [...subdirs], + files: [...file names] + } + ``` + + The `files` and `subdirs` entries are always set but may be + empty arrays. + + The returned object has the same structure but its `name` is + an empty string. All returned objects are created with + Object.create(null), so have no prototype. + + Design note: the entries do not contain more information, + e.g. file sizes, because getting such info is not only + expensive but is subject to locking-related errors. + */ + opfsUtil.treeList = async function(){ + const doDir = async function callee(dirHandle,tgt){ + tgt.name = dirHandle.name; + tgt.dirs = []; + tgt.files = []; + for await (const handle of dirHandle.values()){ + if('directory' === handle.kind){ + const subDir = Object.create(null); + tgt.dirs.push(subDir); + await callee(handle, subDir); + }else{ + tgt.files.push(handle.name); + } + } + }; + const root = Object.create(null); + const dir = await opfsUtil.getRootDir(); + await doDir(dir, root); + return root; + }; + + /** + Irrevocably deletes _all_ files in the current origin's OPFS. + Obviously, this must be used with great caution. It may throw + an exception if removal of anything fails (e.g. a file is + locked), but the precise conditions under which the underlying + APIs will throw are not documented (so we cannot tell you what + they are). + */ + opfsUtil.rmfr = async function(){ + const rd = await opfsUtil.getRootDir(); + const dir = rd, opt = {recurse: true}; + for await (const handle of dir.values()){ + dir.removeEntry(handle.name, opt); + } + }; + + /** + Deletes the given OPFS filesystem entry. As this environment + has no notion of "current directory", the given name must be an + absolute path. If the 2nd argument is truthy, deletion is + recursive (use with caution!). + + The returned Promise resolves to true if the deletion was + successful, else false (but...). The OPFS API reports the + reason for the failure only in human-readable form, not + exceptions which can be type-checked to determine the + failure. Because of that... + + If the final argument is truthy then this function will + propagate any exception on error, rather than returning false. + */ + opfsUtil.unlink = async function(fsEntryName, recursive = false, + throwOnError = false){ + try { + const [hDir, filenamePart] = + await opfsUtil.getDirForFilename(fsEntryName, false); + await hDir.removeEntry(filenamePart, {recursive}); + return true; + }catch(e){ + if(throwOnError){ + throw new Error("unlink(",arguments[0],") failed: "+e.message,{ + cause: e + }); + } + return false; + } + }; + + /** + Traverses the OPFS filesystem, calling a callback for each + entry. The argument may be either a callback function or an + options object with any of the following properties: + + - `callback`: function which gets called for each filesystem + entry. It gets passed 3 arguments: 1) the + FileSystemFileHandle or FileSystemDirectoryHandle of each + entry (noting that both are instanceof FileSystemHandle). 2) + the FileSystemDirectoryHandle of the parent directory. 3) the + current depth level, with 0 being at the top of the tree + relative to the starting directory. If the callback returns a + literal false, as opposed to any other falsy value, traversal + stops without an error. Any exceptions it throws are + propagated. Results are undefined if the callback manipulate + the filesystem (e.g. removing or adding entries) because the + how OPFS iterators behave in the face of such changes is + undocumented. + + - `recursive` [bool=true]: specifies whether to recurse into + subdirectories or not. Whether recursion is depth-first or + breadth-first is unspecified! + + - `directory` [FileSystemDirectoryEntry=sqlite3.opfs.rootDirectory] + specifies the starting directory. + + If this function is passed a function, it is assumed to be the + callback. + + Returns a promise because it has to (by virtue of being async) + but that promise has no specific meaning: the traversal it + performs is synchronous. The promise must be used to catch any + exceptions propagated by the callback, however. + */ + opfsUtil.traverse = async function(opt){ + const defaultOpt = { + recursive: true, + directory: await opfsUtil.getRootDir() + }; + if('function'===typeof opt){ + opt = {callback:opt}; + } + opt = Object.assign(defaultOpt, opt||{}); + const doDir = async function callee(dirHandle, depth){ + for await (const handle of dirHandle.values()){ + if(false === opt.callback(handle, dirHandle, depth)) return false; + else if(opt.recursive && 'directory' === handle.kind){ + if(false === await callee(handle, depth + 1)) break; + } + } + }; + doDir(opt.directory, 0); + }; + + /** + Impl of opfsUtil.importDb() when it's given a function as its + second argument. + */ + const importDbChunked = async function(filename, callback){ + const [hDir, fnamePart] = await opfsUtil.getDirForFilename(filename, true); + const hFile = await hDir.getFileHandle(fnamePart, {create:true}); + let sah = await hFile.createSyncAccessHandle(); + let nWrote = 0, chunk, checkedHeader = false, err = false; + try{ + sah.truncate(0); + while( undefined !== (chunk = await callback()) ){ + if(chunk instanceof ArrayBuffer) chunk = new Uint8Array(chunk); + if( !checkedHeader && 0===nWrote && chunk.byteLength>=15 ){ + util.affirmDbHeader(chunk); + checkedHeader = true; + } + sah.write(chunk, {at: nWrote}); + nWrote += chunk.byteLength; + } + if( nWrote < 512 || 0!==nWrote % 512 ){ + toss("Input size",nWrote,"is not correct for an SQLite database."); + } + if( !checkedHeader ){ + const header = new Uint8Array(20); + sah.read( header, {at: 0} ); + util.affirmDbHeader( header ); + } + sah.write(new Uint8Array([1,1]), {at: 18}/*force db out of WAL mode*/); + return nWrote; + }catch(e){ + await sah.close(); + sah = undefined; + await hDir.removeEntry( fnamePart ).catch(()=>{}); + throw e; + }finally { + if( sah ) await sah.close(); + } + }; + + /** + Asynchronously imports the given bytes (a byte array or + ArrayBuffer) into the given database file. + + Results are undefined if the given db name refers to an opened + db. + + If passed a function for its second argument, its behaviour + changes: imports its data in chunks fed to it by the given + callback function. It calls the callback (which may be async) + repeatedly, expecting either a Uint8Array or ArrayBuffer (to + denote new input) or undefined (to denote EOF). For so long as + the callback continues to return non-undefined, it will append + incoming data to the given VFS-hosted database file. When + called this way, the resolved value of the returned Promise is + the number of bytes written to the target file. + + It very specifically requires the input to be an SQLite3 + database and throws if that's not the case. It does so in + order to prevent this function from taking on a larger scope + than it is specifically intended to. i.e. we do not want it to + become a convenience for importing arbitrary files into OPFS. + + This routine rewrites the database header bytes in the output + file (not the input array) to force disabling of WAL mode. + + On error this throws and the state of the input file is + undefined (it depends on where the exception was triggered). + + On success, resolves to the number of bytes written. + */ + opfsUtil.importDb = async function(filename, bytes){ + if( bytes instanceof Function ){ + return importDbChunked(filename, bytes); + } + if(bytes instanceof ArrayBuffer) bytes = new Uint8Array(bytes); + util.affirmIsDb(bytes); + const n = bytes.byteLength; + const [hDir, fnamePart] = await opfsUtil.getDirForFilename(filename, true); + let sah, err, nWrote = 0; + try { + const hFile = await hDir.getFileHandle(fnamePart, {create:true}); + sah = await hFile.createSyncAccessHandle(); + sah.truncate(0); + nWrote = sah.write(bytes, {at: 0}); + if(nWrote != n){ + toss("Expected to write "+n+" bytes but wrote "+nWrote+"."); + } + sah.write(new Uint8Array([1,1]), {at: 18}) /* force db out of WAL mode */; + return nWrote; + }catch(e){ + if( sah ){ await sah.close(); sah = undefined; } + await hDir.removeEntry( fnamePart ).catch(()=>{}); + throw e; + }finally{ + if( sah ) await sah.close(); + } + }; + + /** + Checks for features required for OPFS VFSes and throws with a + descriptive error message if they're not found. This is intended + to be run as part of async VFS installation steps. + */ + opfsUtil.vfsInstallationFeatureCheck = function(vfsName){ + if( !globalThis.SharedArrayBuffer || !globalThis.Atomics ){ + toss("Cannot install OPFS: Missing SharedArrayBuffer and/or Atomics.", + "The server must emit the COOP/COEP response headers to enable those.", + "See https://sqlite.org/wasm/doc/trunk/persistence.md#coop-coep"); + }else if( 'undefined'===typeof WorkerGlobalScope ){ + toss("The OPFS sqlite3_vfs cannot run in the main thread", + "because it requires Atomics.wait()."); + }else if( !globalThis.FileSystemHandle || + !globalThis.FileSystemDirectoryHandle || + !globalThis.FileSystemFileHandle?.prototype?.createSyncAccessHandle || + !navigator?.storage?.getDirectory ){ + toss("Missing required OPFS APIs."); + }else if( 'opfs-wl'===vfsName && !globalThis.Atomics.waitAsync ){ + toss('The',vfsName,'VFS requires Atomics.waitAsync(), which is not available.'); + } + }; + + /** + Must be called by the VFS's main installation routine and passed + the options object that function receives and a reference to that + function itself (we don't need this anymore). + + It throws if OPFS is not available. + + If it returns falsy, it detected that OPFS should be disabled, in + which case the callee should immediately return/resolve to the + sqlite3 object. + + Else it returns a new copy of the options object, fleshed out + with any missing defaults. The caller must: + + - Set up any local state they need. + + - Call opfsUtil.createVfsState(vfsName,opt), where opt is the + object returned by this function. + + - Set up any references they may need to state returned + by the previous step. + + - Call opfvs.bindVfs() + */ + opfsUtil.initOptions = function callee(vfsName, options){ + const urlParams = new URL(globalThis.location.href).searchParams; + if( urlParams.has(vfsName+'-disable') ){ + //sqlite3.config.warn('Explicitly not installing "opfs" VFS due to opfs-disable flag.'); + return; + } + try{ + opfsUtil.vfsInstallationFeatureCheck(vfsName); + }catch(e){ + return; + } + options = util.nu(options); + options.vfsName = vfsName; + options.verbose ??= urlParams.has('opfs-verbose') + ? +urlParams.get('opfs-verbose') : 1; + options.sanityChecks ??= urlParams.has('opfs-sanity-check'); + + if( !opfsUtil.proxyUri ){ + opfsUtil.proxyUri = "sqlite3-opfs-async-proxy.js"; + if( sqlite3.scriptInfo?.sqlite3Dir ){ + /* Doing this from one scope up, outside of this function, does + not work. */ + opfsUtil.proxyUri = ( + sqlite3.scriptInfo.sqlite3Dir + opfsUtil.proxyUri + ); + } + } + options.proxyUri ??= opfsUtil.proxyUri; + if('function' === typeof options.proxyUri){ + options.proxyUri = options.proxyUri(); + } + //sqlite3.config.warn("opfsUtil options =",JSON.stringify(options), 'urlParams =', urlParams); + return opfsUtil.options = options; + }; + + /** + Creates, populates, and returns the main state object used by the + "opfs" and "opfs-wl" VFSes, and transfered from those to their + async counterparts. + + The returned object's vfs property holds the fully-populated + capi.sqlite3_vfs instance, tagged with lots of extra state which + the current VFSes need to have exposed to them. + + After setting up any local state needed, the caller must call + theVfs.bindVfs(X,Y), where X is an object containing the + sqlite3_io_methods to override and Y is a callback which gets + triggered if init succeeds, before the final Promise decides + whether or not to reject. + + This object must, when it's passed to the async part, contain + only cloneable or sharable objects. After the worker's "inited" + message arrives, other types of data may be added to it. + */ + opfsUtil.createVfsState = function(){ + const state = util.nu(); + const options = opfsUtil.options; + state.verbose = options.verbose; + + const loggers = [ + sqlite3.config.error, + sqlite3.config.warn, + sqlite3.config.log + ]; + const vfsName = options.vfsName + || toss("Maintenance required: missing VFS name"); + const logImpl = (level,...args)=>{ + if(state.verbose>level) loggers[level](vfsName+":",...args); + }; + const log = (...args)=>logImpl(2, ...args), + warn = (...args)=>logImpl(1, ...args), + error = (...args)=>logImpl(0, ...args), + capi = sqlite3.capi, + wasm = sqlite3.wasm; + + const opfsVfs = state.vfs = new capi.sqlite3_vfs(); + const opfsIoMethods = opfsVfs.ioMethods = new capi.sqlite3_io_methods(); + + opfsIoMethods.$iVersion = 1; + opfsVfs.$iVersion = 2/*yes, two*/; + opfsVfs.$szOsFile = capi.sqlite3_file.structInfo.sizeof; + opfsVfs.$mxPathname = 1024/* sure, why not? The OPFS name length limit + is undocumented/unspecified. */; + opfsVfs.$zName = wasm.allocCString(vfsName); + opfsVfs.addOnDispose( + '$zName', opfsVfs.$zName, opfsIoMethods + /** + Pedantic sidebar: the entries in this array are items to + clean up when opfsVfs.dispose() is called, but in this + environment it will never be called. The VFS instance simply + hangs around until the WASM module instance is cleaned up. We + "could" _hypothetically_ clean it up by "importing" an + sqlite3_os_end() impl into the wasm build, but the shutdown + order of the wasm engine and the JS one are undefined so + there is no guaranty that the opfsVfs instance would be + available in one environment or the other when + sqlite3_os_end() is called (_if_ it gets called at all in a + wasm build, which is undefined). i.e. addOnDispose() here is + a matter of "correctness", not necessity. It just wouldn't do + to leave the impression that we're blindly leaking memory. + */ + ); + + opfsVfs.metrics = util.nu({ + counters: util.nu(), + dump: function(){ + let k, n = 0, t = 0, w = 0; + for(k in state.opIds){ + const m = metrics[k]; + n += m.count; + t += m.time; + w += m.wait; + m.avgTime = (m.count && m.time) ? (m.time / m.count) : 0; + m.avgWait = (m.count && m.wait) ? (m.wait / m.count) : 0; + } + sqlite3.config.log(globalThis.location.href, + "metrics for",globalThis.location.href,":",metrics, + "\nTotal of",n,"op(s) for",t, + "ms (incl. "+w+" ms of waiting on the async side)"); + sqlite3.config.log("Serialization metrics:",opfsVfs.metrics.counters.s11n); + opfsVfs.worker?.postMessage?.({type:'opfs-async-metrics'}); + }, + reset: function(){ + let k; + const r = (m)=>(m.count = m.time = m.wait = 0); + const m = opfsVfs.metrics.counters; + for(k in state.opIds){ + r(m[k] = Object.create(null)); + } + let s = m.s11n = Object.create(null); + s = s.serialize = Object.create(null); + s.count = s.time = 0; + s = m.s11n.deserialize = Object.create(null); + s.count = s.time = 0; + } + })/*opfsVfs.metrics*/; + + /** + asyncIdleWaitTime is how long (ms) to wait, in the async proxy, + for each Atomics.wait() when waiting on inbound VFS API calls. + We need to wake up periodically to give the thread a chance to + do other things. If this is too high (e.g. 500ms) then even two + workers/tabs can easily run into locking errors. Some multiple + of this value is also used for determining how long to wait on + lock contention to free up. + */ + state.asyncIdleWaitTime = 150; + + /** + Whether the async counterpart should log exceptions to + the serialization channel. That produces a great deal of + noise for seemingly innocuous things like xAccess() checks + for missing files, so this option may have one of 3 values: + + 0 = no exception logging. + + 1 = only log exceptions for "significant" ops like xOpen(), + xRead(), and xWrite(). Exceptions related to, e.g., wait/retry + loops in acquiring SyncAccessHandles are not logged. + + 2 = log all exceptions. + */ + state.asyncS11nExceptions = 1; + /* Size of file I/O buffer block. 64k = max sqlite3 page size, and + xRead/xWrite() will never deal in blocks larger than that. */ + state.fileBufferSize = 1024 * 64; + state.sabS11nOffset = state.fileBufferSize; + /** + The size of the block in our SAB for serializing arguments and + result values. Needs to be large enough to hold serialized + values of any of the proxied APIs. Filenames are the largest + part but are limited to opfsVfs.$mxPathname bytes. We also + store exceptions there, so it needs to be long enough to hold + a reasonably long exception string. + */ + state.sabS11nSize = opfsVfs.$mxPathname * 2; + /** + The SAB used for all data I/O between the synchronous and + async halves (file i/o and arg/result s11n). + */ + state.sabIO = new SharedArrayBuffer( + state.fileBufferSize/* file i/o block */ + + state.sabS11nSize/* argument/result serialization block */ + ); + + /** + For purposes of Atomics.wait() and Atomics.notify(), we use a + SharedArrayBuffer with one slot reserved for each of the API + proxy's methods. The sync side of the API uses Atomics.wait() + on the corresponding slot and the async side uses + Atomics.notify() on that slot. state.opIds holds the SAB slot + IDs of each of those. + */ + state.opIds = Object.create(null); + { + /* Indexes for use in our SharedArrayBuffer... */ + let i = 0; + /* SAB slot used to communicate which operation is desired + between both workers. This worker writes to it and the other + listens for changes and clears it. The values written to it + are state.opIds.x[A-Z][a-z]+, defined below.*/ + state.opIds.whichOp = i++; + /* Slot for storing return values. This side listens to that + slot and the async proxy writes to it. */ + state.opIds.rc = i++; + /* Each function gets an ID which this worker writes to the + state.opIds.whichOp slot. The async-api worker uses + Atomic.wait() on the whichOp slot to figure out which + operation to run next. */ + state.opIds.xAccess = i++; + state.opIds.xClose = i++; + state.opIds.xDelete = i++; + state.opIds.xDeleteNoWait = i++; + state.opIds.xFileSize = i++; + state.opIds.xLock = i++; + state.opIds.xOpen = i++; + state.opIds.xRead = i++; + state.opIds.xSleep = i++; + state.opIds.xSync = i++; + state.opIds.xTruncate = i++; + state.opIds.xUnlock = i++; + state.opIds.xWrite = i++; + state.opIds.mkdir = i++ /*currently unused*/; + /** Internal signals which are used only during development and + testing via the dev console. */ + state.opIds['opfs-async-metrics'] = i++; + state.opIds['opfs-async-shutdown'] = i++; + /* The retry slot is used by the async part for wait-and-retry + semantics. It is never written to, only used as a convenient + place to wait-with-timeout for a value which will never be + written, i.e. sleep()ing, before retrying a failed attempt to + acquire a SharedAccessHandle. */ + state.opIds.retry = i++; + state.sabOP = new SharedArrayBuffer( + i * 4/* 4==sizeof int32, noting that Atomics.wait() and + friends can only function on Int32Array views of an + SAB. */); + } + /** + SQLITE_xxx constants to export to the async worker + counterpart... + */ + state.sq3Codes = Object.create(null); + for(const k of [ + 'SQLITE_ACCESS_EXISTS', + 'SQLITE_ACCESS_READWRITE', + 'SQLITE_BUSY', + 'SQLITE_CANTOPEN', + 'SQLITE_ERROR', + 'SQLITE_IOERR', + 'SQLITE_IOERR_ACCESS', + 'SQLITE_IOERR_CLOSE', + 'SQLITE_IOERR_DELETE', + 'SQLITE_IOERR_FSYNC', + 'SQLITE_IOERR_LOCK', + 'SQLITE_IOERR_READ', + 'SQLITE_IOERR_SHORT_READ', + 'SQLITE_IOERR_TRUNCATE', + 'SQLITE_IOERR_UNLOCK', + 'SQLITE_IOERR_WRITE', + 'SQLITE_LOCK_EXCLUSIVE', + 'SQLITE_LOCK_NONE', + 'SQLITE_LOCK_PENDING', + 'SQLITE_LOCK_RESERVED', + 'SQLITE_LOCK_SHARED', + 'SQLITE_LOCKED', + 'SQLITE_MISUSE', + 'SQLITE_NOTFOUND', + 'SQLITE_OPEN_CREATE', + 'SQLITE_OPEN_DELETEONCLOSE', + 'SQLITE_OPEN_MAIN_DB', + 'SQLITE_OPEN_READONLY', + 'SQLITE_LOCK_NONE', + 'SQLITE_LOCK_SHARED', + 'SQLITE_LOCK_RESERVED', + 'SQLITE_LOCK_PENDING', + 'SQLITE_LOCK_EXCLUSIVE' + ]){ + state.sq3Codes[k] = + capi[k] ?? toss("Maintenance required: not found:",k); + } + + state.opfsFlags = Object.assign(Object.create(null),{ + /** + Flag for use with xOpen(). URI flag "opfs-unlock-asap=1" + enables this. See defaultUnlockAsap, below. + */ + OPFS_UNLOCK_ASAP: 0x01, + /** + Flag for use with xOpen(). URI flag "delete-before-open=1" + tells the VFS to delete the db file before attempting to open + it. This can be used, e.g., to replace a db which has been + corrupted (without forcing us to expose a delete/unlink() + function in the public API). + + Failure to unlink the file is ignored but may lead to + downstream errors. An unlink can fail if, e.g., another tab + has the handle open. + + It goes without saying that deleting a file out from under + another instance results in Undefined Behavior. + */ + OPFS_UNLINK_BEFORE_OPEN: 0x02, + /** + If true, any async routine which must implicitly acquire a + sync access handle (i.e. an OPFS lock), without an active + xLock(), will release that lock at the end of the call which + acquires it. If false, such implicit locks are not released + until the VFS is idle for some brief amount of time, as + defined by state.asyncIdleWaitTime. + + The benefit of enabling this is higher concurrency. The + down-side is much-reduced performance (as much as a 4x + decrease in speedtest1). + */ + defaultUnlockAsap: false + }); + + opfsVfs.metrics.reset()/*must not be called until state.opIds is set up*/; + const metrics = opfsVfs.metrics.counters; + + /** + Runs the given operation (by name) in the async worker + counterpart, waits for its response, and returns the result + which the async worker writes to SAB[state.opIds.rc]. The 2nd + and subsequent arguments must be the arguments for the async op + (see sqlite3-opfs-async-proxy.c-pp.js). + */ + const opRun = opfsVfs.opRun = (op,...args)=>{ + const opNdx = state.opIds[op] || toss(opfsVfs.vfsName+": Invalid op ID:",op); + state.s11n.serialize(...args); + Atomics.store(state.sabOPView, state.opIds.rc, -1); + Atomics.store(state.sabOPView, state.opIds.whichOp, opNdx); + Atomics.notify(state.sabOPView, state.opIds.whichOp) + /* async thread will take over here */; + const t = performance.now(); + while('not-equal'!==Atomics.wait(state.sabOPView, state.opIds.rc, -1)){ + /* + The reason for this loop is buried in the details of a long + discussion at: + + https://github.com/sqlite/sqlite-wasm/issues/12 + + Summary: in at least one browser flavor, under high loads, + the wait()/notify() pairings can get out of sync and/or + spuriously wake up. Calling wait() here until it returns + 'not-equal' gets them back in sync. + */ + } + /* When the above wait() call returns 'not-equal', the async + half will have completed the operation and reported its + results in the state.opIds.rc slot of the SAB. It may have + also serialized an exception for us. */ + const rc = Atomics.load(state.sabOPView, state.opIds.rc); + metrics[op].wait += performance.now() - t; + if(rc && state.asyncS11nExceptions){ + const err = state.s11n.deserialize(); + if(err) error(op+"() async error:",...err); + } + return rc; + }; + + const opTimer = Object.create(null); + opTimer.op = undefined; + opTimer.start = undefined; + const mTimeStart = opfsVfs.mTimeStart = (op)=>{ + opTimer.start = performance.now(); + opTimer.op = op; + ++metrics[op].count; + }; + const mTimeEnd = opfsVfs.mTimeEnd = ()=>( + metrics[opTimer.op].time += performance.now() - opTimer.start + ); + + /** + Map of sqlite3_file pointers to objects constructed by xOpen(). + */ + const __openFiles = opfsVfs.__openFiles = Object.create(null); + + /** + Impls for the sqlite3_io_methods methods. Maintenance reminder: + members are in alphabetical order to simplify finding them. + */ + const ioSyncWrappers = opfsVfs.ioSyncWrappers = util.nu({ + xCheckReservedLock: function(pFile,pOut){ + /** + After consultation with a topic expert: "opfs-wl" will + continue to use the same no-op impl which "opfs" does + because: + + - xCheckReservedLock() is just a hint. If SQLite needs to + lock, it's still going to try to lock. + + - We cannot do this check synchronously in "opfs-wl", + so would need to pass it to the async proxy. That would + make it inordinately expensive considering that it's + just a hint. + */ + wasm.poke(pOut, 0, 'i32'); + return 0; + }, + xClose: function(pFile){ + mTimeStart('xClose'); + let rc = 0; + const f = __openFiles[pFile]; + if(f){ + delete __openFiles[pFile]; + rc = opRun('xClose', pFile); + if(f.sq3File) f.sq3File.dispose(); + } + mTimeEnd(); + return rc; + }, + xDeviceCharacteristics: function(pFile){ + return capi.SQLITE_IOCAP_UNDELETABLE_WHEN_OPEN; + }, + xFileControl: function(pFile, opId, pArg){ + /*mTimeStart('xFileControl'); + mTimeEnd();*/ + return capi.SQLITE_NOTFOUND; + }, + xFileSize: function(pFile,pSz64){ + mTimeStart('xFileSize'); + let rc = opRun('xFileSize', pFile); + if(0==rc){ + try { + const sz = state.s11n.deserialize()[0]; + wasm.poke(pSz64, sz, 'i64'); + }catch(e){ + error("Unexpected error reading xFileSize() result:",e); + rc = state.sq3Codes.SQLITE_IOERR; + } + } + mTimeEnd(); + return rc; + }, + xRead: function(pFile,pDest,n,offset64){ + mTimeStart('xRead'); + const f = __openFiles[pFile]; + let rc; + try { + rc = opRun('xRead',pFile, n, Number(offset64)); + if(0===rc || capi.SQLITE_IOERR_SHORT_READ===rc){ + /** + Results get written to the SharedArrayBuffer f.sabView. + Because the heap is _not_ a SharedArrayBuffer, we have + to copy the results. TypedArray.set() seems to be the + fastest way to copy this. */ + wasm.heap8u().set(f.sabView.subarray(0, n), Number(pDest)); + } + }catch(e){ + error("xRead(",arguments,") failed:",e,f); + rc = capi.SQLITE_IOERR_READ; + } + mTimeEnd(); + return rc; + }, + xSync: function(pFile,flags){ + mTimeStart('xSync'); + const rc = opRun('xSync', pFile, flags); + mTimeEnd(); + return rc; + }, + xTruncate: function(pFile,sz64){ + mTimeStart('xTruncate'); + const rc = opRun('xTruncate', pFile, Number(sz64)); + mTimeEnd(); + return rc; + }, + xWrite: function(pFile,pSrc,n,offset64){ + mTimeStart('xWrite'); + const f = __openFiles[pFile]; + let rc; + try { + f.sabView.set(wasm.heap8u().subarray( + Number(pSrc), Number(pSrc) + n + )); + rc = opRun('xWrite', pFile, n, Number(offset64)); + }catch(e){ + error("xWrite(",arguments,") failed:",e,f); + rc = capi.SQLITE_IOERR_WRITE; + } + mTimeEnd(); + return rc; + } + })/*ioSyncWrappers*/; + + /** + Impls for the sqlite3_vfs methods. Maintenance reminder: members + are in alphabetical order to simplify finding them. + */ + const vfsSyncWrappers = opfsVfs.vfsSyncWrappers = { + xAccess: function(pVfs,zName,flags,pOut){ + mTimeStart('xAccess'); + const rc = opRun('xAccess', wasm.cstrToJs(zName)); + wasm.poke( pOut, (rc ? 0 : 1), 'i32' ); + mTimeEnd(); + return 0; + }, + xCurrentTime: function(pVfs,pOut){ + wasm.poke(pOut, 2440587.5 + (new Date().getTime()/86400000), + 'double'); + return 0; + }, + xCurrentTimeInt64: function(pVfs,pOut){ + wasm.poke(pOut, (2440587.5 * 86400000) + new Date().getTime(), + 'i64'); + return 0; + }, + xDelete: function(pVfs, zName, doSyncDir){ + mTimeStart('xDelete'); + const rc = opRun('xDelete', wasm.cstrToJs(zName), doSyncDir, false); + mTimeEnd(); + return rc; + }, + xFullPathname: function(pVfs,zName,nOut,pOut){ + /* Until/unless we have some notion of "current dir" + in OPFS, simply copy zName to pOut... */ + const i = wasm.cstrncpy(pOut, zName, nOut); + return ipMethods is NULL. */ + if(fh.readOnly){ + wasm.poke(pOutFlags, capi.SQLITE_OPEN_READONLY, 'i32'); + } + __openFiles[pFile] = fh; + fh.sabView = state.sabFileBufView; + fh.sq3File = new capi.sqlite3_file(pFile); + if( zToFree ) fh.sq3File.addOnDispose(zToFree); + fh.sq3File.$pMethods = opfsIoMethods.pointer; + fh.lockType = capi.SQLITE_LOCK_NONE; + } + mTimeEnd(); + return rc; + }/*xOpen()*/ + }/*vfsSyncWrappers*/; + + const pDVfs = capi.sqlite3_vfs_find(null)/*pointer to default VFS*/; + if(pDVfs){ + const dVfs = new capi.sqlite3_vfs(pDVfs); + opfsVfs.$xRandomness = dVfs.$xRandomness; + opfsVfs.$xSleep = dVfs.$xSleep; + dVfs.dispose(); + } + if(!opfsVfs.$xRandomness){ + /* If the default VFS has no xRandomness(), add a basic JS impl... */ + opfsVfs.vfsSyncWrappers.xRandomness = function(pVfs, nOut, pOut){ + const heap = wasm.heap8u(); + let i = 0; + const npOut = Number(pOut); + for(; i < nOut; ++i) heap[npOut + i] = (Math.random()*255000) & 0xFF; + return i; + }; + } + if(!opfsVfs.$xSleep){ + /* If we can inherit an xSleep() impl from the default VFS then + assume it's sane and use it, otherwise install a JS-based + one. */ + opfsVfs.vfsSyncWrappers.xSleep = function(pVfs,ms){ + mTimeStart('xSleep'); + Atomics.wait(state.sabOPView, state.opIds.xSleep, 0, ms); + mTimeEnd(); + return 0; + }; + } + +//#define vfs.metrics.enable +//#// import initS11n() +//#include api/opfs-common-inline.c-pp.js +//#undef vfs.metrics.enable + opfsVfs.initS11n = initS11n; + + /** + To be called by the VFS's main installation routine after it has + wired up enough state to provide its overridden io-method impls + (which must be properties of the ioMethods argument). Returns a + Promise which the installation routine must return. callback must + be a function which performs any post-bootstrap touchups, namely + plugging in a sqlite3.oo1 wrapper. It is passed (sqlite3, opfsVfs), + where opfsVfs is the sqlite3_vfs object which was set up by + opfsUtil.createVfsState(). + */ + opfsVfs.bindVfs = function(ioMethods, callback){ + Object.assign(opfsVfs.ioSyncWrappers, ioMethods); + const thePromise = new Promise(function(promiseResolve_, promiseReject_){ + let promiseWasRejected = undefined; + const promiseReject = (err)=>{ + promiseWasRejected = true; + opfsVfs.dispose(); + return promiseReject_(err); + }; + const promiseResolve = ()=>{ + try{ + callback(sqlite3, opfsVfs); + }catch(e){ + return promiseReject(e); + } + promiseWasRejected = false; + return promiseResolve_(sqlite3); + }; + const options = opfsUtil.options; + let proxyUri = options.proxyUri +( + (options.proxyUri.indexOf('?')<0) ? '?' : '&' + )+'vfs='+vfsName; + //sqlite3.config.error("proxyUri",options.proxyUri, (new Error())); + const W = opfsVfs.worker = +//#if target:es6-bundler-friendly + (()=>{ + /* _Sigh_... */ + switch(vfsName){ + case 'opfs': + return new Worker(new URL("sqlite3-opfs-async-proxy.js?vfs=opfs", import.meta.url)); + case 'opfs-wl': + return new Worker(new URL("sqlite3-opfs-async-proxy.js?vfs=opfs-wl", import.meta.url)); + } + })(); +//#elif target:es6-module + new Worker(new URL(proxyUri, import.meta.url)); +//#else + new Worker(proxyUri); +//#/if + let zombieTimer = setTimeout(()=>{ + /* At attempt to work around a browser-specific quirk in which + the Worker load is failing in such a way that we neither + resolve nor reject it. This workaround gives that resolve/reject + a time limit and rejects if that timer expires. Discussion: + https://sqlite.org/forum/forumpost/a708c98dcb3ef */ + if(undefined===promiseWasRejected){ + promiseReject( + new Error("Timeout while waiting for OPFS async proxy worker.") + ); + } + }, 4000); + W._originalOnError = W.onerror /* will be restored later */; + W.onerror = function(err){ + // The error object doesn't contain any useful info when the + // failure is, e.g., that the remote script is 404. + error("Error initializing OPFS asyncer:",err); + promiseReject(new Error("Loading OPFS async Worker failed for unknown reasons.")); + }; + + const opRun = opfsVfs.opRun; +//#if 0 + /** + Not part of the public API. Only for test/development use. + */ + opfsVfs.debug = { + asyncShutdown: ()=>{ + warn("Shutting down OPFS async listener. The OPFS VFS will no longer work."); + opRun('opfs-async-shutdown'); + }, + asyncRestart: ()=>{ + warn("Attempting to restart OPFS VFS async listener. Might work, might not."); + W.postMessage({type: 'opfs-async-restart'}); + } + }; +//#/if + + const sanityCheck = function(){ + const scope = wasm.scopedAllocPush(); + const sq3File = new capi.sqlite3_file(); + try{ + const fid = sq3File.pointer; + const openFlags = capi.SQLITE_OPEN_CREATE + | capi.SQLITE_OPEN_READWRITE + //| capi.SQLITE_OPEN_DELETEONCLOSE + | capi.SQLITE_OPEN_MAIN_DB; + const pOut = wasm.scopedAlloc(8); + const dbFile = "/sanity/check/file"+randomFilename(8); + const zDbFile = wasm.scopedAllocCString(dbFile); + let rc; + state.s11n.serialize("This is ä string."); + rc = state.s11n.deserialize(); + log("deserialize() says:",rc); + if("This is ä string."!==rc[0]) toss("String d13n error."); + opfsVfs.vfsSyncWrappers.xAccess(opfsVfs.pointer, zDbFile, 0, pOut); + rc = wasm.peek(pOut,'i32'); + log("xAccess(",dbFile,") exists ?=",rc); + rc = opfsVfs.vfsSyncWrappers.xOpen(opfsVfs.pointer, zDbFile, + fid, openFlags, pOut); + log("open rc =",rc,"state.sabOPView[xOpen] =", + state.sabOPView[state.opIds.xOpen]); + if(0!==rc){ + error("open failed with code",rc); + return; + } + opfsVfs.vfsSyncWrappers.xAccess(opfsVfs.pointer, zDbFile, 0, pOut); + rc = wasm.peek(pOut,'i32'); + if(!rc) toss("xAccess() failed to detect file."); + rc = opfsVfs.ioSyncWrappers.xSync(sq3File.pointer, 0); + if(rc) toss('sync failed w/ rc',rc); + rc = opfsVfs.ioSyncWrappers.xTruncate(sq3File.pointer, 1024); + if(rc) toss('truncate failed w/ rc',rc); + wasm.poke(pOut,0,'i64'); + rc = opfsVfs.ioSyncWrappers.xFileSize(sq3File.pointer, pOut); + if(rc) toss('xFileSize failed w/ rc',rc); + log("xFileSize says:",wasm.peek(pOut, 'i64')); + rc = opfsVfs.ioSyncWrappers.xWrite(sq3File.pointer, zDbFile, 10, 1); + if(rc) toss("xWrite() failed!"); + const readBuf = wasm.scopedAlloc(16); + rc = opfsVfs.ioSyncWrappers.xRead(sq3File.pointer, readBuf, 6, 2); + wasm.poke(readBuf+6,0); + let jRead = wasm.cstrToJs(readBuf); + log("xRead() got:",jRead); + if("sanity"!==jRead) toss("Unexpected xRead() value."); + if(opfsVfs.vfsSyncWrappers.xSleep){ + log("xSleep()ing before close()ing..."); + opfsVfs.vfsSyncWrappers.xSleep(opfsVfs.pointer,2000); + log("waking up from xSleep()"); + } + rc = opfsVfs.ioSyncWrappers.xClose(fid); + log("xClose rc =",rc,"sabOPView =",state.sabOPView); + log("Deleting file:",dbFile); + opfsVfs.vfsSyncWrappers.xDelete(opfsVfs.pointer, zDbFile, 0x1234); + opfsVfs.vfsSyncWrappers.xAccess(opfsVfs.pointer, zDbFile, 0, pOut); + rc = wasm.peek(pOut,'i32'); + if(rc) toss("Expecting 0 from xAccess(",dbFile,") after xDelete()."); + warn("End of OPFS sanity checks."); + }finally{ + sq3File.dispose(); + wasm.scopedAllocPop(scope); + } + }/*sanityCheck()*/; + + W.onmessage = function({data}){ + //sqlite3.config.warn(vfsName,"Worker.onmessage:",data); + switch(data.type){ + case 'opfs-unavailable': + /* Async proxy has determined that OPFS is unavailable. There's + nothing more for us to do here. */ + promiseReject(new Error(data.payload.join(' '))); + break; + case 'opfs-async-loaded': + /* Arrives as soon as the asyc proxy finishes loading. + Pass our config and shared state on to the async + worker. */ + delete state.vfs; + W.postMessage({type: 'opfs-async-init', args: util.nu(state)}); + break; + case 'opfs-async-inited': { + /* Indicates that the async partner has received the 'init' + and has finished initializing, so the real work can + begin... */ + if(true===promiseWasRejected){ + break /* promise was already rejected via timer */; + } + clearTimeout(zombieTimer); + zombieTimer = null; + try { + sqlite3.vfs.installVfs({ + io: {struct: opfsVfs.ioMethods, methods: opfsVfs.ioSyncWrappers}, + vfs: {struct: opfsVfs, methods: opfsVfs.vfsSyncWrappers} + }); + state.sabOPView = new Int32Array(state.sabOP); + state.sabFileBufView = new Uint8Array(state.sabIO, 0, state.fileBufferSize); + state.sabS11nView = new Uint8Array(state.sabIO, state.sabS11nOffset, state.sabS11nSize); + opfsVfs.initS11n(); + delete opfsVfs.initS11n; + if(options.sanityChecks){ + warn("Running sanity checks because of opfs-sanity-check URL arg..."); + sanityCheck(); + } + if(opfsUtil.thisThreadHasOPFS()){ + opfsUtil.getRootDir().then((d)=>{ + W.onerror = W._originalOnError; + delete W._originalOnError; + log("End of OPFS sqlite3_vfs setup.", opfsVfs); + promiseResolve(); + }).catch(promiseReject); + }else{ + promiseResolve(); + } + }catch(e){ + error(e); + promiseReject(e); + } + break; + } + case 'debug': + warn("debug message from worker:",data); + break; + default: { + const errMsg = ( + "Unexpected message from the OPFS async worker: " + + JSON.stringify(data) + ); + error(errMsg); + promiseReject(new Error(errMsg)); + break; + } + }/*switch(data.type)*/ + }/*W.onmessage()*/; + })/*thePromise*/; + return thePromise; + }/*bindVfs()*/; + + return state; + }/*createVfsState()*/; + +}/*sqlite3ApiBootstrap.initializers*/); +//#/if target:node diff --git a/ext/wasm/api/post-js-footer.js b/ext/wasm/api/post-js-footer.js index c6a2e1517c..f8050ddd3d 100644 --- a/ext/wasm/api/post-js-footer.js +++ b/ext/wasm/api/post-js-footer.js @@ -1,3 +1,72 @@ +/* + 2022-07-22 + + The author disclaims copyright to this source code. In place of a + legal notice, here is a blessing: + + * May you do good and not evil. + * May you find forgiveness for yourself and forgive others. + * May you share freely, never taking more than you give. + + *********************************************************************** + + This file is the tail end of the sqlite3-api.js constellation, + closing the function scope opened by post-js-header.js. + + In terms of amalgamation code placement, this file is appended + immediately after the final sqlite3-api-*.js piece. Those files + cooperate to prepare sqlite3ApiBootstrap() and this file calls it. + It is run within a context which gives it access to Emscripten's + Module object, after sqlite3.wasm is loaded but before + sqlite3ApiBootstrap() has been called. + + Because this code resides (after building) inside the function + installed by post-js-header.js, it has access to state set up by + pre-js.c-pp.js and friends. +*/ +try{ + /* We are in the closing block of Module.runSQLite3PostLoadInit(), so + its arguments are visible here. */ + + /* Config options for sqlite3ApiBootstrap(). */ + const bootstrapConfig = Object.assign( + Object.create(null), + /** The WASM-environment-dependent configuration for sqlite3ApiBootstrap() */ + { + memory: ('undefined'!==typeof wasmMemory) + ? wasmMemory + : EmscriptenModule['wasmMemory'], + exports: ('undefined'!==typeof wasmExports) + ? wasmExports /* emscripten >=3.1.44 */ + : (Object.prototype.hasOwnProperty.call(EmscriptenModule,'wasmExports') + ? EmscriptenModule['wasmExports'] + : EmscriptenModule['asm']/* emscripten <=3.1.43 */) + }, + globalThis.sqlite3ApiBootstrap.defaultConfig, // default options + globalThis.sqlite3ApiConfig || {} // optional client-provided options + ); + + sqlite3InitScriptInfo.debugModule("Bootstrapping lib config", bootstrapConfig); + + /** + For purposes of the Emscripten build, call sqlite3ApiBootstrap(). + Ideally clients should be able to inject their own config here, + but that's not practical in this particular build constellation + because of the order everything happens in. Clients may either + define globalThis.sqlite3ApiConfig or modify + globalThis.sqlite3ApiBootstrap.defaultConfig to tweak the default + configuration used by a no-args call to sqlite3ApiBootstrap(), + but must have first loaded their WASM module in order to be able + to provide the necessary configuration state. + */ + const p = globalThis.sqlite3ApiBootstrap(bootstrapConfig); + delete globalThis.sqlite3ApiBootstrap; + return p /* the eventual result of globalThis.sqlite3InitModule() */; +}catch(e){ + console.error("sqlite3ApiBootstrap() error:",e); + throw e; +} + //console.warn("This is the end of the Module.runSQLite3PostLoadInit handler."); }/*Module.runSQLite3PostLoadInit(...)*/; //console.warn("This is the end of the setup of the (pending) Module.runSQLite3PostLoadInit"); diff --git a/ext/wasm/api/post-js-header.js b/ext/wasm/api/post-js-header.js index cdc6b3a38d..348f80ea0b 100644 --- a/ext/wasm/api/post-js-header.js +++ b/ext/wasm/api/post-js-header.js @@ -3,14 +3,13 @@ post-js.js for use with Emscripten's --post-js flag, so it gets injected in the earliest stages of sqlite3InitModule(). - This function wraps the whole SQLite3 library but does not - bootstrap it. - Running this function will bootstrap the library and return a Promise to the sqlite3 namespace object. + + In the canonical builds, this gets called by extern-post-js.c-pp.js */ -Module.runSQLite3PostLoadInit = function( - sqlite3InitScriptInfo /* populated by extern-post-js.c-pp.js */, +Module.runSQLite3PostLoadInit = async function( + sqlite3InitScriptInfo, EmscriptenModule/*the Emscripten-style module object*/, sqlite3IsUnderTest ){ @@ -35,7 +34,7 @@ Module.runSQLite3PostLoadInit = function( - sqlite3-vtab-helper.c-pp.js => Utilities for virtual table impls - sqlite3-vfs-opfs.c-pp.js => OPFS VFS - sqlite3-vfs-opfs-sahpool.c-pp.js => OPFS SAHPool VFS - - sqlite3-api-cleanup.js => final bootstrapping phase + - sqlite3-vfs-opfs-wl.c-pp.js => WebLock-using OPFS VFS - post-js-footer.js => this file's epilogue And all of that gets sandwiched between extern-pre-js.js and diff --git a/ext/wasm/api/pre-js.c-pp.js b/ext/wasm/api/pre-js.c-pp.js index 8a4a0f9fd0..fbb48f9eac 100644 --- a/ext/wasm/api/pre-js.c-pp.js +++ b/ext/wasm/api/pre-js.c-pp.js @@ -14,12 +14,37 @@ itself. i.e. try to keep file-local symbol names obnoxiously collision-resistant. */ +/** + This file was preprocessed using: + +//#@ policy error + @c-pp::argv@ +//#@ policy off +*/ +//#if unsupported-build +/** + UNSUPPORTED BUILD: + + This SQLite JS build configuration is entirely unsupported! It has + not been tested beyond the ability to compile it. It may not + load. It may not work properly. Only builds _directly_ targeting + browser environments ("vanilla" JS and ESM modules) are supported + and tested. Builds which _indirectly_ target browsers (namely + bundler-friendly builds and any node builds) are not supported + deliverables. +*/ +//#/if +//#if not target:es6-bundler-friendly (function(Module){ const sIMS = globalThis.sqlite3InitModuleState/*from extern-post-js.c-pp.js*/ || Object.assign(Object.create(null),{ - debugModule: ()=>{ - console.warn("globalThis.sqlite3InitModuleState is missing"); + /* In WASMFS builds this file gets loaded once per thread, + but sqlite3InitModuleState is not getting set for the + worker threads? That those workers seem to function fine + despite that is curious. */ + debugModule: function(){ + console.warn("globalThis.sqlite3InitModuleState is missing",arguments); } }); delete globalThis.sqlite3InitModuleState; @@ -47,6 +72,14 @@ approach. */ Module['locateFile'] = function(path, prefix) { + if( this.emscriptenLocateFile instanceof Function ){ + /* [tag:locateFile] Client-overridden impl. We do not support + this but offer it as a back-door which will go away the + moment either Emscripten changes that interface or we manage + to get non-Emscripten builds working. + https://sqlite.org/forum/forumpost/1eec339854c935bd */ + return this.emscriptenLocateFile(path, prefix); + } //#if target:es6-module return new URL(path, import.meta.url).href; //#else @@ -69,11 +102,10 @@ "result =", theFile ); return theFile; -//#endif target:es6-module +//#/if target:es6-module }.bind(sIMS); -//#if Module.instantiateWasm -//#if not wasmfs +//#if Module.instantiateWasm and not wasmfs and not target:node /** Override Module.instantiateWasm(). @@ -82,8 +114,15 @@ https://github.com/emscripten-core/emscripten/issues/17951 In such builds we must disable this. + + It's disabled in the (unsupported/untested) node builds because + node does not do fetch(). */ Module['instantiateWasm'] = function callee(imports,onSuccess){ + if( this.emscriptenInstantiateWasm instanceof Function ){ + /* See [tag:locateFile]. Same story here */ + return this.emscriptenInstantiateWasm(imports, onSuccess); + } const sims = this; const uri = Module.locateFile( sims.wasmFilename, ( @@ -109,7 +148,7 @@ .then(finalThen) return loadWasm(); }.bind(sIMS); -//#endif not wasmfs -//#endif Module.instantiateWasm +//#/if Module.instantiateWasm and not wasmfs })(Module); +//#/if not target:es6-bundler-friendly /* END FILE: api/pre-js.js. */ diff --git a/ext/wasm/api/sqlite3-api-cleanup.js b/ext/wasm/api/sqlite3-api-cleanup.js deleted file mode 100644 index 2235663261..0000000000 --- a/ext/wasm/api/sqlite3-api-cleanup.js +++ /dev/null @@ -1,83 +0,0 @@ -/* - 2022-07-22 - - The author disclaims copyright to this source code. In place of a - legal notice, here is a blessing: - - * May you do good and not evil. - * May you find forgiveness for yourself and forgive others. - * May you share freely, never taking more than you give. - - *********************************************************************** - - This file is the tail end of the sqlite3-api.js constellation, - intended to be appended after all other sqlite3-api-*.js files so - that it can finalize any setup and clean up any global symbols - temporarily used for setting up the API's various subsystems. - - In Emscripten builds it's run in the context of what amounts to a - Module.postRun handler, though it's no longer actually a postRun - handler because Emscripten 4.0 changed postRun semantics in an - incompatible way. - - In terms of amalgamation code placement, this file is appended - immediately after the final sqlite3-api-*.js piece. Those files - cooperate to prepare sqlite3ApiBootstrap() and this file calls it. - It is run within a context which gives it access to Emscripten's - Module object, after sqlite3.wasm is loaded but before - sqlite3ApiBootstrap() has been called. - - Because this code resides (after building) inside the function - installed by post-js-header.js, it has access to the -*/ -'use strict'; -if( 'undefined' === typeof EmscriptenModule/*from post-js-header.js*/ ){ - console.warn("This is not running in the context of Module.runSQLite3PostLoadInit()"); - throw new Error("sqlite3-api-cleanup.js expects to be running in the "+ - "context of its Emscripten module loader."); -} -try{ - /* Config options for sqlite3ApiBootstrap(). */ - const bootstrapConfig = Object.assign( - Object.create(null), - globalThis.sqlite3ApiBootstrap.defaultConfig, // default options - globalThis.sqlite3ApiConfig || {}, // optional client-provided options - /** The WASM-environment-dependent configuration for sqlite3ApiBootstrap() */ - { - memory: ('undefined'!==typeof wasmMemory) - ? wasmMemory - : EmscriptenModule['wasmMemory'], - exports: ('undefined'!==typeof wasmExports) - ? wasmExports /* emscripten >=3.1.44 */ - : (Object.prototype.hasOwnProperty.call(EmscriptenModule,'wasmExports') - ? EmscriptenModule['wasmExports'] - : EmscriptenModule['asm']/* emscripten <=3.1.43 */) - } - ); - - /** Figure out if this is a 32- or 64-bit WASM build. */ - bootstrapConfig.wasmPtrIR = - 'number'===(typeof bootstrapConfig.exports.sqlite3_libversion()) - ? 'i32' :'i64'; - const sIMS = sqlite3InitScriptInfo; - sIMS.debugModule("Bootstrapping lib config", sIMS); - - /** - For purposes of the Emscripten build, call sqlite3ApiBootstrap(). - Ideally clients should be able to inject their own config here, - but that's not practical in this particular build constellation - because of the order everything happens in. Clients may either - define globalThis.sqlite3ApiConfig or modify - globalThis.sqlite3ApiBootstrap.defaultConfig to tweak the default - configuration used by a no-args call to sqlite3ApiBootstrap(), - but must have first loaded their WASM module in order to be able - to provide the necessary configuration state. - */ - const p = globalThis.sqlite3ApiBootstrap(bootstrapConfig); - delete globalThis.sqlite3ApiBootstrap; - return p /* the eventual result of globalThis.sqlite3InitModule() */; -}catch(e){ - console.error("sqlite3ApiBootstrap() error:",e); - throw e; -} -throw new Error("Maintenance required: this line should never be reached"); diff --git a/ext/wasm/api/sqlite3-api-glue.c-pp.js b/ext/wasm/api/sqlite3-api-glue.c-pp.js index 1c42b01508..9c525bd4c7 100644 --- a/ext/wasm/api/sqlite3-api-glue.c-pp.js +++ b/ext/wasm/api/sqlite3-api-glue.c-pp.js @@ -98,6 +98,8 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ ["sqlite3_bind_parameter_name", "string", "sqlite3_stmt*", "int"], ["sqlite3_bind_pointer", "int", "sqlite3_stmt*", "int", "*", "string:static", "*"], + /* sqlite_bind_text() is hand-written */ + ["sqlite3_bind_zeroblob", "int", "sqlite3_stmt*", "int", "int"], ["sqlite3_busy_handler","int", [ "sqlite3*", new wasm.xWrap.FuncPtrAdapter({ @@ -120,7 +122,11 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ ["sqlite3_column_double","f64", "sqlite3_stmt*", "int"], ["sqlite3_column_int","int", "sqlite3_stmt*", "int"], ["sqlite3_column_name","string", "sqlite3_stmt*", "int"], +//#define proxy-text-apis 1 +//#if not proxy-text-apis +/* Search this file for tag:proxy-text-apis to see what this is about. */ ["sqlite3_column_text","string", "sqlite3_stmt*", "int"], +//#/if ["sqlite3_column_type","int", "sqlite3_stmt*", "int"], ["sqlite3_column_value","sqlite3_value*", "sqlite3_stmt*", "int"], ["sqlite3_commit_hook", "void*", [ @@ -196,6 +202,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ ["sqlite3_libversion_number", "int"], ["sqlite3_limit", "int", ["sqlite3*", "int", "int"]], ["sqlite3_malloc", "*","int"], + ["sqlite3_next_stmt", "sqlite3_stmt*", ["sqlite3*","sqlite3_stmt*"]], ["sqlite3_open", "int", "string", "*"], ["sqlite3_open_v2", "int", "string", "*", "int", "string"], /* sqlite3_prepare_v2() and sqlite3_prepare_v3() are handled @@ -317,7 +324,9 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ ["sqlite3_value_numeric_type", "int", "sqlite3_value*"], ["sqlite3_value_pointer", "*", "sqlite3_value*", "string:static"], ["sqlite3_value_subtype", "int", "sqlite3_value*"], +//#if not proxy-text-apis ["sqlite3_value_text", "string", "sqlite3_value*"], +//#/if ["sqlite3_value_type", "int", "sqlite3_value*"], ["sqlite3_vfs_find", "*", "string"], ["sqlite3_vfs_register", "int", "sqlite3_vfs*", "int"], @@ -493,7 +502,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ ["sqlite3_activate_see", undefined, "string"] ); } -//#endif enable-see +//#/if enable-see if( wasm.bigIntEnabled && !!wasm.exports.sqlite3_declare_vtab ){ bindingSignatures.int64.push( @@ -969,10 +978,8 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ "entry SQLITE_WASM_DEALLOC (=="+capi.SQLITE_WASM_DEALLOC+")."); } const __rcMap = Object.create(null); - for(const t of ['resultCodes']){ - for(const e of Object.entries(wasm.ctype[t])){ - __rcMap[e[1]] = e[0]; - } + for(const e of Object.entries(wasm.ctype['resultCodes'])){ + __rcMap[e[1]] = e[0]; } /** For the given integer, returns the SQLITE_xxx result code as a @@ -984,8 +991,6 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ const notThese = Object.assign(Object.create(null),{ // For each struct to NOT register, map its name to true: WasmTestStruct: true, - /* We unregister the kvvfs VFS from Worker threads below. */ - sqlite3_kvvfs_methods: !util.isUIThread(), /* sqlite3_index_info and friends require int64: */ sqlite3_index_info: !wasm.bigIntEnabled, sqlite3_index_constraint: !wasm.bigIntEnabled, @@ -1654,6 +1659,45 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ }/*sqlite3_bind_text/blob()*/ +//#if proxy-text-apis + if(!capi.sqlite3_column_text){ + /*[tag:proxy-text-apis] + As discussed at: + + https://sqlite.org/forum/forumpost/d77281aec2df9ada + + Summary: there are opinions that sqlite3_column_text() and + sqlite3_value_text() should handle strings such that embedded + NULs are retained. This block does that. This block does _not_ + apply that special-case behavior to any number of _other_ + APIs which return C-strings. That discrepancy makes this + block highly arguable, but one can also argue that these two + specific functions can get away with such acrobatics without + it being called voodoo in a pejorative sense. + */ + const argStmt = wasm.xWrap.argAdapter('sqlite3_stmt*'), + argInt = wasm.xWrap.argAdapter('int'), + argValue = wasm.xWrap.argAdapter('sqlite3_value*'), + newStr = + (cstr,n)=>wasm.typedArrayToString(wasm.heap8u(), + Number(cstr), Number(cstr)+n) + capi.sqlite3_column_text = function(stmt, colIndex){ + const a0 = argStmt(stmt), a1 = argInt(colIndex); + const cstr = wasm.exports.sqlite3_column_text(a0, a1); + return cstr + ? newStr(cstr,wasm.exports.sqlite3_column_bytes(a0, a1)) + : null; + }; + capi.sqlite3_value_text = function(val){ + const a0 = argValue(val); + const cstr = wasm.exports.sqlite3_value_text(a0); + return cstr + ? newStr(cstr,wasm.exports.sqlite3_value_bytes(a0)) + : null; + }; + }/*text-return-related bindings*/ +//#/if proxy-text-apis + {/* sqlite3_config() */ /** Wraps a small subset of the C API's sqlite3_config() options. @@ -1740,105 +1784,6 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ }; }/* auto-extension */ - const pKvvfs = capi.sqlite3_vfs_find("kvvfs"); - if( pKvvfs ){/* kvvfs-specific glue */ - if(util.isUIThread()){ - const kvvfsMethods = new capi.sqlite3_kvvfs_methods( - wasm.exports.sqlite3__wasm_kvvfs_methods() - ); - delete capi.sqlite3_kvvfs_methods; - - const kvvfsMakeKey = wasm.exports.sqlite3__wasm_kvvfsMakeKeyOnPstack, - pstack = wasm.pstack; - - const kvvfsStorage = (zClass)=> - ((115/*=='s'*/===wasm.peek(zClass)) - ? sessionStorage : localStorage); - - /** - Implementations for members of the object referred to by - sqlite3__wasm_kvvfs_methods(). We swap out the native - implementations with these, which use localStorage or - sessionStorage for their backing store. - */ - const kvvfsImpls = { - xRead: (zClass, zKey, zBuf, nBuf)=>{ - const stack = pstack.pointer, - astack = wasm.scopedAllocPush(); - try { - const zXKey = kvvfsMakeKey(zClass,zKey); - if(!zXKey) return -3/*OOM*/; - const jKey = wasm.cstrToJs(zXKey); - const jV = kvvfsStorage(zClass).getItem(jKey); - if(!jV) return -1; - const nV = jV.length /* We are relying 100% on v being - ASCII so that jV.length is equal - to the C-string's byte length. */; - if(nBuf<=0) return nV; - else if(1===nBuf){ - wasm.poke(zBuf, 0); - return nV; - } - const zV = wasm.scopedAllocCString(jV); - if(nBuf > nV + 1) nBuf = nV + 1; - wasm.heap8u().copyWithin( - Number(zBuf), Number(zV), wasm.ptr.addn(zV, nBuf,- 1) - ); - wasm.poke(wasm.ptr.add(zBuf, nBuf, -1), 0); - return nBuf - 1; - }catch(e){ - sqlite3.config.error("kvstorageRead()",e); - return -2; - }finally{ - pstack.restore(stack); - wasm.scopedAllocPop(astack); - } - }, - xWrite: (zClass, zKey, zData)=>{ - const stack = pstack.pointer; - try { - const zXKey = kvvfsMakeKey(zClass,zKey); - if(!zXKey) return 1/*OOM*/; - const jKey = wasm.cstrToJs(zXKey); - kvvfsStorage(zClass).setItem(jKey, wasm.cstrToJs(zData)); - return 0; - }catch(e){ - sqlite3.config.error("kvstorageWrite()",e); - return capi.SQLITE_IOERR; - }finally{ - pstack.restore(stack); - } - }, - xDelete: (zClass, zKey)=>{ - const stack = pstack.pointer; - try { - const zXKey = kvvfsMakeKey(zClass,zKey); - if(!zXKey) return 1/*OOM*/; - kvvfsStorage(zClass).removeItem(wasm.cstrToJs(zXKey)); - return 0; - }catch(e){ - sqlite3.config.error("kvstorageDelete()",e); - return capi.SQLITE_IOERR; - }finally{ - pstack.restore(stack); - } - } - }/*kvvfsImpls*/; - for(const k of Object.keys(kvvfsImpls)){ - kvvfsMethods[kvvfsMethods.memberKey(k)] = - wasm.installFunction( - kvvfsMethods.memberSignature(k), - kvvfsImpls[k] - ); - } - }else{ - /* Worker thread: unregister kvvfs to avoid it being used - for anything other than local/sessionStorage. It "can" - be used that way but it's not really intended to be. */ - capi.sqlite3_vfs_unregister(pKvvfs); - } - }/*pKvvfs*/ - /* Warn if client-level code makes use of FuncPtrAdapter. */ wasm.xWrap.FuncPtrAdapter.warnOnUse = true; @@ -1944,7 +1889,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ } tgt[memKey] = fProxy; }else{ - const pFunc = wasm.installFunction(fProxy, tgt.memberSignature(name)); + const pFunc = wasm.installFunction(fProxy, sigN); tgt[memKey] = pFunc; if(!tgt.ondispose || !tgt.ondispose.__removeFuncList){ tgt.addOnDispose('ondispose.__removeFuncList handler', diff --git a/ext/wasm/api/sqlite3-api-oo1.c-pp.js b/ext/wasm/api/sqlite3-api-oo1.c-pp.js index 8c2f35e677..13aa427194 100644 --- a/ext/wasm/api/sqlite3-api-oo1.c-pp.js +++ b/ext/wasm/api/sqlite3-api-oo1.c-pp.js @@ -26,6 +26,20 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ the sqlite3 binding if, e.g., the wrapper is in the main thread and the sqlite3 API is in a worker. */ + const outWrapper = function(f){ + return (...args)=>f("sqlite3.oo1:",...args); + }; + + const debug = sqlite3.__isUnderTest + ? outWrapper(console.debug.bind(console)) + : outWrapper(sqlite3.config.debug); + const warn = sqlite3.__isUnderTest + ? outWrapper(console.warn.bind(console)) + : outWrapper(sqlite3.config.warn); + const error = sqlite3.__isUnderTest + ? outWrapper(console.error.bind(console)) + : outWrapper(sqlite3.config.error); + /** In order to keep clients from manipulating, perhaps inadvertently, the underlying pointer values of DB and Stmt @@ -88,7 +102,8 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ wasm.installFunction('i(ippp)', function(t,c,p,x){ if(capi.SQLITE_TRACE_STMT===t){ // x == SQL, p == sqlite3_stmt* - console.log("SQL TRACE #"+(++this.counter)+' via sqlite3@'+c+':', + console.log("SQL TRACE #"+(++this.counter), + 'via sqlite3@'+c+'['+capi.sqlite3_db_filename(c,null)+']', wasm.cstrToJs(x)); } }.bind({counter: 0})); @@ -200,7 +215,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ if(stmt) stmt.finalize(); } }; -//#endif enable-see +//#/if enable-see /** A proxy for DB class constructors. It must be called with the @@ -213,41 +228,18 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ properties: - `.filename`: the db filename. It may be a special name like ":memory:" - or "". + or "". It may also be a URI-style name. - `.flags`: as documented in the DB constructor. - `.vfs`: as documented in the DB constructor. It also accepts those as the first 3 arguments. + + In non-default builds it may accept additional configuration + options. */ const dbCtorHelper = function ctor(...args){ - if(!ctor._name2vfs){ - /** - Map special filenames which we handle here (instead of in C) - to some helpful metadata... - - As of 2022-09-20, the C API supports the names :localStorage: - and :sessionStorage: for kvvfs. However, C code cannot - determine (without embedded JS code, e.g. via Emscripten's - EM_JS()) whether the kvvfs is legal in the current browser - context (namely the main UI thread). In order to help client - code fail early on, instead of it being delayed until they - try to read or write a kvvfs-backed db, we'll check for those - names here and throw if they're not legal in the current - context. - */ - ctor._name2vfs = Object.create(null); - const isWorkerThread = ('function'===typeof importScripts/*===running in worker thread*/) - ? (n)=>toss3("The VFS for",n,"is only available in the main window thread.") - : false; - ctor._name2vfs[':localStorage:'] = { - vfs: 'kvvfs', filename: isWorkerThread || (()=>'local') - }; - ctor._name2vfs[':sessionStorage:'] = { - vfs: 'kvvfs', filename: isWorkerThread || (()=>'session') - }; - } const opt = ctor.normalizeArgs(...args); //sqlite3.config.debug("DB ctor",opt); let pDb; @@ -269,12 +261,6 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ sqlite3.config.error("Invalid DB ctor args",opt,arguments); toss3("Invalid arguments for DB constructor:", arguments, "opts:", opt); } - let fnJs = wasm.isPtr(fn) ? wasm.cstrToJs(fn) : fn; - const vfsCheck = ctor._name2vfs[fnJs]; - if(vfsCheck){ - vfsName = vfsCheck.vfs; - fn = fnJs = vfsCheck.filename(fnJs); - } let oflags = 0; if( flagsStr.indexOf('c')>=0 ){ oflags |= capi.SQLITE_OPEN_CREATE | capi.SQLITE_OPEN_READWRITE; @@ -299,7 +285,15 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ }finally{ wasm.pstack.restore(stack); } - this.filename = fnJs; + this.filename = + /* A poor design choice we have to keep: this.filename may be + in the form "file:....?....". It really should have been + sqlite3_db_filename(pDb) but that discrepancy went too long + unnoticed to be able to change without risk of + breakage. DB.dbFilename() can be used to fetch _just_ the + name part. + */ wasm.isPtr(fn) ? wasm.cstrToJs(fn) : fn; + } __ptrMap.set(this, pDb); __stmtMap.set(this, Object.create(null)); @@ -307,7 +301,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ try{ //#if enable-see dbCtorApplySEEKey(this,opt); -//#endif +//#/if // Check for per-VFS post-open SQL/callback... const pVfs = capi.sqlite3_js_db_vfs(pDb) || toss3("Internal error: cannot get VFS for new db handle."); @@ -390,12 +384,12 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ The given db filename must be resolvable using whatever filesystem layer (virtual or otherwise) is set up for the default - sqlite3 VFS. + sqlite3 VFS or a VFS which can resolve it must be specified. - Note that the special sqlite3 db names ":memory:" and "" - (temporary db) have their normal special meanings here and need - not resolve to real filenames, but "" uses an on-storage - temporary database and requires that the VFS support that. + The special sqlite3 db names ":memory:" and "" (temporary db) + have their normal special meanings here and need not resolve to + real filenames, but "" uses an on-storage temporary database and + requires that the VFS support that. The second argument specifies the open/create mode for the database. It must be string containing a sequence of letters (in @@ -459,7 +453,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ is supplied and the database is encrypted, execution of the post-initialization SQL will fail, causing the constructor to throw. -//#endif enable-see +//#/if enable-see The `filename` and `vfs` arguments may be either JS strings or C-strings allocated via WASM. `flags` is required to be a JS @@ -808,6 +802,9 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ sqlite3_db_filename() value for the given database name, defaulting to "main". The argument may be either a JS string or a pointer to a WASM-allocated C-string. + + this.filename may be in the form of a URI-style string, whereas + the returned string contains only the filename part. */ dbFilename: function(dbName='main'){ return capi.sqlite3_db_filename(affirmDbOpen(this).pointer, dbName); @@ -929,15 +926,15 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ result set, but only if that statement has any result rows. The callback's "this" is the options object, noting that this function synthesizes one if the caller does not pass one to - exec(). The second argument passed to the callback is always - the current Stmt object, as it's needed if the caller wants to - fetch the column names or some such (noting that they could - also be fetched via `this.columnNames`, if the client provides - the `columnNames` option). If the callback returns a literal - `false` (as opposed to any other falsy value, e.g. an implicit - `undefined` return), any ongoing statement-`step()` iteration - stops without an error. The return value of the callback is - otherwise ignored. + exec(). The first argument passed to the callback is described + below. The second argument is always the current Stmt object, + as it's needed if the caller wants to fetch the column names or + some such (noting that they could also be fetched via + `this.columnNames`, if the client provides the `columnNames` + option). If the callback returns a literal `false` (as opposed + to any other falsy value, e.g. an implicit `undefined` return), + any ongoing statement-`step()` iteration stops without an + error. The return value of the callback is otherwise ignored. ACHTUNG: The callback MUST NOT modify the Stmt object. Calling any of the Stmt.get() variants, Stmt.getColumnName(), or @@ -970,20 +967,23 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ A.3) `'stmt'` causes the current Stmt to be passed to the callback, but this mode will trigger an exception if `resultRows` is an array because appending the transient - statement to the array would be downright unhelpful. + statement to the array would be downright unhelpful. This + option is a legacy feature, retained for backwards + compatibility. The statement object is passed as the second + argument to the callback, as described above. B) An integer, indicating a zero-based column in the result - row. Only that one single value will be passed on. + row. Only that one single value, in JS form, will be passed on. C) A string with a minimum length of 2 and leading character of '$' will fetch the row as an object, extract that one field, - and pass that field's value to the callback. Note that these - keys are case-sensitive so must match the case used in the + and pass that field's value to the callback. These keys are + case-sensitive so must match the case used in the SQL. e.g. `"select a A from t"` with a `rowMode` of `'$A'` would work but `'$a'` would not. A reference to a column not in the result set will trigger an exception on the first row (as - the check is not performed until rows are fetched). Note also - that `$` is a legal identifier character in JS so need not be + the check is not performed until rows are fetched). Note that + `$` is a legal identifier character in JS so need not be quoted. Any other `rowMode` value triggers an exception. @@ -1023,6 +1023,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ - `callback` and `resultRows`: permit an array entries with semantics similar to those described for `bind` above. + OTOH, this function already does too much. */ exec: function(/*(sql [,obj]) || (obj)*/){ affirmDbOpen(this); @@ -1046,7 +1047,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ /* Optimization: if the SQL is a TypedArray we can save some string conversion costs. */; /* Allocate the two output pointers (ppStmt, pzTail) and heap - space for the SQL (pSql). When prepare_v2() returns, pzTail + space for the SQL (pSql). When prepare_v3() returns, pzTail will point to somewhere in pSql. */ let sqlByteLen = isTA ? arg.sql.byteLength : wasm.jstrlen(arg.sql); const ppStmt = wasm.scopedAlloc( @@ -1058,8 +1059,8 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ const pSqlEnd = wasm.ptr.add(pSql, sqlByteLen); if(isTA) wasm.heap8().set(arg.sql, pSql); else wasm.jstrcpy(arg.sql, wasm.heap8(), pSql, sqlByteLen, false); - wasm.poke(wasm.ptr.add(pSql, sqlByteLen), 0/*NUL terminator*/); - while(pSql && wasm.peek(pSql, 'i8') + wasm.poke8(wasm.ptr.add(pSql, sqlByteLen), 0/*NUL terminator*/); + while(pSql && wasm.peek8(pSql) /* Maintenance reminder:^^^ _must_ be 'i8' or else we will very likely cause an endless loop. What that's doing is checking for a terminating NUL byte. If we @@ -1123,6 +1124,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ /* In order to trigger an exception in the INSERT...RETURNING locking scenario: https://sqlite.org/forum/forumpost/36f7a2e7494897df + [tag:insert-returning-reset] */).finalize(); stmt = null; }/*prepare() loop*/ @@ -1130,15 +1132,141 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ sqlite3.config.warn("DB.exec() is propagating exception",opt,e); throw e; }*/finally{ - wasm.scopedAllocPop(stack); if(stmt){ __execLock.delete(stmt); stmt.finalize(); } + wasm.scopedAllocPop(stack); } return arg.returnVal(); }/*exec()*/, +//#if 0 + /** + Experimental and untested - do not use. + + Prepares one or more SQL statements, passing each to a callback + for processing. + + It requires an options object with the following properties: + + - "sql": SQL in any format accepted by exec(). + + - "callback" (function): gets passed each prepared statement, + as described below. + + - "asPointer" (bool=false): if true, the callback is passed the + WASM (sqlite3*) pointer instead of a Stmt object. + + - "saveSql" (array): if set, the SQL of each prepared statement + is appended to this array. This can be used without a callback + to split SQL into its component statements. Purely empty + statements (for for which sqlite3_prepare() returns a NULL + sqlite3_stmt, i.e. spaces and comments) are not added to this + list unless... + + - "saveEmpty" (bool=false): If true, empty statements are + retained in opt.saveSql, but their leading/trailing whitespace + is trimmed (as for queries) so they may be empty. + + For each statement in the input SQL: + + 1) If opt.saveSql is set, the SQL is appended to it. + + 2) If callback is set, callback(S) is called, where S is either + a Stmt object (by default) or an (sqlite3*) WASM pointer (if + opt.asPointer is true). If the callback returns a literal true + (as opposed to any other truthy value), ownership of S is + transferred to the callback, otherwise S is reset and finalized + as soon as the callback returns. If the callback throws, S is + unconditionally finalized. + + If neither of opt.saveSql nor opt.callback are set, this + function does nothing more than prepare and finalize each + statement, which will trigger an exception if any of them + contain invalid SQL. + */ + forEachStmt: function(opt){ + affirmDbOpen(this); + opt ??= Object.create(null); + if(!opt.sql){ + return toss3("exec() requires an SQL string."); + } + const sql = util.flexibleString(opt.sql); + const callback = opt.callback; + let stmt, pStmt; + const stack = wasm.scopedAllocPush(); + const saveSql = Array.isArray(opt.saveSql) ? opt.saveSql : undefined; + try{ + const isTA = util.isSQLableTypedArray(opt.sql) + /* Optimization: if the SQL is a TypedArray we can save some string + conversion costs. */; + /* Allocate the two output pointers (ppStmt, pzTail) and heap + space for the SQL (pSql). When prepare_v3() returns, pzTail + will point to somewhere in pSql. */ + let sqlByteLen = isTA ? opt.sql.byteLength : wasm.jstrlen(sql); + const ppStmt = wasm.scopedAlloc( + /* output (sqlite3_stmt**) arg and pzTail */ + (2 * wasm.ptr.size) + (sqlByteLen + 1/* SQL + NUL */) + ); + const pzTail = wasm.ptr.add(ppStmt, wasm.ptr.size) /* final arg to sqlite3_prepare_v2() */; + let pSql = wasm.ptr.add(pzTail, wasm.ptr.size) /* start of the SQL string */; + const pSqlEnd = wasm.ptr.add(pSql, sqlByteLen); + if(isTA) wasm.heap8().set(sql, pSql); + else wasm.jstrcpy(sql, wasm.heap8(), pSql, sqlByteLen, false); + wasm.poke8(wasm.ptr.add(pSql, sqlByteLen), 0/*NUL terminator*/); + while( pSql && wasm.peek8(pSql) ){ + pStmt = stmt = null; + wasm.pokePtr([ppStmt, pzTail], 0); + const zHead = pSql; + DB.checkRc(this, capi.sqlite3_prepare_v3( + this.pointer, pSql, sqlByteLen, 0, ppStmt, pzTail + )); + [pStmt, pSql] = wasm.peekPtr([ppStmt, pzTail]); + sqlByteLen = wasm.ptr.addn(pSqlEnd,-pSql); + if(opt.saveSql){ + if( pStmt ) opt.saveSql.push(capi.sqlite3_sql(pStmt).trim()); + else if( opt.saveEmpty ){ + saveSql.push(wasm.typedArrayToString( + wasm.heap8u(), Number(zHead), + wasm.ptr.addn(zHead, sqlByteLen) + ).trim(/*arguable*/)); + } + } + if(!pStmt) continue; + //sqlite3.config.debug("forEachStmt() pSql =",capi.sqlite3_sql(pStmt)); + if( !opt.callback ){ + capi.sqlite3_finalize(pStmt); + pStmt = null; + continue; + } + stmt = opt.asPointer ? null : new Stmt(this, pStmt, BindTypes); + if( true===callaback(stmt || pStmt) ){ + stmt = pStmt = null /*callback took ownership */; + }else if(stmt){ + pStmt = null; + stmt.reset( + /* See [tag:insert-returning-reset]. The thinking here is + that if the callback didn't throw for this, it + probably should have. + */).finalize(); + stmt = null; + }else{ + const rx = capi.sqlite3_reset(pStmt/*[tag:insert-returning-reset]*/); + capi.sqlite3_finalize(pStmt); + pStmt = null; + DB.checkRc(this, rx); + } + }/*prepare() loop*/ + }finally{ + if(stmt) stmt.finalize(); + else if(pStmt) capi.sqlite3_finalize(pStmt); + wasm.scopedAllocPop(stack); + } + return this; + }/*forEachStmt()*/, +//#/if nope + /** Creates a new UDF (User-Defined Function) which is accessible via SQL code. This function may be called in any of the @@ -2295,56 +2423,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ Stmt }/*oo1 object*/; - if(util.isUIThread()){ - /** - Functionally equivalent to DB(storageName,'c','kvvfs') except - that it throws if the given storage name is not one of 'local' - or 'session'. - - As of version 3.46, the argument may optionally be an options - object in the form: - - { - filename: 'session'|'local', - ... etc. (all options supported by the DB ctor) - } - - noting that the 'vfs' option supported by main DB - constructor is ignored here: the vfs is always 'kvvfs'. - */ - sqlite3.oo1.JsStorageDb = function(storageName='session'){ - const opt = dbCtorHelper.normalizeArgs(...arguments); - storageName = opt.filename; - if('session'!==storageName && 'local'!==storageName){ - toss3("JsStorageDb db name must be one of 'session' or 'local'."); - } - opt.vfs = 'kvvfs'; - dbCtorHelper.call(this, opt); - }; - const jdb = sqlite3.oo1.JsStorageDb; - jdb.prototype = Object.create(DB.prototype); - /** Equivalent to sqlite3_js_kvvfs_clear(). */ - jdb.clearStorage = capi.sqlite3_js_kvvfs_clear; - /** - Clears this database instance's storage or throws if this - instance has been closed. Returns the number of - database blocks which were cleaned up. - */ - jdb.prototype.clearStorage = function(){ - return jdb.clearStorage(affirmDbOpen(this).filename); - }; - /** Equivalent to sqlite3_js_kvvfs_size(). */ - jdb.storageSize = capi.sqlite3_js_kvvfs_size; - /** - Returns the _approximate_ number of bytes this database takes - up in its storage or throws if this instance has been closed. - */ - jdb.prototype.storageSize = function(){ - return jdb.storageSize(affirmDbOpen(this).filename); - }; - }/*main-window-only bits*/ - }); //#else /* Built with the omit-oo1 flag. */ -//#endif if not omit-oo1 +//#/if if not omit-oo1 diff --git a/ext/wasm/api/sqlite3-api-prologue.js b/ext/wasm/api/sqlite3-api-prologue.js index 069f3fdb5c..e7b775fe51 100644 --- a/ext/wasm/api/sqlite3-api-prologue.js +++ b/ext/wasm/api/sqlite3-api-prologue.js @@ -27,13 +27,14 @@ /** sqlite3ApiBootstrap() is the only global symbol persistently exposed by this API. It is intended to be called one time at the - end of the API amalgamation process, passed configuration details - for the current environment, and then optionally be removed from - the global object using `delete globalThis.sqlite3ApiBootstrap`. + end of the API amalgamation process and passed configuration details + for the current environment. This function is not intended for client-level use. It is intended for use in creating bundles configured for specific WASM - environments. + environments. That said, the "sqlite3-api.js" intermediary build + file aims to be suitable for dropping in to custom builds, and it + exposes only this function. This function expects a configuration object, intended to abstract away details specific to any given WASM environment, primarily so @@ -93,9 +94,17 @@ can be replaced with (e.g.) empty functions to squelch all such output. - - `wasmfsOpfsDir`[^1]: Specifies the "mount point" of the OPFS-backed - filesystem in WASMFS-capable builds. + - `wasmfsOpfsDir`[^1]: Specifies the "mount point" of the + OPFS-backed filesystem in WASMFS-capable builds. This is only + used in WASMFS-capable builds of the library (which the canonical + builds do not include). + - `disable` (as of 3.53.0) may be an object with the following + properties: + - `vfs`, an object, may contain a map of VFS names to booleans. + Any mapping to falsy are disabled. The supported names + are: "kvvfs", "opfs", "opfs-sahpool", "opfs-wl". + - Other disabling options may be added in the future. [^1] = This property may optionally be a function, in which case this function calls that function to fetch the value, @@ -125,7 +134,8 @@ Both sqlite3ApiBootstrap.defaultConfig and globalThis.sqlite3ApiConfig get deleted by sqlite3ApiBootstrap() because any changes to them made after that point would have no - useful effect. + useful effect. This function also deletes itself from globalThis + when it's called. This function returns a Promise to the sqlite3 namespace object, which resolves after the async pieces of the library init are @@ -142,7 +152,8 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( ); return sqlite3ApiBootstrap.sqlite3; } - const config = Object.assign(Object.create(null),{ + const nu = (...obj)=>Object.assign(Object.create(null),...obj); + const config = nu({ exports: undefined, memory: undefined, bigIntEnabled: !!globalThis.BigInt64Array, @@ -159,7 +170,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( certain wasm.xWrap.resultAdapter()s. */ useStdAlloc: false - }, apiConfig || {}); + }, apiConfig); Object.assign(config, { allocExportName: config.useStdAlloc ? 'malloc' : 'sqlite3_malloc', @@ -177,14 +188,6 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( } }); - /** - Eliminate any confusion about whether these config objects may - be used after library initialization by eliminating the outward-facing - objects... - */ - delete globalThis.sqlite3ApiConfig; - delete sqlite3ApiBootstrap.defaultConfig; - /** The main sqlite3 binding API gets installed into this object, mimicking the C API as closely as we can. The numerous members @@ -200,7 +203,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( not documented are installed as 1-to-1 proxies for their C-side counterparts. */ - const capi = Object.create(null); + const capi = nu(); /** Holds state which are specific to the WASM-related infrastructure and glue code. @@ -209,7 +212,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( dynamically after the api object is fully constructed, so not all are documented in this file. */ - const wasm = Object.create(null); + const wasm = nu(); /** Internal helper for SQLite3Error ctor. */ const __rcStr = (rc)=>{ @@ -757,6 +760,12 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( toss: function(...args){throw new Error(args.join(' '))}, toss3, typedArrayPart: wasm.typedArrayPart, + nu, + assert: function(arg,msg){ + if( !arg ){ + util.toss("Assertion failed:",msg); + } + }, /** Given a byte array or ArrayBuffer, this function throws if the lead bytes of that buffer do not hold a SQLite3 database header, @@ -796,25 +805,10 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( /** wasm.X properties which are used for configuring the wasm - environment via whwashutil.js. + environment via whwashutil.js. This object gets fleshed out with + a number of WASM-specific utilities, in sqlite3-api-glue.c-pp.js. */ Object.assign(wasm, { - /** - The WASM IR (Intermediate Representation) value for - pointer-type values. If set then it MUST be one of 'i32' or - 'i64' (else an exception will be thrown). If it's not set, it - will default to 'i32'. - */ - pointerIR: config.wasmPtrIR, - - /** - True if BigInt support was enabled via (e.g.) the - Emscripten -sWASM_BIGINT flag, else false. When - enabled, certain 64-bit sqlite3 APIs are enabled which - are not otherwise enabled due to JS/WASM int64 - impedance mismatches. - */ - bigIntEnabled: !!config.bigIntEnabled, /** The symbols exported by the WASM environment. @@ -825,8 +819,9 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( /** When Emscripten compiles with `-sIMPORTED_MEMORY`, it initializes the heap and imports it into wasm, as opposed to - the other way around. In this case, the memory is not - available via this.exports.memory. + the other way around. In this case, the memory is not available + via this.exports.memory so the client must pass it in via + config.memory. */ memory: config.memory || config.exports['memory'] @@ -834,6 +829,29 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( "in either config.exports.memory (exported)", "or config.memory (imported)."), + /** + The WASM pointer size. If set then it MUST be one of 4 or 8 and + it MUST correspond to the WASM environment's pointer size. We + figure out the size by calling some un-JS-wrapped WASM function + which returns a pointer-type value. If that value is a BigInt, + it's 64-bit, else it's 32-bit. The pieces which populate + sqlite3.wasm (whwasmutil.js) can figure this out _if_ they can + allocate, but we have a chicken/egg situation there which makes + it illegal for that code to invoke wasm.dealloc() at the time + it would be needed. So we need to configure it ahead of time + (here) instead. + */ + pointerSize: ('number'===typeof config.exports.sqlite3_libversion()) ? 4 : 8, + + /** + True if BigInt support was enabled via (e.g.) the + Emscripten -sWASM_BIGINT flag, else false. When + enabled, certain 64-bit sqlite3 APIs are enabled which + are not otherwise enabled due to JS/WASM int64 + impedance mismatches. + */ + bigIntEnabled: !!config.bigIntEnabled, + /** WebAssembly.Table object holding the indirect function call table. Defaults to exports.__indirect_function_table. @@ -883,7 +901,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( Like this.alloc.impl(), this.realloc.impl() is a direct binding to the underlying realloc() implementation which does not throw - exceptions, instead returning 0 on allocation error. + exceptions, instead returning 0 (or 0n) on allocation error. */ realloc: undefined/*installed later*/, @@ -949,7 +967,11 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( }; wasm.realloc.impl = wasm.exports[keyRealloc]; wasm.dealloc = function f(m){ - f.impl(wasm.ptr.coerce(m)/*tag:64bit*/); + f.impl(wasm.ptr.coerce(m)/*tag:64bit*/) + /* This coerce() is the reason we have to set wasm.pointerSize before + calling WhWasmUtilInstaller(). If we don't, that code will call + into this very early in its init, before wasm.ptr has been set up, + resulting in a null deref here. */; }; wasm.dealloc.impl = wasm.exports[keyDealloc]; } @@ -995,7 +1017,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( rv[1] = m ? (f._rxInt.test(m[2]) ? +m[2] : m[2]) : true; }; } - const rc = Object.create(null), ov = [0,0]; + const rc = nu(), ov = [0,0]; let i = 0, k; while((k = capi.sqlite3_compileoption_get(i++))){ f._opt(k,ov); @@ -1003,7 +1025,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( } return f._result = rc; }else if(Array.isArray(optName)){ - const rc = Object.create(null); + const rc = nu(); optName.forEach((v)=>{ rc[v] = capi.sqlite3_compileoption_used(v); }); @@ -1020,18 +1042,18 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( }/*compileOptionUsed()*/; /** - sqlite3.wasm.pstack (pseudo-stack) holds a special-case intended - solely for short-lived, small data. In practice, it's primarily - used to allocate output pointers. It mus not be used for any - memory which needs to outlive the scope in which it's obtained - from pstack. + sqlite3.wasm.pstack (pseudo-stack) holds a special-case allocator + intended solely for short-lived, small data. In practice, it's + primarily used to allocate output pointers. It must not be used + for any memory which needs to outlive the scope in which it's + obtained from pstack. The library guarantees only that a minimum of 2kb are available in this allocator, and it may provide more (it's a build-time value). pstack.quota and pstack.remaining can be used to get the total resp. remaining amount of memory. - It has only a single intended usage: + It has only a single intended usage pattern: ``` const stackPos = pstack.pointer; @@ -1048,15 +1070,13 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( ``` This allocator is much faster than a general-purpose one but is - limited to usage patterns like the one shown above. + limited to usage patterns like the one shown above (which are + pretty common when using sqlite3.capi). - It operates from a static range of memory which lives outside of - space managed by Emscripten's stack-management, so does not - collide with Emscripten-provided stack allocation APIs. The - memory lives in the WASM heap and can be used with routines such - as wasm.poke() and wasm.heap8u().slice(). + The memory lives in the WASM heap and can be used with routines + such as wasm.poke() and wasm.heap8u().slice(). */ - wasm.pstack = Object.assign(Object.create(null),{ + wasm.pstack = nu({ /** Sets the current pstack position to the given pointer. Results are undefined if the passed-in value did not come from @@ -1128,7 +1148,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( argument: if it's 1, it returns a single pointer value. If it's more than 1, it returns the same as allocChunks(). - When a returned pointers will refer to a 64-bit value, e.g. a + When a returned pointer will refer to a 64-bit value, e.g. a double or int64, and that value must be written or fetched, e.g. using wasm.poke() or wasm.peek(), it is important that the pointer in question be aligned to an 8-byte @@ -1196,6 +1216,9 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( } })/*wasm.pstack properties*/; + /** + Docs: https://sqlite.org/wasm/doc/trunk/api-c-style.md#sqlite3_randomness + */ capi.sqlite3_randomness = (...args)=>{ if(1===args.length && util.isTypedArray(args[0]) @@ -1230,8 +1253,6 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( wasm.exports.sqlite3_randomness(...args); }; - /** State for sqlite3_wasmfs_opfs_dir(). */ - let __wasmfsOpfsDir = undefined; /** If the wasm environment has a WASMFS/OPFS-backed persistent storage directory, its path is returned by this function. If it @@ -1255,7 +1276,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( WASMFS capability requires a custom build. */ capi.sqlite3_wasmfs_opfs_dir = function(){ - if(undefined !== __wasmfsOpfsDir) return __wasmfsOpfsDir; + if(undefined !== this.dir) return this.dir; // If we have no OPFS, there is no persistent dir const pdir = config.wasmfsOpfsDir; if(!pdir @@ -1263,21 +1284,21 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( || !globalThis.FileSystemDirectoryHandle || !globalThis.FileSystemFileHandle || !wasm.exports.sqlite3__wasm_init_wasmfs){ - return __wasmfsOpfsDir = ""; + return this.dir = ""; } try{ if(pdir && 0===wasm.xCallWrapped( 'sqlite3__wasm_init_wasmfs', 'i32', ['string'], pdir )){ - return __wasmfsOpfsDir = pdir; + return this.dir = pdir; }else{ - return __wasmfsOpfsDir = ""; + return this.dir = ""; } }catch(e){ // sqlite3__wasm_init_wasmfs() is not available - return __wasmfsOpfsDir = ""; + return this.dir = ""; } - }; + }.bind(nu()); /** Returns true if sqlite3.capi.sqlite3_wasmfs_opfs_dir() is a @@ -1333,7 +1354,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( */ capi.sqlite3_js_vfs_list = function(){ const rc = []; - let pVfs = capi.sqlite3_vfs_find(wasm.ptr.coerce(0)); + let pVfs = capi.sqlite3_vfs_find(wasm.ptr.null); while(pVfs){ const oVfs = new capi.sqlite3_vfs(pVfs); rc.push(wasm.cstrToJs(oVfs.$zName)); @@ -1401,7 +1422,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( or not provided, then "main" is assumed. */ capi.sqlite3_js_db_vfs = - (dbPointer, dbName=0)=>util.sqlite3__wasm_db_vfs(dbPointer, dbName); + (dbPointer, dbName=wasm.ptr.null)=>util.sqlite3__wasm_db_vfs(dbPointer, dbName); /** A thin wrapper around capi.sqlite3_aggregate_context() which @@ -1597,86 +1618,6 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( return x===v ? undefined : x; } - if( util.isUIThread() ){ - /* Features specific to the main window thread... */ - - /** - Internal helper for sqlite3_js_kvvfs_clear() and friends. - Its argument should be one of ('local','session',""). - */ - const __kvvfsInfo = function(which){ - const rc = Object.create(null); - rc.prefix = 'kvvfs-'+which; - rc.stores = []; - if('session'===which || ""===which) rc.stores.push(globalThis.sessionStorage); - if('local'===which || ""===which) rc.stores.push(globalThis.localStorage); - return rc; - }; - - /** - Clears all storage used by the kvvfs DB backend, deleting any - DB(s) stored there. Its argument must be either 'session', - 'local', or "". In the first two cases, only sessionStorage - resp. localStorage is cleared. If it's an empty string (the - default) then both are cleared. Only storage keys which match - the pattern used by kvvfs are cleared: any other client-side - data are retained. - - This function is only available in the main window thread. - - Returns the number of entries cleared. - */ - capi.sqlite3_js_kvvfs_clear = function(which=""){ - let rc = 0; - const kvinfo = __kvvfsInfo(which); - kvinfo.stores.forEach((s)=>{ - const toRm = [] /* keys to remove */; - let i; - for( i = 0; i < s.length; ++i ){ - const k = s.key(i); - if(k.startsWith(kvinfo.prefix)) toRm.push(k); - } - toRm.forEach((kk)=>s.removeItem(kk)); - rc += toRm.length; - }); - return rc; - }; - - /** - This routine guesses the approximate amount of - window.localStorage and/or window.sessionStorage in use by the - kvvfs database backend. Its argument must be one of - ('session', 'local', ""). In the first two cases, only - sessionStorage resp. localStorage is counted. If it's an empty - string (the default) then both are counted. Only storage keys - which match the pattern used by kvvfs are counted. The returned - value is the "length" value of every matching key and value, - noting that JavaScript stores each character in 2 bytes. - - Note that the returned size is not authoritative from the - perspective of how much data can fit into localStorage and - sessionStorage, as the precise algorithms for determining - those limits are unspecified and may include per-entry - overhead invisible to clients. - */ - capi.sqlite3_js_kvvfs_size = function(which=""){ - let sz = 0; - const kvinfo = __kvvfsInfo(which); - kvinfo.stores.forEach((s)=>{ - let i; - for(i = 0; i < s.length; ++i){ - const k = s.key(i); - if(k.startsWith(kvinfo.prefix)){ - sz += k.length; - sz += s.getItem(k).length; - } - } - }); - return sz * 2 /* because JS uses 2-byte char encoding */; - }; - - }/* main-window-only bits */ - /** Wraps all known variants of the C-side variadic sqlite3_db_config(). @@ -1711,6 +1652,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( case capi.SQLITE_DBCONFIG_ENABLE_ATTACH_CREATE: case capi.SQLITE_DBCONFIG_ENABLE_ATTACH_WRITE: case capi.SQLITE_DBCONFIG_ENABLE_COMMENTS: + case capi.SQLITE_DBCONFIG_FP_DIGITS: if( !this.ip ){ this.ip = wasm.xWrap('sqlite3__wasm_db_config_ip','int', ['sqlite3*', 'int', 'int', '*']); @@ -1732,7 +1674,7 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( default: return capi.SQLITE_MISUSE; } - }.bind(Object.create(null)); + }.bind(nu()); /** Given a (sqlite3_value*), this function attempts to convert it @@ -1953,55 +1895,114 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( return (0===v) ? undefined : capi.sqlite3_value_to_js(v, throwIfCannotConvert); }; - /** - Internal impl of sqlite3_preupdate_new/old_js() and - sqlite3changeset_new/old_js(). - */ - const __newOldValue = function(pObj, iCol, impl){ - impl = capi[impl]; - if(!this.ptr) this.ptr = wasm.allocPtr(); - else wasm.pokePtr(this.ptr, 0); - const rc = impl(pObj, iCol, this.ptr); - if(rc) return SQLite3Error.toss(rc,arguments[2]+"() failed with code "+rc); - const pv = wasm.peekPtr(this.ptr); - return pv ? capi.sqlite3_value_to_js( pv, true ) : undefined; - }.bind(Object.create(null)); + if( true ){ /* changeset/preupdate additions... */ + /** + Internal impl of sqlite3_preupdate_new/old_js() and + sqlite3changeset_new/old_js(). + */ + const __newOldValue = function(pObj, iCol, impl){ + impl = capi[impl]; + if(!this.ptr) this.ptr = wasm.allocPtr(); + else wasm.pokePtr(this.ptr, 0); + const rc = impl(pObj, iCol, this.ptr); + if(rc) return SQLite3Error.toss(rc,arguments[2]+"() failed with code "+rc); + const pv = wasm.peekPtr(this.ptr); + return pv ? capi.sqlite3_value_to_js( pv, true ) : undefined; + }.bind(nu()); - /** - A wrapper around sqlite3_preupdate_new() which fetches the - sqlite3_value at the given index and returns the result of - passing it to sqlite3_value_to_js(). Throws on error. - */ - capi.sqlite3_preupdate_new_js = - (pDb, iCol)=>__newOldValue(pDb, iCol, 'sqlite3_preupdate_new'); + /** + A wrapper around sqlite3_preupdate_new() which fetches the + sqlite3_value at the given index and returns the result of + passing it to sqlite3_value_to_js(). Throws on error. + */ + capi.sqlite3_preupdate_new_js = + (pDb, iCol)=>__newOldValue(pDb, iCol, 'sqlite3_preupdate_new'); - /** - The sqlite3_preupdate_old() counterpart of - sqlite3_preupdate_new_js(), with an identical interface. - */ - capi.sqlite3_preupdate_old_js = - (pDb, iCol)=>__newOldValue(pDb, iCol, 'sqlite3_preupdate_old'); + /** + The sqlite3_preupdate_old() counterpart of + sqlite3_preupdate_new_js(), with an identical interface. + */ + capi.sqlite3_preupdate_old_js = + (pDb, iCol)=>__newOldValue(pDb, iCol, 'sqlite3_preupdate_old'); - /** - A wrapper around sqlite3changeset_new() which fetches the - sqlite3_value at the given index and returns the result of - passing it to sqlite3_value_to_js(). Throws on error. + /** + A wrapper around sqlite3changeset_new() which fetches the + sqlite3_value at the given index and returns the result of + passing it to sqlite3_value_to_js(). Throws on error. + + If sqlite3changeset_new() succeeds but has no value to report, + this function returns the undefined value, noting that + undefined is not a valid conversion from an `sqlite3_value`, so + is unambiguous. + */ + capi.sqlite3changeset_new_js = + (pChangesetIter, iCol) => __newOldValue(pChangesetIter, iCol, + 'sqlite3changeset_new'); - If sqlite3changeset_new() succeeds but has no value to report, - this function returns the undefined value, noting that undefined - is a valid conversion from an `sqlite3_value`, so is unambiguous. - */ - capi.sqlite3changeset_new_js = - (pChangesetIter, iCol) => __newOldValue(pChangesetIter, iCol, - 'sqlite3changeset_new'); + /** + The sqlite3changeset_old() counterpart of + sqlite3changeset_new_js(), with an identical interface. + */ + capi.sqlite3changeset_old_js = + (pChangesetIter, iCol)=>__newOldValue(pChangesetIter, iCol, + 'sqlite3changeset_old'); + }/*changeset/preupdate additions*/ /** - The sqlite3changeset_old() counterpart of - sqlite3changeset_new_js(), with an identical interface. + EXPERIMENTAL. For tentative addition in 3.53.0. + + sqlite3_js_retry_busy(maxTimes,callback[,beforeRetry]) + + Calls the given _synchronous_ callback function. If that function + returns sqlite3.capi.SQLITE_BUSY _or_ throws an SQLite3Error + with a resultCode property of that value then it will suppress + that error and try again, up to the given maximum number of + times. If the callback returns any other value than that, + it is returned. If the maximum number of retries has been + reached, an SQLite3Error with a resultCode value of + sqlite3.capi.SQLITE_BUSY is thrown. If the callback throws any + exception other than the aforementioned BUSY exception, it is + propagated. If it throws a BUSY exception on its final attempt, + that is propagated as well. + + If the beforeRetry argument is given, it must be a _synchronous_ + function. It is called immediately before each retry of the + callback (not for the initial call), passed the attempt number + (so it starts with 2, not 1). If it throws, the exception is + handled as described above. Its result value is ignored. + + To effectively retry "forever", pass a negative maxTimes value, + with the caveat that there is no recovery from that unless the + beforeRetry() can figure out when to throw. + + TODO: an async variant of this. */ - capi.sqlite3changeset_old_js = - (pChangesetIter, iCol)=>__newOldValue(pChangesetIter, iCol, - 'sqlite3changeset_old'); + capi.sqlite3_js_retry_busy = function(maxTimes, callback, beforeRetry){ + for(let n = 1; n <= maxTimes; ++n){ + try{ + if( beforeRetry && n>1 ) beforeRetry(n); + const rc = callback(); + if( capi.SQLITE_BUSY===rc ){ + if( n===maxTimes ){ + throw new SQLite3Error(rc, [ + "sqlite3_js_retry_busy() max retry attempts (", + maxTimes, + ") reached." + ].join('')); + } + continue; + } + return rc; + }catch(e){ + if( n{ if(!sqlite3.__isUnderTest){ /* Delete references to internal-only APIs which are used by some initializers. Retain them when running in test mode so that we can add tests for them. */ delete sqlite3.util; - /* It's conceivable that we might want to expose - StructBinder to client-side code, but it's only useful if - clients build their own sqlite3.wasm which contains their - own C struct types. */ delete sqlite3.StructBinder; + delete sqlite3.opfs; } return sqlite3; }; @@ -2090,21 +2086,26 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( let p = Promise.resolve(sqlite3); while(lia.length) p = p.then(lia.shift()); return ff.isReady = p.catch(catcher); - }, + }.bind(sqlite3ApiBootstrap), /** - scriptInfo ideally gets injected into this object by the - infrastructure which assembles the JS/WASM module. It contains - state which must be collected before sqlite3ApiBootstrap() can - be declared. It is not necessarily available to any - sqlite3ApiBootstrap.initializers but "should" be in place (if - it's added at all) by the time that - sqlite3ApiBootstrap.initializersAsync is processed. + scriptInfo holds information about the currenty-loading script + so that we can locate the WASM file if it's somewhere other + than the build-time-defined directory. It ideally gets injected + into this object by the infrastructure which assembles the + JS/WASM module. It contains state which must be collected + before sqlite3ApiBootstrap() can be declared. It is not + necessarily available to any sqlite3ApiBootstrap.initializers + but "should" be in place (if it's added at all) by the time + that sqlite3ApiBootstrap.initializersAsync is processed. This state is not part of the public API, only intended for use with the sqlite3 API bootstrapping and wasm-loading process. */ scriptInfo: undefined }; + if( 'undefined'!==typeof sqlite3IsUnderTest/* from post-js-header.js */ ){ + sqlite3.__isUnderTest = !!sqlite3IsUnderTest; + } try{ sqlite3ApiBootstrap.initializers.forEach((f)=>{ f(sqlite3); @@ -2117,16 +2118,34 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( } delete sqlite3ApiBootstrap.initializers; sqlite3ApiBootstrap.sqlite3 = sqlite3; - delete globalThis.sqlite3ApiBootstrap; - delete globalThis.sqlite3ApiConfig; - sqlite3InitScriptInfo.debugModule( - "sqlite3ApiBootstrap() complete", sqlite3 - ); - sqlite3.scriptInfo /* used by some async init code */ = - sqlite3InitScriptInfo /* from post-js-header.js */; - if( (sqlite3.__isUnderTest = sqlite3IsUnderTest /* from post-js-header.js */) ){ - sqlite3.config.emscripten = EmscriptenModule; - const iw = sqlite3InitScriptInfo.instantiateWasm; + if( 'undefined'!==typeof sqlite3InitScriptInfo/* from post-js-header.js */ ){ + sqlite3InitScriptInfo.debugModule( + "sqlite3ApiBootstrap() complete", sqlite3 + ); + sqlite3.scriptInfo + /* Used by some async init code. As of 2025-11-15 this is still + in use by the OPFS VFS for locating its worker. In non-Emscripten + builds, this would need to be injected in somewhere to get + that VFS loading. */ = sqlite3InitScriptInfo; + } + if( sqlite3.__isUnderTest ){ + if( 'undefined'!==typeof EmscriptenModule ){ + sqlite3.config.emscripten = EmscriptenModule; + } + /* + The problem with exposing these pieces (in non-testing runs) via + sqlite3.wasm is that it exposes non-SQLite pieces to the + clients, who may come to expect it to remain. _We_ only have + these data because we've overridden Emscripten's wasm file + loader, and if we lose that capability for some reason then + we'll lose access to this metadata. + + These data are interesting for exploring how the wasm/JS pieces + connect, e.g. for exploring exactly what Emscripten imports into + WASM from its JS glue, but it's not SQLite-related and is not + required for the library to work. + */ + const iw = sqlite3.scriptInfo?.instantiateWasm; if( iw ){ /* Metadata injected by the custom Module.instantiateWasm() in pre-js.c-pp.js. */ @@ -2135,10 +2154,21 @@ globalThis.sqlite3ApiBootstrap = async function sqlite3ApiBootstrap( sqlite3.wasm.imports = iw.imports; } } + + /** + Eliminate any confusion about whether these config objects may + be used after library initialization by eliminating the outward-facing + objects... + */ + delete globalThis.sqlite3ApiConfig; + delete globalThis.sqlite3ApiBootstrap; + delete sqlite3ApiBootstrap.defaultConfig; return sqlite3.asyncPostInit().then((s)=>{ - sqlite3InitScriptInfo.debugModule( - "sqlite3.asyncPostInit() complete", sqlite3 - ); + if( 'undefined'!==typeof sqlite3InitScriptInfo/* from post-js-header.js */ ){ + sqlite3InitScriptInfo.debugModule( + "sqlite3.asyncPostInit() complete", s + ); + } delete s.asyncPostInit; delete s.scriptInfo; delete s.emscripten; diff --git a/ext/wasm/api/sqlite3-api-worker1.c-pp.js b/ext/wasm/api/sqlite3-api-worker1.c-pp.js index 25262abf85..d2103ca850 100644 --- a/ext/wasm/api/sqlite3-api-worker1.c-pp.js +++ b/ext/wasm/api/sqlite3-api-worker1.c-pp.js @@ -677,4 +677,4 @@ sqlite3.initWorker1API = function(){ }); //#else /* Built with the omit-oo1 flag. */ -//#endif if not omit-oo1 +//#/if if not omit-oo1 diff --git a/ext/wasm/api/sqlite3-license-version-header.js b/ext/wasm/api/sqlite3-license-version-header.js index 4829894638..dd32f4666e 100644 --- a/ext/wasm/api/sqlite3-license-version-header.js +++ b/ext/wasm/api/sqlite3-license-version-header.js @@ -1,4 +1,5 @@ -/* +/* @preserve +** ** LICENSE for the sqlite3 WebAssembly/JavaScript APIs. ** ** This bundle (typically released as sqlite3.js or sqlite3.mjs) diff --git a/ext/wasm/api/sqlite3-opfs-async-proxy.js b/ext/wasm/api/sqlite3-opfs-async-proxy.c-pp.js similarity index 62% rename from ext/wasm/api/sqlite3-opfs-async-proxy.js rename to ext/wasm/api/sqlite3-opfs-async-proxy.c-pp.js index e10d0dd505..d5df0b9f79 100644 --- a/ext/wasm/api/sqlite3-opfs-async-proxy.js +++ b/ext/wasm/api/sqlite3-opfs-async-proxy.c-pp.js @@ -1,4 +1,4 @@ -/* +/* @preserve 2022-09-16 The author disclaims copyright to this source code. In place of a @@ -46,10 +46,52 @@ theFunc().then(...) is not compatible with the change to synchronous, but we do do not use those APIs that way. i.e. we don't _need_ to change anything for this, but at some point (after Chrome - versions (approximately) 104-107 are extinct) should change our + versions (approximately) 104-107 are extinct) we should change our usage of those methods to remove the "await". */ +//#if 0 +/** + 2026-04-04: this file gets included by both the "opfs" and "opfs-wl" + VFSes. It would, in hindsight, hypothetically be possible to restructure + it very slightly to support both VFSes via a single Worker instance. + + Some of the changes we would need for that: + + - The xLock/xUnlock "op codes" would need to differ for each impl. + i.e. we'd need state.opIds.xLock{,WL} and state.opIds.xUnlock{,WL} + to distinguish between the two, rather than doing so when this Worker + is loaded. + + - We would need to centralize loading of this Worker, outside of + the VFS-specific pieces, and change the handshake in order to be + able to distinguish between clients which support + Atomics.waitAsync() and those which do not ("opfs-wl" requires + waitAsync()). + + One down-side would be for clients which, for whatever reason, want + to use both "opfs" and "opfs-wl" within the same session: because + both would go through the same Worker, any operations for one VFS + would, while they're being processed on this side of the proxy, + effectively block the other VFS from doing anything, potentially + deadlocking. This use case seems unlikely enough that it can + possibly be ruled out (or even reasonably flat-out prohibited by + the library). +*/ +//#/if + "use strict"; +const urlParams = new URL(globalThis.location.href).searchParams; +const vfsName = urlParams.get('vfs'); +if( !vfsName ){ + throw new Error("Expecting vfs=opfs|opfs-wl URL argument for this worker"); +} +/** + We use this to allow us to differentiate debug output from + multiple instances, e.g. multiple Workers to the "opfs" + VFS or both the "opfs" and "opfs-wl" VFSes. +*/ +const workerId = (Math.random() * 10000000) | 0; +const isWebLocker = 'opfs-wl'===urlParams.get('vfs'); const wPost = (type,...args)=>postMessage({type, payload:args}); const installAsyncProxy = function(){ const toss = function(...args){throw new Error(args.join(' '))}; @@ -66,6 +108,13 @@ const installAsyncProxy = function(){ */ const state = Object.create(null); + /* initS11n() is preprocessor-injected so that we have identical + copies in the synchronous and async halves. This side does not + load the SQLite library, so does not have access to that copy. */ +//#define opfs-async-proxy +//#include api/opfs-common-inline.c-pp.js +//#undef opfs-async-proxy + /** verbose: @@ -82,7 +131,7 @@ const installAsyncProxy = function(){ 2:console.log.bind(console) }; const logImpl = (level,...args)=>{ - if(state.verbose>level) loggers[level]("OPFS asyncer:",...args); + if(state.verbose>level) loggers[level](vfsName+' async-proxy',workerId+":",...args); }; const log = (...args)=>logImpl(2, ...args); const warn = (...args)=>logImpl(1, ...args); @@ -97,12 +146,13 @@ const installAsyncProxy = function(){ */ const __openFiles = Object.create(null); /** - __implicitLocks is a Set of sqlite3_file pointers (integers) which were - "auto-locked". i.e. those for which we obtained a sync access - handle without an explicit xLock() call. Such locks will be - released during db connection idle time, whereas a sync access - handle obtained via xLock(), or subsequently xLock()'d after - auto-acquisition, will not be released until xUnlock() is called. + __implicitLocks is a Set of sqlite3_file pointers (integers) + which were "auto-locked". i.e. those for which we necessarily + obtain a sync access handle without an explicit xLock() call + guarding access. Such locks will be released during + `waitLoop()`'s idle time, whereas a sync access handle obtained + via xLock(), or subsequently xLock()'d after auto-acquisition, + will not be released until xUnlock() is called. Maintenance reminder: if we relinquish auto-locks at the end of the operation which acquires them, we pay a massive performance @@ -271,10 +321,11 @@ const installAsyncProxy = function(){ In order to help alleviate cross-tab contention for a dabase, if an exception is thrown while acquiring the handle, this routine - will wait briefly and try again, up to some fixed number of - times. If acquisition still fails at that point it will give up - and propagate the exception. Client-level code will see that as - an I/O error. + will wait briefly and try again, up to `maxTries` of times. If + acquisition still fails at that point it will give up and + propagate the exception. Client-level code will see that either + as an I/O error or SQLITE_BUSY, depending on the exception and + the context. 2024-06-12: there is a rare race condition here which has been reported a single time: @@ -289,13 +340,31 @@ const installAsyncProxy = function(){ there's another race condition there). That's easy to say but creating a viable test for that condition has proven challenging so far. + + Interface quirk: if fh.xLock is falsy and the handle is acquired + then fh.fid is added to __implicitLocks(). If fh.xLock is truthy, + it is not added as an implicit lock. i.e. xLock() impls must set + fh.xLock immediately _before_ calling this and must arrange to + restore it to its previous value if this function throws. + + 2026-03-06: + + - baseWaitTime is the number of milliseconds to wait for the + first retry, increasing by one factor for each retry. It defaults + to (state.asyncIdleWaitTime*2). + + - maxTries is the number of attempt to make, each one spaced out + by one additional factor of the baseWaitTime (e.g. 300, then 600, + then 900, the 1200...). This MUST be an integer >0. + + Only the Web Locks impl should use the 3rd and 4th parameters. */ - const getSyncHandle = async (fh,opName)=>{ + const getSyncHandle = async (fh, opName, baseWaitTime, maxTries = 6)=>{ if(!fh.syncHandle){ const t = performance.now(); log("Acquiring sync handle for",fh.filenameAbs); - const maxTries = 6, - msBase = state.asyncIdleWaitTime * 2; + const msBase = baseWaitTime ?? (state.asyncIdleWaitTime * 2); + maxTries ??= 6; let i = 1, ms = msBase; for(; true; ms = msBase * ++i){ try { @@ -329,6 +398,9 @@ const installAsyncProxy = function(){ /** Stores the given value at state.sabOPView[state.opIds.rc] and then Atomics.notify()'s it. + + The opName is only used for logging and debugging - all result + codes are expected on the same state.sabOPView slot. */ const storeAndNotify = (opName, value)=>{ log(opName+"() => notify(",value,")"); @@ -458,24 +530,12 @@ const installAsyncProxy = function(){ await releaseImplicitLock(fh); storeAndNotify('xFileSize', rc); }, - xLock: async function(fid/*sqlite3_file pointer*/, - lockType/*SQLITE_LOCK_...*/){ - const fh = __openFiles[fid]; - let rc = 0; - const oldLockType = fh.xLock; - fh.xLock = lockType; - if( !fh.syncHandle ){ - try { - await getSyncHandle(fh,'xLock'); - __implicitLocks.delete(fid); - }catch(e){ - state.s11n.storeException(1,e); - rc = GetSyncHandleError.convertRc(e,state.sq3Codes.SQLITE_IOERR_LOCK); - fh.xLock = oldLockType; - } - } - storeAndNotify('xLock',rc); - }, + /** + The first argument is semantically invalid here - it's an + address in the synchronous side's heap. We can do nothing with + it here except use it as a unique-per-file identifier. + i.e. a lookup key. + */ xOpen: async function(fid/*sqlite3_file pointer*/, filename, flags/*SQLITE_OPEN_...*/, opfsFlags/*OPFS_...*/){ @@ -533,7 +593,7 @@ const installAsyncProxy = function(){ rc = state.sq3Codes.SQLITE_IOERR_SHORT_READ; } }catch(e){ - error("xRead() failed",e,fh); + //error("xRead() failed",e,fh); state.s11n.storeException(1,e); rc = GetSyncHandleError.convertRc(e,state.sq3Codes.SQLITE_IOERR_READ); } @@ -560,29 +620,13 @@ const installAsyncProxy = function(){ affirmNotRO('xTruncate', fh); await (await getSyncHandle(fh,'xTruncate')).truncate(size); }catch(e){ - error("xTruncate():",e,fh); + //error("xTruncate():",e,fh); state.s11n.storeException(2,e); rc = GetSyncHandleError.convertRc(e,state.sq3Codes.SQLITE_IOERR_TRUNCATE); } await releaseImplicitLock(fh); storeAndNotify('xTruncate',rc); }, - xUnlock: async function(fid/*sqlite3_file pointer*/, - lockType/*SQLITE_LOCK_...*/){ - let rc = 0; - const fh = __openFiles[fid]; - if( fh.syncHandle - && state.sq3Codes.SQLITE_LOCK_NONE===lockType - /* Note that we do not differentiate between lock types in - this VFS. We're either locked or unlocked. */ ){ - try { await closeSyncHandle(fh) } - catch(e){ - state.s11n.storeException(1,e); - rc = state.sq3Codes.SQLITE_IOERR_UNLOCK; - } - } - storeAndNotify('xUnlock',rc); - }, xWrite: async function(fid/*sqlite3_file pointer*/,n,offset64){ let rc; const fh = __openFiles[fid]; @@ -594,7 +638,7 @@ const installAsyncProxy = function(){ {at: Number(offset64)}) ) ? 0 : state.sq3Codes.SQLITE_IOERR_WRITE; }catch(e){ - error("xWrite():",e,fh); + //error("xWrite():",e,fh); state.s11n.storeException(1,e); rc = GetSyncHandleError.convertRc(e,state.sq3Codes.SQLITE_IOERR_WRITE); } @@ -603,152 +647,274 @@ const installAsyncProxy = function(){ } }/*vfsAsyncImpls*/; - const initS11n = ()=>{ - /** - ACHTUNG: this code is 100% duplicated in the other half of this - proxy! The documentation is maintained in the "synchronous half". + if( isWebLocker ){ + /* We require separate xLock() and xUnlock() implementations for the + original and Web Lock implementations. The ones in this block + are for the WebLock impl. + + The Golden Rule for this impl is: if we have a web lock, we + must also hold the SAH. When "upgrading" an implicit lock to a + requested (explicit) lock, we must remove the SAH from the + __implicitLocks set. When we unlock, we release both the web + lock and the SAH. That invariant must be kept intact or race + conditions on SAHs will ensue. */ - if(state.s11n) return state.s11n; - const textDecoder = new TextDecoder(), - textEncoder = new TextEncoder('utf-8'), - viewU8 = new Uint8Array(state.sabIO, state.sabS11nOffset, state.sabS11nSize), - viewDV = new DataView(state.sabIO, state.sabS11nOffset, state.sabS11nSize); - state.s11n = Object.create(null); - const TypeIds = Object.create(null); - TypeIds.number = { id: 1, size: 8, getter: 'getFloat64', setter: 'setFloat64' }; - TypeIds.bigint = { id: 2, size: 8, getter: 'getBigInt64', setter: 'setBigInt64' }; - TypeIds.boolean = { id: 3, size: 4, getter: 'getInt32', setter: 'setInt32' }; - TypeIds.string = { id: 4 }; - const getTypeId = (v)=>( - TypeIds[typeof v] - || toss("Maintenance required: this value type cannot be serialized.",v) - ); - const getTypeIdById = (tid)=>{ - switch(tid){ - case TypeIds.number.id: return TypeIds.number; - case TypeIds.bigint.id: return TypeIds.bigint; - case TypeIds.boolean.id: return TypeIds.boolean; - case TypeIds.string.id: return TypeIds.string; - default: toss("Invalid type ID:",tid); - } - }; - state.s11n.deserialize = function(clear=false){ - const argc = viewU8[0]; - const rc = argc ? [] : null; - if(argc){ - const typeIds = []; - let offset = 1, i, n, v; - for(i = 0; i < argc; ++i, ++offset){ - typeIds.push(getTypeIdById(viewU8[offset])); + /** Registry of active Web Locks: fid -> { mode, resolveRelease } */ + const __activeWebLocks = Object.create(null); + + vfsAsyncImpls.xLock = async function(fid/*sqlite3_file pointer*/, + lockType/*SQLITE_LOCK_...*/, + isFromUnlock/*only if called from this.xUnlock()*/){ + const whichOp = isFromUnlock ? 'xUnlock' : 'xLock'; + const fh = __openFiles[fid]; + //error("xLock()",fid, lockType, isFromUnlock, fh); + const requestedMode = (lockType >= state.sq3Codes.SQLITE_LOCK_RESERVED) + ? 'exclusive' : 'shared'; + const existing = __activeWebLocks[fid]; + if( existing ){ + if( existing.mode === requestedMode + || (existing.mode === 'exclusive' + && requestedMode === 'shared') ) { + fh.xLock = lockType; + storeAndNotify(whichOp, 0); + /* Don't do this: existing.mode = requestedMode; + + Paraphrased from advice given by a consulting developer: + + If you hold an exclusive lock and SQLite requests shared, + you should keep exiting.mode as exclusive in because the + underlying Web Lock is still exclusive. Changing it to + shared would trick xLock into thinking it needs to + perform a release/re-acquire dance if an exclusive is + later requested. + */ + return 0 /* Already held at required or higher level */; } - for(i = 0; i < argc; ++i){ - const t = typeIds[i]; - if(t.getter){ - v = viewDV[t.getter](offset, state.littleEndian); - offset += t.size; - }else{/*String*/ - n = viewDV.getInt32(offset, state.littleEndian); - offset += 4; - v = textDecoder.decode(viewU8.slice(offset, offset+n)); - offset += n; + /* + Upgrade path: we must release shared and acquire exclusive. + This transition is NOT atomic in Web Locks API. + + It _effectively_ is atomic if we don't call + closeSyncHandle(fh), as no other worker can lock that until + we let it go. But we can't do that without eventually + leading to deadly embrace situations, so we don't do that. + (That's not a hypothetical, it has happened.) + */ + await closeSyncHandle(fh); + existing.resolveRelease(); + delete __activeWebLocks[fid]; + } + + const lockName = "sqlite3-vfs-opfs:" + fh.filenameAbs; + const oldLockType = fh.xLock; + return new Promise((resolveWaitLoop) => { + //log("xLock() initial promise entered..."); + navigator.locks.request(lockName, { mode: requestedMode }, async (lock) => { + //log("xLock() Web Lock entered.", fh); + __implicitLocks.delete(fid); + let rc = 0; + try{ + fh.xLock = lockType/*must be set before getSyncHandle() is called!*/; + await getSyncHandle(fh, 'xLock', state.asyncIdleWaitTime, 5); + }catch(e){ + fh.xLock = oldLockType; + state.s11n.storeException(1, e); + rc = GetSyncHandleError.convertRc(e, state.sq3Codes.SQLITE_BUSY); } - rc.push(v); - } + const releasePromise = rc + ? undefined + : new Promise((resolveRelease) => { + __activeWebLocks[fid] = { mode: requestedMode, resolveRelease }; + }); + storeAndNotify(whichOp, rc) /* unblock the C side */; + resolveWaitLoop(0) /* unblock waitLoop() */; + await releasePromise /* hold the lock until xUnlock */; + }); + }); + }; + + /** Internal helper for the opfs-wl xUnlock() */ + const wlCloseHandle = async(fh)=>{ + let rc = 0; + try{ + /* For the record, we've never once seen closeSyncHandle() + throw, nor should it because destructors do not throw. */ + await closeSyncHandle(fh); + }catch(e){ + state.s11n.storeException(1,e); + rc = state.sq3Codes.SQLITE_IOERR_UNLOCK; } - if(clear) viewU8[0] = 0; - //log("deserialize:",argc, rc); return rc; }; - state.s11n.serialize = function(...args){ - if(args.length){ - //log("serialize():",args); - const typeIds = []; - let i = 0, offset = 1; - viewU8[0] = args.length & 0xff /* header = # of args */; - for(; i < args.length; ++i, ++offset){ - /* Write the TypeIds.id value into the next args.length - bytes. */ - typeIds.push(getTypeId(args[i])); - viewU8[offset] = typeIds[i].id; - } - for(i = 0; i < args.length; ++i) { - /* Deserialize the following bytes based on their - corresponding TypeIds.id from the header. */ - const t = typeIds[i]; - if(t.setter){ - viewDV[t.setter](offset, args[i], state.littleEndian); - offset += t.size; - }else{/*String*/ - const s = textEncoder.encode(args[i]); - viewDV.setInt32(offset, s.byteLength, state.littleEndian); - offset += 4; - viewU8.set(s, offset); - offset += s.byteLength; - } + + vfsAsyncImpls.xUnlock = async function(fid/*sqlite3_file pointer*/, + lockType/*SQLITE_LOCK_...*/){ + const fh = __openFiles[fid]; + const existing = __activeWebLocks[fid]; + if( !existing ){ + const rc = await wlCloseHandle(fh); + storeAndNotify('xUnlock', rc); + return rc; + } + //log("xUnlock()",fid, lockType, fh); + let rc = 0; + if( lockType === state.sq3Codes.SQLITE_LOCK_NONE ){ + /* SQLite usually unlocks all the way to NONE */ + rc = await wlCloseHandle(fh); + existing.resolveRelease(); + delete __activeWebLocks[fid]; + fh.xLock = lockType; + }else if( lockType === state.sq3Codes.SQLITE_LOCK_SHARED + && existing.mode === 'exclusive' ){ + /* downgrade EXCLUSIVE -> SHARED */ + rc = await wlCloseHandle(fh); + if( 0===rc ){ + fh.xLock = lockType; + existing.resolveRelease(); + delete __activeWebLocks[fid]; + return vfsAsyncImpls.xLock(fid, lockType, true); } - //log("serialize() result:",viewU8.slice(0,offset)); }else{ - viewU8[0] = 0; + /* ??? */ + error("xUnlock() unhandled condition", fh); + } + storeAndNotify('xUnlock', rc); + return 0; + } + + }else{ + /* Original/"legacy" xLock() and xUnlock() */ + + vfsAsyncImpls.xLock = async function(fid/*sqlite3_file pointer*/, + lockType/*SQLITE_LOCK_...*/){ + const fh = __openFiles[fid]; + let rc = 0; + const oldLockType = fh.xLock; + fh.xLock = lockType; + if( !fh.syncHandle ){ + try { + await getSyncHandle(fh,'xLock'); + __implicitLocks.delete(fid); + }catch(e){ + state.s11n.storeException(1,e); + rc = GetSyncHandleError.convertRc(e,state.sq3Codes.SQLITE_IOERR_LOCK); + fh.xLock = oldLockType; + } } + storeAndNotify('xLock',rc); }; - state.s11n.storeException = state.asyncS11nExceptions - ? ((priority,e)=>{ - if(priority<=state.asyncS11nExceptions){ - state.s11n.serialize([e.name,': ',e.message].join("")); + vfsAsyncImpls.xUnlock = async function(fid/*sqlite3_file pointer*/, + lockType/*SQLITE_LOCK_...*/){ + let rc = 0; + const fh = __openFiles[fid]; + if( fh.syncHandle + && state.sq3Codes.SQLITE_LOCK_NONE===lockType + /* Note that we do not differentiate between lock types in + this VFS. We're either locked or unlocked. */ ){ + try { await closeSyncHandle(fh) } + catch(e){ + state.s11n.storeException(1,e); + rc = state.sq3Codes.SQLITE_IOERR_UNLOCK; } - }) - : ()=>{}; + } + storeAndNotify('xUnlock',rc); + } - return state.s11n; - }/*initS11n()*/; + }/*xLock() and xUnlock() impls*/ const waitLoop = async function f(){ - const opHandlers = Object.create(null); - for(let k of Object.keys(state.opIds)){ - const vi = vfsAsyncImpls[k]; - if(!vi) continue; - const o = Object.create(null); - opHandlers[state.opIds[k]] = o; - o.key = k; - o.f = vi; + if( !f.inited ){ + f.inited = true; + f.opHandlers = Object.create(null); + for(let k of Object.keys(state.opIds)){ + const vi = vfsAsyncImpls[k]; + if(!vi) continue; + const o = Object.create(null); + f.opHandlers[state.opIds[k]] = o; + o.key = k; + o.f = vi; + } } + const opIds = state.opIds; + const opView = state.sabOPView; + const slotWhichOp = opIds.whichOp; + const idleWaitTime = state.asyncIdleWaitTime; + const hasWaitAsync = !!Atomics.waitAsync; +//#if 0 + error("waitLoop init: isWebLocker",isWebLocker, + "idleWaitTime",idleWaitTime, + "hasWaitAsync",hasWaitAsync); +//#/if while(!flagAsyncShutdown){ try { - if('not-equal'!==Atomics.wait( - state.sabOPView, state.opIds.whichOp, 0, state.asyncIdleWaitTime - )){ - /* Maintenance note: we compare against 'not-equal' because - - https://github.com/tomayac/sqlite-wasm/issues/12 - - is reporting that this occasionally, under high loads, - returns 'ok', which leads to the whichOp being 0 (which - isn't a valid operation ID and leads to an exception, - along with a corresponding ugly console log - message). Unfortunately, the conditions for that cannot - be reliably reproduced. The only place in our code which - writes a 0 to the state.opIds.whichOp SharedArrayBuffer - index is a few lines down from here, and that instance - is required in order for clear communication between - the sync half of this proxy and this half. + let opId; + if( hasWaitAsync ){ + opId = Atomics.load(opView, slotWhichOp); + if( 0===opId ){ + const rv = Atomics.waitAsync(opView, slotWhichOp, 0, + idleWaitTime); + if( rv.async ) await rv.value; + await releaseImplicitLocks(); + continue; + } + }else{ + /** + For browsers without Atomics.waitAsync(), we require + the legacy implementation. Browser versions where + waitAsync() arrived: + + Chrome: 90 (2021-04-13) + Firefox: 145 (2025-11-11) + Safari: 16.4 (2023-03-27) + + The "opfs" VFS was not born until Chrome was somewhere in + the v104-108 range (Summer/Autumn 2022) and did not work + with Safari < v17 (2023-09-18) due to a WebKit bug which + restricted OPFS access from sub-Workers. + + The waitAsync() counterpart of this block can be used by + both "opfs" and "opfs-wl", whereas this block can only be + used by "opfs". Performance comparisons between the two + in high-contention tests have been indecisive. */ - await releaseImplicitLocks(); - continue; + if('not-equal'!==Atomics.wait( + state.sabOPView, slotWhichOp, 0, state.asyncIdleWaitTime + )){ + /* Maintenance note: we compare against 'not-equal' because + + https://github.com/tomayac/sqlite-wasm/issues/12 + + is reporting that this occasionally, under high loads, + returns 'ok', which leads to the whichOp being 0 (which + isn't a valid operation ID and leads to an exception, + along with a corresponding ugly console log + message). Unfortunately, the conditions for that cannot + be reliably reproduced. The only place in our code which + writes a 0 to the state.opIds.whichOp SharedArrayBuffer + index is a few lines down from here, and that instance + is required in order for clear communication between + the sync half of this proxy and this half. + + Much later (2026-03-07): that phenomenon is apparently + called a spurious wakeup. + */ + await releaseImplicitLocks(); + continue; + } + opId = Atomics.load(state.sabOPView, slotWhichOp); } - const opId = Atomics.load(state.sabOPView, state.opIds.whichOp); - Atomics.store(state.sabOPView, state.opIds.whichOp, 0); - const hnd = opHandlers[opId] ?? toss("No waitLoop handler for whichOp #",opId); + Atomics.store(opView, slotWhichOp, 0); + const hnd = f.opHandlers[opId]?.f ?? toss("No waitLoop handler for whichOp #",opId); const args = state.s11n.deserialize( true /* clear s11n to keep the caller from confusing this with an exception string written by the upcoming operation */ ) || []; - //warn("waitLoop() whichOp =",opId, hnd, args); - if(hnd.f) await hnd.f(...args); - else error("Missing callback for opId",opId); + //error("waitLoop() whichOp =",opId, f.opHandlers[opId].key, args); + await hnd(...args); }catch(e){ - error('in waitLoop():',e); + error('in waitLoop():', e); } } }; @@ -756,6 +922,7 @@ const installAsyncProxy = function(){ navigator.storage.getDirectory().then(function(d){ state.rootDir = d; globalThis.onmessage = function({data}){ + //log(globalThis.location.href,"onmessage()",data); switch(data.type){ case 'opfs-async-init':{ /* Receive shared state from synchronous partner */ @@ -771,6 +938,7 @@ const installAsyncProxy = function(){ } }); initS11n(); + //warn("verbosity =",opt.verbose, state.verbose); log("init state",state); wPost('opfs-async-inited'); waitLoop(); @@ -782,22 +950,27 @@ const installAsyncProxy = function(){ flagAsyncShutdown = false; waitLoop(); } - break; + break; } }; wPost('opfs-async-loaded'); }).catch((e)=>error("error initializing OPFS asyncer:",e)); }/*installAsyncProxy()*/; -if(!globalThis.SharedArrayBuffer){ +if(globalThis.window === globalThis){ + wPost('opfs-unavailable', + "This code cannot run from the main thread.", + "Load it as a Worker from a separate Worker."); +}else if(!globalThis.SharedArrayBuffer){ wPost('opfs-unavailable', "Missing SharedArrayBuffer API.", "The server must emit the COOP/COEP response headers to enable that."); }else if(!globalThis.Atomics){ wPost('opfs-unavailable', "Missing Atomics API.", "The server must emit the COOP/COEP response headers to enable that."); +}else if(isWebLocker && !globalThis.Atomics.waitAsync){ + wPost('opfs-unavailable',"Missing required Atomics.waitSync() for "+vfsName); }else if(!globalThis.FileSystemHandle || !globalThis.FileSystemDirectoryHandle || - !globalThis.FileSystemFileHandle || - !globalThis.FileSystemFileHandle.prototype.createSyncAccessHandle || + !globalThis.FileSystemFileHandle?.prototype?.createSyncAccessHandle || !navigator?.storage?.getDirectory){ wPost('opfs-unavailable',"Missing required OPFS APIs."); }else{ diff --git a/ext/wasm/api/sqlite3-vfs-kvvfs.c-pp.js b/ext/wasm/api/sqlite3-vfs-kvvfs.c-pp.js new file mode 100644 index 0000000000..0db303bc46 --- /dev/null +++ b/ext/wasm/api/sqlite3-vfs-kvvfs.c-pp.js @@ -0,0 +1,2102 @@ +/* + 2025-11-21 + + The author disclaims copyright to this source code. In place of a + legal notice, here is a blessing: + + * May you do good and not evil. + * May you find forgiveness for yourself and forgive others. + * May you share freely, never taking more than you give. + + *********************************************************************** + + This file houses the "kvvfs" pieces of the SQLite3 JS API. Most of + kvvfs is implemented in src/os_kv.c and exposed/extended for use + here via sqlite3-wasm.c. + + Main project home page: https://sqlite.org + + Documentation home page: https://sqlite.org/wasm +*/ +//#if omit-kvvfs +globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ + /* These are JS plumbing, not part of the public API */ + delete sqlite3.capi.sqlite3_kvvfs_methods; + delete sqlite3.capi.KVVfsFile; +} +//#else +//#@ policy error +//#savepoint begin +//#define kvvfs-v2-added-in "3.52.0" + +/** + kvvfs - the Key/Value VFS - is an SQLite3 VFS which delegates + storage of its pages and metadata to a key-value store. + + It was conceived in order to support JS's localStorage and + sessionStorage objects. Its native implementation uses files as + key/value storage (one file per record) but the JS implementation + replaces a few methods so that it can use the aforementioned + objects as storage. + + It uses a bespoke ASCII encoding to store each db page as a + separate record and stores some metadata, like the db's unencoded + size and its journal, as individual records. + + kvvfs is significantly less efficient than a plain in-memory db but + it also, as a side effect of its design, offers a JSON-friendly + interchange format for exporting and importing databases. + + kvvfs is _not_ designed for heavy db loads. It is relatively + malloc()-heavy, having to de/allocate frequently, and it + spends much of its time converting the raw db pages into and out of + an ASCII encoding. + + But it _does_ work and is "performant enough" for db work of the + scale of a db which will fit within sessionStorage or localStorage + (just 2-3mb). + + "Version 2" extends it to support using Storage-like objects as + backing storage, Storage being the JS class which localStorage and + sessionStorage both derive from. This essentially moves the backing + store from whatever localStorage and sessionStorage use to an + in-memory object. + + This effort is primarily a stepping stone towards eliminating, if + it proves possible, the POSIX I/O API dependencies in SQLite's WASM + builds. That is: if this VFS works properly, it can be set as the + default VFS and we can eliminate the "unix" VFS from the JS/WASM + builds (as opposed to server-wise/WASI builds). That still, as of + 2025-11-23, a ways away, but it's the main driver for version 2 of + kvvfs. + + Version 2 remains compatible with version 1 databases and always + writes localStorage/sessionStorage metadata in the v1 format, so + such dbs can be manipulated freely by either version. For transient + storage objects (new in version 2), the format of its record keys + is simpified, requiring less space than v1 keys by eliding + redundant (in this context) info from the keys. + + Another benefit of v2 is its ability to export dbs into a + JSON-friendly (but not human-friendly) format. + + A potential, as-yet-unproven, benefit, would be the ability to plug + arbitrary Storage-compatible objects in so that clients could, + e.g. asynchronously post updates to db pages to some back-end for + backups. +*/ +globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ + if( sqlite3.config.disable?.vfs?.kvvfs ){ + return; + } + 'use strict'; + const capi = sqlite3.capi, + sqlite3_kvvfs_methods = capi.sqlite3_kvvfs_methods, + KVVfsFile = capi.KVVfsFile, + pKvvfs = sqlite3.capi.sqlite3_vfs_find("kvvfs") + + /* These are JS plumbing, not part of the public API */ + delete capi.sqlite3_kvvfs_methods; + delete capi.KVVfsFile; + + if( !pKvvfs ) return /* nothing to do */; + if( 0 ){ + /* This working would be our proverbial holy grail, in that it + would allow us to eliminate the current default VFS, which + relies on POSIX I/O APIs. Eliminating that dependency would get + us one giant step closer to creating wasi-sdk builds. */ + capi.sqlite3_vfs_register(pKvvfs, 1); + } + + const util = sqlite3.util, + wasm = sqlite3.wasm, + toss3 = util.toss3, + hop = (o,k)=>Object.prototype.hasOwnProperty.call(o,k); + + const kvvfsMethods = new sqlite3_kvvfs_methods( + /* Wraps the static sqlite3_api_methods singleton */ + wasm.exports.sqlite3__wasm_kvvfs_methods() + ); + util.assert( 32<=kvvfsMethods.$nKeySize, "unexpected kvvfsMethods.$nKeySize: "+kvvfsMethods.$nKeySize); + + /** + Most of the VFS-internal state. + */ + const cache = Object.assign(Object.create(null),{ + /** Regex matching journal file names. */ + rxJournalSuffix: /-journal$/, + /** Frequently-used C-string. */ + zKeyJrnl: wasm.allocCString("jrnl"), + /** Frequently-used C-string. */ + zKeySz: wasm.allocCString("sz"), + /** + The maximum size of a kvvfs record key. It is historically only + 32, a limitation currently retained only because it's convenient to + do so (the underlying code has outgrown the need for the artifically + low limit). + + We cache this value here because the end of this init code will + dispose of kvvfsMethods, invalidating it. + */ + keySize: kvvfsMethods.$nKeySize, + /** + WASM heap memory buffers to optimize out some frequent + allocations. + */ + buffer: Object.assign(Object.create(null),{ + /** + The size of each buffer in this.pool. + + kvvfsMethods.$nBufferSize is slightly larger than the output + space needed for a kvvfs-encoded 64kb db page in a worse-cast + encoding (128kb). It is not suitable for arbitrary buffer + use, only page de/encoding. + */ + n: kvvfsMethods.$nBufferSize, + /** + Map of buffer ids to wasm.alloc()'d pointers of size + this.n. (Re)used by various internals. + + Buffer ids 0 and 1 are used in the API internals. Other + names are used in higher-level APIs. + + See memBuffer() and memBufferFree(). + */ + pool: Object.create(null) + }) + }); + + /** + Returns a (cached) wasm.alloc()'d buffer of cache.buffer.n size, + throwing on OOM. + + We leak this one-time alloc because we've no better option. + sqlite3_vfs does not have a finalizer, so we've no place to hook + in the cleanup. We "could" extend sqlite3_shutdown() to have a + cleanup list for stuff like this but that function is never + used in JS, so it's hardly worth it. + */ + cache.memBuffer = (id=0)=>cache.buffer.pool[id] ??= wasm.alloc(cache.buffer.n); + + /** Frees the buffer with the given id. */ + cache.memBufferFree = (id)=>{ + const b = cache.buffer.pool[id]; + if( b ){ + wasm.dealloc(b); + delete cache.buffer.pool[id]; + } + }; + + const noop = ()=>{}; + const debug = sqlite3.__isUnderTest + ? (...args)=>sqlite3.config.debug?.("kvvfs:", ...args) + : noop; + const warn = (...args)=>sqlite3.config.warn?.("kvvfs:", ...args); + const error = (...args)=>sqlite3.config.error?.("kvvfs:", ...args); + + /** + Implementation of JS's Storage interface for use as backing store + of the kvvfs. Storage is a native class and its constructor + cannot be legally called from JS, making it impossible to + directly subclass Storage. This class implements (only) the + Storage interface, to make it a drop-in replacement for + localStorage/sessionStorage. (Any behavioral discrepancies are to + be considered bugs.) + + This impl simply proxies a plain, prototype-less Object, suitable + for JSON-ing. + + Design note: Storage has a bit of an odd iteration-related + interface as does not (AFAIK) specify specific behavior regarding + modification during traversal. Because of that, this class does + some seemingly unnecessary things with its #keys member, deleting + and recreating it whenever a property index might be invalidated. + */ + class KVVfsStorage { + #map = Object.create(null); + #keys = null; + #size = 0; + + constructor(){ + this.clear(); + } + + #getKeys(){ + return this.#keys ??= Object.keys(this.#map); + } + + key(n){ + if(n < 0 || n >= this.#size) return null; + return this.#getKeys()[n]; + } + + getItem(k){ + return this.#map[k] ?? null; + } + + setItem(k,v){ + if( !(k in this.#map) ){ + ++this.#size; + this.#keys = null; + } + this.#map[k] = ''+v; + } + + removeItem(k){ + if( k in this.#map ){ + delete this.#map[k]; + --this.#size; + this.#keys = null; + } + } + + clear(){ + this.#map = Object.create(null); + this.#keys = null; + this.#size = 0; + } + + get length() { + return this.#size; + } + }/*KVVfsStorage*/; + + /** True if v is the name of one of the special persistant Storage + objects. */ + const kvvfsIsPersistentName = (v)=>'local'===v || 'session'===v; + + /** + Keys in kvvfs have a prefix of "kvvfs-NAME-", where NAME is the + db name. This key is redundant in JS but it's how kvvfs works (it + saves each key to a separate file, so needs a distinct namespace + per data source name). We retain this prefix in 'local' and + 'session' storage for backwards compatibility and so that they + can co-exist with client data in their storage, but we elide them + from "v2" storage, where they're superfluous. + */ + const kvvfsKeyPrefix = (v)=>kvvfsIsPersistentName(v) ? 'kvvfs-'+v+'-' : ''; + + /** + Throws if storage name n (JS string) is not valid for use as a + storage name. Much of this goes back to kvvfs having a fixed + buffer size for its keys, and the storage name needing to be + encoded in the keys for local/session storage. + + The second argument must only be true when called from xOpen() - + it makes names with a "-journal" suffix legal. + */ + const validateStorageName = function(n,mayBeJournal=false){ + if( kvvfsIsPersistentName(n) ) return; + const len = (new Blob([n])).size/*byte length*/; + if( !len ) toss3(capi.SQLITE_MISUSE, "Empty name is not permitted."); + let maxLen = cache.keySize - 1; + if( cache.rxJournalSuffix.test(n) ){ + if( !mayBeJournal ){ + toss3(capi.SQLITE_MISUSE, + "Storage names may not have a '-journal' suffix."); + } + }else if( ['-wal','-shm'].filter(v=>n.endsWith(v)).length ){ + toss3(capi.SQLITE_MISUSE, + "Storage names may not have a -wal or -shm suffix."); + }else{ + maxLen -= 8 /* so we have space for a matching "-journal" suffix */; + } + if( len > maxLen ){ + toss3(capi.SQLITE_RANGE, "Storage name is too long. Limit =", maxLen); + } + let i; + for( i = 0; i < len; ++i ){ + const ch = n.codePointAt(i); + if( ch<32 ){ + toss3(capi.SQLITE_RANGE, + "Illegal character ("+ch+"d) in storage name:",n); + } + } + }; + + /** + Create a new instance of the objects which go into + cache.storagePool, with a refcount of 1. If passed a Storage-like + object as its second argument, it is used for the storage, + otherwise it creates a new KVVfsStorage object. + */ + const newStorageObj = (name,storage=undefined)=>Object.assign(Object.create(null),{ + /** + JS string value of this KVVfsFile::$zClass. i.e. the storage's + name. + */ + jzClass: name, + /** + Refcount. This keeps dbs and journals pointing to the same + storage for the life of both and enables kvvfs to behave more + like a conventional filesystem (a stepping stone towards + downstream API goals). Managed by xOpen() and xClose(). + */ + refc: 1, + /** + If true, this storage will be removed by xClose() or + sqlite3_js_kvvfs_unlink() when refc reaches 0. The others will + persist when refc==0, to give the illusion of real back-end + storage. Managed by xOpen() and sqlite3_js_kvvfs_reserve(). By + default this is false but the delete-on-close=1 flag can be + used to set this to true. + */ + deleteAtRefc0: false, + /** + The backing store. Must implement the Storage interface. + */ + storage: storage || new KVVfsStorage, + /** + The storage prefix used for kvvfs keys. It is + "kvvfs-STORAGENAME-" for local/session storage and an empty + string for other storage. local/session storage must use the + long form (A) for backwards compatibility and (B) so that kvvfs + can coexist with non-db client data in those backends. Neither + (A) nor (B) are concerns for KVVfsStorage objects. + + This prefix mirrors the one generated by os_kv.c's + kvrecordMakeKey() and must stay in sync with that one. + */ + keyPrefix: kvvfsKeyPrefix(name), + /** + KVVfsFile instances currently using this storage. Managed by + xOpen() and xClose(). + */ + files: [], + /** + If set, it's an array of objects with various event + callbacks. See sqlite3_js_kvvfs_listen(). When there are no + listeners, this member is set to undefined (instead of an empty + array) to allow us to more easily optimize out calls to + notifyListeners() for the common case of no listeners. + */ + listeners: undefined + }); + + /** + Public interface for kvvfs v2. The capi.sqlite3_js_kvvfs_...() + routines remain in place for v1. Some members of this class proxy + to those functions but use different default argument values in + some cases. + */ + const kvvfs = sqlite3.kvvfs = Object.create(null); + if( sqlite3.__isUnderTest ){ + /* For inspection via the dev tools console. */ + kvvfs.log = Object.assign(Object.create(null),{ + xOpen: false, + xClose: false, + xWrite: false, + xRead: false, + xSync: false, + xAccess: false, + xFileControl: false, + xRcrdRead: false, + xRcrdWrite: false, + xRcrdDelete: false, + }); + } + + /** + Deletes the cache.storagePool entries for store (a + cache.storagePool entry) and its db/journal counterpart. + */ + const deleteStorage = function(store){ + const other = cache.rxJournalSuffix.test(store.jzClass) + ? store.jzClass.replace(cache.rxJournalSuffix,'') + : store.jzClass+'-journal'; + kvvfs?.log?.xClose + && debug("cleaning up storage handles [", store.jzClass, other,"]",store); + delete cache.storagePool[store.jzClass]; + delete cache.storagePool[other]; + if( !sqlite3.__isUnderTest ){ + /* In test runs, leave these for inspection. If we delete them here, + any prior dumps of them emitted via the console get cleared out + because the console shows live objects instead of call-time + static dumps. */ + delete store.storage; + delete store.refc; + } + }; + + /** + Add both store.jzClass and store.jzClass+"-journal" + to cache,storagePool. + */ + const installStorageAndJournal = (store)=> + cache.storagePool[store.jzClass] = + cache.storagePool[store.jzClass+'-journal'] = store; + + /** + The public name of the current thread's transient storage + object. A storage object with this name gets preinstalled. + */ + const nameOfThisThreadStorage = '.'; + + /** + Map of JS-stringified KVVfsFile::zClass names to + reference-counted Storage objects. These objects are created in + xOpen(). Their refcount is decremented in xClose(), and the + record is destroyed if the refcount reaches 0. We refcount so + that concurrent active xOpen()s on a given name, and within a + given thread, use the same storage object. + */ + cache.storagePool = Object.assign(Object.create(null),{ + /* Start off with mappings for well-known names. */ + [nameOfThisThreadStorage]: newStorageObj(nameOfThisThreadStorage) + }); + + if( globalThis.Storage ){ + /* If available, install local/session storage. */ + if( globalThis.localStorage instanceof globalThis.Storage ){ + cache.storagePool.local = newStorageObj('local', globalThis.localStorage); + } + if( globalThis.sessionStorage instanceof globalThis.Storage ){ + cache.storagePool.session = newStorageObj('session', globalThis.sessionStorage); + } + } + + cache.builtinStorageNames = Object.keys(cache.storagePool); + + const isBuiltinName = (n)=>cache.builtinStorageNames.indexOf(n)>-1; + + /* Add "-journal" twins for each cache.storagePool entry... */ + for(const k of Object.keys(cache.storagePool)){ + /* Journals in kvvfs are are stored as individual records within + their Storage-ish object, named "{storage.keyPrefix}jrnl". We + always map the db and its journal to the same Storage + object. */ + const orig = cache.storagePool[k]; + cache.storagePool[k+'-journal'] = orig; + } + + cache.setError = (e=undefined, dfltErrCode=capi.SQLITE_ERROR)=>{ + if( e ){ + cache.lastError = e; + return (e.resultCode | 0) || dfltErrCode; + } + delete cache.lastError; + return 0; + }; + + cache.popError = ()=>{ + const e = cache.lastError; + delete cache.lastError; + return e; + }; + + /** Exception handler for notifyListeners(). */ + const catchForNotify = (e)=>{ + warn("kvvfs.listener handler threw:",e); + }; + + const kvvfsDecode = wasm.exports.sqlite3__wasm_kvvfs_decode; + const kvvfsEncode = wasm.exports.sqlite3__wasm_kvvfs_encode; + + /** + Listener events and their argument(s) (via the callback(ev) + ev.data member): + + 'open': number of opened handles on this storage. + + 'close': number of opened handles on this storage. + + 'write': key, value + + 'delete': key + + 'sync': true if it's from xSync(), false if it's from + xFileControl(). + + For efficiency's sake, all calls to this function should + be in the form: + + store.listeners && notifyListeners(...); + + Failing to do so will trigger an exceptin in this function (which + will be ignored but may produce a console warning). + */ + const notifyListeners = async function(eventName,store,...args){ + try{ + //cache.rxPageNoSuffix ??= /(\d+)$/; + if( store.keyPrefix && args[0] ){ + args[0] = args[0].replace(store.keyPrefix,''); + } + let u8enc, z0, z1, wcache; + for(const ear of store.listeners){ + const ev = Object.create(null); + ev.storageName = store.jzClass; + ev.type = eventName; + const decodePages = ear.decodePages; + const f = ear.events[eventName]; + if( f ){ + if( !ear.includeJournal && args[0]==='jrnl' ){ + continue; + } + if( 'write'===eventName && ear.decodePages && +args[0]>0 ){ + /* Decode pages to Uint8Array, caching the result in + wcache in case we have more listeners. */ + ev.data = [args[0]]; + if( wcache?.[args[0]] ){ + ev.data[1] = wcache[args[0]]; + continue; + } + u8enc ??= new TextEncoder('utf-8'); + z0 ??= cache.memBuffer(10); + z1 ??= cache.memBuffer(11); + const u = u8enc.encode(args[1]); + const heap = wasm.heap8u(); + heap.set(u, Number(z0)); + heap[wasm.ptr.addn(z0, u.length)] = 0; + const rc = kvvfsDecode(z0, z1, cache.buffer.n); + if( rc>0 ){ + wcache ??= Object.create(null); + wcache[args[0]] + = ev.data[1] + = heap.slice(Number(z1), wasm.ptr.addn(z1,rc)); + }else{ + continue; + } + }else{ + ev.data = args.length + ? ((args.length===1) ? args[0] : args) + : undefined; + } + try{f(ev)?.catch?.(catchForNotify)} + catch(e){ + warn("notifyListeners [",store.jzClass,"]",eventName,e); + } + } + } + }catch(e){ + catchForNotify(e); + } + }/*notifyListeners()*/; + + /** + Returns the storage object mapped to the given string zClass + (C-string pointer or JS string). + */ + const storageForZClass = (zClass)=> + 'string'===typeof zClass + ? cache.storagePool[zClass] + : cache.storagePool[wasm.cstrToJs(zClass)]; + +//#if 0 + // fileForDb() works but we don't have a current need for it. + /** + Expects an (sqlite3*). Uses sqlite3_file_control() to extract its + (sqlite3_file*). On success it returns a new KVVfsFile instance + wrapping that pointer, which the caller must eventual call + dispose() on (which won't free the underlying pointer, just the + wrapper). Returns null if no handle is found (which would + indicate either that pDb is not using kvvfs or a severe bug in + its management). + */ + const fileForDb = function(pDb){ + const stack = wasm.pstack.pointer; + try{ + const pOut = wasm.pstack.allocPtr(); + return wasm.exports.sqlite3_file_control( + pDb, wasm.ptr.null, capi.SQLITE_FCNTL_FILE_POINTER, pOut + ) + ? null + : new KVVfsFile(wasm.peekPtr(pOut)); + }finally{ + wasm.pstack.restore(stack); + } + }; + + /** + Expects an object from the storagePool map. The $szPage and + $szDb members of each store.files entry is set to -1 in an attempt + to trigger those values to reload. + */ + const alertFilesToReload = (store)=>{ + try{ + for( const f of store.files ){ + // FIXME: we need to use one of the C APIs for this, maybe an + // fcntl. + f.$szPage = -1; + f.$szDb = -1n + } + }catch(e){ + error("alertFilesToReload()",store,e); + throw e; + } + }; +//#/if + + const kvvfsMakeKey = wasm.exports.sqlite3__wasm_kvvfsMakeKey; + /** + Returns a C string from kvvfsMakeKey() OR returns zKey. In the + former case the memory is static, so must be copied before a + second call. zKey MUST be a pointer passed to a VFS/file method, + to allow us to avoid an alloc and/or an snprintf(). It requires + C-string arguments for zClass and zKey. zClass may be NULL but + zKey may not. + */ + const zKeyForStorage = (store, zClass, zKey)=>{ + //debug("zKeyForStorage(",store, wasm.cstrToJs(zClass), wasm.cstrToJs(zKey)); + return (zClass && store.keyPrefix) ? kvvfsMakeKey(zClass, zKey) : zKey; + }; + + const jsKeyForStorage = (store,zClass,zKey)=> + wasm.cstrToJs(zKeyForStorage(store, zClass, zKey)); + + const storageGetDbSize = (store)=>+store.storage.getItem(store.keyPrefix + "sz"); + + /** + sqlite3_file pointers => objects, each of which has: + + .file = KVVfsFile instance + + .jzClass = JS-string form of f.$zClass + + .storage = Storage object. It is shared between a db and its + journal. + */ + const pFileHandles = new Map(); + + /** + Original WASM functions for methods we partially override. + */ + const originalMethods = { + vfs: Object.create(null), + ioDb: Object.create(null), + ioJrnl: Object.create(null) + }; + + /** Returns the appropriate originalMethods[X] instance for the + given a KVVfsFile instance. */ + const originalIoMethods = (kvvfsFile)=> + originalMethods[kvvfsFile.$isJournal ? 'ioJrnl' : 'ioDb']; + + const pVfs = new capi.sqlite3_vfs(kvvfsMethods.$pVfs); + const pIoDb = new capi.sqlite3_io_methods(kvvfsMethods.$pIoDb); + const pIoJrnl = new capi.sqlite3_io_methods(kvvfsMethods.$pIoJrnl); + const recordHandler = + Object.create(null)/** helper for some vfs + routines. Populated later. */; + const kvvfsInternal = Object.assign(Object.create(null),{ + pFileHandles, + cache, + storageForZClass, + KVVfsStorage, + /** + BUG: changing to a page size other than the default, + then vacuuming, corrupts the db. As a workaround, + until this is resolved, we forcibly disable + (pragma page_size=...) changes. + */ + disablePageSizeChange: true + }); + if( kvvfs.log ){ + // this is a test build + kvvfs.internal = kvvfsInternal; + } + + /** + Implementations for members of the object referred to by + sqlite3__wasm_kvvfs_methods(). We swap out some native + implementations with these so that we can use JS Storage for + their backing store. + */ + const methodOverrides = { + + /** + sqlite3_kvvfs_methods's member methods. These perform the + fetching, setting, and removal of storage keys on behalf of + kvvfs. In the native impl these write each db page to a + separate file. This impl stores each db page as a single + record in a Storage object which is mapped to zClass. + + A db's size is stored in a record named kvvfs[-storagename]-sz + and the journal is stored in kvvfs[-storagename]-jrnl. The + [-storagename] part is a remnant of the native impl (so that + it has unique filenames per db) and is only used for + localStorage and sessionStorage. We elide that part (to save + space) from other storage objects but retain it on those two + to avoid invalidating pre-version-2 session/localStorage dbs. + + The interface docs for these methods are in src/os_kv.c's + kvrecordRead(), kvrecordWrite(), and kvrecordDelete(). + */ + recordHandler: { + xRcrdRead: (zClass, zKey, zBuf, nBuf)=>{ + try{ + const jzClass = wasm.cstrToJs(zClass); + const store = storageForZClass(jzClass); + if( !store ) return -1; + const jXKey = jsKeyForStorage(store, zClass, zKey); + kvvfs?.log?.xRcrdRead && warn("xRcrdRead", jzClass, jXKey, nBuf, store ); + const jV = store.storage.getItem(jXKey); + if(null===jV) return -1; + const nV = jV.length /* We are relying 100% on v being + ** ASCII so that jV.length is equal + ** to the C-string's byte length. */; + if( 0 ){ + debug("xRcrdRead", jXKey, store, jV); + } + if(nBuf<=0) return nV; + else if(1===nBuf){ + wasm.poke(zBuf, 0); + return nV; + } + if( nBuf+1{ + try { + const store = storageForZClass(zClass); + const jxKey = jsKeyForStorage(store, zClass, zKey); + const jData = wasm.cstrToJs(zData); + kvvfs?.log?.xRcrdWrite && warn("xRcrdWrite",jxKey, store); + store.storage.setItem(jxKey, jData); + store.listeners && notifyListeners('write', store, jxKey, jData); + return 0; + }catch(e){ + error("kvrecordWrite()",e); + return cache.setError(e, capi.SQLITE_IOERR); + } + }, + + xRcrdDelete: (zClass, zKey)=>{ + try { + const store = storageForZClass(zClass); + const jxKey = jsKeyForStorage(store, zClass, zKey); + kvvfs?.log?.xRcrdDelete && warn("xRcrdDelete",jxKey, store); + store.storage.removeItem(jxKey); + store.listeners && notifyListeners('delete', store, jxKey); + return 0; + }catch(e){ + error("kvrecordDelete()",e); + return cache.setError(e, capi.SQLITE_IOERR); + } + } + }/*recordHandler*/, + + /** + Override certain operations of the underlying sqlite3_vfs and + the two sqlite3_io_methods instances so that we can tie + Storage objects to db names. + */ + vfs:{ + /* sqlite3_kvvfs_methods::pVfs's methods */ + xOpen: function(pProtoVfs,zName,pProtoFile,flags,pOutFlags){ + cache.popError(); + let zToFree /* alloc()'d memory for temp db name */; + if( 0 ){ + /* tester1.js makes it a lot further if we do this. */ + flags |= capi.SQLITE_OPEN_CREATE; + } + try{ + if( !zName ){ + zToFree = wasm.allocCString(""+pProtoFile+"." + +(Math.random() * 100000 | 0)); + zName = zToFree; + } + const jzClass = wasm.cstrToJs(zName); + kvvfs?.log?.xOpen && debug("xOpen",jzClass,"flags =",flags); + validateStorageName(jzClass, true); + if( (flags & (capi.SQLITE_OPEN_MAIN_DB + | capi.SQLITE_OPEN_TEMP_DB + | capi.SQLITE_OPEN_TRANSIENT_DB)) + && cache.rxJournalSuffix.test(jzClass) ){ + toss3(capi.SQLITE_ERROR, + "DB files may not have a '-journal' suffix."); + } + let s = storageForZClass(jzClass); + if( !s && !(flags & capi.SQLITE_OPEN_CREATE) ){ + toss3(capi.SQLITE_ERROR, "Storage not found:", jzClass); + } + const rc = originalMethods.vfs.xOpen(pProtoVfs, zName, pProtoFile, + flags, pOutFlags); + if( rc ) return rc; + let deleteAt0 = !!(capi.SQLITE_OPEN_DELETEONCLOSE & flags); + if(wasm.isPtr(arguments[1]/*original zName*/)){ + if(capi.sqlite3_uri_boolean(zName, "delete-on-close", 0)){ + deleteAt0 = true; + } + } + const f = new KVVfsFile(pProtoFile); + util.assert(f.$zClass, "Missing f.$zClass"); + f.addOnDispose(zToFree); + zToFree = undefined; + //debug("xOpen", jzClass, s); + if( s ){ + ++s.refc; + //no if( true===deleteAt0 ) s.deleteAtRefc0 = true; + s.files.push(f); + wasm.poke32(pOutFlags, flags); + }else{ + wasm.poke32(pOutFlags, flags | capi.SQLITE_OPEN_CREATE); + util.assert( !f.$isJournal, "Opening a journal before its db? "+jzClass ); + /* Map both zName and zName-journal to the same storage. */ + const nm = jzClass.replace(cache.rxJournalSuffix,''); + s = newStorageObj(nm); + installStorageAndJournal(s); + s.files.push(f); + s.deleteAtRefc0 = deleteAt0; + kvvfs?.log?.xOpen + && debug("xOpen installed storage handle [",nm, nm+"-journal","]", s); + } + pFileHandles.set(pProtoFile, {store: s, file: f, jzClass}); + s.listeners && notifyListeners('open', s, s.files.length); + return 0; + }catch(e){ + warn("xOpen:",e); + return cache.setError(e); + }finally{ + zToFree && wasm.dealloc(zToFree); + } + }/*xOpen()*/, + + xDelete: function(pVfs, zName, iSyncFlag){ + cache.popError(); + try{ + const jzName = wasm.cstrToJs(zName); + if( cache.rxJournalSuffix.test(jzName) ){ + recordHandler.xRcrdDelete(zName, cache.zKeyJrnl); + }/* + else: historically not done, but maybe otherwise delete + all db pages from storageForZClass(zName)? + */ + return 0; + }catch(e){ + warn("xDelete",e); + return cache.setError(e); + } + }, + + xAccess: function(pProtoVfs, zPath, flags, pResOut){ + cache.popError(); + try{ + const s = storageForZClass(zPath); + const jzPath = s?.jzClass || wasm.cstrToJs(zPath); + if( kvvfs?.log?.xAccess ){ + debug("xAccess",jzPath,"flags =", + flags,"*pResOut =",wasm.peek32(pResOut), + "store =",s); + } + if( !s ){ + // From the API docs: + /** The xAccess method returns [SQLITE_OK] on success or some + ** non-zero error code if there is an I/O error or if the name of + ** the file given in the second argument is illegal. + */ + // However, returning non-0 from here is fatal, so we don't do that. + try{validateStorageName(jzPath)} + catch(e){ + //warn("xAccess is ignoring name validation failure:",e); + wasm.poke32(pResOut, 0); + return 0; + } + } + if( s ){ + const key = s.keyPrefix+ + (cache.rxJournalSuffix.test(jzPath) ? "jrnl" : "1"); + const res = s.storage.getItem(key) ? 0 : 1; + /* This res value looks completely backwards to me, and + is the opposite of the native kvvfs's impl, but it's + working, whereas reimplementing the native one + faithfully does not. Read the lib-level code of where + this is invoked, my expectation is that we set res to 0 + for not-exists. */ + //warn("access res",jzPath,res); + wasm.poke32(pResOut, res); + }else{ + wasm.poke32(pResOut, 0); + } + return 0; + }catch(e){ + error('xAccess',e); + return cache.setError(e); + } + }, + + xRandomness: function(pVfs, nOut, pOut){ + const heap = wasm.heap8u(); + let i = 0; + const npOut = Number(pOut); + for(; i < nOut; ++i) heap[npOut + i] = (Math.random()*255000) & 0xFF; + return nOut; + }, + + xGetLastError: function(pVfs,nOut,pOut){ + const e = cache.popError(); + debug('xGetLastError',e); + if(e){ + const scope = wasm.scopedAllocPush(); + try{ + const [cMsg, n] = wasm.scopedAllocCString(e.message, true); + wasm.cstrncpy(pOut, cMsg, nOut); + if(n > nOut) wasm.poke8(wasm.ptr.add(pOut,nOut,-1), 0); + debug("set xGetLastError",e.message); + return (e.resultCode | 0) || capi.SQLITE_IOERR; + }catch(e){ + return capi.SQLITE_NOMEM; + }finally{ + wasm.scopedAllocPop(scope); + } + } + return 0; + } + +//#if 0 + // these impls work but there's currently no pressing need _not_ use + // the native impls. + xCurrentTime: function(pVfs,pOut){ + wasm.poke64f(pOut, 2440587.5 + (Date.now()/86400000)); + return 0; + }, + + xCurrentTimeInt64: function(pVfs,pOut){ + wasm.poke64(pOut, (2440587.5 * 86400000) + Date.now()); + return 0; + } +//#/if + }/*.vfs*/, + + /** + kvvfs has separate sqlite3_api_methods impls for some of the + methods depending on whether it's a db or journal file. Some + of the methods use shared impls but others are specific to + either db or journal files. + */ + ioDb:{ + /* sqlite3_kvvfs_methods::pIoDb's methods */ + xClose: function(pFile){ + cache.popError(); + try{ + const h = pFileHandles.get(pFile); + kvvfs?.log?.xClose && debug("xClose", pFile, h); + if( h ){ + pFileHandles.delete(pFile); + const s = h.store;//storageForZClass(h.jzClass); + s.files = s.files.filter((v)=>v!==h.file); + if( --s.refc<=0 && s.deleteAtRefc0 ){ + deleteStorage(s); + } + originalMethods.ioDb/*same for journals*/.xClose(pFile); + h.file.dispose(); + s.listeners && notifyListeners('close', s, s.files.length); + }else{ + /* Can happen if xOpen fails */ + } + return 0; + }catch(e){ + error("xClose",e); + return cache.setError(e); + } + }, + + xFileControl: function(pFile, opId, pArg){ + cache.popError(); + try{ + const h = pFileHandles.get(pFile); + util.assert(h, "Missing KVVfsFile handle"); + kvvfs?.log?.xFileControl && debug("xFileControl",h,'op =',opId); + if( opId===capi.SQLITE_FCNTL_PRAGMA + && kvvfsInternal.disablePageSizeChange ){ + /* pArg== length-3 (char**) */ + //const argv = wasm.cArgvToJs(3, pArg); // the easy way + const zName = wasm.peekPtr(wasm.ptr.add(pArg, wasm.ptr.size)); + if( "page_size"===wasm.cstrToJs(zName) ){ + kvvfs?.log?.xFileControl + && debug("xFileControl pragma",wasm.cstrToJs(zName)); + const zVal = wasm.peekPtr(wasm.ptr.add(pArg, 2*wasm.ptr.size)); + if( zVal ){ + /* Without this, pragma page_size=N; followed by a + vacuum breaks the db. With this, it continues + working but does not actually change the page + size. */ + kvvfs?.log?.xFileControl + && warn("xFileControl pragma", h, + "NOT setting page size to", wasm.cstrToJs(zVal)); + h.file.$szPage = -1; + return 0/*corrupts: capi.SQLITE_NOTFOUND*/; + }else if( h.file.$szPage>0 ){ + kvvfs?.log?.xFileControl && + warn("xFileControl", h, "getting page size",h.file.$szPage); + wasm.pokePtr(pArg, wasm.allocCString(""+h.file.$szPage) + /* memory now owned by the library */); + return 0;//capi.SQLITE_NOTFOUND; + } + } + } + const rc = originalMethods.ioDb.xFileControl(pFile, opId, pArg); + if( 0==rc && capi.SQLITE_FCNTL_SYNC===opId ){ + h.store.listeners && notifyListeners('sync', h.store, false); + } + return rc; + }catch(e){ + error("xFileControl",e); + return cache.setError(e); + } + }, + + xSync: function(pFile,flags){ + cache.popError(); + try{ + const h = pFileHandles.get(pFile); + kvvfs?.log?.xSync && debug("xSync", h); + util.assert(h, "Missing KVVfsFile handle"); + const rc = originalMethods.ioDb.xSync(pFile, flags); + if( 0==rc && h.store.listeners ) notifyListeners('sync', h.store, true); + return rc; + }catch(e){ + error("xSync",e); + return cache.setError(e); + } + }, + +//#if 0 + // We override xRead/xWrite only for logging/debugging. They + // should otherwise be disabled (it's faster that way). + xRead: function(pFile,pTgt,n,iOff64){ + cache.popError(); + try{ + if( kvvfs?.log?.xRead ){ + const h = pFileHandles.get(pFile); + util.assert(h, "Missing KVVfsFile handle"); + debug("xRead", n, iOff64, h); + } + return originalMethods.ioDb.xRead(pFile, pTgt, n, iOff64); + }catch(e){ + error("xRead",e); + return cache.setError(e); + } + }, + xWrite: function(pFile,pSrc,n,iOff64){ + cache.popError(); + try{ + if( kvvfs?.log?.xWrite ){ + const h = pFileHandles.get(pFile); + util.assert(h, "Missing KVVfsFile handle"); + debug("xWrite", n, iOff64, h); + } + return originalMethods.ioDb.xWrite(pFile, pSrc, n, iOff64); + }catch(e){ + error("xWrite",e); + return cache.setError(e); + } + }, +//#/if + +//#if 0 + xTruncate: function(pFile,i64){}, + xFileSize: function(pFile,pi64Out){}, + xLock: function(pFile,iLock){}, + xUnlock: function(pFile,iLock){}, + xCheckReservedLock: function(pFile,piOut){}, + xSectorSize: function(pFile){}, + xDeviceCharacteristics: function(pFile){} +//#/if + }/*.ioDb*/, + + ioJrnl:{ + /* sqlite3_kvvfs_methods::pIoJrnl's methods. Those set to true + are copied as-is from the ioDb objects. Others are specific + to journal files. */ + xClose: true, +//#if 0 + xRead: function(pFile,pTgt,n,iOff64){}, + xWrite: function(pFile,pSrc,n,iOff64){}, + xTruncate: function(pFile,i64){}, + xSync: function(pFile,flags){}, + xFileControl: function(pFile, opId, pArg){}, + xFileSize: function(pFile,pi64Out){}, + xLock: true, + xUnlock: true, + xCheckReservedLock: true, + xSectorSize: true, + xDeviceCharacteristics: true +//#/if + }/*.ioJrnl*/ + }/*methodOverrides*/; + +//#if 0 + debug("pVfs and friends", pVfs, pIoDb, pIoJrnl, + kvvfsMethods, capi.sqlite3_file.structInfo, + KVVfsFile.structInfo); +//#/if + + try { + util.assert( cache.buffer.n>1024*129, "Heap buffer is not large enough" + /* Native is SQLITE_KVOS_SZ is 133073 as of this writing */ ); + for(const e of Object.entries(methodOverrides.recordHandler)){ + // Overwrite kvvfsMethods's callbacks + const k = e[0], f = e[1]; + recordHandler[k] = f; + if( 0 ){ + // bug: this should work + kvvfsMethods.installMethod(k, f); + }else{ + kvvfsMethods[kvvfsMethods.memberKey(k)] = + wasm.installFunction(kvvfsMethods.memberSignature(k), f); + } + } + for(const e of Object.entries(methodOverrides.vfs)){ + // Overwrite some pVfs entries and stash the original impls + const k = e[0], f = e[1], km = pVfs.memberKey(k), + member = pVfs.structInfo.members[k] + || util.toss("Missing pVfs.structInfo[",k,"]"); + originalMethods.vfs[k] = wasm.functionEntry(pVfs[km]); + pVfs[km] = wasm.installFunction(member.signature, f); + } + for(const e of Object.entries(methodOverrides.ioDb)){ + // Similar treatment for pVfs.$pIoDb a.k.a. pIoDb... + const k = e[0], f = e[1], km = pIoDb.memberKey(k); + originalMethods.ioDb[k] = wasm.functionEntry(pIoDb[km]) + || util.toss("Missing native pIoDb[",km,"]"); + pIoDb[km] = wasm.installFunction(pIoDb.memberSignature(k), f); + } + for(const e of Object.entries(methodOverrides.ioJrnl)){ + // Similar treatment for pVfs.$pIoJrnl a.k.a. pIoJrnl... + const k = e[0], f = e[1], km = pIoJrnl.memberKey(k); + originalMethods.ioJrnl[k] = wasm.functionEntry(pIoJrnl[km]) + || util.toss("Missing native pIoJrnl[",km,"]"); + if( true===f ){ + /* use pIoDb's copy */ + pIoJrnl[km] = pIoDb[km] || util.toss("Missing copied pIoDb[",km,"]"); + }else{ + pIoJrnl[km] = wasm.installFunction(pIoJrnl.memberSignature(k), f); + } + } + }finally{ + kvvfsMethods.dispose(); + pVfs.dispose(); + pIoDb.dispose(); + pIoJrnl.dispose(); + } + + /* + That gets all of the low-level bits out of the way. What follows + are the public API additions. + */ + + /** + Clears all storage used by the kvvfs DB backend, deleting any + DB(s) stored there. + + Its argument must be the name of a kvvfs storage object: + + - 'session' + - 'local' + - '' - see below. + - A transient kvvfs storage object name. + + In the first two cases, only sessionStorage resp. localStorage is + cleared. An empty string resolves to both 'local' and 'session' + storage. + + Returns the number of entries cleared. + + As of kvvfs version 2: + + This API is available in Worker threads but does not have access + to localStorage or sessionStorage in them. Prior versions did not + include this API in Worker threads. + + Differences in this function in version 2: + + - It accepts an arbitrary storage name. In v1 this was a silent + no-op for any names other than ('local','session',''). + + - It throws if a db currently has the storage opened UNLESS the + storage object is localStorage or sessionStorage. That version 1 + did not throw for this case was due to an architectural + limitation which has since been overcome, but removal of + JsStorageDb.prototype.clearStorage() would be a backwards compatibility + break, so this function permits wiping the storage for those two + cases even if they are opened. Use with care. + */ + const sqlite3_js_kvvfs_clear = function callee(which){ + if( ''===which ){ + return callee('local') + callee('session'); + } + const store = storageForZClass(which); + if( !store ) return 0; + if( store.files.length ){ + if( globalThis.localStorage===store.storage + || globalThis.sessionStorage===store.storage ){ + /* backwards compatibility: allow these to be cleared + while opened. */ + }else{ + /* Interestingly, kvvfs recovers just fine when the storage is + wiped, so long as the db is not in use and its schema is + recreated before it's used, but client apps should not have + to be faced with that eventuality mid-query (where it + _will_ cause failures). Therefore we disallow it when + storage handles are opened. Kvvfs version 1 could not + detect this case - see the if() block above. + */ + toss3(capi.SQLITE_ACCESS, + "Cannot clear in-use database storage."); + } + } + const s = store.storage; + const toRm = [] /* keys to remove */; + let i, n = s.length; + //debug("kvvfs_clear",store,s); + for( i = 0; i < n; ++i ){ + const k = s.key(i); + //debug("kvvfs_clear ?",k); + if(!store.keyPrefix || k.startsWith(store.keyPrefix)) toRm.push(k); + } + toRm.forEach((kk)=>s.removeItem(kk)); + //alertFilesToReload(store); + return toRm.length; + }; + + /** + This routine estimates the approximate amount of + storage used by the given kvvfs back-end. + + Its arguments are as documented for sqlite3_js_kvvfs_clear(), + only the operation this performs is different. + + The returned value is twice the "length" value of every matching + key and value, noting that JavaScript stores each character in 2 + bytes. + + The returned size is not authoritative from the perspective of + how much data can fit into localStorage and sessionStorage, as + the precise algorithms for determining those limits are + unspecified and may include per-entry overhead invisible to + clients. + */ + const sqlite3_js_kvvfs_size = function callee(which){ + if( ''===which ){ + return callee('local') + callee('session'); + } + const store = storageForZClass(which); + if( !store ) return 0; + const s = store.storage; + let i, sz = 0; + for(i = 0; i < s.length; ++i){ + const k = s.key(i); + if(!store.keyPrefix || k.startsWith(store.keyPrefix)){ + sz += k.length; + sz += s.getItem(k).length; + } + } + return sz * 2 /* because JS uses 2-byte char encoding */; + }; + + /** + Exports a kvvfs storage object to an object, optionally + JSON-friendly. + + Usages: + + thisfunc(storageName); + thisfunc(options); + + In the latter case, the options object must be an object with + the following properties: + + - "name" (string) required. The storage to export. + + - "decodePages" (bool=false). If true, the .pages result property + holdes Uint8Array objects holding the raw binary-format db + pages. The default is to use kvvfs-encoded string pages + (JSON-friendly). + + - "includeJournal" (bool=false). If true and the db has a current + journal, it is exported as well. (Kvvfs journals are stored as a + single record within the db's storage object.) + + The returned object is structured as follows... + + - "name": the name of the storage. This is 'local' or 'session' + for localStorage resp. sessionStorage, and an arbitrary name for + transient storage. This propery may be changed before passing + this object to sqlite3_js_kvvfs_import() in order to + import into a different storage object. + + - "timestamp": the time this function was called, in Unix + epoch milliseconds. + + - "size": the unencoded db size. + + - "journal": if options.includeJournal is true and this db has a + journal, it is stored as a string here, otherwise this property + is not set. + + - "pages": An array holding the raw encoded db pages in their + proper order. + + Throws if this db is not opened. + + The encoding of the underlying database is not part of this + interface - it is simply passed on as-is. Interested parties are + directed to src/os_kv.c in the SQLite source tree, with the + caveat that that code also does not offer a public interface. + i.e. the encoding is a private implementation detail of kvvfs. + The format may be changed in the future but kvvfs will continue + to support the current form. + + Added in version @kvvfs-v2-added-in@. + */ + const sqlite3_js_kvvfs_export = function callee(...args){ + let opt; + if( 1===args.length && 'object'===typeof args[0] ){ + opt = args[0]; + }else if(args.length){ + opt = Object.assign(Object.create(null),{ + name: args[0], + //decodePages: true + }); + } + const store = opt ? storageForZClass(opt.name) : null; + if( !store ){ + toss3(capi.SQLITE_NOTFOUND, + "There is no kvvfs storage named",opt?.name); + } + //debug("store to export=",store); + const s = store.storage; + const rc = Object.assign(Object.create(null),{ + name: store.jzClass, + timestamp: Date.now(), + pages: [] + }); + const pages = Object.create(null); + let xpages; + const keyPrefix = store.keyPrefix; + const rxTail = keyPrefix + ? /^kvvfs-[^-]+-(\w+)/ /* X... part of kvvfs-NAME-X... */ + : undefined; + let i = 0, n = s.length; + for( ; i < n; ++i ){ + const k = s.key(i); + if( !keyPrefix || k.startsWith(keyPrefix) ){ + let kk = (keyPrefix ? rxTail.exec(k) : undefined)?.[1] ?? k; + switch( kk ){ + case 'jrnl': + if( opt.includeJournal ) rc.journal = s.getItem(k); + break; + case 'sz': + rc.size = +s.getItem(k); + break; + default: + kk = +kk /* coerce to number */; + if( !util.isInt32(kk) || kk<=0 ){ + toss3(capi.SQLITE_RANGE, "Malformed kvvfs key: "+k); + } + if( opt.decodePages ){ + const spg = s.getItem(k), + n = spg.length, + z = cache.memBuffer(0), + zDec = cache.memBuffer(1), + heap = wasm.heap8u()/* MUST be inited last*/; + let i = 0; + for( ; i < n; ++i ){ + heap[wasm.ptr.add(z, i)] = spg.codePointAt(i) & 0xff; + } + heap[wasm.ptr.add(z, i)] = 0; + //debug("Decoding",i,"page bytes"); + const nDec = kvvfsDecode( + z, zDec, cache.buffer.n + ); + //debug("Decoded",nDec,"page bytes"); + pages[kk] = heap.slice(Number(zDec), wasm.ptr.addn(zDec, nDec)); + }else{ + pages[kk] = s.getItem(k); + } + break; + } + } + } + if( opt.decodePages ) cache.memBufferFree(1); + /* Now sort the page numbers and move them into an array. In JS + property keys are always strings, so we have to coerce them to + numbers so we can get them sorted properly for the array. */ + Object.keys(pages).map((v)=>+v).sort().forEach( + (v)=>rc.pages.push(pages[v]) + ); + return rc; + }/* sqlite3_js_kvvfs_export */; + + /** + The counterpart of sqlite3_js_kvvfs_export(). Its + argument must be the result of that function() or + a compatible one. + + This either replaces the contents of an existing transient + storage object or installs one named exp.name, setting + the storage's db contents to that of the exp object. + + Throws on error. Error conditions include: + + - The given storage object is currently opened by any db. + Performing this page-by-page import would invoke undefined + behavior on them. + + - Malformed input object. + + If it throws after starting the import then it clears the storage + before returning, to avoid leaving the db in an undefined + state. It may throw for any of the above-listed conditions before + reaching that step, in which case the db is not modified. If + exp.name refers to a new storage name then if it throws, the name + does not get installed. + + Added in version @kvvfs-v2-added-in@. + */ + const sqlite3_js_kvvfs_import = function(exp, overwrite=false){ + if( !exp?.timestamp + || !exp.name + || undefined===exp.size + || !Array.isArray(exp.pages) ){ + toss3(capi.SQLITE_MISUSE, "Malformed export object."); + }else if( !exp.size + || (exp.size !== (exp.size | 0)) + //|| (exp.size % cache.fixedPageSize) + || exp.size>=0x7fffffff ){ + toss3(capi.SQLITE_RANGE, "Invalid db size: "+exp.size); + } + + validateStorageName(exp.name); + let store = storageForZClass(exp.name); + const isNew = !store; + if( store ){ + if( !overwrite ){ + //warn("Storage exists:",arguments,store); + toss3(capi.SQLITE_ACCESS, + "Storage '"+exp.name+"' already exists and", + "overwrite was not specified."); + }else if( !store.files || !store.jzClass ){ + toss3(capi.SQLITE_ERROR, + "Internal storage object", exp.name,"seems to be malformed."); + }else if( store.files.length ){ + toss3(capi.SQLITE_IOERR_ACCESS, + "Cannot import db storage while it is in use."); + } + sqlite3_js_kvvfs_clear(exp.name); + }else{ + store = newStorageObj(exp.name); + //warn("Installing new storage:",store); + } + //debug("Importing store",store.poolEntry.files.length, store); + //debug("object to import:",exp); + const keyPrefix = kvvfsKeyPrefix(exp.name); + let zEnc; + try{ + /* Force the native KVVfsFile instances to re-read the db + and page size. */; + const s = store.storage; + s.setItem(keyPrefix+'sz', exp.size); + if( exp.journal ) s.setItem(keyPrefix+'jrnl', exp.journal); + if( exp.pages[0] instanceof Uint8Array ){ + /* raw binary pages */ + //debug("pages",exp.pages); + exp.pages.forEach((u,ndx)=>{ + const n = u.length; + if( 0 && cache.fixedPageSize !== n ){ + util.toss3(capi.SQLITE_RANGE,"Unexpected page size:", n); + } + zEnc ??= cache.memBuffer(1); + const zBin = cache.memBuffer(0), + heap = wasm.heap8u()/*MUST be inited last*/; + /* Copy u to the heap and encode the heap copy via C. This + is _presumably_ faster than porting the encoding algo to + JS. */ + heap.set(u, Number(zBin)); + heap[wasm.ptr.addn(zBin,n)] = 0; + const rc = kvvfsEncode(zBin, n, zEnc); + util.assert( rc < cache.buffer.n, + "Impossibly long output - possibly smashed the heap" ); + util.assert( 0===wasm.peek8(wasm.ptr.add(zEnc,rc)), + "Expecting NUL-terminated encoded output" ); + const jenc = wasm.cstrToJs(zEnc); + //debug("(un)encoded page:",u,jenc); + s.setItem(keyPrefix+(ndx+1), jenc); + }); + }else if( exp.pages[0] ){ + /* kvvfs-encoded pages */ + exp.pages.forEach((v,ndx)=>s.setItem(keyPrefix+(ndx+1), v)); + } + if( isNew ) installStorageAndJournal(store); + }catch{ + if( !isNew ){ + try{sqlite3_js_kvvfs_clear(exp.name);}catch(ee){/*ignored*/} + } + }finally{ + if( zEnc ) cache.memBufferFree(1); + } + return this; + }; + + /** + If no kvvfs storage exists with the given name, one is + installed. If one exists, its reference count is increased so + that it won't be freed by the closing of a database or journal + file. + + Throws if the name is not valid for a new storage object. + + Added in version @kvvfs-v2-added-in@. + */ + const sqlite3_js_kvvfs_reserve = function(name){ + let store = storageForZClass(name); + if( store ){ + ++store.refc; + return; + } + validateStorageName(name); + installStorageAndJournal(newStorageObj(name)); + }; + + /** + Conditionally "unlinks" a kvvfs storage object, reducing its + reference count by 1. + + This is a no-op if name ends in "-journal" or refers to a + built-in storage object. + + It will not lower the refcount below the number of + currently-opened db/journal files for the storage (so that it + cannot delete it out from under them). + + If the refcount reaches 0 then the storage object is + removed. + + Returns true if it reduces the refcount, else false. A result of + true does not necessarily mean that the storage unit was removed, + just that its refcount was lowered. Similarly, a result of false + does not mean that the storage is removed - it may still have + opened handles. + + Added in version @kvvfs-v2-added-in@. + */ + const sqlite3_js_kvvfs_unlink = function(name){ + const store = storageForZClass(name); + if( !store + || kvvfsIsPersistentName(store.jzClass) + || isBuiltinName(store.jzClass) + || cache.rxJournalSuffix.test(name) ) return false; + if( store.refc > store.files.length || 0===store.files.length ){ + if( --store.refc<=0 ){ + /* Ignoring deleteAtRefc0 for an explicit unlink */ + deleteStorage(store); + } + return true; + } + return false; + }; + + /** + Adds an event listener to a kvvfs storage object. The idea is + that this can be used to asynchronously back up one kvvfs storage + object to another or another channel entirely. (The caveat in the + latter case is that kvvfs's format is not readily consumable by + downstream code.) + + Its argument must be an object with the following properties: + + - storage: the name of the kvvfs storage object. + + - reserve [=false]: if true, sqlite3_js_kvvfs_reserve() is used + to ensure that the storage exists if it does not already. + If this is false and the storage does not exist then an + exception is thrown. + + - events: an object which may have any of the following + callback function properties: open, close, write, delete. + + - decodePages [=false]: if true, write events will receive each + db page write in the form of a Uint8Array holding the raw binary + db page. The default is to emit the kvvfs-format page because it + requires no extra work, we already have it in hand, and it's + often smaller. It's not great for interchange, though. + + - includeJournal [=false]: if true, writes and deletes of + "jrnl" records are included. If false, no events are sent + for journal updates. + + Passing the same object to sqlite3_js_kvvfs_unlisten() will + remove the listener. + + Each one of the events callbacks will be called asynchronously + when the given storage performs those operations. They may be + asynchronous functions but are not required to be (the events are + fired async either way, but making the event callbacks async may + be advantageous when multiple listeners are involved). All + exceptions, including those via Promises, are ignored but may (or + may not) trigger warning output on the console. + + Each callback gets passed a single object with the following + properties: + + .type = the same as the name of the callback + + .storageName = the name of the storage object + + .data = callback-dependent: + + - 'open' and 'close' get an integer, the number of + currently-opened handles on the storage. + + - 'write' gets a length-two array holding the key and value which + were written. The key is always a string, even if it's a db page + number. For db-page records, the value's type depends on + opt.decodePages. All others, including the journal, are strings. + (The journal, being a kvvfs-specific format, is delivered in + that same JSON-friendly format.) More details below. + + - 'delete' gets the string-type key of the deleted record. + + - 'sync' gets a boolean value: true if it was triggered by db + file's xSync(), false if it was triggered by xFileControl(). The + latter triggers before the xSync() and also triggers if the DB + has PRAGMA SYNCHRONOUS=OFF (in which case xSync() is not + triggered). + + The key/value arguments to 'write', and key argument to 'delete', + are in one of the following forms: + + - 'sz' = the unencoded db size as a string. This specific key is + key is never deleted, so is only ever passed to 'write' events. + + - 'jrnl' = the current db journal as a kvvfs-encoded string. This + journal format is not useful anywhere except in the kvvfs + internals. These events are not fired if opt.includeJournal is + false. + + - '[1-9][0-9]*' (a db page number) = Its type depends on + opt.decodePages. These may be written and deleted in arbitrary + order. + + Design note: JS has StorageEvents but only in the main thread, + which is why the listeners are not based on that. + + Added in version @kvvfs-v2-added-in@. + */ + const sqlite3_js_kvvfs_listen = function(opt){ + if( !opt || 'object'!==typeof opt ){ + toss3(capi.SQLITE_MISUSE, "Expecting a listener object."); + } + let store = storageForZClass(opt.storage); + if( !store ){ + if( opt.storage && opt.reserve ){ + sqlite3_js_kvvfs_reserve(opt.storage); + store = storageForZClass(opt.storage); + util.assert(store, + "Unexpectedly cannot fetch reserved storage " + +opt.storage); + }else{ + toss3(capi.SQLITE_NOTFOUND,"No such storage:",opt.storage); + } + } + if( opt.events ){ + (store.listeners ??= []).push(opt); + } + }; + + /** + Removes the kvvfs event listeners for the given options + object. It must be passed the same object instance which was + passed to sqlite3_js_kvvfs_listen(). + + This has no side effects if opt is invalid or is not a match for + any listeners. + + Return true if it unregisters its argument, else false. + + Added in version @kvvfs-v2-added-in@. + */ + const sqlite3_js_kvvfs_unlisten = function(opt){ + const store = storageForZClass(opt?.storage); + if( store?.listeners && opt.events ){ + const n = store.listeners.length; + store.listeners = store.listeners.filter((v)=>v!==opt); + const rc = n>store.listeners.length; + if( !store.listeners.length ){ + // to speed up downstream checks for listeners + store.listeners = undefined; + } + return rc; + } + return false; + }; + + sqlite3.kvvfs.reserve = sqlite3_js_kvvfs_reserve; + sqlite3.kvvfs.import = sqlite3_js_kvvfs_import; + sqlite3.kvvfs.export = sqlite3_js_kvvfs_export; + sqlite3.kvvfs.unlink = sqlite3_js_kvvfs_unlink; + sqlite3.kvvfs.listen = sqlite3_js_kvvfs_listen; + sqlite3.kvvfs.unlisten = sqlite3_js_kvvfs_unlisten; + sqlite3.kvvfs.exists = (name)=>!!storageForZClass(name); + sqlite3.kvvfs.estimateSize = sqlite3_js_kvvfs_size; + sqlite3.kvvfs.clear = sqlite3_js_kvvfs_clear; + + + if( globalThis.Storage ){ + /** + Prior to version 2, kvvfs was only available in the main + thread. We retain that for the v1 APIs, exposing them only in + the main UI thread. As of version 2, kvvfs is available in all + threads but only via its v2 interface (sqlite3.kvvfs). + + These versions have a default argument value of "" which the v2 + versions lack. + */ + capi.sqlite3_js_kvvfs_size = (which="")=>sqlite3_js_kvvfs_size(which); + capi.sqlite3_js_kvvfs_clear = (which="")=>sqlite3_js_kvvfs_clear(which); + } + +//#if not omit-oo1 + if(sqlite3.oo1?.DB){ + /** + Functionally equivalent to DB(storageName,'c','kvvfs') except + that it throws if the given storage name is not one of 'local' + or 'session'. + + As of version 3.46, the argument may optionally be an options + object in the form: + + { + filename: 'session'|'local', + ... etc. (all options supported by the DB ctor) + } + + noting that the 'vfs' option supported by main DB + constructor is ignored here: the vfs is always 'kvvfs'. + */ + const DB = sqlite3.oo1.DB; + sqlite3.oo1.JsStorageDb = function( + storageName = sqlite3.oo1.JsStorageDb.defaultStorageName + ){ + const opt = DB.dbCtorHelper.normalizeArgs(...arguments); + opt.vfs = 'kvvfs'; + if( 0 ){ + // Current tests rely on these, but that's arguably a bug + if( opt.flags ) opt.flags = 'cw'+opt.flags; + else opt.flags = 'cw'; + } + switch( opt.filename ){ + /* sqlite3_open(), in these builds, recognizes the names + below and performs some magic which we want to bypass + here for sanity's sake. */ + case ":sessionStorage:": opt.filename = 'session'; break; + case ":localStorage:": opt.filename = 'local'; break; + } + const m = /(file:(\/\/)?)([^?]+)/.exec(opt.filename); + validateStorageName( m ? m[3] : opt.filename); + DB.dbCtorHelper.call(this, opt); + }; + sqlite3.oo1.JsStorageDb.defaultStorageName + = cache.storagePool.session ? 'session' : nameOfThisThreadStorage; + const jdb = sqlite3.oo1.JsStorageDb; + jdb.prototype = Object.create(DB.prototype); + jdb.clearStorage = sqlite3_js_kvvfs_clear; + /** + DEPRECATED: the inherited method of this name (as opposed to + the "static" class method) is deprecated with version 2 of + kvvfs. This function will, for backwards comaptibility, + continue to work with localStorage and sessionStorage, but will + throw for all other storage because they are opened. Version 1 + was not capable of recognizing that the storage was opened so + permitted wiping it out at any time, but that was arguably a + bug. + + Clears this database instance's storage or throws if this + instance has been closed. Returns the number of + database pages which were cleaned up. + */ + jdb.prototype.clearStorage = function(){ + return jdb.clearStorage(this.affirmOpen().dbFilename(), true); + }; + /** Equivalent to sqlite3_js_kvvfs_size(). */ + jdb.storageSize = sqlite3_js_kvvfs_size; + /** + Returns the _approximate_ number of bytes this database takes + up in its storage or throws if this instance has been closed. + */ + jdb.prototype.storageSize = function(){ + return jdb.storageSize(this.affirmOpen().dbFilename(), true); + }; + }/*sqlite3.oo1.JsStorageDb*/ +//#/if not omit-oo1 + + if( sqlite3.__isUnderTest && sqlite3.vtab ){ + /** + An eponymous vtab for inspecting the kvvfs state. This is only + intended for use in testing and development, not part of the + public API. + */ + const cols = Object.assign(Object.create(null),{ + rowid: {type: 'INTEGER'}, + name: {type: 'TEXT'}, + nRef: {type: 'INTEGER'}, + nOpen: {type: 'INTEGER'}, + isTransient: {type: 'INTEGER'}, + dbSize: {type: 'INTEGER'} + }); + Object.keys(cols).forEach((v,i)=>cols[v].colId = i); + + const VT = sqlite3.vtab; + const ProtoCursor = Object.assign(Object.create(null),{ + row: function(){ + return cache.storagePool[this.names[this.rowid]]; + } + }); + Object.assign(Object.create(ProtoCursor),{ + rowid: 0, + names: Object.keys(cache.storagePool) + .filter(v=>!cache.rxJournalSuffix.test(v)) + }); + const cursorState = function(cursor, reset){ + const o = (cursor instanceof capi.sqlite3_vtab_cursor) + ? cursor + : VT.xCursor.get(cursor); + if( reset || !o.vTabState ){ + o.vTabState = Object.assign(Object.create(ProtoCursor),{ + rowid: 0, + names: Object.keys(cache.storagePool) + .filter(v=>!cache.rxJournalSuffix.test(v)) + }); + } + return o.vTabState; + }; + + const dbg = 1 ? ()=>{} : (...args)=>debug("vtab",...args); + + const theModule = function f(){ + return f.mod ??= new sqlite3.capi.sqlite3_module().setupModule({ + catchExceptions: true, + methods: { + xConnect: function(pDb, pAux, argc, argv, ppVtab, pzErr){ + dbg("xConnect"); + try{ + const xcol = []; + Object.keys(cols).forEach((k)=>{ + xcol.push(k+" "+cols[k].type); + }); + const rc = capi.sqlite3_declare_vtab( + pDb, "CREATE TABLE ignored("+xcol.join(',')+")" + ); + if(0===rc){ + const t = VT.xVtab.create(ppVtab); + util.assert( + (t === VT.xVtab.get(wasm.peekPtr(ppVtab))), + "output pointer check failed" + ); + } + return rc; + }catch(e){ + return VT.xError('xConnect', e, capi.SQLITE_ERROR); + } + }, + xCreate: wasm.ptr.null, // eponymous only + //xCreate: true, // copy xConnect, i.e. also eponymous only + xDisconnect: function(pVtab){ + dbg("xDisconnect",...arguments); + VT.xVtab.dispose(pVtab); + return 0; + }, + xOpen: function(pVtab, ppCursor){ + dbg("xOpen",...arguments); + VT.xCursor.create(ppCursor); + return 0; + }, + xClose: function(pCursor){ + dbg("xClose",...arguments); + const c = VT.xCursor.unget(pCursor); + delete c.vTabState; + c.dispose(); + return 0; + }, + xNext: function(pCursor){ + dbg("xNext",...arguments); + const c = VT.xCursor.get(pCursor); + ++cursorState(c).rowid; + return 0; + }, + xColumn: function(pCursor, pCtx, iCol){ + dbg("xColumn",...arguments); + //const c = VT.xCursor.get(pCursor); + const st = cursorState(pCursor); + const store = st.row(); + util.assert(store, "Unexpected xColumn call"); + switch(iCol){ + case cols.rowid.colId: + capi.sqlite3_result_int(pCtx, st.rowid); + break; + case cols.name.colId: + capi.sqlite3_result_text(pCtx, store.jzClass, -1, capi.SQLITE_TRANSIENT); + break; + case cols.nRef.colId: + capi.sqlite3_result_int(pCtx, store.refc); + break; + case cols.nOpen.colId: + capi.sqlite3_result_int(pCtx, store.files.length); + break; + case cols.isTransient.colId: + capi.sqlite3_result_int(pCtx, !!store.deleteAtRefc0); + break; + case cols.dbSize.colId: + capi.sqlite3_result_int(pCtx, storageGetDbSize(store)); + break; + default: + capi.sqlite3_result_error(pCtx, "Invalid column id: "+iCol); + return capi.SQLITE_RANGE; + } + return 0; + }, + xRowid: function(pCursor, ppRowid64){ + dbg("xRowid",...arguments); + const st = cursorState(pCursor); + VT.xRowid(ppRowid64, st.rowid); + return 0; + }, + xEof: function(pCursor){ + const st = cursorState(pCursor); + dbg("xEof?="+(!st.row()),...arguments); + return !st.row(); + }, + xFilter: function(pCursor, idxNum, idxCStr, + argc, argv/* [sqlite3_value* ...] */){ + dbg("xFilter",...arguments); + const st = cursorState(pCursor, true); + return 0; + }, + xBestIndex: function(pVtab, pIdxInfo){ + dbg("xBestIndex",...arguments); + //const t = VT.xVtab.get(pVtab); + const pii = new capi.sqlite3_index_info(pIdxInfo); + pii.$estimatedRows = cache.storagePool.size; + pii.$estimatedCost = 1.0; + pii.dispose(); + return 0; + } + } + })/*setupModule*/; + }/*theModule()*/; + + sqlite3.kvvfs.create_module = function(pDb, name="sqlite_kvvfs"){ + return capi.sqlite3_create_module(pDb, name, theModule(), + wasm.ptr.null); + }; + + }/* virtual table */ + +//#if 0 + /** + The idea here is a simpler wrapper for listening to kvvfs + changes. Clients would override its onXyz() event methods + instead of providing callbacks for sqlite3.kvvfs.listen(), the + main (only?) benefit of which is that this class would do the + sorting-out and validation of event state before calling the + overloaded callbacks. + */ + kvvfs.Listener = class KvvfsListener { + #store; + #listener; + + constructor(opt){ + this.#listenTo(opt); + } + + #event(ev){ + switch(ev.type){ + case 'open': this.onOpen(ev.data); break; + case 'close': this.onClose(ev.data); break; + case 'sync': this.onSync(ev.data); break; + case 'delete': + switch(ev.data){ + case 'jrnl': break; + default:{ + const n = +ev.data; + util.assert( n>0, "Expecting positive db page number" ); + this.onPageChange(n, null); + break; + } + } + break; + case 'write':{ + const key = ev.data[0], val = ev.data[1]; + switch( key ){ + case 'jrnl': break; + case 'sz':{ + const sz = +val; + util.assert( sz>0, "Expecting a db page number" ); + this.onSizeChange(sz); + break; + } + default: + T.assert( +key>0, "Expecting a positive db page number" ); + this.onPageChange(+key, val); + break; + } + break; + } + } + } + + #listenTo(opt){ + if(this.#listener){ + sqlite3_js_kvvfs_unlisten(this.#listener); + this.#listener = undefined; + } + const eventHandler = async function(ev){this.event(ev)}.bind(this); + const li = Object.assign( + { /* Defaults */ + reserve: false, + includeJournal: false, + decodePages: false, + storage: null + }, + (/*client options*/opt||{}), + {/*hard-coded options*/ + events: Object.assign(Object.create(null),{ + 'open': eventHandler, + 'close': eventHandler, + 'write': eventHandler, + 'delete': eventHandler, + 'sync': eventHandler + }) + } + ); + sqlite3_js_kvvfs_listen(li); + this.#listener = li; + } + + async onSizeChange(sz){} + async onPageChange(pgNo,content/*null for delete*/){} + async onSync(mode/*true=xSync, false=xFileControl*/){} + async onOpen(count){} + async onClose(count){} + }/*KvvfsListener*/; +//#/if nope + +})/*globalThis.sqlite3ApiBootstrap.initializers*/; +//#savepoint rollback +//#/if not omit-kvvfs diff --git a/ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js b/ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js index 69be338b0c..2990fb1470 100644 --- a/ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js +++ b/ext/wasm/api/sqlite3-vfs-opfs-sahpool.c-pp.js @@ -55,6 +55,10 @@ */ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ 'use strict'; + if( sqlite3.config.disable?.vfs?.['opfs-sahpool'] ){ + return; + } + const toss = sqlite3.util.toss; const toss3 = sqlite3.util.toss3; const initPromises = Object.create(null) /* cache of (name:result) of VFS init results */; @@ -358,7 +362,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ try{ const [cMsg, n] = wasm.scopedAllocCString(e.message, true); wasm.cstrncpy(pOut, cMsg, nOut); - if(n > nOut) wasm.poke8(pOut + nOut - 1, 0); + if(n > nOut) wasm.poke8(wasm.ptr.add(pOut,nOut,-1), 0); }catch(e){ return capi.SQLITE_NOMEM; }finally{ @@ -410,7 +414,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ }/*vfsMethods*/; /** - Creates and initializes an sqlite3_vfs instance for an + Creates, initializes, and returns an sqlite3_vfs instance for an OpfsSAHPool. The argument is the VFS's name (JS string). Throws if the VFS name is already registered or if something @@ -1157,8 +1161,9 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ described at the end of these docs. This function accepts an options object to configure certain - parts but it is only acknowledged for the very first call and - ignored for all subsequent calls. + parts but it is only acknowledged for the very first call for + each distinct name and ignored for all subsequent calls with that + same name. The options, in alphabetical order: @@ -1224,7 +1229,14 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ - Paths given to it _must_ be absolute. Relative paths will not be properly recognized. This is arguably a bug but correcting it requires some hoop-jumping in routines which have no business - doing such tricks. + doing such tricks. (2026-01-19 (2.5 years later): the specifics + are lost to history, but this was a side effect of xOpen() + receiving an immutable C-string filename, to which no implicit + "/" can be prefixed without causing a discrepancy between what + the user provided and what the VFS stores. Its conceivable that + that quirk could be glossed over in xFullPathname(), but + regressions when doing so cannot be ruled out, so there are no + current plans to change this behavior.) - It is possible to install multiple instances under different names, each sandboxed from one another inside their own private @@ -1459,4 +1471,4 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ The OPFS SAH Pool VFS parts are elided from builds targeting node.js. */ -//#endif target:node +//#/if target:node diff --git a/ext/wasm/api/sqlite3-vfs-opfs-wl.c-pp.js b/ext/wasm/api/sqlite3-vfs-opfs-wl.c-pp.js new file mode 100644 index 0000000000..a3baee4269 --- /dev/null +++ b/ext/wasm/api/sqlite3-vfs-opfs-wl.c-pp.js @@ -0,0 +1,128 @@ +//#if not target:node +/* + 2026-02-20 + + The author disclaims copyright to this source code. In place of a + legal notice, here is a blessing: + + * May you do good and not evil. + * May you find forgiveness for yourself and forgive others. + * May you share freely, never taking more than you give. + + *********************************************************************** + + This file is a reimplementation of the "opfs" VFS (as distinct from + "opfs-sahpool") which uses WebLocks for locking instead of a bespoke + Atomics.wait()/notify() protocol. This file holds the "synchronous + half" of the VFS, whereas it shares the "asynchronous half" with the + "opfs" VFS. + + Testing has failed to show any genuine functional difference between + these VFSes other than "opfs-wl" being able to dole out xLock() + requests in a strictly FIFO manner by virtue of WebLocks being + globally managed by the browser. This tends to lead to, but does not + guaranty, fairer distribution of locks. Differences are unlikely to + be noticed except, perhaps, under very high contention. + + This file is intended to be appended to the main sqlite3 JS + deliverable somewhere after opfs-common-shared.c-pp.js. +*/ +'use strict'; +globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ + if( !sqlite3.opfs || sqlite3.config.disable?.vfs?.['opfs-wl'] ){ + return; + } + const util = sqlite3.util, + toss = sqlite3.util.toss; + const opfsUtil = sqlite3.opfs; + const vfsName = 'opfs-wl'; +/** + installOpfsWlVfs() returns a Promise which, on success, installs an + sqlite3_vfs named "opfs-wl", suitable for use with all sqlite3 APIs + which accept a VFS. It is intended to be called via + sqlite3ApiBootstrap.initializers or an equivalent mechanism. + + This VFS is essentially identical to the "opfs" VFS but uses + WebLocks for its xLock() and xUnlock() implementations. + + Quirks specific to this VFS: + + - The (officially undocumented) 'opfs-wl-disable' URL + argument will disable OPFS, making this function a no-op. + + Aside from locking differences in the VFSes, this function + otherwise behaves the same as + sqlite3-vfs-opfs.c-pp.js:installOpfsVfs(). +*/ +const installOpfsWlVfs = async function(options){ + options = opfsUtil.initOptions(vfsName,options); + if( !options ) return sqlite3; + const capi = sqlite3.capi, + state = opfsUtil.createVfsState(), + opfsVfs = state.vfs, + metrics = opfsVfs.metrics.counters, + mTimeStart = opfsVfs.mTimeStart, + mTimeEnd = opfsVfs.mTimeEnd, + opRun = opfsVfs.opRun, + debug = (...args)=>sqlite3.config.debug(vfsName+":",...args), + warn = (...args)=>sqlite3.config.warn(vfsName+":",...args), + __openFiles = opfsVfs.__openFiles; + + //debug("state",JSON.stringify(options)); + /* + At this point, createVfsState() has populated: + + - state: the configuration object we share with the async proxy. + + - opfsVfs: an sqlite3_vfs instance with lots of JS state attached + to it. + + with any code common to both the "opfs" and "opfs-wl" VFSes. Now + comes the VFS-dependent work... + */ + return opfsVfs.bindVfs(util.nu({ + xLock: function(pFile,lockType){ + mTimeStart('xLock'); + //debug("xLock()..."); + const f = __openFiles[pFile]; + const rc = opRun('xLock', pFile, lockType); + if( !rc ) f.lockType = lockType; + mTimeEnd(); + return rc; + }, + xUnlock: function(pFile,lockType){ + mTimeStart('xUnlock'); + const f = __openFiles[pFile]; + const rc = opRun('xUnlock', pFile, lockType); + if( !rc ) f.lockType = lockType; + mTimeEnd(); + return rc; + } + }), function(sqlite3, vfs){ + /* Post-VFS-registration initialization... */ + if(sqlite3.oo1){ + const OpfsWlDb = function(...args){ + const opt = sqlite3.oo1.DB.dbCtorHelper.normalizeArgs(...args); + opt.vfs = vfs.$zName; + sqlite3.oo1.DB.dbCtorHelper.call(this, opt); + }; + OpfsWlDb.prototype = Object.create(sqlite3.oo1.DB.prototype); + sqlite3.oo1.OpfsWlDb = OpfsWlDb; + OpfsWlDb.importDb = opfsUtil.importDb; + /* The "opfs" VFS variant adds a + oo1.DB.dbCtorHelper.setVfsPostOpenCallback() callback to set + a high busy_timeout. That was a design mis-decision and is + inconsistent with sqlite3_open() and friends, but is retained + against the risk of introducing regressions if it's removed. + This variant does not repeat that mistake. + */ + } + })/*bindVfs()*/; +}/*installOpfsWlVfs()*/; +globalThis.sqlite3ApiBootstrap.initializersAsync.push(async (sqlite3)=>{ + return installOpfsWlVfs().catch((e)=>{ + sqlite3.config.warn("Ignoring inability to install the",vfsName,"sqlite3_vfs:",e); + }); +}); +}/*sqlite3ApiBootstrap.initializers.push()*/); +//#/if target:node diff --git a/ext/wasm/api/sqlite3-vfs-opfs.c-pp.js b/ext/wasm/api/sqlite3-vfs-opfs.c-pp.js index 2b636460dd..8edfab4dab 100644 --- a/ext/wasm/api/sqlite3-vfs-opfs.c-pp.js +++ b/ext/wasm/api/sqlite3-vfs-opfs.c-pp.js @@ -16,11 +16,16 @@ asynchronous Origin-Private FileSystem (OPFS) APIs using a second Worker, implemented in sqlite3-opfs-async-proxy.js. This file is intended to be appended to the main sqlite3 JS deliverable somewhere - after sqlite3-api-oo1.js and before sqlite3-api-cleanup.js. + after sqlite3-api-oo1.js. */ 'use strict'; globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ -/** + if( !sqlite3.opfs || sqlite3.config.disable?.vfs?.opfs ){ + return; + } + const util = sqlite3.util, + opfsUtil = sqlite3.opfs || sqlite3.util.toss("Missing sqlite3.opfs"); + /** installOpfsVfs() returns a Promise which, on success, installs an sqlite3_vfs named "opfs", suitable for use with all sqlite3 APIs which accept a VFS. It is intended to be called via @@ -58,7 +63,8 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ The argument may optionally be a plain object with the following configuration options: - - proxyUri: name of the async proxy JS file. + - proxyUri: name of the async proxy JS file or a synchronous function + which, when called, returns such a name. - verbose (=2): an integer 0-3. 0 disables all logging, 1 enables logging of errors. 2 enables logging of warnings and errors. 3 @@ -70,1391 +76,105 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ Promise resolves. This is only intended for testing and development of the VFS, not client-side use. + Additionaly, the (officially undocumented) 'opfs-disable' URL + argument will disable OPFS, making this function a no-op. + On success, the Promise resolves to the top-most sqlite3 namespace - object and that object gets a new object installed in its - `opfs` property, containing several OPFS-specific utilities. + object. Success does not necessarily mean that it installs the VFS, + as there are legitimate non-error reasons for OPFS not to be + available. */ -const installOpfsVfs = function callee(options){ - if(!globalThis.SharedArrayBuffer - || !globalThis.Atomics){ - return Promise.reject( - new Error("Cannot install OPFS: Missing SharedArrayBuffer and/or Atomics. "+ - "The server must emit the COOP/COEP response headers to enable those. "+ - "See https://sqlite.org/wasm/doc/trunk/persistence.md#coop-coep") - ); - }else if('undefined'===typeof WorkerGlobalScope){ - return Promise.reject( - new Error("The OPFS sqlite3_vfs cannot run in the main thread "+ - "because it requires Atomics.wait().") - ); - }else if(!globalThis.FileSystemHandle || - !globalThis.FileSystemDirectoryHandle || - !globalThis.FileSystemFileHandle || - !globalThis.FileSystemFileHandle.prototype.createSyncAccessHandle || - !navigator?.storage?.getDirectory){ - return Promise.reject( - new Error("Missing required OPFS APIs.") - ); - } - if(!options || 'object'!==typeof options){ - options = Object.create(null); - } - const urlParams = new URL(globalThis.location.href).searchParams; - if(urlParams.has('opfs-disable')){ - //sqlite3.config.warn('Explicitly not installing "opfs" VFS due to opfs-disable flag.'); - return Promise.resolve(sqlite3); - } - if(undefined===options.verbose){ - options.verbose = urlParams.has('opfs-verbose') - ? (+urlParams.get('opfs-verbose') || 2) : 1; - } - if(undefined===options.sanityChecks){ - options.sanityChecks = urlParams.has('opfs-sanity-check'); - } - if(undefined===options.proxyUri){ - options.proxyUri = callee.defaultProxyUri; - } - - //sqlite3.config.warn("OPFS options =",options,globalThis.location); - - if('function' === typeof options.proxyUri){ - options.proxyUri = options.proxyUri(); - } - const thePromise = new Promise(function(promiseResolve_, promiseReject_){ - const loggers = [ - sqlite3.config.error, - sqlite3.config.warn, - sqlite3.config.log - ]; - const logImpl = (level,...args)=>{ - if(options.verbose>level) loggers[level]("OPFS syncer:",...args); - }; - const log = (...args)=>logImpl(2, ...args); - const warn = (...args)=>logImpl(1, ...args); - const error = (...args)=>logImpl(0, ...args); - const toss = sqlite3.util.toss; - const capi = sqlite3.capi; - const util = sqlite3.util; - const wasm = sqlite3.wasm; - const sqlite3_vfs = capi.sqlite3_vfs; - const sqlite3_file = capi.sqlite3_file; - const sqlite3_io_methods = capi.sqlite3_io_methods; - /** - Generic utilities for working with OPFS. This will get filled out - by the Promise setup and, on success, installed as sqlite3.opfs. - - ACHTUNG: do not rely on these APIs in client code. They are - experimental and subject to change or removal as the - OPFS-specific sqlite3_vfs evolves. - */ - const opfsUtil = Object.create(null); - - /** - Returns true if _this_ thread has access to the OPFS APIs. - */ - const thisThreadHasOPFS = ()=>{ - return globalThis.FileSystemHandle && - globalThis.FileSystemDirectoryHandle && - globalThis.FileSystemFileHandle && - globalThis.FileSystemFileHandle.prototype.createSyncAccessHandle && - navigator?.storage?.getDirectory; - }; - - /** - Not part of the public API. Solely for internal/development - use. - */ - opfsUtil.metrics = { - dump: function(){ - let k, n = 0, t = 0, w = 0; - for(k in state.opIds){ - const m = metrics[k]; - n += m.count; - t += m.time; - w += m.wait; - m.avgTime = (m.count && m.time) ? (m.time / m.count) : 0; - m.avgWait = (m.count && m.wait) ? (m.wait / m.count) : 0; - } - sqlite3.config.log(globalThis.location.href, - "metrics for",globalThis.location.href,":",metrics, - "\nTotal of",n,"op(s) for",t, - "ms (incl. "+w+" ms of waiting on the async side)"); - sqlite3.config.log("Serialization metrics:",metrics.s11n); - W.postMessage({type:'opfs-async-metrics'}); - }, - reset: function(){ - let k; - const r = (m)=>(m.count = m.time = m.wait = 0); - for(k in state.opIds){ - r(metrics[k] = Object.create(null)); - } - let s = metrics.s11n = Object.create(null); - s = s.serialize = Object.create(null); - s.count = s.time = 0; - s = metrics.s11n.deserialize = Object.create(null); - s.count = s.time = 0; - } - }/*metrics*/; - const opfsIoMethods = new sqlite3_io_methods(); - const opfsVfs = new sqlite3_vfs() - .addOnDispose( ()=>opfsIoMethods.dispose()); - let promiseWasRejected = undefined; - const promiseReject = (err)=>{ - promiseWasRejected = true; - opfsVfs.dispose(); - return promiseReject_(err); - }; - const promiseResolve = ()=>{ - promiseWasRejected = false; - return promiseResolve_(sqlite3); - }; - const W = -//#if target:es6-bundler-friendly - new Worker(new URL("sqlite3-opfs-async-proxy.js", import.meta.url)); -//#elif target:es6-module - new Worker(new URL(options.proxyUri, import.meta.url)); -//#else - new Worker(options.proxyUri); -//#endif - setTimeout(()=>{ - /* At attempt to work around a browser-specific quirk in which - the Worker load is failing in such a way that we neither - resolve nor reject it. This workaround gives that resolve/reject - a time limit and rejects if that timer expires. Discussion: - https://sqlite.org/forum/forumpost/a708c98dcb3ef */ - if(undefined===promiseWasRejected){ - promiseReject( - new Error("Timeout while waiting for OPFS async proxy worker.") - ); - } - }, 4000); - W._originalOnError = W.onerror /* will be restored later */; - W.onerror = function(err){ - // The error object doesn't contain any useful info when the - // failure is, e.g., that the remote script is 404. - error("Error initializing OPFS asyncer:",err); - promiseReject(new Error("Loading OPFS async Worker failed for unknown reasons.")); - }; - const pDVfs = capi.sqlite3_vfs_find(null)/*pointer to default VFS*/; - const dVfs = pDVfs - ? new sqlite3_vfs(pDVfs) - : null /* dVfs will be null when sqlite3 is built with - SQLITE_OS_OTHER. */; - opfsIoMethods.$iVersion = 1; - opfsVfs.$iVersion = 2/*yes, two*/; - opfsVfs.$szOsFile = capi.sqlite3_file.structInfo.sizeof; - opfsVfs.$mxPathname = 1024/* sure, why not? The OPFS name length limit - is undocumented/unspecified. */; - opfsVfs.$zName = wasm.allocCString("opfs"); - // All C-side memory of opfsVfs is zeroed out, but just to be explicit: - opfsVfs.$xDlOpen = opfsVfs.$xDlError = opfsVfs.$xDlSym = opfsVfs.$xDlClose = null; - opfsVfs.addOnDispose( - '$zName', opfsVfs.$zName, - 'cleanup default VFS wrapper', ()=>(dVfs ? dVfs.dispose() : null) - ); - /** - Pedantic sidebar about opfsVfs.ondispose: the entries in that array - are items to clean up when opfsVfs.dispose() is called, but in this - environment it will never be called. The VFS instance simply - hangs around until the WASM module instance is cleaned up. We - "could" _hypothetically_ clean it up by "importing" an - sqlite3_os_end() impl into the wasm build, but the shutdown order - of the wasm engine and the JS one are undefined so there is no - guaranty that the opfsVfs instance would be available in one - environment or the other when sqlite3_os_end() is called (_if_ it - gets called at all in a wasm build, which is undefined). - */ - /** - State which we send to the async-api Worker or share with it. - This object must initially contain only cloneable or sharable - objects. After the worker's "inited" message arrives, other types - of data may be added to it. - - For purposes of Atomics.wait() and Atomics.notify(), we use a - SharedArrayBuffer with one slot reserved for each of the API - proxy's methods. The sync side of the API uses Atomics.wait() - on the corresponding slot and the async side uses - Atomics.notify() on that slot. - - The approach of using a single SAB to serialize comms for all - instances might(?) lead to deadlock situations in multi-db - cases. We should probably have one SAB here with a single slot - for locking a per-file initialization step and then allocate a - separate SAB like the above one for each file. That will - require a bit of acrobatics but should be feasible. The most - problematic part is that xOpen() would have to use - postMessage() to communicate its SharedArrayBuffer, and mixing - that approach with Atomics.wait/notify() gets a bit messy. - */ - const state = Object.create(null); - state.verbose = options.verbose; - state.littleEndian = (()=>{ - const buffer = new ArrayBuffer(2); - new DataView(buffer).setInt16(0, 256, true /* ==>littleEndian */); - // Int16Array uses the platform's endianness. - return new Int16Array(buffer)[0] === 256; - })(); - /** - asyncIdleWaitTime is how long (ms) to wait, in the async proxy, - for each Atomics.wait() when waiting on inbound VFS API calls. - We need to wake up periodically to give the thread a chance to - do other things. If this is too high (e.g. 500ms) then even two - workers/tabs can easily run into locking errors. Some multiple - of this value is also used for determining how long to wait on - lock contention to free up. - */ - state.asyncIdleWaitTime = 150; - - /** - Whether the async counterpart should log exceptions to - the serialization channel. That produces a great deal of - noise for seemingly innocuous things like xAccess() checks - for missing files, so this option may have one of 3 values: - - 0 = no exception logging. - - 1 = only log exceptions for "significant" ops like xOpen(), - xRead(), and xWrite(). - - 2 = log all exceptions. - */ - state.asyncS11nExceptions = 1; - /* Size of file I/O buffer block. 64k = max sqlite3 page size, and - xRead/xWrite() will never deal in blocks larger than that. */ - state.fileBufferSize = 1024 * 64; - state.sabS11nOffset = state.fileBufferSize; - /** - The size of the block in our SAB for serializing arguments and - result values. Needs to be large enough to hold serialized - values of any of the proxied APIs. Filenames are the largest - part but are limited to opfsVfs.$mxPathname bytes. We also - store exceptions there, so it needs to be long enough to hold - a reasonably long exception string. - */ - state.sabS11nSize = opfsVfs.$mxPathname * 2; - /** - The SAB used for all data I/O between the synchronous and - async halves (file i/o and arg/result s11n). - */ - state.sabIO = new SharedArrayBuffer( - state.fileBufferSize/* file i/o block */ - + state.sabS11nSize/* argument/result serialization block */ - ); - state.opIds = Object.create(null); - const metrics = Object.create(null); - { - /* Indexes for use in our SharedArrayBuffer... */ - let i = 0; - /* SAB slot used to communicate which operation is desired - between both workers. This worker writes to it and the other - listens for changes. */ - state.opIds.whichOp = i++; - /* Slot for storing return values. This worker listens to that - slot and the other worker writes to it. */ - state.opIds.rc = i++; - /* Each function gets an ID which this worker writes to - the whichOp slot. The async-api worker uses Atomic.wait() - on the whichOp slot to figure out which operation to run - next. */ - state.opIds.xAccess = i++; - state.opIds.xClose = i++; - state.opIds.xDelete = i++; - state.opIds.xDeleteNoWait = i++; - state.opIds.xFileSize = i++; - state.opIds.xLock = i++; - state.opIds.xOpen = i++; - state.opIds.xRead = i++; - state.opIds.xSleep = i++; - state.opIds.xSync = i++; - state.opIds.xTruncate = i++; - state.opIds.xUnlock = i++; - state.opIds.xWrite = i++; - state.opIds.mkdir = i++; - state.opIds['opfs-async-metrics'] = i++; - state.opIds['opfs-async-shutdown'] = i++; - /* The retry slot is used by the async part for wait-and-retry - semantics. Though we could hypothetically use the xSleep slot - for that, doing so might lead to undesired side effects. */ - state.opIds.retry = i++; - state.sabOP = new SharedArrayBuffer( - i * 4/* ==sizeof int32, noting that Atomics.wait() and friends - can only function on Int32Array views of an SAB. */); - opfsUtil.metrics.reset(); - } - /** - SQLITE_xxx constants to export to the async worker - counterpart... - */ - state.sq3Codes = Object.create(null); - [ - 'SQLITE_ACCESS_EXISTS', - 'SQLITE_ACCESS_READWRITE', - 'SQLITE_BUSY', - 'SQLITE_CANTOPEN', - 'SQLITE_ERROR', - 'SQLITE_IOERR', - 'SQLITE_IOERR_ACCESS', - 'SQLITE_IOERR_CLOSE', - 'SQLITE_IOERR_DELETE', - 'SQLITE_IOERR_FSYNC', - 'SQLITE_IOERR_LOCK', - 'SQLITE_IOERR_READ', - 'SQLITE_IOERR_SHORT_READ', - 'SQLITE_IOERR_TRUNCATE', - 'SQLITE_IOERR_UNLOCK', - 'SQLITE_IOERR_WRITE', - 'SQLITE_LOCK_EXCLUSIVE', - 'SQLITE_LOCK_NONE', - 'SQLITE_LOCK_PENDING', - 'SQLITE_LOCK_RESERVED', - 'SQLITE_LOCK_SHARED', - 'SQLITE_LOCKED', - 'SQLITE_MISUSE', - 'SQLITE_NOTFOUND', - 'SQLITE_OPEN_CREATE', - 'SQLITE_OPEN_DELETEONCLOSE', - 'SQLITE_OPEN_MAIN_DB', - 'SQLITE_OPEN_READONLY' - ].forEach((k)=>{ - if(undefined === (state.sq3Codes[k] = capi[k])){ - toss("Maintenance required: not found:",k); - } - }); - state.opfsFlags = Object.assign(Object.create(null),{ - /** - Flag for use with xOpen(). URI flag "opfs-unlock-asap=1" - enables this. See defaultUnlockAsap, below. - */ - OPFS_UNLOCK_ASAP: 0x01, - /** - Flag for use with xOpen(). URI flag "delete-before-open=1" - tells the VFS to delete the db file before attempting to open - it. This can be used, e.g., to replace a db which has been - corrupted (without forcing us to expose a delete/unlink() - function in the public API). - - Failure to unlink the file is ignored but may lead to - downstream errors. An unlink can fail if, e.g., another tab - has the handle open. - - It goes without saying that deleting a file out from under another - instance results in Undefined Behavior. - */ - OPFS_UNLINK_BEFORE_OPEN: 0x02, - /** - If true, any async routine which implicitly acquires a sync - access handle (i.e. an OPFS lock) will release that lock at - the end of the call which acquires it. If false, such - "autolocks" are not released until the VFS is idle for some - brief amount of time. - - The benefit of enabling this is much higher concurrency. The - down-side is much-reduced performance (as much as a 4x decrease - in speedtest1). - */ - defaultUnlockAsap: false - }); - - /** - Runs the given operation (by name) in the async worker - counterpart, waits for its response, and returns the result - which the async worker writes to SAB[state.opIds.rc]. The - 2nd and subsequent arguments must be the arguments for the - async op. - */ - const opRun = (op,...args)=>{ - const opNdx = state.opIds[op] || toss("Invalid op ID:",op); - state.s11n.serialize(...args); - Atomics.store(state.sabOPView, state.opIds.rc, -1); - Atomics.store(state.sabOPView, state.opIds.whichOp, opNdx); - Atomics.notify(state.sabOPView, state.opIds.whichOp) - /* async thread will take over here */; - const t = performance.now(); - while('not-equal'!==Atomics.wait(state.sabOPView, state.opIds.rc, -1)){ - /* - The reason for this loop is buried in the details of a long - discussion at: - - https://github.com/sqlite/sqlite-wasm/issues/12 - - Summary: in at least one browser flavor, under high loads, - the wait()/notify() pairings can get out of sync. Calling - wait() here until it returns 'not-equal' gets them back in - sync. - */ - } - /* When the above wait() call returns 'not-equal', the async - half will have completed the operation and reported its results - in the state.opIds.rc slot of the SAB. */ - const rc = Atomics.load(state.sabOPView, state.opIds.rc); - metrics[op].wait += performance.now() - t; - if(rc && state.asyncS11nExceptions){ - const err = state.s11n.deserialize(); - if(err) error(op+"() async error:",...err); +const installOpfsVfs = async function(options){ + options = opfsUtil.initOptions('opfs',options); + if( !options ) return sqlite3; + const capi = sqlite3.capi, + state = opfsUtil.createVfsState(), + opfsVfs = state.vfs, + metrics = opfsVfs.metrics.counters, + mTimeStart = opfsVfs.mTimeStart, + mTimeEnd = opfsVfs.mTimeEnd, + opRun = opfsVfs.opRun, + debug = (...args)=>sqlite3.config.debug("opfs:",...args), + warn = (...args)=>sqlite3.config.warn("opfs:",...args), + __openFiles = opfsVfs.__openFiles; + + //debug("options:",JSON.stringify(options)); + /* + At this point, createVfsState() has populated: + + - state: the configuration object we share with the async proxy. + + - opfsVfs: an sqlite3_vfs instance with lots of JS state attached + to it. + + with any code common to both the "opfs" and "opfs-wl" VFSes. Now + comes the VFS-dependent work... + */ + return opfsVfs.bindVfs(util.nu({ + xLock: function(pFile,lockType){ + mTimeStart('xLock'); + ++metrics.xLock.count; + const f = __openFiles[pFile]; + let rc = 0; + /* All OPFS locks are exclusive locks. If xLock() has + previously succeeded, do nothing except record the lock + type. If no lock is active, have the async counterpart + lock the file. */ + if( f.lockType ) { + f.lockType = lockType; + }else{ + rc = opRun('xLock', pFile, lockType); + if( 0===rc ) f.lockType = lockType; } + mTimeEnd(); return rc; - }; - - /** - Not part of the public API. Only for test/development use. - */ - opfsUtil.debug = { - asyncShutdown: ()=>{ - warn("Shutting down OPFS async listener. The OPFS VFS will no longer work."); - opRun('opfs-async-shutdown'); - }, - asyncRestart: ()=>{ - warn("Attempting to restart OPFS VFS async listener. Might work, might not."); - W.postMessage({type: 'opfs-async-restart'}); - } - }; - - const initS11n = ()=>{ - /** - !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - ACHTUNG: this code is 100% duplicated in the other half of - this proxy! The documentation is maintained in the - "synchronous half". - !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - This proxy de/serializes cross-thread function arguments and - output-pointer values via the state.sabIO SharedArrayBuffer, - using the region defined by (state.sabS11nOffset, - state.sabS11nOffset + state.sabS11nSize]. Only one dataset is - recorded at a time. - - This is not a general-purpose format. It only supports the - range of operations, and data sizes, needed by the - sqlite3_vfs and sqlite3_io_methods operations. Serialized - data are transient and this serialization algorithm may - change at any time. - - The data format can be succinctly summarized as: - - Nt...Td...D - - Where: - - - N = number of entries (1 byte) - - - t = type ID of first argument (1 byte) - - - ...T = type IDs of the 2nd and subsequent arguments (1 byte - each). - - - d = raw bytes of first argument (per-type size). - - - ...D = raw bytes of the 2nd and subsequent arguments (per-type - size). - - All types except strings have fixed sizes. Strings are stored - using their TextEncoder/TextDecoder representations. It would - arguably make more sense to store them as Int16Arrays of - their JS character values, but how best/fastest to get that - in and out of string form is an open point. Initial - experimentation with that approach did not gain us any speed. - - Historical note: this impl was initially about 1% this size by - using using JSON.stringify/parse(), but using fit-to-purpose - serialization saves considerable runtime. - */ - if(state.s11n) return state.s11n; - const textDecoder = new TextDecoder(), - textEncoder = new TextEncoder('utf-8'), - viewU8 = new Uint8Array(state.sabIO, state.sabS11nOffset, state.sabS11nSize), - viewDV = new DataView(state.sabIO, state.sabS11nOffset, state.sabS11nSize); - state.s11n = Object.create(null); - /* Only arguments and return values of these types may be - serialized. This covers the whole range of types needed by the - sqlite3_vfs API. */ - const TypeIds = Object.create(null); - TypeIds.number = { id: 1, size: 8, getter: 'getFloat64', setter: 'setFloat64' }; - TypeIds.bigint = { id: 2, size: 8, getter: 'getBigInt64', setter: 'setBigInt64' }; - TypeIds.boolean = { id: 3, size: 4, getter: 'getInt32', setter: 'setInt32' }; - TypeIds.string = { id: 4 }; - - const getTypeId = (v)=>( - TypeIds[typeof v] - || toss("Maintenance required: this value type cannot be serialized.",v) - ); - const getTypeIdById = (tid)=>{ - switch(tid){ - case TypeIds.number.id: return TypeIds.number; - case TypeIds.bigint.id: return TypeIds.bigint; - case TypeIds.boolean.id: return TypeIds.boolean; - case TypeIds.string.id: return TypeIds.string; - default: toss("Invalid type ID:",tid); - } - }; - - /** - Returns an array of the deserialized state stored by the most - recent serialize() operation (from from this thread or the - counterpart thread), or null if the serialization buffer is - empty. If passed a truthy argument, the serialization buffer - is cleared after deserialization. - */ - state.s11n.deserialize = function(clear=false){ - ++metrics.s11n.deserialize.count; - const t = performance.now(); - const argc = viewU8[0]; - const rc = argc ? [] : null; - if(argc){ - const typeIds = []; - let offset = 1, i, n, v; - for(i = 0; i < argc; ++i, ++offset){ - typeIds.push(getTypeIdById(viewU8[offset])); - } - for(i = 0; i < argc; ++i){ - const t = typeIds[i]; - if(t.getter){ - v = viewDV[t.getter](offset, state.littleEndian); - offset += t.size; - }else{/*String*/ - n = viewDV.getInt32(offset, state.littleEndian); - offset += 4; - v = textDecoder.decode(viewU8.slice(offset, offset+n)); - offset += n; - } - rc.push(v); - } - } - if(clear) viewU8[0] = 0; - //log("deserialize:",argc, rc); - metrics.s11n.deserialize.time += performance.now() - t; - return rc; - }; - - /** - Serializes all arguments to the shared buffer for consumption - by the counterpart thread. - - This routine is only intended for serializing OPFS VFS - arguments and (in at least one special case) result values, - and the buffer is sized to be able to comfortably handle - those. - - If passed no arguments then it zeroes out the serialization - state. - */ - state.s11n.serialize = function(...args){ - const t = performance.now(); - ++metrics.s11n.serialize.count; - if(args.length){ - //log("serialize():",args); - const typeIds = []; - let i = 0, offset = 1; - viewU8[0] = args.length & 0xff /* header = # of args */; - for(; i < args.length; ++i, ++offset){ - /* Write the TypeIds.id value into the next args.length - bytes. */ - typeIds.push(getTypeId(args[i])); - viewU8[offset] = typeIds[i].id; - } - for(i = 0; i < args.length; ++i) { - /* Deserialize the following bytes based on their - corresponding TypeIds.id from the header. */ - const t = typeIds[i]; - if(t.setter){ - viewDV[t.setter](offset, args[i], state.littleEndian); - offset += t.size; - }else{/*String*/ - const s = textEncoder.encode(args[i]); - viewDV.setInt32(offset, s.byteLength, state.littleEndian); - offset += 4; - viewU8.set(s, offset); - offset += s.byteLength; - } - } - //log("serialize() result:",viewU8.slice(0,offset)); - }else{ - viewU8[0] = 0; - } - metrics.s11n.serialize.time += performance.now() - t; - }; - return state.s11n; - }/*initS11n()*/; - - /** - Generates a random ASCII string len characters long, intended for - use as a temporary file name. - */ - const randomFilename = function f(len=16){ - if(!f._chars){ - f._chars = "abcdefghijklmnopqrstuvwxyz"+ - "ABCDEFGHIJKLMNOPQRSTUVWXYZ"+ - "012346789"; - f._n = f._chars.length; - } - const a = []; - let i = 0; - for( ; i < len; ++i){ - const ndx = Math.random() * (f._n * 64) % f._n | 0; - a[i] = f._chars[ndx]; - } - return a.join(""); - /* - An alternative impl. with an unpredictable length - but much simpler: - - Math.floor(Math.random() * Number.MAX_SAFE_INTEGER).toString(36) - */ - }; - - /** - Map of sqlite3_file pointers to objects constructed by xOpen(). - */ - const __openFiles = Object.create(null); - - const opTimer = Object.create(null); - opTimer.op = undefined; - opTimer.start = undefined; - const mTimeStart = (op)=>{ - opTimer.start = performance.now(); - opTimer.op = op; - ++metrics[op].count; - }; - const mTimeEnd = ()=>( - metrics[opTimer.op].time += performance.now() - opTimer.start - ); - - /** - Impls for the sqlite3_io_methods methods. Maintenance reminder: - members are in alphabetical order to simplify finding them. - */ - const ioSyncWrappers = { - xCheckReservedLock: function(pFile,pOut){ - /** - As of late 2022, only a single lock can be held on an OPFS - file. We have no way of checking whether any _other_ db - connection has a lock except by trying to obtain and (on - success) release a sync-handle for it, but doing so would - involve an inherent race condition. For the time being, - pending a better solution, we simply report whether the - given pFile is open. - - Update 2024-06-12: based on forum discussions, this - function now always sets pOut to 0 (false): - - https://sqlite.org/forum/forumpost/a2f573b00cda1372 - */ - wasm.poke(pOut, 0, 'i32'); - return 0; - }, - xClose: function(pFile){ - mTimeStart('xClose'); - let rc = 0; - const f = __openFiles[pFile]; - if(f){ - delete __openFiles[pFile]; - rc = opRun('xClose', pFile); - if(f.sq3File) f.sq3File.dispose(); - } - mTimeEnd(); - return rc; - }, - xDeviceCharacteristics: function(pFile){ - return capi.SQLITE_IOCAP_UNDELETABLE_WHEN_OPEN; - }, - xFileControl: function(pFile, opId, pArg){ - /*mTimeStart('xFileControl'); - mTimeEnd();*/ - return capi.SQLITE_NOTFOUND; - }, - xFileSize: function(pFile,pSz64){ - mTimeStart('xFileSize'); - let rc = opRun('xFileSize', pFile); - if(0==rc){ - try { - const sz = state.s11n.deserialize()[0]; - wasm.poke(pSz64, sz, 'i64'); - }catch(e){ - error("Unexpected error reading xFileSize() result:",e); - rc = state.sq3Codes.SQLITE_IOERR; - } - } - mTimeEnd(); - return rc; - }, - xLock: function(pFile,lockType){ - mTimeStart('xLock'); - const f = __openFiles[pFile]; - let rc = 0; - /* All OPFS locks are exclusive locks. If xLock() has - previously succeeded, do nothing except record the lock - type. If no lock is active, have the async counterpart - lock the file. */ - if( !f.lockType ) { - rc = opRun('xLock', pFile, lockType); - if( 0===rc ) f.lockType = lockType; - }else{ - f.lockType = lockType; - } - mTimeEnd(); - return rc; - }, - xRead: function(pFile,pDest,n,offset64){ - mTimeStart('xRead'); - const f = __openFiles[pFile]; - let rc; - try { - rc = opRun('xRead',pFile, n, Number(offset64)); - if(0===rc || capi.SQLITE_IOERR_SHORT_READ===rc){ - /** - Results get written to the SharedArrayBuffer f.sabView. - Because the heap is _not_ a SharedArrayBuffer, we have - to copy the results. TypedArray.set() seems to be the - fastest way to copy this. */ - wasm.heap8u().set(f.sabView.subarray(0, n), Number(pDest)); - } - }catch(e){ - error("xRead(",arguments,") failed:",e,f); - rc = capi.SQLITE_IOERR_READ; - } - mTimeEnd(); - return rc; - }, - xSync: function(pFile,flags){ - mTimeStart('xSync'); - ++metrics.xSync.count; - const rc = opRun('xSync', pFile, flags); - mTimeEnd(); - return rc; - }, - xTruncate: function(pFile,sz64){ - mTimeStart('xTruncate'); - const rc = opRun('xTruncate', pFile, Number(sz64)); - mTimeEnd(); - return rc; - }, - xUnlock: function(pFile,lockType){ - mTimeStart('xUnlock'); - const f = __openFiles[pFile]; - let rc = 0; - if( capi.SQLITE_LOCK_NONE === lockType + }, + xUnlock: function(pFile,lockType){ + mTimeStart('xUnlock'); + ++metrics.xUnlock.count; + const f = __openFiles[pFile]; + let rc = 0; + if( capi.SQLITE_LOCK_NONE === lockType && f.lockType ){ - rc = opRun('xUnlock', pFile, lockType); - } - if( 0===rc ) f.lockType = lockType; - mTimeEnd(); - return rc; - }, - xWrite: function(pFile,pSrc,n,offset64){ - mTimeStart('xWrite'); - const f = __openFiles[pFile]; - let rc; - try { - f.sabView.set(wasm.heap8u().subarray( - Number(pSrc), Number(pSrc) + n - )); - rc = opRun('xWrite', pFile, n, Number(offset64)); - }catch(e){ - error("xWrite(",arguments,") failed:",e,f); - rc = capi.SQLITE_IOERR_WRITE; - } - mTimeEnd(); - return rc; + rc = opRun('xUnlock', pFile, lockType); } - }/*ioSyncWrappers*/; - - /** - Impls for the sqlite3_vfs methods. Maintenance reminder: members - are in alphabetical order to simplify finding them. - */ - const vfsSyncWrappers = { - xAccess: function(pVfs,zName,flags,pOut){ - mTimeStart('xAccess'); - const rc = opRun('xAccess', wasm.cstrToJs(zName)); - wasm.poke( pOut, (rc ? 0 : 1), 'i32' ); - mTimeEnd(); - return 0; - }, - xCurrentTime: function(pVfs,pOut){ - /* If it turns out that we need to adjust for timezone, see: - https://stackoverflow.com/a/11760121/1458521 */ - wasm.poke(pOut, 2440587.5 + (new Date().getTime()/86400000), - 'double'); - return 0; - }, - xCurrentTimeInt64: function(pVfs,pOut){ - wasm.poke(pOut, (2440587.5 * 86400000) + new Date().getTime(), - 'i64'); - return 0; - }, - xDelete: function(pVfs, zName, doSyncDir){ - mTimeStart('xDelete'); - const rc = opRun('xDelete', wasm.cstrToJs(zName), doSyncDir, false); - mTimeEnd(); - return rc; - }, - xFullPathname: function(pVfs,zName,nOut,pOut){ - /* Until/unless we have some notion of "current dir" - in OPFS, simply copy zName to pOut... */ - const i = wasm.cstrncpy(pOut, zName, nOut); - return i!!v) : p; - }; - - /** - Takes the absolute path to a filesystem element. Returns an - array of [handleOfContainingDir, filename]. If the 2nd argument - is truthy then each directory element leading to the file is - created along the way. Throws if any creation or resolution - fails. - */ - opfsUtil.getDirForFilename = async function f(absFilename, createDirs = false){ - const path = opfsUtil.getResolvedPath(absFilename, true); - const filename = path.pop(); - let dh = opfsUtil.rootDirectory; - for(const dirName of path){ - if(dirName){ - dh = await dh.getDirectoryHandle(dirName, {create: !!createDirs}); - } - } - return [dh, filename]; - }; - - /** - Creates the given directory name, recursively, in - the OPFS filesystem. Returns true if it succeeds or the - directory already exists, else false. - */ - opfsUtil.mkdir = async function(absDirName){ - try { - await opfsUtil.getDirForFilename(absDirName+"/filepart", true); - return true; - }catch(e){ - //sqlite3.config.warn("mkdir(",absDirName,") failed:",e); - return false; - } - }; - /** - Checks whether the given OPFS filesystem entry exists, - returning true if it does, false if it doesn't or if an - exception is intercepted while trying to make the - determination. - */ - opfsUtil.entryExists = async function(fsEntryName){ - try { - const [dh, fn] = await opfsUtil.getDirForFilename(fsEntryName); - await dh.getFileHandle(fn); - return true; - }catch(e){ - return false; - } - }; - - /** - Generates a random ASCII string, intended for use as a - temporary file name. Its argument is the length of the string, - defaulting to 16. - */ - opfsUtil.randomFilename = randomFilename; - - /** - Returns a promise which resolves to an object which represents - all files and directories in the OPFS tree. The top-most object - has two properties: `dirs` is an array of directory entries - (described below) and `files` is a list of file names for all - files in that directory. - - Traversal starts at sqlite3.opfs.rootDirectory. - - Each `dirs` entry is an object in this form: - - ``` - { name: directoryName, - dirs: [...subdirs], - files: [...file names] - } - ``` - - The `files` and `subdirs` entries are always set but may be - empty arrays. - - The returned object has the same structure but its `name` is - an empty string. All returned objects are created with - Object.create(null), so have no prototype. - - Design note: the entries do not contain more information, - e.g. file sizes, because getting such info is not only - expensive but is subject to locking-related errors. - */ - opfsUtil.treeList = async function(){ - const doDir = async function callee(dirHandle,tgt){ - tgt.name = dirHandle.name; - tgt.dirs = []; - tgt.files = []; - for await (const handle of dirHandle.values()){ - if('directory' === handle.kind){ - const subDir = Object.create(null); - tgt.dirs.push(subDir); - await callee(handle, subDir); - }else{ - tgt.files.push(handle.name); - } - } - }; - const root = Object.create(null); - await doDir(opfsUtil.rootDirectory, root); - return root; - }; - - /** - Irrevocably deletes _all_ files in the current origin's OPFS. - Obviously, this must be used with great caution. It may throw - an exception if removal of anything fails (e.g. a file is - locked), but the precise conditions under which the underlying - APIs will throw are not documented (so we cannot tell you what - they are). - */ - opfsUtil.rmfr = async function(){ - const dir = opfsUtil.rootDirectory, opt = {recurse: true}; - for await (const handle of dir.values()){ - dir.removeEntry(handle.name, opt); - } - }; - - /** - Deletes the given OPFS filesystem entry. As this environment - has no notion of "current directory", the given name must be an - absolute path. If the 2nd argument is truthy, deletion is - recursive (use with caution!). - - The returned Promise resolves to true if the deletion was - successful, else false (but...). The OPFS API reports the - reason for the failure only in human-readable form, not - exceptions which can be type-checked to determine the - failure. Because of that... - - If the final argument is truthy then this function will - propagate any exception on error, rather than returning false. - */ - opfsUtil.unlink = async function(fsEntryName, recursive = false, - throwOnError = false){ - try { - const [hDir, filenamePart] = - await opfsUtil.getDirForFilename(fsEntryName, false); - await hDir.removeEntry(filenamePart, {recursive}); - return true; - }catch(e){ - if(throwOnError){ - throw new Error("unlink(",arguments[0],") failed: "+e.message,{ - cause: e - }); - } - return false; - } - }; - - /** - Traverses the OPFS filesystem, calling a callback for each - entry. The argument may be either a callback function or an - options object with any of the following properties: - - - `callback`: function which gets called for each filesystem - entry. It gets passed 3 arguments: 1) the - FileSystemFileHandle or FileSystemDirectoryHandle of each - entry (noting that both are instanceof FileSystemHandle). 2) - the FileSystemDirectoryHandle of the parent directory. 3) the - current depth level, with 0 being at the top of the tree - relative to the starting directory. If the callback returns a - literal false, as opposed to any other falsy value, traversal - stops without an error. Any exceptions it throws are - propagated. Results are undefined if the callback manipulate - the filesystem (e.g. removing or adding entries) because the - how OPFS iterators behave in the face of such changes is - undocumented. - - - `recursive` [bool=true]: specifies whether to recurse into - subdirectories or not. Whether recursion is depth-first or - breadth-first is unspecified! - - - `directory` [FileSystemDirectoryEntry=sqlite3.opfs.rootDirectory] - specifies the starting directory. - - If this function is passed a function, it is assumed to be the - callback. - - Returns a promise because it has to (by virtue of being async) - but that promise has no specific meaning: the traversal it - performs is synchronous. The promise must be used to catch any - exceptions propagated by the callback, however. - */ - opfsUtil.traverse = async function(opt){ - const defaultOpt = { - recursive: true, - directory: opfsUtil.rootDirectory - }; - if('function'===typeof opt){ - opt = {callback:opt}; - } - opt = Object.assign(defaultOpt, opt||{}); - const doDir = async function callee(dirHandle, depth){ - for await (const handle of dirHandle.values()){ - if(false === opt.callback(handle, dirHandle, depth)) return false; - else if(opt.recursive && 'directory' === handle.kind){ - if(false === await callee(handle, depth + 1)) break; - } - } - }; - doDir(opt.directory, 0); - }; - - /** - impl of importDb() when it's given a function as its second - argument. - */ - const importDbChunked = async function(filename, callback){ - const [hDir, fnamePart] = await opfsUtil.getDirForFilename(filename, true); - const hFile = await hDir.getFileHandle(fnamePart, {create:true}); - let sah = await hFile.createSyncAccessHandle(); - let nWrote = 0, chunk, checkedHeader = false, err = false; - try{ - sah.truncate(0); - while( undefined !== (chunk = await callback()) ){ - if(chunk instanceof ArrayBuffer) chunk = new Uint8Array(chunk); - if( !checkedHeader && 0===nWrote && chunk.byteLength>=15 ){ - util.affirmDbHeader(chunk); - checkedHeader = true; - } - sah.write(chunk, {at: nWrote}); - nWrote += chunk.byteLength; - } - if( nWrote < 512 || 0!==nWrote % 512 ){ - toss("Input size",nWrote,"is not correct for an SQLite database."); - } - if( !checkedHeader ){ - const header = new Uint8Array(20); - sah.read( header, {at: 0} ); - util.affirmDbHeader( header ); - } - sah.write(new Uint8Array([1,1]), {at: 18}/*force db out of WAL mode*/); - return nWrote; - }catch(e){ - await sah.close(); - sah = undefined; - await hDir.removeEntry( fnamePart ).catch(()=>{}); - throw e; - }finally { - if( sah ) await sah.close(); - } - }; - - /** - Asynchronously imports the given bytes (a byte array or - ArrayBuffer) into the given database file. - - Results are undefined if the given db name refers to an opened - db. - - If passed a function for its second argument, its behaviour - changes: imports its data in chunks fed to it by the given - callback function. It calls the callback (which may be async) - repeatedly, expecting either a Uint8Array or ArrayBuffer (to - denote new input) or undefined (to denote EOF). For so long as - the callback continues to return non-undefined, it will append - incoming data to the given VFS-hosted database file. When - called this way, the resolved value of the returned Promise is - the number of bytes written to the target file. - - It very specifically requires the input to be an SQLite3 - database and throws if that's not the case. It does so in - order to prevent this function from taking on a larger scope - than it is specifically intended to. i.e. we do not want it to - become a convenience for importing arbitrary files into OPFS. - - This routine rewrites the database header bytes in the output - file (not the input array) to force disabling of WAL mode. - - On error this throws and the state of the input file is - undefined (it depends on where the exception was triggered). - - On success, resolves to the number of bytes written. - */ - opfsUtil.importDb = async function(filename, bytes){ - if( bytes instanceof Function ){ - return importDbChunked(filename, bytes); - } - if(bytes instanceof ArrayBuffer) bytes = new Uint8Array(bytes); - util.affirmIsDb(bytes); - const n = bytes.byteLength; - const [hDir, fnamePart] = await opfsUtil.getDirForFilename(filename, true); - let sah, err, nWrote = 0; - try { - const hFile = await hDir.getFileHandle(fnamePart, {create:true}); - sah = await hFile.createSyncAccessHandle(); - sah.truncate(0); - nWrote = sah.write(bytes, {at: 0}); - if(nWrote != n){ - toss("Expected to write "+n+" bytes but wrote "+nWrote+"."); - } - sah.write(new Uint8Array([1,1]), {at: 18}) /* force db out of WAL mode */; - return nWrote; - }catch(e){ - if( sah ){ await sah.close(); sah = undefined; } - await hDir.removeEntry( fnamePart ).catch(()=>{}); - throw e; - }finally{ - if( sah ) await sah.close(); - } - }; - + }), function(sqlite3, vfs){ + /* Post-VFS-registration initialization... */ if(sqlite3.oo1){ const OpfsDb = function(...args){ const opt = sqlite3.oo1.DB.dbCtorHelper.normalizeArgs(...args); - opt.vfs = opfsVfs.$zName; + opt.vfs = vfs.$zName; sqlite3.oo1.DB.dbCtorHelper.call(this, opt); }; OpfsDb.prototype = Object.create(sqlite3.oo1.DB.prototype); sqlite3.oo1.OpfsDb = OpfsDb; OpfsDb.importDb = opfsUtil.importDb; - sqlite3.oo1.DB.dbCtorHelper.setVfsPostOpenCallback( - opfsVfs.pointer, - function(oo1Db, sqlite3){ - /* Set a relatively high default busy-timeout handler to - help OPFS dbs deal with multi-tab/multi-worker - contention. */ - sqlite3.capi.sqlite3_busy_timeout(oo1Db, 10000); - } - ); - }/*extend sqlite3.oo1*/ - - const sanityCheck = function(){ - const scope = wasm.scopedAllocPush(); - const sq3File = new sqlite3_file(); - try{ - const fid = sq3File.pointer; - const openFlags = capi.SQLITE_OPEN_CREATE - | capi.SQLITE_OPEN_READWRITE - //| capi.SQLITE_OPEN_DELETEONCLOSE - | capi.SQLITE_OPEN_MAIN_DB; - const pOut = wasm.scopedAlloc(8); - const dbFile = "/sanity/check/file"+randomFilename(8); - const zDbFile = wasm.scopedAllocCString(dbFile); - let rc; - state.s11n.serialize("This is ä string."); - rc = state.s11n.deserialize(); - log("deserialize() says:",rc); - if("This is ä string."!==rc[0]) toss("String d13n error."); - vfsSyncWrappers.xAccess(opfsVfs.pointer, zDbFile, 0, pOut); - rc = wasm.peek(pOut,'i32'); - log("xAccess(",dbFile,") exists ?=",rc); - rc = vfsSyncWrappers.xOpen(opfsVfs.pointer, zDbFile, - fid, openFlags, pOut); - log("open rc =",rc,"state.sabOPView[xOpen] =", - state.sabOPView[state.opIds.xOpen]); - if(0!==rc){ - error("open failed with code",rc); - return; - } - vfsSyncWrappers.xAccess(opfsVfs.pointer, zDbFile, 0, pOut); - rc = wasm.peek(pOut,'i32'); - if(!rc) toss("xAccess() failed to detect file."); - rc = ioSyncWrappers.xSync(sq3File.pointer, 0); - if(rc) toss('sync failed w/ rc',rc); - rc = ioSyncWrappers.xTruncate(sq3File.pointer, 1024); - if(rc) toss('truncate failed w/ rc',rc); - wasm.poke(pOut,0,'i64'); - rc = ioSyncWrappers.xFileSize(sq3File.pointer, pOut); - if(rc) toss('xFileSize failed w/ rc',rc); - log("xFileSize says:",wasm.peek(pOut, 'i64')); - rc = ioSyncWrappers.xWrite(sq3File.pointer, zDbFile, 10, 1); - if(rc) toss("xWrite() failed!"); - const readBuf = wasm.scopedAlloc(16); - rc = ioSyncWrappers.xRead(sq3File.pointer, readBuf, 6, 2); - wasm.poke(readBuf+6,0); - let jRead = wasm.cstrToJs(readBuf); - log("xRead() got:",jRead); - if("sanity"!==jRead) toss("Unexpected xRead() value."); - if(vfsSyncWrappers.xSleep){ - log("xSleep()ing before close()ing..."); - vfsSyncWrappers.xSleep(opfsVfs.pointer,2000); - log("waking up from xSleep()"); - } - rc = ioSyncWrappers.xClose(fid); - log("xClose rc =",rc,"sabOPView =",state.sabOPView); - log("Deleting file:",dbFile); - vfsSyncWrappers.xDelete(opfsVfs.pointer, zDbFile, 0x1234); - vfsSyncWrappers.xAccess(opfsVfs.pointer, zDbFile, 0, pOut); - rc = wasm.peek(pOut,'i32'); - if(rc) toss("Expecting 0 from xAccess(",dbFile,") after xDelete()."); - warn("End of OPFS sanity checks."); - }finally{ - sq3File.dispose(); - wasm.scopedAllocPop(scope); - } - }/*sanityCheck()*/; - - W.onmessage = function({data}){ - //log("Worker.onmessage:",data); - switch(data.type){ - case 'opfs-unavailable': - /* Async proxy has determined that OPFS is unavailable. There's - nothing more for us to do here. */ - promiseReject(new Error(data.payload.join(' '))); - break; - case 'opfs-async-loaded': - /* Arrives as soon as the asyc proxy finishes loading. - Pass our config and shared state on to the async - worker. */ - W.postMessage({type: 'opfs-async-init',args: state}); - break; - case 'opfs-async-inited': { - /* Indicates that the async partner has received the 'init' - and has finished initializing, so the real work can - begin... */ - if(true===promiseWasRejected){ - break /* promise was already rejected via timer */; - } - try { - sqlite3.vfs.installVfs({ - io: {struct: opfsIoMethods, methods: ioSyncWrappers}, - vfs: {struct: opfsVfs, methods: vfsSyncWrappers} - }); - state.sabOPView = new Int32Array(state.sabOP); - state.sabFileBufView = new Uint8Array(state.sabIO, 0, state.fileBufferSize); - state.sabS11nView = new Uint8Array(state.sabIO, state.sabS11nOffset, state.sabS11nSize); - initS11n(); - if(options.sanityChecks){ - warn("Running sanity checks because of opfs-sanity-check URL arg..."); - sanityCheck(); - } - if(thisThreadHasOPFS()){ - navigator.storage.getDirectory().then((d)=>{ - W.onerror = W._originalOnError; - delete W._originalOnError; - sqlite3.opfs = opfsUtil; - opfsUtil.rootDirectory = d; - log("End of OPFS sqlite3_vfs setup.", opfsVfs); - promiseResolve(); - }).catch(promiseReject); - }else{ - promiseResolve(); - } - }catch(e){ - error(e); - promiseReject(e); - } - break; + if( true ){ + /* 2026-03-06: this was a design mis-decision and is + inconsistent with sqlite3_open() and friends, but is + retained against the risk of introducing regressions if + it's removed. */ + sqlite3.oo1.DB.dbCtorHelper.setVfsPostOpenCallback( + opfsVfs.pointer, + function(oo1Db, sqlite3){ + /* Set a relatively high default busy-timeout handler to + help OPFS dbs deal with multi-tab/multi-worker + contention. */ + sqlite3.capi.sqlite3_busy_timeout(oo1Db, 10000); } - default: { - const errMsg = ( - "Unexpected message from the OPFS async worker: " + - JSON.stringify(data) - ); - error(errMsg); - promiseReject(new Error(errMsg)); - break; - } - }/*switch(data.type)*/ - }/*W.onmessage()*/; - })/*thePromise*/; - return thePromise; + ); + } + }/*extend sqlite3.oo1*/ + })/*bindVfs()*/; }/*installOpfsVfs()*/; -installOpfsVfs.defaultProxyUri = - "sqlite3-opfs-async-proxy.js"; globalThis.sqlite3ApiBootstrap.initializersAsync.push(async (sqlite3)=>{ - try{ - let proxyJs = installOpfsVfs.defaultProxyUri; - if(sqlite3.scriptInfo.sqlite3Dir){ - installOpfsVfs.defaultProxyUri = - sqlite3.scriptInfo.sqlite3Dir + proxyJs; - //sqlite3.config.warn("installOpfsVfs.defaultProxyUri =",installOpfsVfs.defaultProxyUri); - } - return installOpfsVfs().catch((e)=>{ - sqlite3.config.warn("Ignoring inability to install OPFS sqlite3_vfs:",e.message); - }); - }catch(e){ - sqlite3.config.error("installOpfsVfs() exception:",e); - return Promise.reject(e); - } + return installOpfsVfs().catch((e)=>{ + sqlite3.config.warn("Ignoring inability to install 'opfs' sqlite3_vfs:",e); + }) }); }/*sqlite3ApiBootstrap.initializers.push()*/); -//#else -/* The OPFS VFS parts are elided from builds targeting node.js. */ -//#endif target:node +//#/if target:node diff --git a/ext/wasm/api/sqlite3-vtab-helper.c-pp.js b/ext/wasm/api/sqlite3-vtab-helper.c-pp.js index 4c2338fc5a..80f4bfac23 100644 --- a/ext/wasm/api/sqlite3-vtab-helper.c-pp.js +++ b/ext/wasm/api/sqlite3-vtab-helper.c-pp.js @@ -172,10 +172,7 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ Works like unget() plus it calls dispose() on the StructType object. */ - dispose: (pCObj)=>{ - const o = __xWrap(pCObj,true); - if(o) o.dispose(); - } + dispose: (pCObj)=>__xWrap(pCObj,true)?.dispose?.() }); }; diff --git a/ext/wasm/api/sqlite3-wasm.c b/ext/wasm/api/sqlite3-wasm.c index 4d5e9b2962..0c5f4f8ea5 100644 --- a/ext/wasm/api/sqlite3-wasm.c +++ b/ext/wasm/api/sqlite3-wasm.c @@ -93,6 +93,18 @@ #undef SQLITE_ENABLE_API_ARMOR #define SQLITE_ENABLE_API_ARMOR 1 +/**********************************************************************/ +/* SQLITE_EXPERIMENTAL_PRAGMA_20251114 */ +/* +** See: +** https://sqlite.org/src/info/e2b3f1a9480a9be3 +** https://github.com/rhashimoto/wa-sqlite/discussions/301 +** +** It is enabled here for the sake of VFS experimentors. +*/ +#undef SQLITE_EXPERIMENTAL_PRAGMA_20251114 +#define SQLITE_EXPERIMENTAL_PRAGMA_20251114 + /**********************************************************************/ /* SQLITE_O... */ #undef SQLITE_OMIT_DEPRECATED @@ -136,8 +148,8 @@ /* ** If SQLITE_WASM_BARE_BONES is defined, undefine most of the ENABLE ** macros. This will, when using the canonical makefile, also elide -** any C functions from the WASM exports which are listed in -** ./EXPORT_FUNCTIONS.sqlite3-extras. +** any C functions from the WASM exports: see +** ./EXPORTED_FUNCTIONS.c-pp. */ #ifdef SQLITE_WASM_BARE_BONES # undef SQLITE_ENABLE_COLUMN_METADATA @@ -217,15 +229,18 @@ ** not by client code, so an argument can be made for reducing their ** visibility by not including them in any build-time export lists. ** -** 2022-09-11: it's not yet _proven_ that this approach works in -** non-Emscripten builds. If not, such builds will need to export -** those using the --export=... wasm-ld flag (or equivalent). As of -** this writing we are tied to Emscripten for various reasons -** and cannot test the library with other build environments. +** 2025-12-01: for use in non-Emscripten builds, we need a more +** invasive macro which explicitly names the export: +** SQLITE_WASM_EXPORT2. */ #define SQLITE_WASM_EXPORT __attribute__((used,visibility("default"))) -// See also: -//__attribute__((export_name("theExportedName"), used, visibility("default"))) +#define SQLITE_WASM_EXPORT_NAMED(X) __attribute__((export_name(#X),used,visibility("default"))) +#define SQLITE_WASM_EXPORT2(RETTYPE,NAME,SIG) SQLITE_WASM_EXPORT_NAMED(NAME) RETTYPE NAME SIG + +#if 1 +/** Increase the kvvfs key size limit from 32. */ +#define KVRECORD_KEY_SZ 128 +#endif /* ** Which sqlite3.c we're using needs to be configurable to enable @@ -249,45 +264,6 @@ #undef INC__STRINGIFY #undef SQLITE_C -#if 0 -/* -** An EXPERIMENT in implementing a stack-based allocator analog to -** Emscripten's stackSave(), stackAlloc(), stackRestore(). -** Unfortunately, this cannot work together with Emscripten because -** Emscripten defines its own native one and we'd stomp on each -** other's memory. Other than that complication, basic tests show it -** to work just fine. -** -** Another option is to malloc() a chunk of our own and call that our -** "stack". -*/ -SQLITE_WASM_EXPORT void * sqlite3__wasm_stack_end(void){ - extern void __heap_base - /* see https://stackoverflow.com/questions/10038964 */; - return &__heap_base; -} -SQLITE_WASM_EXPORT void * sqlite3__wasm_stack_begin(void){ - extern void __data_end; - return &__data_end; -} -static void * pWasmStackPtr = 0; -SQLITE_WASM_EXPORT void * sqlite3__wasm_stack_ptr(void){ - if(!pWasmStackPtr) pWasmStackPtr = sqlite3__wasm_stack_end(); - return pWasmStackPtr; -} -SQLITE_WASM_EXPORT void sqlite3__wasm_stack_restore(void * p){ - pWasmStackPtr = p; -} -SQLITE_WASM_EXPORT void * sqlite3__wasm_stack_alloc(int n){ - if(n<=0) return 0; - n = (n + 7) & ~7 /* align to 8-byte boundary */; - unsigned char * const p = (unsigned char *)sqlite3__wasm_stack_ptr(); - unsigned const char * const b = (unsigned const char *)sqlite3__wasm_stack_begin(); - if(b + n >= p || b + n < b/*overflow*/) return 0; - return pWasmStackPtr = p - n; -} -#endif /* stack allocator experiment */ - /* ** State for the "pseudo-stack" allocator implemented in ** sqlite3__wasm_pstack_xyz(). In order to avoid colliding with @@ -326,7 +302,7 @@ SQLITE_WASM_EXPORT void * sqlite3__wasm_pstack_ptr(void){ */ SQLITE_WASM_EXPORT void sqlite3__wasm_pstack_restore(unsigned char * p){ assert(p>=PStack.pBegin && p<=PStack.pEnd && p>=PStack.pPos); - assert(0==((unsigned long long)p & 0x7)); + assert(0==((unsigned long long)p & 0x7) /* 8-byte aligned */); if(p>=PStack.pBegin && p<=PStack.pEnd /*&& p>=PStack.pPos*/){ PStack.pPos = p; } @@ -336,10 +312,10 @@ SQLITE_WASM_EXPORT void sqlite3__wasm_pstack_restore(unsigned char * p){ ** the memory on success, 0 on error (including a negative n value). n ** is always adjusted to be a multiple of 8 and returned memory is ** always zeroed out before returning (because this keeps the client -** JS code from having to do so, and most uses of the pstack will -** call for doing so). +** JS code from having to do so, and most uses of the pstack call for +** doing so). */ -SQLITE_WASM_EXPORT void * sqlite3__wasm_pstack_alloc(int n){ +SQLITE_WASM_EXPORT2(void *,sqlite3__wasm_pstack_alloc,(int n)){ if( n<=0 ) return 0; n = (n + 7) & ~7 /* align to 8-byte boundary */; if( PStack.pBegin + n > PStack.pPos /*not enough space left*/ @@ -351,7 +327,7 @@ SQLITE_WASM_EXPORT void * sqlite3__wasm_pstack_alloc(int n){ ** Return the number of bytes left which can be ** sqlite3__wasm_pstack_alloc()'d. */ -SQLITE_WASM_EXPORT int sqlite3__wasm_pstack_remaining(void){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_pstack_remaining,(void)){ assert(PStack.pPos >= PStack.pBegin); assert(PStack.pPos <= PStack.pEnd); return (int)(PStack.pPos - PStack.pBegin); @@ -362,7 +338,7 @@ SQLITE_WASM_EXPORT int sqlite3__wasm_pstack_remaining(void){ ** any space which is currently allocated. This value is a ** compile-time constant. */ -SQLITE_WASM_EXPORT int sqlite3__wasm_pstack_quota(void){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_pstack_quota,(void)){ return (int)(PStack.pEnd - PStack.pBegin); } @@ -375,8 +351,7 @@ struct WasmTestStruct { void (*xFunc)(void*); }; typedef struct WasmTestStruct WasmTestStruct; -SQLITE_WASM_EXPORT -void sqlite3__wasm_test_struct(WasmTestStruct * s){ +SQLITE_WASM_EXPORT2(void,sqlite3__wasm_test_struct,(WasmTestStruct * s)){ if(s){ if( 0 ){ /* Do not be alarmed by the small (and odd) pointer values. @@ -416,8 +391,7 @@ void sqlite3__wasm_test_struct(WasmTestStruct * s){ ** fails to compile with "tables may not be 64-bit" but does not tell ** us where it's happening. */ -SQLITE_WASM_EXPORT -const char * sqlite3__wasm_enum_json(void){ +SQLITE_WASM_EXPORT2(const char *,sqlite3__wasm_enum_json,(void)){ static char aBuffer[1024 * 20] = {0} /* where the JSON goes. 2025-09-19: output size=19295, but that can vary slightly from build to build, so a little @@ -597,6 +571,7 @@ const char * sqlite3__wasm_enum_json(void){ DefInt(SQLITE_DBCONFIG_ENABLE_ATTACH_WRITE); DefInt(SQLITE_DBCONFIG_ENABLE_COMMENTS); DefInt(SQLITE_DBCONFIG_MAX); + DefInt(SQLITE_DBCONFIG_FP_DIGITS); } _DefGroup; DefGroup(dbStatus){ @@ -620,6 +595,7 @@ const char * sqlite3__wasm_enum_json(void){ DefGroup(encodings) { /* Noting that the wasm binding only aims to support UTF-8. */ DefInt(SQLITE_UTF8); + DefInt(SQLITE_UTF8_ZT); DefInt(SQLITE_UTF16LE); DefInt(SQLITE_UTF16BE); DefInt(SQLITE_UTF16); @@ -723,6 +699,8 @@ const char * sqlite3__wasm_enum_json(void){ DefInt(SQLITE_MAX_TRIGGER_DEPTH); DefInt(SQLITE_LIMIT_WORKER_THREADS); DefInt(SQLITE_MAX_WORKER_THREADS); + DefInt(SQLITE_LIMIT_PARSER_DEPTH); + DefInt(SQLITE_MAX_PARSER_DEPTH); } _DefGroup; DefGroup(openFlags) { @@ -756,6 +734,7 @@ const char * sqlite3__wasm_enum_json(void){ DefInt(SQLITE_PREPARE_PERSISTENT); DefInt(SQLITE_PREPARE_NORMALIZE); DefInt(SQLITE_PREPARE_NO_VTAB); + DefInt(SQLITE_PREPARE_FROM_DDL); } _DefGroup; DefGroup(resultCodes) { @@ -1007,13 +986,15 @@ const char * sqlite3__wasm_enum_json(void){ /** ^^^ indirection needed to expand CurrentStruct */ #define StructBinder StructBinder_(CurrentStruct) #define _StructBinder CloseBrace(2) -#define M(MEMBER,SIG) \ - outf("%s\"%s\": " \ - "{\"offset\":%d,\"sizeof\": %d,\"signature\":\"%s\"}", \ - (n++ ? ", " : ""), #MEMBER, \ - (int)offsetof(CurrentStruct,MEMBER), \ - (int)sizeof(((CurrentStruct*)0)->MEMBER), \ - SIG) +#define M3(MEMBER,SIG,READONLY) \ + outf("%s\"%s\": " \ + "{\"offset\":%d,\"sizeof\":%d,\"signature\":\"%s\"%s}", \ + (n++ ? ", " : ""), #MEMBER, \ + (int)offsetof(CurrentStruct,MEMBER), \ + (int)sizeof(((CurrentStruct*)0)->MEMBER), \ + SIG, (READONLY ? ",\"readOnly\":true" : "")) +#define M(MEMBER,SIG) M3(MEMBER,SIG,0) +#define MRO(MEMBER,SIG) M3(MEMBER,SIG,1) nStruct = 0; out(", \"structs\": ["); { @@ -1076,11 +1057,30 @@ const char * sqlite3__wasm_enum_json(void){ #undef CurrentStruct #define CurrentStruct sqlite3_kvvfs_methods + /* From os_kv.c */ + StructBinder { + M(xRcrdRead, "i(sspi)"); + M(xRcrdWrite, "i(sss)"); + M(xRcrdDelete, "i(ss)"); + MRO(nKeySize, "i"); + MRO(nBufferSize, "i"); + M(pVfs, "p"); + M(pIoDb, "p"); + M(pIoJrnl, "p"); + } _StructBinder; +#undef CurrentStruct + +#define CurrentStruct KVVfsFile + /* From os_kv.c */ StructBinder { - M(xRead, "i(sspi)"); - M(xWrite, "i(sss)"); - M(xDelete, "i(ss)"); - M(nKeySize, "i"); + M(base, "p")/*sqlite3_file base*/; + M(zClass, "s"); + M(isJournal, "i"); + M(nJrnl, "i")/*actually unsigned!*/; + M(aJrnl, "p"); + M(szPage, "i"); + M(szDb, "j"); + M(aData, "p"); } _StructBinder; #undef CurrentStruct @@ -1137,7 +1137,13 @@ const char * sqlite3__wasm_enum_json(void){ ** sqlite3_index_info, we have to uplift those into constructs we ** can access by type name. These structs _must_ match their ** in-sqlite3_index_info counterparts byte for byte. - */ + ** + ** 2025-11-21: this uplifing is no longer necessary, as Jaccwabyt + ** can now handle nested structs, but "it ain't broke" so there's + ** no pressing need to rewire this. Also, it's conceivable that + ** rewiring it might break downstream vtab impls, so it shouldn't + ** be rewired. + */ typedef struct { int iColumn; unsigned char op; @@ -1232,6 +1238,8 @@ const char * sqlite3__wasm_enum_json(void){ #undef StructBinder_ #undef StructBinder__ #undef M +#undef MRO +#undef M3 #undef _StructBinder #undef CloseBrace #undef out @@ -1249,8 +1257,7 @@ const char * sqlite3__wasm_enum_json(void){ ** method, SQLITE_MISUSE is returned, else the result of the xDelete() ** call is returned. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_vfs_unlink(sqlite3_vfs *pVfs, const char *zName){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_vfs_unlink,(sqlite3_vfs *pVfs, const char *zName)){ int rc = SQLITE_MISUSE /* ??? */; if( 0==pVfs && 0!=zName ) pVfs = sqlite3_vfs_find(0); if( zName && pVfs && pVfs->xDelete ){ @@ -1267,8 +1274,7 @@ int sqlite3__wasm_vfs_unlink(sqlite3_vfs *pVfs, const char *zName){ ** defaulting to "main" if zDbName is 0. Returns 0 if no db with the ** given name is open. */ -SQLITE_WASM_EXPORT -sqlite3_vfs * sqlite3__wasm_db_vfs(sqlite3 *pDb, const char *zDbName){ +SQLITE_WASM_EXPORT2(sqlite3_vfs *,sqlite3__wasm_db_vfs,(sqlite3 *pDb, const char *zDbName)){ sqlite3_vfs * pVfs = 0; sqlite3_file_control(pDb, zDbName ? zDbName : "main", SQLITE_FCNTL_VFS_POINTER, &pVfs); @@ -1290,8 +1296,7 @@ sqlite3_vfs * sqlite3__wasm_db_vfs(sqlite3 *pDb, const char *zDbName){ ** Returns 0 on success, an SQLITE_xxx code on error. Returns ** SQLITE_MISUSE if pDb is NULL. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_db_reset(sqlite3 *pDb){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_db_reset,(sqlite3 *pDb)){ int rc = SQLITE_MISUSE; if( pDb ){ sqlite3_table_column_metadata(pDb, "main", 0, 0, 0, 0, 0, 0, 0); @@ -1372,10 +1377,10 @@ int sqlite3__wasm_db_export_chunked( sqlite3* pDb, ** If `*pOut` is not NULL, the caller is responsible for passing it to ** sqlite3_free() to free it. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_db_serialize( sqlite3 *pDb, const char *zSchema, - unsigned char **pOut, - sqlite3_int64 *nOut, unsigned int mFlags ){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_db_serialize, + (sqlite3 *pDb, const char *zSchema, + unsigned char **pOut, + sqlite3_int64 *nOut, unsigned int mFlags)){ unsigned char * z; if( !pDb || !pOut ) return SQLITE_MISUSE; if( nOut ) *nOut = 0; @@ -1436,11 +1441,9 @@ int sqlite3__wasm_db_serialize( sqlite3 *pDb, const char *zSchema, ** portability, so that the API can still work in builds where BigInt ** support is disabled or unavailable. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_vfs_create_file( sqlite3_vfs *pVfs, - const char *zFilename, - const unsigned char * pData, - int nData ){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_vfs_create_file, + (sqlite3_vfs *pVfs, const char *zFilename, + const unsigned char * pData, int nData)){ int rc; sqlite3_file *pFile = 0; sqlite3_io_methods const *pIo; @@ -1526,10 +1529,9 @@ int sqlite3__wasm_vfs_create_file( sqlite3_vfs *pVfs, ** zFilename, appends pData bytes to it, and returns 0 on success or ** SQLITE_IOERR on error. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_posix_create_file( const char *zFilename, - const unsigned char * pData, - int nData ){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_posix_create_file, + (const char *zFilename, const unsigned char * pData, + int nData)){ int rc; FILE * pFile = 0; int fileExisted = 0; @@ -1549,22 +1551,30 @@ int sqlite3__wasm_posix_create_file( const char *zFilename, ** This function is NOT part of the sqlite3 public API. It is strictly ** for use by the sqlite project's own JS/WASM bindings. ** -** Allocates sqlite3KvvfsMethods.nKeySize bytes from -** sqlite3__wasm_pstack_alloc() and returns 0 if that allocation fails, -** else it passes that string to kvstorageMakeKey() and returns a -** NUL-terminated pointer to that string. It is up to the caller to -** use sqlite3__wasm_pstack_restore() to free the returned pointer. +** This returns either a pointer to a static buffer or zKeyIn directly +** (if zClass is NULL or empty). */ -SQLITE_WASM_EXPORT -char * sqlite3__wasm_kvvfsMakeKeyOnPstack(const char *zClass, - const char *zKeyIn){ +SQLITE_WASM_EXPORT2(const char *,sqlite3__wasm_kvvfsMakeKey, + (const char *zClass, const char *zKeyIn)){ + static char buf[SQLITE_KVOS_SZ+1] = {0}; assert(sqlite3KvvfsMethods.nKeySize>24); - char *zKeyOut = - (char *)sqlite3__wasm_pstack_alloc(sqlite3KvvfsMethods.nKeySize); - if(zKeyOut){ - kvstorageMakeKey(zClass, zKeyIn, zKeyOut); + if( zClass && *zClass ){ + kvrecordMakeKey(zClass, zKeyIn, buf); + return buf; + }else{ +#if 1 + /* We can return zKeyIn here only because the JS API takes special + ** care with its lifetime.*/ + return zKeyIn; +#else + /* It would be nice to be able to return zKeyIn directly here, but + ** it may have been allocated as part of the automated JS-to-WASM + ** conversions, in which case it will be freed before reaching the + ** caller. */ + sqlite3_snprintf(KVRECORD_KEY_SZ, buf, "%s", zKeyIn); + return buf; +#endif } - return zKeyOut; } /* @@ -1574,8 +1584,7 @@ char * sqlite3__wasm_kvvfsMakeKeyOnPstack(const char *zClass, ** Returns the pointer to the singleton object which holds the kvvfs ** I/O methods and associated state. */ -SQLITE_WASM_EXPORT -sqlite3_kvvfs_methods * sqlite3__wasm_kvvfs_methods(void){ +SQLITE_WASM_EXPORT2(sqlite3_kvvfs_methods *,sqlite3__wasm_kvvfs_methods,(void)){ return &sqlite3KvvfsMethods; } @@ -1590,8 +1599,8 @@ sqlite3_kvvfs_methods * sqlite3__wasm_kvvfs_methods(void){ ** sqlite3_vtab_config(), or SQLITE_MISUSE if the 2nd arg is not a ** valid value. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_vtab_config(sqlite3 *pDb, int op, int arg){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_vtab_config, + (sqlite3 *pDb, int op, int arg)){ switch(op){ case SQLITE_VTAB_DIRECTONLY: case SQLITE_VTAB_INNOCUOUS: @@ -1611,8 +1620,8 @@ int sqlite3__wasm_vtab_config(sqlite3 *pDb, int op, int arg){ ** Wrapper for the variants of sqlite3_db_config() which take ** (int,int*) variadic args. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_db_config_ip(sqlite3 *pDb, int op, int arg1, int* pArg2){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_db_config_ip, + (sqlite3 *pDb, int op, int arg1, int* pArg2)){ switch(op){ case SQLITE_DBCONFIG_ENABLE_FKEY: case SQLITE_DBCONFIG_ENABLE_TRIGGER: @@ -1635,6 +1644,7 @@ int sqlite3__wasm_db_config_ip(sqlite3 *pDb, int op, int arg1, int* pArg2){ case SQLITE_DBCONFIG_ENABLE_ATTACH_CREATE: case SQLITE_DBCONFIG_ENABLE_ATTACH_WRITE: case SQLITE_DBCONFIG_ENABLE_COMMENTS: + case SQLITE_DBCONFIG_FP_DIGITS: return sqlite3_db_config(pDb, op, arg1, pArg2); default: return SQLITE_MISUSE; } @@ -1647,8 +1657,9 @@ int sqlite3__wasm_db_config_ip(sqlite3 *pDb, int op, int arg1, int* pArg2){ ** Wrapper for the variants of sqlite3_db_config() which take ** (void*,int,int) variadic args. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_db_config_pii(sqlite3 *pDb, int op, void * pArg1, int arg2, int arg3){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_db_config_pii, + (sqlite3 *pDb, int op, void * pArg1, int arg2, + int arg3)){ switch(op){ case SQLITE_DBCONFIG_LOOKASIDE: return sqlite3_db_config(pDb, op, pArg1, arg2, arg3); @@ -1663,8 +1674,8 @@ int sqlite3__wasm_db_config_pii(sqlite3 *pDb, int op, void * pArg1, int arg2, in ** Wrapper for the variants of sqlite3_db_config() which take ** (const char *) variadic args. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_db_config_s(sqlite3 *pDb, int op, const char *zArg){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_db_config_s,(sqlite3 *pDb, int op, + const char *zArg)){ switch(op){ case SQLITE_DBCONFIG_MAINDBNAME: return sqlite3_db_config(pDb, op, zArg); @@ -1680,8 +1691,7 @@ int sqlite3__wasm_db_config_s(sqlite3 *pDb, int op, const char *zArg){ ** Binding for combinations of sqlite3_config() arguments which take ** a single integer argument. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_config_i(int op, int arg){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_config_i,(int op, int arg)){ return sqlite3_config(op, arg); } @@ -1692,8 +1702,7 @@ int sqlite3__wasm_config_i(int op, int arg){ ** Binding for combinations of sqlite3_config() arguments which take ** two int arguments. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_config_ii(int op, int arg1, int arg2){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_config_ii,(int op, int arg1, int arg2)){ return sqlite3_config(op, arg1, arg2); } @@ -1704,8 +1713,7 @@ int sqlite3__wasm_config_ii(int op, int arg1, int arg2){ ** Binding for combinations of sqlite3_config() arguments which take ** a single i64 argument. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_config_j(int op, sqlite3_int64 arg){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_config_j,(int op, sqlite3_int64 arg)){ return sqlite3_config(op, arg); } @@ -1717,8 +1725,7 @@ int sqlite3__wasm_config_j(int op, sqlite3_int64 arg){ ** sqlite3_mprintf()'s %Q modifier (if addQuotes is true) or %q (if ** addQuotes is 0). Returns NULL if z is NULL or on OOM. */ -SQLITE_WASM_EXPORT -char * sqlite3__wasm_qfmt_token(char *z, int addQuotes){ +SQLITE_WASM_EXPORT2(char *,sqlite3__wasm_qfmt_token,(char *z, int addQuotes)){ char * rc = 0; if( z ){ rc = addQuotes @@ -1728,6 +1735,21 @@ char * sqlite3__wasm_qfmt_token(char *z, int addQuotes){ return rc; } +/* +** This function is NOT part of the sqlite3 public API. It is strictly +** for use by the sqlite project's own JS/WASM bindings. +** +** A WASM wrapper for the interal os_kv.c:kvvfsDecode() for internal +** use by the kvvfs v2 API. +*/ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_kvvfs_decode,(const char *a, char *aOut, int nOut)){ + return kvvfsDecode(a, aOut, nOut); +} +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_kvvfs_encode,(const char *a, int nA, char *aOut)){ + return kvvfsEncode(a, nA, aOut); +} + + #if defined(__EMSCRIPTEN__) && defined(SQLITE_ENABLE_WASMFS) #include #include @@ -1753,8 +1775,7 @@ char * sqlite3__wasm_qfmt_token(char *z, int addQuotes){ ** the virtual FS fails. In builds compiled without SQLITE_ENABLE_WASMFS ** defined, SQLITE_NOTFOUND is returned without side effects. */ -SQLITE_WASM_EXPORT -int sqlite3__wasm_init_wasmfs(const char *zMountPoint){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_init_wasmfs,(const char *zMountPoint)){ static backend_t pOpfs = 0; if( !zMountPoint || !*zMountPoint ) zMountPoint = "/opfs"; if( !pOpfs ){ @@ -1773,8 +1794,7 @@ int sqlite3__wasm_init_wasmfs(const char *zMountPoint){ return pOpfs ? 0 : SQLITE_NOMEM; } #else -SQLITE_WASM_EXPORT -int sqlite3__wasm_init_wasmfs(const char *zUnused){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_init_wasmfs,(const char *zUnused)){ //emscripten_console_warn("WASMFS OPFS is not compiled in."); (void)zUnused; return SQLITE_NOTFOUND; @@ -1783,52 +1803,43 @@ int sqlite3__wasm_init_wasmfs(const char *zUnused){ #if SQLITE_WASM_ENABLE_C_TESTS -SQLITE_WASM_EXPORT -int sqlite3__wasm_test_intptr(int * p){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_test_intptr,(int * p)){ return *p = *p * 2; } -SQLITE_WASM_EXPORT -void * sqlite3__wasm_test_voidptr(void * p){ +SQLITE_WASM_EXPORT2(void *,sqlite3__wasm_test_voidptr,(void * p)){ return p; } -SQLITE_WASM_EXPORT -int64_t sqlite3__wasm_test_int64_max(void){ +SQLITE_WASM_EXPORT2(int64_t,sqlite3__wasm_test_int64_max,(void)){ return (int64_t)0x7fffffffffffffff; } -SQLITE_WASM_EXPORT -int64_t sqlite3__wasm_test_int64_min(void){ +SQLITE_WASM_EXPORT2(int64_t,sqlite3__wasm_test_int64_min,(void)){ return ~sqlite3__wasm_test_int64_max(); } -SQLITE_WASM_EXPORT -int64_t sqlite3__wasm_test_int64_times2(int64_t x){ +SQLITE_WASM_EXPORT2(int64_t,sqlite3__wasm_test_int64_times2,(int64_t x)){ return x * 2; } -SQLITE_WASM_EXPORT -void sqlite3__wasm_test_int64_minmax(int64_t * min, int64_t *max){ +SQLITE_WASM_EXPORT2(void,sqlite3__wasm_test_int64_minmax,(int64_t * min, int64_t *max)){ *max = sqlite3__wasm_test_int64_max(); *min = sqlite3__wasm_test_int64_min(); /*printf("minmax: min=%lld, max=%lld\n", *min, *max);*/ } -SQLITE_WASM_EXPORT -int64_t sqlite3__wasm_test_int64ptr(int64_t * p){ +SQLITE_WASM_EXPORT2(int64_t,sqlite3__wasm_test_int64ptr,(int64_t * p)){ /*printf("sqlite3__wasm_test_int64ptr( @%lld = 0x%llx )\n", (int64_t)p, *p);*/ return *p = *p * 2; } -SQLITE_WASM_EXPORT -void sqlite3__wasm_test_stack_overflow(int recurse){ +SQLITE_WASM_EXPORT2(void,sqlite3__wasm_test_stack_overflow,(int recurse)){ if(recurse) sqlite3__wasm_test_stack_overflow(recurse); } /* For testing the 'string:dealloc' whwasmutil.xWrap() conversion. */ -SQLITE_WASM_EXPORT -char * sqlite3__wasm_test_str_hello(int fail){ +SQLITE_WASM_EXPORT2(char *,sqlite3__wasm_test_str_hello,(int fail)){ char * s = fail ? 0 : (char *)sqlite3_malloc(6); if(s){ memcpy(s, "hello", 5); @@ -1942,8 +1953,8 @@ static int sqlite3__wasm_SQLTester_strnotglob(const char *zGlob, const char *z){ return *z==0; } -SQLITE_WASM_EXPORT -int sqlite3__wasm_SQLTester_strglob(const char *zGlob, const char *z){ +SQLITE_WASM_EXPORT2(int,sqlite3__wasm_SQLTester_strglob, + (const char *zGlob, const char *z)){ return !sqlite3__wasm_SQLTester_strnotglob(zGlob, z); } diff --git a/ext/wasm/api/sqlite3-worker1-promiser.c-pp.js b/ext/wasm/api/sqlite3-worker1-promiser.c-pp.js index 1a09bf9a6a..bcbf3fa9f8 100644 --- a/ext/wasm/api/sqlite3-worker1-promiser.c-pp.js +++ b/ext/wasm/api/sqlite3-worker1-promiser.c-pp.js @@ -19,10 +19,12 @@ slightly simpler client-side interface than the slightly-lower-level Worker API does. - This script necessarily exposes one global symbol, but clients may - freely `delete` that symbol after calling it. + In non-ESM builds this file necessarily exposes one global symbol, + but clients may freely `delete` that symbol after calling it. */ +//#if not defined target:es6-module 'use strict'; +//#/if /** Configures an sqlite3 Worker API #1 Worker such that it can be manipulated via a Promise-based interface and returns a factory @@ -109,10 +111,12 @@ the callback is called one time for each row of the result set, passed the same worker message format as the worker API emits: - {type:typeString, + { + type:typeString, row:VALUE, rowNumber:1-based-#, - columnNames: array} + columnNames: array + } Where `typeString` is an internally-synthesized message type string used temporarily for worker message dispatching. It can be ignored @@ -123,10 +127,9 @@ callback. At the end of the result set, the same event is fired with - (row=undefined, rowNumber=null) to indicate that - the end of the result set has been reached. Note that the rows - arrive via worker-posted messages, with all the implications - of that. + (row=undefined, rowNumber=null) to indicate that the end of the + result set has been reached. The rows arrive via worker-posted + messages, with all the implications of that. Notable shortcomings: @@ -257,7 +260,9 @@ globalThis.sqlite3Worker1Promiser.defaultConfig = { type: 'module' }); //#elif target:es6-module - return new Worker(new URL("sqlite3-worker1.js", import.meta.url)); + return new Worker(new URL("sqlite3-worker1.mjs", import.meta.url),{ + type: 'module' + }); //#else let theJs = "sqlite3-worker1.js"; if(this.currentScript){ @@ -273,15 +278,15 @@ globalThis.sqlite3Worker1Promiser.defaultConfig = { } } return new Worker(theJs + globalThis.location.search); -//#endif +//#/if } //#if not target:es6-module .bind({ currentScript: globalThis?.document?.currentScript }) -//#endif +//#/if , - onerror: (...args)=>console.error('worker1 promiser error',...args) + onerror: (...args)=>console.error('sqlite3Worker1Promiser():',...args) }/*defaultConfig*/; /** @@ -343,7 +348,8 @@ globalThis.sqlite3Worker1Promiser.v2.defaultConfig = incompatibility. */ export default sqlite3Worker1Promiser.v2; -//#endif /* target:es6-module */ +delete globalThis.sqlite3Worker1Promiser; +//#/if /* target:es6-module */ //#else /* Built with the omit-oo1 flag. */ -//#endif if not omit-oo1 +//#/if if not omit-oo1 diff --git a/ext/wasm/api/sqlite3-worker1.c-pp.js b/ext/wasm/api/sqlite3-worker1.c-pp.js index 036c4c6ea3..046243baa5 100644 --- a/ext/wasm/api/sqlite3-worker1.c-pp.js +++ b/ext/wasm/api/sqlite3-worker1.c-pp.js @@ -33,9 +33,9 @@ directory from which `sqlite3.js` will be loaded. */ //#if target:es6-bundler-friendly -import {default as sqlite3InitModule} from './sqlite3-bundler-friendly.mjs'; +import sqlite3InitModule from './sqlite3-bundler-friendly.mjs'; //#elif target:es6-module - return new Worker(new URL("sqlite3.js", import.meta.url)); +import sqlite3InitModule from './sqlite3.mjs'; //#else "use strict"; { @@ -49,8 +49,8 @@ import {default as sqlite3InitModule} from './sqlite3-bundler-friendly.mjs'; //console.warn("worker1 theJs =",theJs); importScripts(theJs); } -//#endif +//#/if sqlite3InitModule().then(sqlite3 => sqlite3.initWorker1API()); //#else /* Built with the omit-oo1 flag. */ -//#endif if not omit-oo1 +//#/if if not omit-oo1 diff --git a/ext/wasm/c-pp-lite.c b/ext/wasm/c-pp-lite.c deleted file mode 100644 index 2120c457dd..0000000000 --- a/ext/wasm/c-pp-lite.c +++ /dev/null @@ -1,2767 +0,0 @@ -/* -** 2022-11-12: -** -** The author disclaims copyright to this source code. In place of -** a legal notice, here is a blessing: -** -** * May you do good and not evil. -** * May you find forgiveness for yourself and forgive others. -** * May you share freely, never taking more than you give. -** -************************************************************************ -** -** The C-minus Preprocessor: a truly minimal C-like preprocessor. -** Why? Because C preprocessors _can_ process non-C code but generally make -** quite a mess of it. The purpose of this application is an extremely -** minimal preprocessor with only the most basic functionality of a C -** preprocessor, namely. -** -** The supported preprocessor directives are documented in the -** README.md hosted with this file. -** -** Any mention of "#" in the docs, e.g. "#if", is symbolic. The -** directive delimiter is configurable and defaults to "##". Define -** CMPP_DEFAULT_DELIM to a string when compiling to define the default -** at build-time. -** -** This preprocessor has only minimal support for replacement of tokens -** which live in the "content" blocks of inputs (that is, the pieces -** which are not prepocessor lines). -** -** See this file's README.md for details. -** -** Design note: this code makes use of sqlite3. Though not _strictly_ -** needed in order to implement it, this tool was specifically created -** for use with the sqlite3 project's own JavaScript code, so there's -** no reason not to make use of it to do some of the heavy lifting. It -** does not require any cutting-edge sqlite3 features and should be -** usable with any version which supports `WITHOUT ROWID`. -** -** Author(s): -** -** - Stephan Beal -** -** Canonical homes: -** -** - https://fossil.wanderinghorse.net/r/c-pp -** - https://sqlite.org/src/file/ext/wasm/c-pp.c -** -** With the former hosting this app's SCM and the latter being the -** single known deployment of c-pp.c, where much of its development -** happens. -*/ - -#include -#include -#include -#include -#include -#include -#include - -#include "sqlite3.h" - -#if defined(_WIN32) || defined(WIN32) -# include -# include -# ifndef access -# define access(f,m) _access((f),(m)) -# endif -#else -# include -#endif - -#ifndef CMPP_DEFAULT_DELIM -#define CMPP_DEFAULT_DELIM "##" -#endif - -#ifndef CMPP_ATSIGN -#define CMPP_ATSIGN (unsigned char)'@' -#endif - -#if 1 -# define CMPP_NORETURN __attribute__((noreturn)) -#else -# define CMPP_NORETURN -#endif - -/* Fatally exits the app with the given printf-style message. */ -static CMPP_NORETURN void fatalv__base(char const *zFile, int line, - char const *zFmt, va_list); -static CMPP_NORETURN void fatal__base(char const *zFile, int line, - char const *zFmt, ...); -#define fatalv(...) fatalv__base(__FILE__,__LINE__,__VA_ARGS__) -#define fatal(...) fatal__base(__FILE__,__LINE__,__VA_ARGS__) - -/** Proxy for free(), for symmetry with cmpp_realloc(). */ -static void cmpp_free(void *p); -/** A realloc() proxy which dies fatally on allocation error. */ -static void * cmpp_realloc(void * p, unsigned n); -#if 0 -/** A malloc() proxy which dies fatally on allocation error. */ -static void * cmpp_malloc(unsigned n); -#endif - -static void check__oom2(void const *p, char const *zFile, int line){ - if(!p) fatal("Alloc failed at %s:%d", zFile, line); -} -#define check__oom(P) check__oom2((P), __FILE__, __LINE__) - -/* -** If p is stdin or stderr then this is a no-op, else it is a -** proxy for fclose(). This is a no-op if p is NULL. -*/ -static void FILE_close(FILE *p); -/* -** Works like fopen() but accepts the special name "-" to mean either -** stdin (if zMode indicates a real-only mode) or stdout. Fails -** fatally on error. -*/ -static FILE * FILE_open(char const *zName, const char * zMode); -/* -** Reads the entire contents of the given file, allocating it in a -** buffer which gets assigned to `*pOut`. `*nOut` gets assigned the -** length of the output buffer. Fails fatally on error. -*/ -static void FILE_slurp(FILE *pFile, unsigned char **pOut, - unsigned * nOut); - -/* -** Intended to be passed an sqlite3 result code. If it's a non-0 value -** other than SQLITE_ROW or SQLITE_DONE then it emits a fatal error -** message which contains both the given string and the -** sqlite3_errmsg() from the application's database instance. -*/ -static void db_affirm_rc(int rc, const char * zMsg); - -/* -** Proxy for sqlite3_str_finish() which fails fatally if that -** routine returns NULL. -*/ -static char * db_str_finish(sqlite3_str *s, int * n); -/* -** Proxy for sqlite3_str_new() which fails fatally if that -** routine returns NULL. -*/ -static sqlite3_str * db_str_new(void); - -/* -** Proxy for sqlite3_step() which fails fatally if the result -** is anything other than SQLITE_ROW or SQLITE_DONE. -*/ -static int db_step(sqlite3_stmt *pStmt); -/* -** Proxy for sqlite3_bind_int() which fails fatally on error. -*/ -static void db_bind_int(sqlite3_stmt *pStmt, int col, int val); -/* -** Proxy for sqlite3_bind_null() which fails fatally on error. -*/ -static void db_bind_null(sqlite3_stmt *pStmt, int col); -/* -** Proxy for sqlite3_bind_text() which fails fatally on error. -*/ -static void db_bind_text(sqlite3_stmt *pStmt, int col, const char * zStr); -/* -** Proxy for sqlite3_bind_text() which fails fatally on error. -*/ -static void db_bind_textn(sqlite3_stmt *pStmt, int col, const char * zStr, int len); -#if 0 -/* -** Proxy for sqlite3_bind_text() which fails fatally on error. It uses -** sqlite3_str_vappendf() so supports all of its formatting options. -*/ -static void db_bind_textv(sqlite3_stmt *pStmt, int col, const char * zFmt, ...); -#endif -/* -** Proxy for sqlite3_free(), to be passed any memory which is allocated -** by sqlite3_malloc(). -*/ -static void db_free(void *m); - -/* -** Returns true if the first nKey bytes of zKey are a legal string. If -** it returns false and zErrPos is not null, *zErrPos is set to the -** position of the illegal character. If nKey is negative, strlen() is -** used to calculate it. -*/ -static int cmpp_is_legal_key(char const *zKey, int nKey, char const **zErrPos); - -/* -** Fails fatally if !cmpp_is_legal_key(zKey). -*/ -static void cmpp_affirm_legal_key(char const *zKey, int nKey); - -/* -** Adds the given `#define` macro name to the list of macros, ignoring -** any duplicates. Fails fatally on error. -** -** If zVal is NULL then zKey may contain an '=', from which the value -** will be extracted. If zVal is not NULL then zKey may _not_ contain -** an '='. -*/ -static void db_define_add(const char * zKey, char const *zVal); - -/* -** Returns true if the given key is already in the `#define` list, -** else false. Fails fatally on db error. -** -** nName is the length of the key part of zName (which might have -** a following =y part. If it's negative, strlen() is used to -** calculate it. -*/ -static int db_define_has(const char * zName, int nName); - -/* -** Returns true if the given key is already in the `#define` list, and -** it has a truthy value (is not empty and not equal to '0'), else -** false. Fails fatally on db error. -** -** nName is the length of zName, or <0 to use strlen() to figure -** it out. -*/ -static int db_define_get_bool(const char * zName, int nName); - -/* -** Searches for a define where (k GLOB zName). If one is found, a copy -** of it is assigned to *zVal (the caller must eventually db_free() -** it)), *nVal (if nVal is not NULL) is assigned its strlen, and -** returns non-0. If no match is found, 0 is returned and neither -** *zVal nor *nVal are modified. If more than one result matches, a -** fatal error is triggered. -** -** It is legal for *zVal to be NULL (and *nVal to be 0) if it returns -** non-0. That just means that the key was defined with no value part. -*/ -static int db_define_get(const char * zName, int nName, char **zVal, unsigned int *nVal); - -/* -** Removes the given `#define` macro name from the list of -** macros. Fails fatally on error. -*/ -static void db_define_rm(const char * zKey); -/* -** Adds the given filename to the list of being-`#include`d files, -** using the given source file name and line number of error reporting -** purposes. If recursion is later detected. -*/ -static void db_including_add(const char * zKey, const char * zSrc, int srcLine); -/* -** Adds the given dir to the list of includes. They are checked in the -** order they are added. -*/ -static void db_include_dir_add(const char * zKey); -/* -** Returns a resolved path of PREFIX+'/'+zKey, where PREFIX is one of -** the `#include` dirs (db_include_dir_add()). If no file match is -** found, NULL is returned. Memory must eventually be passed to -** db_free() to free it. -*/ -static char * db_include_search(const char * zKey); -/* -** Removes the given key from the `#include` list. -*/ -static void db_include_rm(const char * zKey); -/* -** A proxy for sqlite3_prepare() which fails fatally on error. -*/ -static void db_prepare(sqlite3_stmt **pStmt, const char * zSql, ...); - -/* -** Opens the given file and processes its contents as c-pp, sending -** all output to the global c-pp output channel. Fails fatally on -** error. -*/ -static void cmpp_process_file(const char * zName); - -/* -** Operator policy for cmpp_kvp_parse(). -*/ -enum cmpp_key_op_e { - /* Fail if the key contains an operator. */ - cmpp_key_op_none, - /* Accept only '='. */ - cmpp_key_op_eq1 -}; -typedef enum cmpp_key_op_e cmpp_key_op_e; - -/* -** Operators and operator policies for use with X=Y-format keys. -*/ -#define cmpp_kvp_op_map(E) \ - E(none,"") \ - E(eq1,"=") \ - E(eq2,"==") \ - E(lt,"<") \ - E(le,"<=") \ - E(gt,">") \ - E(ge,">=") - -enum cmpp_kvp_op_e { -#define E(N,S) cmpp_kvp_op_ ## N, - cmpp_kvp_op_map(E) -#undef E -}; -typedef enum cmpp_kvp_op_e cmpp_kvp_op_e; - -/* -** A snippet from a string. -*/ -struct cmpp_snippet { - char const *z; - unsigned int n; -}; -typedef struct cmpp_snippet cmpp_snippet; -#define cmpp_snippet_empty_m {0,0} - -/* -** Result type for cmpp_kvp_parse(). -*/ -struct cmpp_kvp { - cmpp_snippet k; - cmpp_snippet v; - cmpp_kvp_op_e op; -}; - -typedef struct cmpp_kvp cmpp_kvp; -#define cmpp_kvp_empty_m \ - {cmpp_snippet_empty_m,cmpp_snippet_empty_m,cmpp_kvp_op_none} -static const cmpp_kvp cmpp_kvp_empty = cmpp_kvp_empty_m; - -/* -** Parses X or X=Y into p. Fails fatally on error. -** -** If nKey is negative then strlen() is used to calculate it. -** -** The third argument specifies whether/how to permit/treat the '=' -** part of X=Y. -*/ -static void cmpp_kvp_parse(cmpp_kvp * p, - char const *zKey, int nKey, - cmpp_kvp_op_e opPolicy); - -/* -** Wrapper around a FILE handle. -*/ -typedef struct FileWrapper FileWrapper; -struct FileWrapper { - /* File's name. */ - char const *zName; - /* FILE handle. */ - FILE * pFile; - /* Where FileWrapper_slurp() stores the file's contents. */ - unsigned char * zContent; - /* Size of this->zContent, as set by FileWrapper_slurp(). */ - unsigned nContent; - /* See Global::pFiles. */ - FileWrapper * pTail; -}; -#define FileWrapper_empty_m {0,0,0,0,0} -static const FileWrapper FileWrapper_empty = FileWrapper_empty_m; - -/* -** Proxy for FILE_close() and frees all memory owned by p. A no-op if -** p is already closed. -*/ -static void FileWrapper_close(FileWrapper * p); -/* Proxy for FILE_open(). Closes p first if it's currently opened. */ -static void FileWrapper_open(FileWrapper * p, const char * zName, const char *zMode); -/* Proxy for FILE_slurp(). */ -static void FileWrapper_slurp(FileWrapper * p); -/* -** If p->zContent ends in \n or \r\n, that part is replaced with 0 and -** p->nContent is adjusted. Returns true if it chomps, else false. -*/ -int FileWrapper_chomp(FileWrapper * p); - -/* -** Outputs a printf()-formatted message to stderr. -*/ -static void g_stderr(char const *zFmt, ...); -/* -** Outputs a printf()-formatted message to stderr. -*/ -static void g_stderrv(char const *zFmt, va_list); -#define g_debug(lvl,pfexpr) \ - if(lvl<=g.flags.doDebug) g_stderr("%s @ %s():%d: ",g.zArgv0,__func__,__LINE__); \ - if(lvl<=g.flags.doDebug) g_stderr pfexpr - -#define g_warn(zFmt,...) g_stderr("%s:%d %s() " zFmt "\n", __FILE__, __LINE__, __func__, __VA_ARGS__) -#define g_warn0(zMsg) g_stderr("%s:%d %s() %s\n", __FILE__, __LINE__, __func__, zMsg) - -void cmpp_free(void *p){ - sqlite3_free(p); -} - -void * cmpp_realloc(void * p, unsigned n){ - void * const rc = sqlite3_realloc(p, n); - if(!rc) fatal("realloc(P,%u) failed", n); - return rc; -} - -#if 0 -void * cmpp_malloc(unsigned n){ - void * const rc = sqlite3_alloc(n); - if(!rc) fatal("malloc(%u) failed", n); - return rc; -} -#endif - -FILE * FILE_open(char const *zName, const char * zMode){ - FILE * p; - if('-'==zName[0] && 0==zName[1]){ - p = strstr(zMode,"w") ? stdout : stdin; - }else{ - p = fopen(zName, zMode); - if(!p) fatal("Cannot open file [%s] with mode [%s]", zName, zMode); - } - return p; -} - -void FILE_close(FILE *p){ - if(p && p!=stdout && p!=stderr){ - fclose(p); - } -} - -void FILE_slurp(FILE *pFile, unsigned char **pOut, - unsigned * nOut){ - unsigned char zBuf[1024 * 8]; - unsigned char * pDest = 0; - unsigned nAlloc = 0; - unsigned nOff = 0; - /* Note that this needs to be able to work on non-seekable streams, - ** thus we read in chunks instead of doing a single alloc and - ** filling it in one go. */ - while( !feof(pFile) ){ - size_t const n = fread(zBuf, 1, sizeof(zBuf), pFile); - if(n>0){ - if(nAlloc < nOff + n + 1){ - nAlloc = nOff + n + 1; - pDest = cmpp_realloc(pDest, nAlloc); - } - memcpy(pDest + nOff, zBuf, n); - nOff += n; - } - } - if(pDest) pDest[nOff] = 0; - *pOut = pDest; - *nOut = nOff; -} - -void FileWrapper_close(FileWrapper * p){ - if(p->pFile) FILE_close(p->pFile); - if(p->zContent) cmpp_free(p->zContent); - *p = FileWrapper_empty; -} - -void FileWrapper_open(FileWrapper * p, const char * zName, - const char * zMode){ - FileWrapper_close(p); - p->pFile = FILE_open(zName, zMode); - p->zName = zName; -} - -void FileWrapper_slurp(FileWrapper * p){ - assert(!p->zContent); - assert(p->pFile); - FILE_slurp(p->pFile, &p->zContent, &p->nContent); -} - -int FileWrapper_chomp(FileWrapper * p){ - if( p->nContent && '\n'==p->zContent[p->nContent-1] ){ - p->zContent[--p->nContent] = 0; - if( p->nContent && '\r'==p->zContent[p->nContent-1] ){ - p->zContent[--p->nContent] = 0; - } - return 1; - } - return 0; -} - -enum CmppParseState { -TS_Start = 1, -TS_If, -TS_IfPassed, -TS_Else, -TS_Error -}; -typedef enum CmppParseState CmppParseState; -enum CmppTokenType { - -#define CmppToken_map(E) \ - E(Invalid,0) \ - E(Assert,"assert") \ - E(AtPolicy,"@policy") \ - E(Comment,"//") \ - E(Define,"define") \ - E(Elif,"elif") \ - E(Else,"else") \ - E(Endif,"endif") \ - E(Error,"error") \ - E(If,"if") \ - E(Include,"include") \ - E(Line,0) \ - E(Opaque,0) \ - E(Pragma,"pragma") \ - E(Savepoint,"savepoint") \ - E(Stderr,"stderr") \ - E(Undef,"undef") - -#define E(N,TOK) TT_ ## N, - CmppToken_map(E) -#undef E -}; -typedef enum CmppTokenType CmppTokenType; - -/* -** Map of directive (formerly keyword) names and their token types. -*/ -static const struct { -#define E(N,TOK) struct cmpp_snippet N; - CmppToken_map(E) -#undef E -} DStrings = { -#define E(N,TOK) .N = {TOK,sizeof(TOK)-1}, - CmppToken_map(E) -#undef E -}; - -//static -char const * TT_cstr(int tt){ - switch(tt){ -#define E(N,TOK) case TT_ ## N: return DStrings.N.z; - CmppToken_map(E) -#undef E - } - return NULL; -} - -struct CmppToken { - CmppTokenType ttype; - /* Line number of this token in the source file. */ - unsigned lineNo; - /* Start of the token. */ - unsigned char const * zBegin; - /* One-past-the-end byte of the token. */ - unsigned char const * zEnd; -}; -typedef struct CmppToken CmppToken; -#define CmppToken_empty_m {TT_Invalid,0,0,0} -static const CmppToken CmppToken_empty = CmppToken_empty_m; - -/* -** CmppLevel represents one "level" of tokenization, starting at the -** top of the main input, incrementing once for each level of `#if`, -** and decrementing for each `#endif`. -** pushes a level. -*/ -typedef struct CmppLevel CmppLevel; -struct CmppLevel { - unsigned short flags; - /* - ** Used for controlling which parts of an if/elif/...endif chain - ** should get output. - */ - unsigned short skipLevel; - /* The token which started this level (an 'if' or 'include'). */ - CmppToken token; - CmppParseState pstate; -}; -#define CmppLevel_empty_m {0U,0U,CmppToken_empty_m,TS_Start} -static const CmppLevel CmppLevel_empty = CmppLevel_empty_m; -enum CmppLevel_Flags { -/* Max depth of nested `#if` constructs in a single tokenizer. */ -CmppLevel_Max = 10, -/* Max number of keyword arguments. */ -CmppArgs_Max = 15, -/* Directive line buffer size */ -CmppArgs_BufSize = 1024, -/* Flag indicating that output for a CmpLevel should be elided. */ -CmppLevel_F_ELIDE = 0x01, -/* -** Mask of CmppLevel::flags which are inherited when CmppLevel_push() -** is used. -*/ -CmppLevel_F_INHERIT_MASK = CmppLevel_F_ELIDE -}; - -typedef struct CmppTokenizer CmppTokenizer; -typedef struct CmppKeyword CmppKeyword; -typedef void (*cmpp_keyword_f)(CmppKeyword const * pKw, CmppTokenizer * t); -struct CmppKeyword { - const char *zName; - unsigned nName; - int bTokenize; - CmppTokenType ttype; - cmpp_keyword_f xCall; -}; - -static CmppKeyword const * CmppKeyword_search(const char *zName); -static void cmpp_process_keyword(CmppTokenizer * const t); - -/* -** Tokenizer for c-pp input files. -*/ -struct CmppTokenizer { - const char * zName; /* Input (file) name for error reporting */ - unsigned const char * zBegin; /* start of input */ - unsigned const char * zEnd; /* one-after-the-end of input */ - unsigned const char * zPos; /* current position */ - unsigned int lineNo; /* line # of current pos */ - unsigned nSavepoint; - CmppParseState pstate; - CmppToken token; /* current token result */ - struct { - unsigned ndx; - CmppLevel stack[CmppLevel_Max]; - } level; - /* Args for use in cmpp_keyword_f() impls. */ - struct { - CmppKeyword const * pKw; - int argc; - const unsigned char * argv[CmppArgs_Max]; - unsigned char lineBuf[CmppArgs_BufSize]; - } args; -}; -#define CT_level(t) (t)->level.stack[(t)->level.ndx] -#define CT_pstate(t) CT_level(t).pstate -#define CT_skipLevel(t) CT_level(t).skipLevel -#define CLvl_skip(lvl) ((lvl)->skipLevel || ((lvl)->flags & CmppLevel_F_ELIDE)) -#define CT_skip(t) CLvl_skip(&CT_level(t)) -#define CmppTokenizer_empty_m { \ - .zName=0, .zBegin=0, .zEnd=0, \ - .zPos=0, \ - .lineNo=1U, \ - .pstate = TS_Start, \ - .token = CmppToken_empty_m, \ - .level = {0U,{CmppLevel_empty_m}}, \ - .args = {0,0,{0},{0}} \ - } -static const CmppTokenizer CmppTokenizer_empty = CmppTokenizer_empty_m; - -static void CmppTokenizer_cleanup(CmppTokenizer * const t); - -static void cmpp_t_out(CmppTokenizer * t, void const *z, unsigned int n); -/*static void cmpp_t_outf(CmppTokenizer * t, char const *zFmt, ...);*/ - -/* -** Pushes a new level into the given tokenizer. Fails fatally if -** it's too deep. -*/ -static void CmppLevel_push(CmppTokenizer * const t); -/* -** Pops a level from the tokenizer. Fails fatally if the top -** level is popped. -*/ -static void CmppLevel_pop(CmppTokenizer * const t); -/* -** Returns the current level object. -*/ -static CmppLevel * CmppLevel_get(CmppTokenizer * const t); - -/* -** Policies for how to handle undefined @tokens@ when performing -** content filtering. -*/ -enum AtPolicy { - AT_invalid = -1, - /** Turn off @foo@ parsing. */ - AT_OFF = 0, - /** Retain undefined @foo@ - emit it as-is. */ - AT_RETAIN, - /** Elide undefined @foo@. */ - AT_ELIDE, - /** Error for undefined @foo@. */ - AT_ERROR, - AT_DEFAULT = AT_ERROR -}; -typedef enum AtPolicy AtPolicy; - -static AtPolicy AtPolicy_fromStr(char const *z, int bEnforce){ - if( 0==strcmp(z, "retain") ) return AT_RETAIN; - if( 0==strcmp(z, "elide") ) return AT_ELIDE; - if( 0==strcmp(z, "error") ) return AT_ERROR; - if( 0==strcmp(z, "off") ) return AT_OFF; - if( bEnforce ){ - fatal("Invalid @ policy value: %s. " - "Try one of retain|elide|error|off.", z); - } - return AT_invalid; -} - -/* -** Global app state singleton. -*/ -static struct Global { - /* main()'s argv[0]. */ - const char * zArgv0; - /* App's db instance. */ - sqlite3 * db; - /* Current tokenizer (for error reporting purposes). */ - CmppTokenizer const * tok; - /* - ** We use a linked-list of these to keep track of our opened - ** files so that we can clean then up via atexit() in the case of - ** fatal error (to please valgrind). - */ - FileWrapper * pFiles; - /* Output channel. */ - FileWrapper out; - struct { - /* - ** Bytes of the keyword delimiter/prefix. Owned - ** elsewhere. - */ - const char * z; - /* Byte length of this->zDelim. */ - unsigned short n; - /* - ** The @token@ delimiter. - ** - ** Potential TODO is replace this with a pair of opener/closer - ** strings, e.g. "{{" and "}}". - */ - const unsigned char chAt; - } delim; - struct { -#define CMPP_SAVEPOINT_NAME "_cmpp_" -#define GStmt_map(E) \ - E(defIns,"INSERT OR REPLACE INTO def(k,v) VALUES(?,?)") \ - E(defDel,"DELETE FROM def WHERE k GLOB ?") \ - E(defHas,"SELECT 1 FROM def WHERE k GLOB ?") \ - E(defGet,"SELECT k,v FROM def WHERE k GLOB ?") \ - E(defGetBool, \ - "SELECT 1 FROM def WHERE k = ?1" \ - " AND v IS NOT NULL" \ - " AND '0'!=v AND ''!=v") \ - E(defSelAll,"SELECT k,v FROM def ORDER BY k") \ - E(inclIns,"INSERT OR FAIL INTO incl(file,srcFile," \ - "srcLine) VALUES(?,?,?)") \ - E(inclDel,"DELETE FROM incl WHERE file=?") \ - E(inclHas,"SELECT 1 FROM incl WHERE file=?") \ - E(inclPathAdd,"INSERT OR FAIL INTO " \ - "inclpath(seq,dir) VALUES(?,?)") \ - E(inclSearch, \ - "SELECT ?1 fn WHERE fileExists(fn) " \ - "UNION ALL SELECT * FROM (" \ - "SELECT replace(dir||'/'||?1, '//','/') AS fn " \ - "FROM inclpath WHERE fileExists(fn) ORDER BY seq"\ - ")") \ - E(spBegin,"SAVEPOINT " CMPP_SAVEPOINT_NAME) \ - E(spRollback,"ROLLBACK TO SAVEPOINT " \ - CMPP_SAVEPOINT_NAME) \ - E(spRelease,"RELEASE SAVEPOINT " CMPP_SAVEPOINT_NAME) - -#define E(N,S) sqlite3_stmt * N; - GStmt_map(E) -#undef E - } stmt; - struct { - FILE * pFile; - int expandSql; - } sqlTrace; - struct { - AtPolicy atPolicy; - /* If true, enables certain debugging output. */ - char doDebug; - /* If true, chomp() files read via -Fx=file. */ - char chompF; - } flags; -} g = { - .zArgv0 = "?", - .db = 0, - .tok = 0, - .pFiles = 0, - .out = FileWrapper_empty_m, - .delim = { - .z = CMPP_DEFAULT_DELIM, - .n = (unsigned short) sizeof(CMPP_DEFAULT_DELIM)-1, - .chAt = '@' - }, - .stmt = { - .defIns = 0, - .defDel = 0, - .defHas = 0, - .defGet = 0, - .defGetBool = 0, - .inclIns = 0, - .inclDel = 0, - .inclHas = 0, - .inclPathAdd = 0, - .inclSearch = 0 - }, - .sqlTrace = { - .pFile = 0, - .expandSql = 0 - }, - .flags = { - .atPolicy = AT_OFF, - .doDebug = 0, - .chompF = 0 - } -}; - -/** Distinct IDs for each g.stmt member. */ -enum GStmt_e { - GStmt_none = 0, -#define E(N,S) GStmt_ ## N, - GStmt_map(E) -#undef E -}; - -/* -** Returns the g.stmt.X corresponding to `which`, initializing it if -** needed. It does not return NULL - it fails fatally on error. -*/ -static sqlite3_stmt * g_stmt(enum GStmt_e which){ - sqlite3_stmt ** q = 0; - char const * zSql = 0; - switch(which){ - case GStmt_none: - fatal("GStmt_none is not a valid statement handle"); - return NULL; -#define E(N,S) case GStmt_ ## N: zSql = S; q = &g.stmt.N; break; - GStmt_map(E) -#undef E - } - assert( q ); - assert( zSql && *zSql ); - if( !*q ){ - db_prepare(q, "%s", zSql); - assert( *q ); - } - return *q; -} -static void g_stmt_reset(sqlite3_stmt * const q){ - sqlite3_clear_bindings(q); - sqlite3_reset(q); -} - -#if 0 -/* -** Outputs a printf()-formatted message to c-pp's global output -** channel. -*/ -static void g_outf(char const *zFmt, ...); -void g_outf(char const *zFmt, ...){ - va_list va; - va_start(va, zFmt); - vfprintf(g.out.pFile, zFmt, va); - va_end(va); -} -#endif - -/* Outputs n bytes from z to c-pp's global output channel. */ -static void g_out(void const *z, unsigned int n); -void g_out(void const *z, unsigned int n){ - if(g.out.pFile && 1!=fwrite(z, n, 1, g.out.pFile)){ - int const err = errno; - fatal("fwrite() output failed with errno #%d", err); - } -} - -void g_stderrv(char const *zFmt, va_list va){ - if( g.out.pFile==stdout ){ - fflush(g.out.pFile); - } - vfprintf(stderr, zFmt, va); -} - -void g_stderr(char const *zFmt, ...){ - va_list va; - va_start(va, zFmt); - g_stderrv(zFmt, va); - va_end(va); -} - -/* -** Emits n bytes of z if CT_skip(t) is false. -*/ -void cmpp_t_out(CmppTokenizer * t, void const *z, unsigned int n){ - g_debug(3,("CT_skipLevel() ?= %d\n",CT_skipLevel(t))); - g_debug(3,("CT_skip() ?= %d\n",CT_skip(t))); - if(!CT_skip(t)) g_out(z, n); -} - -void CmppLevel_push(CmppTokenizer * const t){ - CmppLevel * pPrev; - CmppLevel * p; - if(t->level.ndx+1 == (unsigned)CmppLevel_Max){ - fatal("%sif nesting level is too deep. Max=%d\n", - g.delim.z, CmppLevel_Max); - } - pPrev = &CT_level(t); - g_debug(3,("push from tokenizer level=%u flags=%04x\n", - t->level.ndx, pPrev->flags)); - p = &t->level.stack[++t->level.ndx]; - *p = CmppLevel_empty; - p->token = t->token; - p->flags = (CmppLevel_F_INHERIT_MASK & pPrev->flags); - if(CLvl_skip(pPrev)) p->flags |= CmppLevel_F_ELIDE; - g_debug(3,("push to tokenizer level=%u flags=%04x\n", - t->level.ndx, p->flags)); -} - -void CmppLevel_pop(CmppTokenizer * const t){ - if(!t->level.ndx){ - fatal("Internal error: CmppLevel_pop() at the top of the stack"); - } - g_debug(3,("pop from tokenizer level=%u, flags=%04x skipLevel?=%d\n", - t->level.ndx, - t->level.stack[t->level.ndx].flags, CT_skipLevel(t))); - g_debug(3,("CT_skipLevel() ?= %d\n",CT_skipLevel(t))); - g_debug(3,("CT_skip() ?= %d\n",CT_skip(t))); - t->level.stack[t->level.ndx--] = CmppLevel_empty; - g_debug(3,("pop to tokenizer level=%u, flags=%04x\n", t->level.ndx, - t->level.stack[t->level.ndx].flags)); - g_debug(3,("CT_skipLevel() ?= %d\n",CT_skipLevel(t))); - g_debug(3,("CT_skip() ?= %d\n",CT_skip(t))); -} - -CmppLevel * CmppLevel_get(CmppTokenizer * const t){ - return &t->level.stack[t->level.ndx]; -} - - -void db_affirm_rc(int rc, const char * zMsg){ - switch(rc){ - case 0: - case SQLITE_DONE: - case SQLITE_ROW: - break; - default: - assert( g.db ); - fatal("Db error #%d %s: %s", rc, zMsg, - sqlite3_errmsg(g.db)); - } -} - -int db_step(sqlite3_stmt *pStmt){ - int const rc = sqlite3_step(pStmt); - switch( rc ){ - case SQLITE_ROW: - case SQLITE_DONE: - break; - default: - db_affirm_rc(rc, "from db_step()"); - } - return rc; -} - -static sqlite3_str * db_str_new(void){ - sqlite3_str * rc = sqlite3_str_new(g.db); - if(!rc) fatal("Alloc failed for sqlite3_str_new()"); - return rc; -} - -static char * db_str_finish(sqlite3_str *s, int * n){ - int const rc = sqlite3_str_errcode(s); - if(rc) fatal("Error #%d from sqlite3_str_errcode()", rc); - if(n) *n = sqlite3_str_length(s); - char * z = sqlite3_str_finish(s); - if(!z) fatal("Alloc failed for sqlite3_str_new()"); - return z; -} - -void db_prepare(sqlite3_stmt **pStmt, const char * zSql, ...){ - int rc; - sqlite3_str * str = db_str_new(); - char * z = 0; - int n = 0; - va_list va; - if(!str) fatal("sqlite3_str_new() failed"); - va_start(va, zSql); - sqlite3_str_vappendf(str, zSql, va); - va_end(va); - rc = sqlite3_str_errcode(str); - if(rc) fatal("sqlite3_str_errcode() = %d", rc); - z = db_str_finish(str, &n); - rc = sqlite3_prepare_v2(g.db, z, n, pStmt, 0); - if(rc) fatal("Error #%d (%s) preparing: %s", - rc, sqlite3_errmsg(g.db), z); - sqlite3_free(z); -} - -void db_bind_int(sqlite3_stmt *pStmt, int col, int val){ - db_affirm_rc(sqlite3_bind_int(pStmt, col, val), - "from db_bind_int()"); -} - -void db_bind_null(sqlite3_stmt *pStmt, int col){ - db_affirm_rc(sqlite3_bind_null(pStmt, col), - "from db_bind_null()"); -} - -void db_bind_textn(sqlite3_stmt *pStmt, int col, - const char * zStr, int n){ - db_affirm_rc( - (zStr && n) - ? sqlite3_bind_text(pStmt, col, zStr, n, SQLITE_TRANSIENT) - : sqlite3_bind_null(pStmt, col), - "from db_bind_textn()" - ); -} - -void db_bind_text(sqlite3_stmt *pStmt, int col, - const char * zStr){ - db_bind_textn(pStmt, col, zStr, -1); -} - -#if 0 -void db_bind_textv(sqlite3_stmt *pStmt, int col, - const char * zFmt, ...){ - int rc; - sqlite3_str * str = db_str_new(); - int n = 0; - char * z; - va_list va; - va_start(va,zFmt); - sqlite3_str_vappendf(str, zFmt, va); - va_end(va); - z = db_str_finish(str, &n); - rc = sqlite3_bind_text(pStmt, col, z, n, sqlite3_free); - db_affirm_rc(rc,"from db_bind_textv()"); -} -#endif - -void db_free(void *m){ - sqlite3_free(m); -} - -void db_define_add(const char * zKey, char const *zVal){ - cmpp_kvp kvp = cmpp_kvp_empty; - cmpp_kvp_parse(&kvp, zKey, -1, - zVal - ? cmpp_key_op_none - : cmpp_key_op_eq1 - ); - if( kvp.v.z ){ - if( zVal ){ - assert(!"cannot happen - cmpp_key_op_none will prevent it"); - fatal("Cannot assign two values to [%.*s] [%.*s] [%s]", - kvp.k.n, kvp.k.z, kvp.v.n, kvp.v.z, zVal); - } - }else{ - kvp.v.z = zVal; - kvp.v.n = zVal ? (int)strlen(zVal) : 0; - } - sqlite3_stmt * const q = g_stmt(GStmt_defIns); - //g_stderr("zKey=%s\nzVal=%s\nzEq=%s\n", zKey, zVal, zEq); - db_bind_textn(q, 1, kvp.k.z, kvp.k.n); - if( kvp.v.z ){ - if( kvp.v.n ){ - db_bind_textn(q, 2, kvp.v.z, (int)kvp.v.n); - }else{ - db_bind_null(q, 2); - } - }else{ - db_bind_int(q, 2, 1); - } - db_step(q); - g_debug(2,("define: %s%s%s\n", - zKey, - zVal ? " with value " : "", - zVal ? zVal : "")); - sqlite3_reset(q); -} - -static void db_define_add_file(const char * zKey){ - cmpp_kvp kvp = cmpp_kvp_empty; - cmpp_kvp_parse(&kvp, zKey, -1, cmpp_kvp_op_eq1); - if( !kvp.v.z || !kvp.v.n ){ - fatal("Invalid filename: %s", zKey); - } - sqlite3_stmt * q = 0; - FileWrapper fw = FileWrapper_empty; - FileWrapper_open(&fw, kvp.v.z, "r"); - FileWrapper_slurp(&fw); - q = g_stmt(GStmt_defIns); - //g_stderr("zKey=%s\nzVal=%s\nzEq=%s\n", zKey, zVal, zEq); - db_bind_textn(q, 1, kvp.k.z, (int)kvp.k.n); - if( g.flags.chompF ){ - FileWrapper_chomp(&fw); - } - if( fw.nContent ){ - db_affirm_rc( - sqlite3_bind_text(q, 2, - (char const *)fw.zContent, - (int)fw.nContent, sqlite3_free), - "binding file content"); - fw.zContent = 0 /* transfered ownership */; - fw.nContent = 0; - }else{ - db_affirm_rc( sqlite3_bind_null(q, 2), - "binding empty file content"); - } - FileWrapper_close(&fw); - db_step(q); - g_stmt_reset(q); - g_debug(2,("define: %s%s%s\n", - kvp.k.z, - kvp.v.z ? " with value " : "", - kvp.v.z ? kvp.v.z : "")); -} - -#define ustr_c(X) ((unsigned char const *)X) - -static inline unsigned int cmpp_strlen(char const *z, int n){ - return n<0 ? (int)strlen(z) : (unsigned)n; -} - - -int db_define_has(const char * zName, int nName){ - int rc; - sqlite3_stmt * const q = g_stmt(GStmt_defHas); - nName = cmpp_strlen(zName, nName); - db_bind_textn(q, 1, zName, nName); - rc = db_step(q); - if(SQLITE_ROW == rc){ - rc = 1; - }else{ - assert(SQLITE_DONE==rc); - rc = 0; - } - g_debug(1,("defined [%s] ?= %d\n",zName, rc)); - g_stmt_reset(q); - return rc; -} - -int db_define_get_bool(const char * zName, int nName){ - sqlite3_stmt * const q = g_stmt(GStmt_defGetBool); - int rc = 0; - nName = cmpp_strlen(zName, nName); - db_bind_textn(q, 1, zName, nName); - rc = db_step(q); - if(SQLITE_ROW == rc){ - if( SQLITE_ROW==sqlite3_step(q) ){ - fatal("Key is ambiguous: %s", zName); - } - rc = 1; - }else{ - assert(SQLITE_DONE==rc); - rc = 0; - } - g_stmt_reset(q); - return rc; -} - -int db_define_get(const char * zName, int nName, - char **zVal, unsigned int *nVal){ - sqlite3_stmt * q = g_stmt(GStmt_defGet); - nName = cmpp_strlen(zName, nName); - db_bind_textn(q, 1, zName, nName); - int n = 0; - int rc = db_step(q); - if(SQLITE_ROW == rc){ - const unsigned char * z = sqlite3_column_text(q, 1); - n = sqlite3_column_bytes(q,1); - if( nVal ) *nVal = (unsigned)n; - *zVal = sqlite3_mprintf("%.*s", n, z); - if( n && z ) check__oom(*zVal); - if( SQLITE_ROW==sqlite3_step(q) ){ - db_free(*zVal); - *zVal = 0; - fatal("Key is ambiguous: %.*s\n", - nName, zName); - } - rc = 1; - }else{ - assert(SQLITE_DONE==rc); - rc = 0; - } - g_debug(1,("define [%.*s] ?= %d %.*s\n", - nName, zName, rc, - *zVal ? n : 0, - *zVal ? *zVal : "")); - g_stmt_reset(q); - return rc; -} - -void db_define_rm(const char * zKey){ - int rc; - int n = 0; - sqlite3_stmt * const q = g_stmt(GStmt_defDel); - db_bind_text(q, 1, zKey); - rc = db_step(q); - if(SQLITE_DONE != rc){ - db_affirm_rc(rc, "Stepping DELETE on def"); - } - g_debug(2,("undefine: %.*s\n",n, zKey)); - g_stmt_reset(q); -} - -void db_including_add(const char * zKey, const char * zSrc, int srcLine){ - int rc; - sqlite3_stmt * const q = g_stmt(GStmt_inclIns); - db_bind_text(q, 1, zKey); - db_bind_text(q, 2, zSrc); - db_bind_int(q, 3, srcLine); - rc = db_step(q); - if(SQLITE_DONE != rc){ - db_affirm_rc(rc, "Stepping INSERT on incl"); - } - g_debug(2,("is-including-file add [%s] from [%s]:%d\n", zKey, zSrc, srcLine)); - g_stmt_reset(q); -} - -void db_include_rm(const char * zKey){ - int rc; - sqlite3_stmt * const q = g_stmt(GStmt_inclDel); - db_bind_text(q, 1, zKey); - rc = db_step(q); - if(SQLITE_DONE != rc){ - db_affirm_rc(rc, "Stepping DELETE on incl"); - } - g_debug(2,("inclpath rm [%s]\n", zKey)); - g_stmt_reset(q); -} - -char * db_include_search(const char * zKey){ - char * zName = 0; - sqlite3_stmt * const q = g_stmt(GStmt_inclSearch); - db_bind_text(q, 1, zKey); - if(SQLITE_ROW==db_step(q)){ - const unsigned char * z = sqlite3_column_text(q, 0); - zName = z ? sqlite3_mprintf("%s", z) : 0; - if(!zName) fatal("Alloc failed"); - } - g_stmt_reset(q); - return zName; -} - -static int db_including_has(const char * zName){ - int rc; - sqlite3_stmt * const q = g_stmt(GStmt_inclHas); - db_bind_text(q, 1, zName); - rc = db_step(q); - if(SQLITE_ROW == rc){ - rc = 1; - }else{ - assert(SQLITE_DONE==rc); - rc = 0; - } - g_debug(2,("inclpath has [%s] = %d\n",zName, rc)); - g_stmt_reset(q); - return rc; -} - -#if 0 -/* -** Fails fatally if the `#include` list contains the given key. -*/ -static void db_including_check(const char * zKey); -void db_including_check(const char * zName){ - if(db_including_has(zName)){ - fatal("Recursive include detected: %s\n", zName); - } -} -#endif - -void db_include_dir_add(const char * zDir){ - static int seq = 0; - int rc; - sqlite3_stmt * const q = g_stmt(GStmt_inclPathAdd); - db_bind_int(q, 1, ++seq); - db_bind_text(q, 2, zDir); - rc = db_step(q); - if(SQLITE_DONE != rc){ - db_affirm_rc(rc, "Stepping INSERT on inclpath"); - } - g_debug(2,("inclpath add #%d: %s\n",seq, zDir)); - g_stmt_reset(q); -} - -void g_FileWrapper_link(FileWrapper *fp){ - assert(!fp->pTail); - fp->pTail = g.pFiles; - g.pFiles = fp; -} - -void g_FileWrapper_close(FileWrapper *fp){ - assert(fp); - assert(fp->pTail || g.pFiles==fp); - g.pFiles = fp->pTail; - fp->pTail = 0; - FileWrapper_close(fp); -} - -static void g_cleanup(int bCloseFileChain){ - if( g.db ){ -#define E(N,S) sqlite3_finalize(g.stmt.N); g.stmt.N = 0; - GStmt_map(E) -#undef E - } - if( bCloseFileChain ){ - FileWrapper * fpNext = 0; - for( FileWrapper * fp=g.pFiles; fp; fp=fpNext ){ - fpNext = fp->pTail; - fp->pTail = 0; - FileWrapper_close(fp); - } - } - FileWrapper_close(&g.out); - if(g.db){ - sqlite3_close(g.db); - g.db = 0; - } -} - -static void cmpp_atexit(void){ - g_cleanup(1); -} - -int cmpp_is_legal_key(char const *zKey, int nKey, char const **zAt){ - char const * z = zKey; - nKey = cmpp_strlen(zKey, nKey); - if( !nKey ){ - if( zAt ) *zAt = z; - return 0; - } - char const * const zEnd = z ? z + nKey : NULL; - for( ; z < zEnd; ++z ){ - switch( (0x80 & *z) ? 0 : *z ){ - case 0: - case '_': - continue; - case '-': - case '.': - case '/': - case ':': - case '=': - case '0': case '1': case '2': case '3': case '4': - case '5': case '6': case '7': case '8': case '9': - if( z==zKey ) break; - continue; - default: - if( isalpha((int)*z) ) continue; - } - if( zAt ) *zAt = z; - return 0; - } - assert( z==zEnd ); - return 1; -} - -void cmpp_affirm_legal_key(char const *zKey, int nKey){ - char const *zAt = 0; - nKey = cmpp_strlen(zKey, nKey); - if( !cmpp_is_legal_key(zKey, nKey, &zAt) ){ - assert( zAt ); - fatal("Illegal character 0x%02x in key [%.*s]\n", - (int)*zAt, nKey, zKey); - } -} - -/* -** sqlite3 UDF which returns true if its argument refers to an -** accessible file, else false. -*/ -static void udf_file_exists( - sqlite3_context *context, - int argc, - sqlite3_value **argv -){ - const char *zName; - (void)(argc); /* Unused parameter */ - zName = (const char*)sqlite3_value_text(argv[0]); - if( zName==0 ) return; - sqlite3_result_int(context, 0==access(zName, 0)); -} - -/** - ** This sqlite3_trace_v2() callback outputs tracing info using - ** g.sqlTrace.pFile. -*/ -static int cmpp__db_sq3TraceV2(unsigned t,void*c,void*p,void*x){ - static unsigned int counter = 0; - switch(t){ - case SQLITE_TRACE_STMT:{ - FILE * const fp = g.sqlTrace.pFile; - if( fp ){ - char const * const zSql = (char const *)x; - char * const zExp = g.sqlTrace.expandSql - ? sqlite3_expanded_sql((sqlite3_stmt*)p) - : 0; - fprintf(fp, "SQL TRACE #%u: %s\n", - ++counter, zExp ? zExp : zSql); - sqlite3_free(zExp); - } - break; - } - } - return 0; -} - -/* Initialize g.db, failing fatally on error. */ -static void cmpp_initdb(void){ - int rc; - char * zErr = 0; - const char * zSchema = - "CREATE TABLE def(" - /* ^^^ defines */ - "k TEXT PRIMARY KEY NOT NULL," - "v TEXT DEFAULT NULL" - ") WITHOUT ROWID;" - "CREATE TABLE incl(" - /* ^^^ files currently being included */ - "file TEXT PRIMARY KEY NOT NULL," - "srcFile TEXT DEFAULT NULL," - "srcLine INTEGER DEFAULT 0" - ") WITHOUT ROWID;" - "CREATE TABLE inclpath(" - /* ^^^ include path */ - "seq INTEGER UNIQUE ON CONFLICT IGNORE, " - "dir TEXT PRIMARY KEY NOT NULL ON CONFLICT IGNORE" - ");" - "BEGIN;" - ; - assert(0==g.db); - if(g.db) return; - rc = sqlite3_open_v2(":memory:", &g.db, SQLITE_OPEN_READWRITE, 0); - if(rc) fatal("Error opening :memory: db."); - sqlite3_trace_v2(g.db, SQLITE_TRACE_STMT, cmpp__db_sq3TraceV2, 0); - rc = sqlite3_exec(g.db, zSchema, 0, 0, &zErr); - if(rc) fatal("Error initializing database: %s", zErr); - rc = sqlite3_create_function(g.db, "fileExists", 1, - SQLITE_UTF8|SQLITE_DIRECTONLY, 0, - udf_file_exists, 0, 0); - db_affirm_rc(rc, "UDF registration failed."); -} - -/* -** For position zPos, which must be in the half-open range -** [zBegin,zEnd), returns g.delim.n if it is at the start of a line and -** starts with g.delim.z, else returns 0. -*/ -//static -unsigned short cmpp_is_delim(unsigned char const *zBegin, - unsigned char const *zEnd, - unsigned char const *zPos){ - assert(zEnd>zBegin); - assert(zPos=zBegin); - if(zPos>zBegin && - ('\n'!=*(zPos - 1) - || ((unsigned)(zEnd - zPos) <= g.delim.n))){ - return 0; - }else if(0==memcmp(zPos, g.delim.z, g.delim.n)){ - return g.delim.n; - }else{ - return 0; - } -} - -static void cmpp_t_out_expand(CmppTokenizer * const t, - unsigned char const * zFrom, - unsigned int n); - -static inline int cmpp__isspace(int ch){ - return ' '==ch || '\t'==ch; -} - -static inline unsigned cmpp__strlenu(unsigned char const *z, int n){ - return n<0 ? (unsigned)strlen((char const *)z) : (unsigned)n; -} - -static inline void cmpp__skip_space_c( unsigned char const **p, - unsigned char const *zEnd ){ - unsigned char const * z = *p; - while( zzPos,t->zEnd) for a derective delimiter. Emits any - non-delimiter output found along the way. - - This updtes t->zPos and t->lineNo as it goes. - - If a delimiter is found, it updates t->token and returns 0. - On no match returns 0. -*/ -static -int CmppTokenizer__delim_search(CmppTokenizer * const t){ - if(!t->zPos) t->zPos = t->zBegin; - if( t->zPos>=t->zEnd ){ - return 0; - } - assert( (t->zPos==t->zBegin || t->zPos[-1]=='\n') - && "Else we've mismanaged something."); - char const * const zD = g.delim.z; - unsigned short const nD = g.delim.n; - unsigned char const * const zEnd = t->zEnd; - unsigned char const * zLeft = t->zPos; - unsigned char const * z = zLeft; - - assert( 0==*zEnd && "Else we'll misinteract with strcspn()" ); - if( *zEnd ){ - fatal("Input must be NUL-terminated."); - return 0; - } -#define tflush \ - if(z>zEnd) z=zEnd; \ - if( z>zLeft ) { \ - cmpp_t_out_expand(t, zLeft, (unsigned)(z-zLeft)); \ - } zLeft = z - while(z < zEnd){ - size_t nNlTotal = 0; - unsigned char const * zNl; - size_t nNl2 = strcspn((char const *)z, "\n"); - zNl = (z + nNl2 >= zEnd ? zEnd : z + nNl2); - if( nNl2 >= CmppArgs_BufSize /* too long */ - //|| '\n'!=(char)*zNl /* end of input */ - /* ^^^ we have to accept a missing trailing EOL for the - sake of -e scripts. */ - ){ - /* we'd like to error out here, but only if we know we're - reading reading a directive line. */ - ++t->lineNo; - z = zNl + 1; - tflush; - continue; - } - nNlTotal += nNl2; - assert( '\n'==*zNl || !*zNl ); - assert( '\n'==*zNl || zNl==zEnd ); - //g_stderr("input: zNl=%d z=<<<%.*s>>>", (int)*zNl, (zNl-z), z); - unsigned char const * const zBOL = z; - cmpp__skip_space_c(&z, zNl); - if( z+nD < zNl && 0==memcmp(z, zD, nD) ){ - /* Found a directive delimiter. */ - if( zBOL!=z ){ - /* Do not emit space from the same line which preceeds a - delimiter */ - zLeft = z; - } - while( zNl>z && zNllineNo; - ++zNl; - nNl2 = strcspn((char const *)zNl, "\n"); - if( !nNl2 ) break; - nNlTotal += nNl2; - zNl += nNl2; - } - assert( zNl<=zEnd && "Else our input was not NUL-terminated"); - if( nNlTotal >= CmppArgs_BufSize ){ - fatal("Directive line is too long (%u)", - (unsigned)(zNl-z)); - break; - } - tflush; - t->token.zBegin = z + nD; - t->token.zEnd = zNl; - cmpp__skip_space_c(&t->token.zBegin, t->token.zEnd); - t->token.ttype = TT_Line; - t->token.lineNo = t->lineNo++; - t->zPos = t->token.zEnd + 1; - if( 0 ){ - g_stderr("token=<<%.*s>>", (t->token.zEnd - t->token.zBegin), - t->token.zBegin); - } - return 1; - } - z = zNl+1; - ++t->lineNo; - tflush; - //g_stderr("line #%d no match\n",(int)t->lineNo); - } - tflush; - t->zPos = z; - return 0; -#undef tflush -} - -void cmpp_kvp_parse(cmpp_kvp * p, char const *zKey, int nKey, - cmpp_kvp_op_e opPolicy){ - char chEq = 0; - char opLen = 0; - *p = cmpp_kvp_empty; - p->k.z = zKey; - p->k.n = cmpp_strlen(zKey, nKey); - switch( opPolicy ){ - case cmpp_kvp_op_none: break; - case cmpp_kvp_op_eq1: - chEq = '='; - opLen = 1; - break; - default: - assert(!"don't use these yet"); - /* todo: ==, !=, <=, <, >, >= */ - chEq = '='; - opLen = 1; - break; - } - assert( chEq ); - p->op = cmpp_kvp_op_none; - const char * const zEnd = p->k.z + p->k.n; - for(const char * zPos = p->k.z ; *zPos && zPosop = cmpp_kvp_op_eq1; - p->k.n = (unsigned)(zPos - zKey); - zPos += opLen; - assert( zPos <= zEnd ); - p->v.z = zPos; - p->v.n = (unsigned)(zEnd - zPos); - break; - } - } - cmpp_affirm_legal_key(p->k.z, p->k.n); -} - -static void cmpp_t_out_expand(CmppTokenizer * const t, - unsigned char const * zFrom, - unsigned int n){ - unsigned char const *zLeft = zFrom; - unsigned char const * const zEnd = zFrom + n; - unsigned char const *z = AT_OFF==g.flags.atPolicy ? zEnd : zLeft; - unsigned char const chEol = (unsigned char)'\n'; - int state = 0 /* 0==looking for opening @ - ** 1==looking for closing @ */; - if( 0 ){ - g_warn("zLeft=%d %c", (int)*zLeft, *zLeft); - } -#define tflush \ - if(z>zEnd) z=zEnd; \ - if(zLefttoken; - - assert(t->zBegin); - assert(t->zEnd > t->zBegin); - if(!t->zPos) t->zPos = t->zBegin; - t->args.pKw = 0; - t->args.argc = 0; - *tok = CmppToken_empty; - if( !CmppTokenizer__delim_search(t) ){ - return 0; - } - /* Split t->token into arguments for the line's keyword */ - int i, argc = 0, prevChar = 0; - const unsigned tokLen = (unsigned)(tok->zEnd - tok->zBegin); - unsigned char * zKwd; - unsigned char * zEsc; - unsigned char * zz; - - assert(TT_Line==tok->ttype); - g_debug(2,("token @ line %u len=%u [[[%.*s]]]\n", - tok->lineNo, tokLen, tokLen, tok->zBegin)); - zKwd = &t->args.lineBuf[0]; - memcpy(zKwd, tok->zBegin, tokLen); - memset(zKwd + tokLen, 0, sizeof(t->args.lineBuf) - tokLen); - for( zEsc = 0, zz = zKwd; *zz; ++zz ){ - /* Convert backslash-escaped newlines to whitespace */ - switch((int)*zz){ - case (int)'\\': - if(zEsc) zEsc = 0; - else zEsc = zz; - break; - case (int)'\n': - assert(zEsc && "Should not have an unescaped newline?"); - if(zEsc==zz-1){ - *zEsc = (unsigned char)' '; - /* FIXME?: memmove() lnBuf content one byte to the left here - ** to collapse backslash and newline into a single - ** byte. Also consider collapsing all leading space on the - ** next line. (Much later: or just collapse the output as we go, - ** effectively shrinking the line.) */ - } - zEsc = 0; - *zz = (unsigned char)' '; - break; - default: - zEsc = 0; - break; - } - } - t->args.argv[argc++] = zKwd; - for( zz = zKwd; *zz; ++zz ){ - if(isspace(*zz)){ - *zz = 0; - break; - } - } - t->args.pKw = CmppKeyword_search((char const *)zKwd); - if(!t->args.pKw){ - fatal("Unknown keyword '%s' at line %u\n", (char const *)zKwd, - tok->lineNo); - } - for( ++zz ; *zz && isspace(*zz); ++zz ){} - if(t->args.pKw->bTokenize){ - for( ; *zz; prevChar = *zz, ++zz ){ - /* Split string into word-shaped tokens. - ** TODO ?= quoted strings, for the sake of the - ** #error keyword. */ - if(isspace(*zz)){ - assert(zz!=zKwd && "Leading space was stripped earlier."); - *zz = 0; - }else{ - if(argc == (int)CmppArgs_Max){ - fatal("Too many arguments @ line %u: %.*s", - tok->lineNo, tokLen, tok->zBegin); - }else if(zz>zKwd && !prevChar){ - t->args.argv[argc++] = zz; - } - } - } - }else{ - /* Treat rest of line as one token */ - if(*zz) t->args.argv[argc++] = zz; - } - tok->ttype = t->args.pKw->ttype; - if(g.flags.doDebug>1){ - for(i = 0; i < argc; ++i){ - g_debug(0,("line %u arg #%d=%s\n", - tok->lineNo, i, - (char const *)t->args.argv[i])); - } - } - t->args.argc = argc; - return 1; -} - -/* Internal error reporting helper for cmpp_keyword_f() impls. */ -static CMPP_NORETURN void cmpp_kwd__err_(char const *zFile, int line, - CmppKeyword const * pKw, - CmppTokenizer const *t, - char const *zFmt, ...){ - va_list va; - g_stderr("%s @ %s line %u:", - pKw->zName, t->zName, t->token.lineNo); - va_start(va, zFmt); - g.tok = 0 /* stop fatalv__base() from duplicating the file info */; - fatalv__base(zFile, line, zFmt, va); - /* not reached */ - va_end(va); -} -#define cmpp_kwd__err(...) cmpp_kwd__err_(__FILE__,__LINE__, __VA_ARGS__) -#define cmpp_t__err(T,...) cmpp_kwd__err_(__FILE__,__LINE__, (T)->args.pKw, (T), __VA_ARGS__) - -/* No-op cmpp_keyword_f() impl. */ -static void cmpp_kwd_noop(CmppKeyword const * pKw, CmppTokenizer *t){ - (void)pKw; - (void)t; -} - -/* #error impl. */ -static void cmpp_kwd_error(CmppKeyword const * pKw, CmppTokenizer *t){ - if(CT_skip(t)) return; - else{ - assert(t->args.argc < 3); - const char *zBegin = t->args.argc>1 - ? (const char *)t->args.argv[1] : 0; - cmpp_t__err(t, "%s", zBegin ? zBegin : "(no additional info)"); - } -} - -/* Impl. for #define, #undef */ -static void cmpp_kwd_define(CmppKeyword const * pKw, CmppTokenizer *t){ - if(CT_skip(t)) return; - if(t->args.argc<2){ - cmpp_kwd__err(pKw, t, "Expecting one or more arguments"); - }else{ - int i = 1; - for( ; i < t->args.argc; ++i){ - char const * const zArg = (char const *)t->args.argv[i]; - cmpp_affirm_legal_key(zArg, -1); - if( TT_Define==pKw->ttype ){ - db_define_add( zArg, NULL ); - }else{ - db_define_rm( zArg ); - } - } - } -} - -static int cmpp_val_matches(char const *zGlob, char const *zRhs){ - return 0==sqlite3_strglob(zGlob, zRhs); -} - -typedef int (*cmpp_vcmp_f)(char const *zLhs, char const *zRhs); - -/* -** Accepts a key in the form X or X=Y. In the former case, it uses -** db_define_get_bool(kvp->k) to determine its truthiness, else it -** compares the kvp->v part to kvp->k's defined value to determine -** truthiness. -** -** Unless... -** -** If bCheckDefined is true is true then (A) it returns true if the -** value is defined and (B) fails fatally if given an X=Y-format key. -** -** Returns true if zKey evals to true, else false. -*/ -//static -int cmpp_kvp_truth(CmppKeyword const * const pKw, - CmppTokenizer const * const t, - cmpp_kvp const * const kvp, - int bCheckDefined){ - int buul = 0; - if( kvp->v.z ){ - if( bCheckDefined ){ - cmpp_kwd__err(pKw, t, "Value part is not legal for " - "is-defined checks: %.s", - kvp->k.n, kvp->k.z); - } - char * zVal = 0; - unsigned int nVal = 0; - buul = db_define_get(kvp->k.z, (int)kvp->k.n, &zVal, &nVal); - //g_debug(0,("checking key[%.*s]=%.*s\n", (zEq-zKey), zKey, nVal, zVal)); - if( kvp->v.n && nVal ){ - /* FIXME? do this with a query */ - /*g_debug(0,("if get-define [%.*s]=[%.*s] zValPart=%s\n", - (zEq-zKey), zKey, - nVal, zVal, zValPart));*/ - buul = cmpp_val_matches(kvp->v.z, zVal); - //g_debug(0,("buul=%d\n", buul)); - }else{ - assert( 0==kvp->v.n || 0==nVal ); - buul = kvp->v.n == nVal; - } - db_free(zVal); - }else{ - if( bCheckDefined ){ - buul = db_define_has(kvp->k.z, kvp->k.n); - }else{ - buul = db_define_get_bool(kvp->k.z, kvp->k.n); - } - } - return buul; -} - -#if 0 -/* -** A thin proxy for cmpp_kvp_truth(). -*/ -static int cmpp_key_truth(CmppKeyword const * pKw, - CmppTokenizer const * t, - char const *zKey, int bCheckDefined){ - cmpp_kvp kvp = cmpp_kvp_empty; - cmpp_kvp_parse(&kvp, zKey, -1, cmpp_kvp_op_eq1); - return cmpp_kvp_truth(pKw, t, &kvp, bCheckDefined); -} -#endif - -//static -cmpp_kvp_op_e cmpp_t_is_op(CmppTokenizer const * t, int arg){ - if( t->args.argc > arg ){ - char const * const z = (char const *)t->args.argv[arg]; -#define E(N,S) if( strcmp(S,z) ) return cmpp_kvp_op_ ## N; else - cmpp_kvp_op_map(E) -#undef E - if(0) {} - } - return cmpp_kvp_op_none; -} - -/* -** A single part of an #if-type expression. They are parsed from -** CmppTokenizer::args in this form: -** -** not* defined{0,1} key[=[value]] -*/ -struct CmppExprDef { - /* The key part of the input. */ - cmpp_kvp kvp; - struct { - int ndx; - int next; - } arg; - CmppTokenizer const * tizer; - /* Set to 0 or 1 depending how many "not" are parsed. */ - unsigned char bNegated; - /* Set to 1 if "defined" is parsed. */ - unsigned char bCheckDefined; -}; -typedef struct CmppExprDef CmppExprDef; -#define CmppExprDef_empty_m {cmpp_kvp_empty_m,{0,0},0,0,0} -static const CmppExprDef CmppExprDef_empty = CmppExprDef_empty_m; - -/* -** Evaluate cep to true or false and return that value: -** -** If cep->bCheckDefined, return the result of db_define_has(). -** -** Else if cep->kvp.v.z is not NULL then fetch the define's value -** and return the result of cmpp_val_matches(cep->kvp.v.z,thatValue). -** -** Else return the result of db_define_get_bool(). -** -** The returned result accounts for cep->bNegated. -*/ -static int CmppExprDef_eval(CmppExprDef const * cep){ - int buul = 0; - - if( cep->bCheckDefined ){ - assert( !cep->kvp.v.n ); - buul = db_define_has(cep->kvp.k.z, (int)cep->kvp.k.n); - }else if( cep->kvp.v.z ){ - unsigned nVal = 0; - char * zVal = 0; - buul = db_define_get(cep->kvp.k.z, cep->kvp.k.n, &zVal, &nVal); - if( nVal ){ - buul = cmpp_val_matches(cep->kvp.v.z, zVal); - } - db_free(zVal); - }else{ - buul = db_define_get_bool(cep->kvp.k.z, cep->kvp.k.n); - } - return cep->bNegated ? !buul : buul; -} - -/* -** Expects t->args, starting at t->args.argv[startArg], to parse to -** one CmmpExprDef. It clears cep and repopulates it with info about -** the parse. Fails fatally on a parse error. -** -** Returns true if it reads one, false if it doesn't, and fails fatally -** if what it tries to parse is not empty but is not a CmppExprDef. -** -** Specifically, it parses: -** -** not+ defined? Word[=value] -** -*/ -static int CmppExprDef_read_one(CmppKeyword const * pKw, - CmppTokenizer const * t, - int startArg, CmppExprDef * cep){ - char const *zKey = 0; - *cep = CmppExprDef_empty; - cep->arg.ndx = startArg; - assert( t->args.pKw ); - assert( t->args.pKw==pKw ); - cep->tizer = t; - for(int i = startArg; !zKey && iargs.argc; ++i ){ - char const * z = (char const *)t->args.argv[i]; - if( 0==strcmp(z, "not") ){ - cep->bNegated = !cep->bNegated; - }else if( 0==strcmp(z,"defined") ){ - if( cep->bCheckDefined ){ - cmpp_kwd__err(pKw, t, - "Cannot use 'defined' more than once"); - } - cep->bCheckDefined = 1; - }else{ - assert( !zKey ); - cmpp_kvp_parse(&cep->kvp, z, -1, cmpp_kvp_op_eq1); - if( cep->bCheckDefined && cep->kvp.v.z ){ - cmpp_kwd__err(pKw, t, "Cannot use X=Y keys with 'defined'"); - cep->arg.next = ++i; - } - return 1; - } - } - return 0; -} - -/* -** Evals pStart and then proceeds to process any remaining arguments -** in t->args as RHS expressions. Returns the result of the expression -** as a bool. -** -** Specifically, it parses: -** -** and|or CmppExprDef -** -** Where CmppExprDef is the result of CmppExprDef_read_one(). -*/ -static int CmppExprDef_parse_cond(CmppKeyword const *pKw, - CmppTokenizer *t, - CmppExprDef const * pStart){ - enum { Op_none = 0, Op_And, Op_Or }; - int lhs = CmppExprDef_eval(pStart); - int op = Op_none; - int i = pStart->arg.next; - for( ; i < t->args.argc; ++i ){ - CmppExprDef eNext = CmppExprDef_empty; - char const *z = (char const *)t->args.argv[i]; - if( 0==strcmp("and",z) ){ - if( Op_none!=op ) goto multiple_ops; - op = Op_And; - continue; - }else if( 0==strcmp("or",z) ){ - if( Op_none!=op ) goto multiple_ops; - op = Op_Or; - continue; - }else if( !CmppExprDef_read_one(pKw, t, i, &eNext) ){ - if( Op_none!=op ){ - cmpp_t__err(t, "Stray operator: %s",z); - } - } - assert( eNext.kvp.k.z ); - int const rhs = CmppExprDef_eval(&eNext); - switch( op ){ - case Op_none: break; - case Op_And: lhs = lhs && rhs; break; - case Op_Or: lhs = lhs || rhs; break; - default: - assert(!"cannot happen"); - fatal("this cannot happen"); - } - op = Op_none; - } - if( Op_none!=op ){ - cmpp_t__err(t, "Extra operator at end of expression"); - }else if( i < t->args.argc ){ - assert(!"cannot happen"); - cmpp_kwd__err(t->args.pKw, t, "Unhandled extra arguments"); - }else{ - return lhs; - } - assert(!"not reached"); -multiple_ops: - cmpp_t__err(t,"Cannot have multiple operators"); - return 0 /* not reached */; -} - -/* Impl. for #if, #elif, #assert. */ -static void cmpp_kwd_if(CmppKeyword const * pKw, CmppTokenizer *t){ - CmppParseState tmpState = TS_Start; - CmppExprDef cep = CmppExprDef_empty; - //int buul = 0; - assert( TT_If==pKw->ttype - || TT_Elif==pKw->ttype - || TT_Assert==pKw->ttype); - if(t->args.argc<2){ - cmpp_kwd__err(pKw, t, "Expecting an argument"); - } - CmppExprDef_read_one(pKw, t, 1, &cep); - if( !cep.kvp.k.z ){ - cmpp_kwd__err(pKw, t, "Missing key argument"); - } - /*g_debug(0,("%s %s level %u pstate=%d bNot=%d bCheckDefined=%d\n", - pKw->zName, zKey, t->level.ndx, (int)CT_pstate(t), - bNot, bCheckDefined));*/ - switch(pKw->ttype){ - case TT_Assert: - break; - case TT_Elif: - switch(CT_pstate(t)){ - case TS_If: break; - case TS_IfPassed: CT_level(t).flags |= CmppLevel_F_ELIDE; return; - default: - cmpp_kwd__err(pKw, t, "'%s' used out of context", - pKw->zName); - } - break; - case TT_If: - CmppLevel_push(t); - break; - default: - assert(!"cannot happen"); - cmpp_kwd__err(pKw, t, "Unexpected keyword token type"); - break; - } - if( CmppExprDef_parse_cond( pKw, t, &cep ) ){ - CT_pstate(t) = tmpState = TS_IfPassed; - CT_skipLevel(t) = 0; - }else{ - if( TT_Assert==pKw->ttype ){ - cmpp_kwd__err(pKw, t, "Assertion failed: %s", - /* fixme: emit the whole line. We don't have it - handy in a readily-printable form. */ - cep.kvp.k.z); - } - CT_pstate(t) = TS_If /* also for TT_Elif */; - CT_skipLevel(t) = 1; - g_debug(3,("setting CT_skipLevel = 1 @ level %d\n", t->level.ndx)); - } - if( TT_If==pKw->ttype ){ - unsigned const lvlIf = t->level.ndx; - CmppToken const lvlToken = CT_level(t).token; - while(cmpp_next_keyword_line(t)){ - cmpp_process_keyword(t); - if(lvlIf > t->level.ndx){ - assert(TT_Endif == t->token.ttype); - break; - } -#if 0 - if(TS_IfPassed==tmpState){ - tmpState = TS_Start; - t->level.stack[lvlIf].flags |= CmppLevel_F_ELIDE; - g_debug(1,("Setting ELIDE for TS_IfPassed @ lv %d (lvlIf=%d)\n", t->level.ndx, lvlIf)); - } -#endif - } - if(lvlIf <= t->level.ndx){ - cmpp_kwd__err(pKw, t, - "Input ended inside an unterminated %sif " - "opened at [%s] line %u", - g.delim.z, t->zName, lvlToken.lineNo); - } - } -} - -/* Impl. for #else. */ -static void cmpp_kwd_else(CmppKeyword const * pKw, CmppTokenizer *t){ - if(t->args.argc>1){ - cmpp_kwd__err(pKw, t, "Expecting no arguments"); - } - switch(CT_pstate(t)){ - case TS_IfPassed: CT_skipLevel(t) = 1; break; - case TS_If: CT_skipLevel(t) = 0; break; - default: - cmpp_kwd__err(pKw, t, "'%s' with no matching 'if'", - pKw->zName); - } - /*g_debug(0,("else flags=0x%02x skipLevel=%u\n", - CT_level(t).flags, CT_level(t).skipLevel));*/ - CT_pstate(t) = TS_Else; -} - -/* Impl. for #endif. */ -static void cmpp_kwd_endif(CmppKeyword const * pKw, CmppTokenizer *t){ - /* Maintenance reminder: we ignore all arguments after the endif - ** to allow for constructs like: - ** - ** #endif // foo - ** - ** in a manner which does not require a specific comment style */ - switch(CT_pstate(t)){ - case TS_Else: - case TS_If: - case TS_IfPassed: - break; - default: - cmpp_kwd__err(pKw, t, "'%s' with no matching 'if'", - pKw->zName); - } - CmppLevel_pop(t); -} - -/* Impl. for #include. */ -static void cmpp_kwd_include(CmppKeyword const * pKw, CmppTokenizer *t){ - char const * zFile; - char * zResolved; - if(CT_skip(t)) return; - else if(t->args.argc!=2){ - cmpp_kwd__err(pKw, t, "Expecting exactly 1 filename argument"); - } - zFile = (const char *)t->args.argv[1]; - if(db_including_has(zFile)){ - /* Note that different spellings of the same filename - ** will elude this check, but that seems okay, as different - ** spellings means that we're not re-running the exact same - ** invocation. We might want some other form of multi-include - ** protection, rather than this, however. There may well be - ** sensible uses for recursion. */ - cmpp_t__err(t, "Recursive include of file: %s", zFile); - } - zResolved = db_include_search(zFile); - if(zResolved){ - db_including_add(zFile, t->zName, t->token.lineNo); - cmpp_process_file(zResolved); - db_include_rm(zFile); - db_free(zResolved); - }else{ - cmpp_t__err(t, "file not found: %s", zFile); - } -} - - -static void cmpp_dump_defines( FILE * fp, int bIndent ){ - sqlite3_stmt * const q = g_stmt(GStmt_defSelAll); - while( SQLITE_ROW==sqlite3_step(q) ){ - unsigned char const * zK = sqlite3_column_text(q, 0); - unsigned char const * zV = sqlite3_column_text(q, 1); - int const nK = sqlite3_column_bytes(q, 0); - int const nV = sqlite3_column_bytes(q, 1); - fprintf(fp, "%s%.*s = %.*s\n", - bIndent ? "\t" : "", nK, zK, nV, zV); - } - g_stmt_reset(q); -} - -/* Impl. for #pragma. */ -static void cmpp_kwd_pragma(CmppKeyword const * pKw, CmppTokenizer *t){ - const char * zArg; - if(CT_skip(t)) return; - else if(t->args.argc<2){ - cmpp_kwd__err(pKw, t, "Expecting an argument"); - } - zArg = (const char *)t->args.argv[1]; -#define M(X) 0==strcmp(zArg,X) - if(M("defines")){ - cmpp_dump_defines(stderr, 1); - } - else if(M("chomp-F")){ - g.flags.chompF = 1; - }else if(M("no-chomp-F")){ - g.flags.chompF = 0; - } -#if 0 - /* now done by cmpp_kwd_at_policy() */ - else if(M("@")){ - if(t->args.argc>2){ - g.flags.atPolicy = - AtPolicy_fromStr((char const *)t->args.argv[2], 1); - }else{ - g.flags.atPolicy = AT_DEFAULT; - } - }else if(M("no-@")){ - g.flags.atPolicy = AT_OFF; - } -#endif - else{ - cmpp_kwd__err(pKw, t, "Unknown pragma: %s", zArg); - } -#undef M -} - -static void db_step_reset(sqlite3_stmt * const q, char const * zErrTip){ - db_affirm_rc(sqlite3_step(q), zErrTip); - g_stmt_reset(q); -} - -static void cmpp_sp_begin(CmppTokenizer * const t){ - db_step_reset(g_stmt(GStmt_spBegin), "Starting savepoint"); - ++t->nSavepoint; -} - -static void cmpp_sp_rollback(CmppTokenizer * const t){ - if( !t->nSavepoint ){ - cmpp_t__err(t, "Cannot roll back: no active savepoint"); - } - db_step_reset(g_stmt(GStmt_spRollback), - "Rolling back savepoint"); - db_step_reset(g_stmt(GStmt_spRelease), - "Releasing rolled-back savepoint"); - --t->nSavepoint; -} - -static void cmpp_sp_commit(CmppTokenizer * const t){ - if( !t->nSavepoint ){ - cmpp_t__err(t, "Cannot commit: no active savepoint"); - } - db_step_reset(g_stmt(GStmt_spRelease), "Rolling back savepoint"); - --t->nSavepoint; -} - -void CmppTokenizer_cleanup(CmppTokenizer * const t){ - while( t->nSavepoint ){ - cmpp_sp_rollback(t); - } -} - -/* Impl. for #savepoint. */ -static void cmpp_kwd_savepoint(CmppKeyword const * pKw, CmppTokenizer *t){ - const char * zArg; - if(CT_skip(t)) return; - else if(t->args.argc!=2){ - cmpp_kwd__err(pKw, t, "Expecting one argument"); - } - zArg = (const char *)t->args.argv[1]; -#define M(X) 0==strcmp(zArg,X) - if(M("begin")){ - cmpp_sp_begin(t); - }else if(M("rollback")){ - cmpp_sp_rollback(t); - }else if(M("commit")){ - cmpp_sp_commit(t); - }else{ - cmpp_kwd__err(pKw, t, "Unknown savepoint option: %s", zArg); - } -#undef SP_NAME -#undef M -} - -/* #stder impl. */ -static void cmpp_kwd_stderr(CmppKeyword const * pKw, CmppTokenizer *t){ - if(CT_skip(t)) return; - else{ - const char *zBegin = t->args.argc>1 - ? (const char *)t->args.argv[1] : 0; - if(zBegin){ - g_stderr("%s:%u: %s\n", t->zName, t->token.lineNo, zBegin); - }else{ - g_stderr("%s:%u: (no %.*s%s argument)\n", - t->zName, t->token.lineNo, - g.delim.n, g.delim.z, pKw->zName); - } - } -} - -/* Impl. for the @ policy. */ -static void cmpp_kwd_at_policy(CmppKeyword const * pKw, CmppTokenizer *t){ - if(CT_skip(t)) return; - else if(t->args.argc<2){ - g.flags.atPolicy = AT_DEFAULT; - }else{ - g.flags.atPolicy = AtPolicy_fromStr((char const*)t->args.argv[1], 1); - } -} - - -#if 0 -/* Impl. for dummy placeholder. */ -static void cmpp_kwd_todo(CmppKeyword const * pKw, CmppTokenizer *t){ - (void)t; - g_debug(0,("TODO: keyword handler for %s\n", pKw->zName)); -} -#endif - -CmppKeyword aKeywords[] = { -/* Keep these sorted by zName */ -#define S(NAME) DStrings.NAME.z, DStrings.NAME.n - {S(Comment), 0, TT_Comment, cmpp_kwd_noop}, - {S(AtPolicy), 1, TT_AtPolicy, cmpp_kwd_at_policy}, - {S(Assert),1, TT_Assert, cmpp_kwd_if}, - {S(Define), 1, TT_Define, cmpp_kwd_define}, - {S(Elif), 1, TT_Elif, cmpp_kwd_if}, - {S(Else), 1, TT_Else, cmpp_kwd_else}, - {S(Endif), 0, TT_Endif, cmpp_kwd_endif}, - {S(Error), 0, TT_Error, cmpp_kwd_error}, - {S(If), 1, TT_If, cmpp_kwd_if}, - {S(Include), 0, TT_Include, cmpp_kwd_include}, - {S(Pragma), 1, TT_Pragma, cmpp_kwd_pragma}, - {S(Savepoint), 1, TT_Savepoint, cmpp_kwd_savepoint}, - {S(Stderr), 0, TT_Stderr, cmpp_kwd_stderr}, - {S(Undef), 1, TT_Undef, cmpp_kwd_define}, -#undef S - {0,0,TT_Invalid, 0} -}; - -static int cmpp_CmppKeyword(const void *p1, const void *p2){ - char const * zName = (const char *)p1; - CmppKeyword const * kw = (CmppKeyword const *)p2; - return strcmp(zName, kw->zName); -} - -CmppKeyword const * CmppKeyword_search(const char *zName){ - return (CmppKeyword const *)bsearch(zName, &aKeywords[0], - sizeof(aKeywords)/sizeof(aKeywords[0]) - 1, - sizeof(aKeywords[0]), - cmpp_CmppKeyword); -} - -void cmpp_process_keyword(CmppTokenizer * const t){ - assert(t->args.pKw); - assert(t->args.argc); - t->args.pKw->xCall(t->args.pKw, t); - t->args.pKw = 0; - t->args.argc = 0; -} - -void cmpp_process_string(const char * zName, - unsigned char const * zIn, - int nIn){ - nIn = cmpp__strlenu(zIn, nIn); - if( !nIn ) return; - CmppTokenizer const * const oldTok = g.tok; - CmppTokenizer ct = CmppTokenizer_empty; - ct.zName = zName; - ct.zBegin = zIn; - ct.zEnd = zIn + nIn; - while(cmpp_next_keyword_line(&ct)){ - cmpp_process_keyword(&ct); - } - if(0!=ct.level.ndx){ - CmppLevel const * const lv = CmppLevel_get(&ct); - fatal("Input ended inside an unterminated nested construct " - "opened at [%s] line %u", zName, lv->token.lineNo); - } - CmppTokenizer_cleanup(&ct); - g.tok = oldTok; -} - -void cmpp_process_file(const char * zName){ - FileWrapper fw = FileWrapper_empty; - FileWrapper_open(&fw, zName, "r"); - g_FileWrapper_link(&fw); - FileWrapper_slurp(&fw); - g_debug(1,("Read %u byte(s) from [%s]\n", fw.nContent, fw.zName)); - if( fw.zContent ){ - cmpp_process_string(zName, fw.zContent, fw.nContent); - } - g_FileWrapper_close(&fw); -} - - -void fatalv__base(char const *zFile, int line, - char const *zFmt, va_list va){ - FILE * const fp = stderr; - fflush(stdout); - fputc('\n', fp); - if( g.flags.doDebug ){ - fprintf(fp, "%s: ", g.zArgv0); - if( zFile ){ - fprintf(fp, "%s:%d ",zFile, line); - } - } - if( g.tok ){ - fprintf(fp,"@%s:%d: ", - (g.tok->zName && 0==strcmp("-",g.tok->zName)) - ? "" - : g.tok->zName, - g.tok->lineNo); - } - if(zFmt && *zFmt){ - vfprintf(fp, zFmt, va); - } - fputc('\n', fp); - fflush(fp); - exit(1); -} - -void fatal__base(char const *zFile, int line, - char const *zFmt, ...){ - va_list va; - va_start(va, zFmt); - fatalv__base(zFile, line, zFmt, va); - va_end(va); -} - -#undef CT_level -#undef CT_pstate -#undef CT_skipLevel -#undef CT_skip -#undef CLvl_skip - -static void usage(int isErr){ - FILE * const fOut = isErr ? stderr : stdout; - fprintf(fOut, "Usage: %s [flags] [infile...]\n", g.zArgv0); - fprintf(fOut, - "Flags and filenames may be in any order and " - "they are processed in that order.\n" - "\nFlags:\n"); -#define GAP " " -#define arg(F,D) fprintf(fOut,"\n %s\n" GAP "%s\n",F, D) - arg("-o|--outfile FILE","Send output to FILE (default=- (stdout)).\n" - GAP "Because arguments are processed in order, this should\n" - GAP "normally be given before -f."); - arg("-f|--file FILE","Process FILE (default=- (stdin)).\n" - GAP "All non-flag arguments are assumed to be the input files."); - arg("-DXYZ[=value]","Define XYZ to the given value (default=1)."); - arg("-UXYZ","Undefine all defines matching glob XYZ."); - arg("-IXYZ","Add dir XYZ to the " CMPP_DEFAULT_DELIM "include path."); - arg("-FXYZ=filename", - "Define XYZ to the raw contents of the given file.\n" - GAP "The file is not processed as by " CMPP_DEFAULT_DELIM"include\n" - GAP "Maybe it should be. Or maybe we need a new flag for that."); - arg("-d|--delimiter VALUE", "Set keyword delimiter to VALUE " - "(default=" CMPP_DEFAULT_DELIM ")."); - arg("--@policy retain|elide|error|off", - "Specifies how to handle @tokens@ (default=off).\n" - GAP "off = do not look for @tokens@\n" - GAP "retain = parse @tokens@ and retain any undefined ones\n" - GAP "elide = parse @tokens@ and elide any undefined ones\n" - GAP "error = parse @tokens@ and error out for any undefined ones" - ); - arg("-@", "Equivalent to --@policy=error."); - arg("-no-@", "Equivalent to --@policy=off (the default)."); - arg("--sql-trace", "Send a trace of all SQL to stderr."); - arg("--sql-trace-x", - "Like --sql-trace but expand all bound values in the SQL."); - arg("--no-sql-trace", "Disable SQL tracing (default)."); - arg("--chomp-F", "One trailing newline is trimmed from files " - "read via -FXYZ=filename."); - arg("--no-chomp-F", "Disable --chomp-F (default)."); -#undef arg -#undef GAP - fputs("\nFlags which require a value accept either " - "--flag=value or --flag value.\n\n",fOut); -} - -/* -** Expects that *ndx points to the current argv entry and that it is a -** flag which expects a value. This function checks for --flag=val and -** (--flag val) forms. If a value is found then *ndx is adjusted (if -** needed) to point to the next argument after the value and *zVal is -** pointed to the value. If no value is found then it fails fatally. -*/ -static void get_flag_val(int argc, char const * const * argv, int * ndx, - char const **zVal){ - char const * zEq = strchr(argv[*ndx], '='); - if( zEq ){ - *zVal = zEq+1; - return; - } - if(*ndx+1>=argc){ - fatal("Missing value for flag '%s'", argv[*ndx]); - } - *zVal = argv[++*ndx]; -} - -static int arg_is_flag( char const *zFlag, char const *zArg, - char const **zValIfEqX ){ - *zValIfEqX = 0; - if( 0==strcmp(zFlag, zArg) ) return 1; - char const * z = strchr(zArg,'='); - if( z && z>zArg ){ - /* compare the part before the '=' */ - if( 0==strncmp(zFlag, zArg, z-zArg) ){ - if( !zFlag[z-zArg] ){ - *zValIfEqX = z+1; - return 1; - } - /* Else it was a prefix match. */ - } - } - return 0; -} - -int main(int argc, char const * const * argv){ - int rc = 0; - int inclCount = 0; - int nFile = 0; - int ndxTrace = 0; - int expandMode = 0; - char const * zVal = 0; -#define ARGVAL if( !zVal ) get_flag_val(argc, argv, &i, &zVal) -#define M(X) arg_is_flag(X, zArg, &zVal) -#define ISFLAG(X) else if(M(X)) -#define ISFLAG2(X,Y) else if(M(X) || M(Y)) -#define NOVAL if( zVal ) fatal("Unexpected value for %s", zArg) -#define g_out_open \ - if(!g.out.pFile) FileWrapper_open(&g.out, "-", "w"); \ - if(!inclCount){ db_include_dir_add("."); ++inclCount; } (void)0 - - g.zArgv0 = argv[0]; -#define DOIT if(doIt) - for(int doIt = 0; doIt<2; ++doIt){ - /** - Loop through the flags twice. The first time we just validate - and look for --help/-?. The second time we process the flags. - This approach allows us to easily chain multiple files and - flags: - - ./c-pp -Dfoo -o foo x.y -Ufoo -Dbar -o bar x.y - */ - DOIT{ - atexit(cmpp_atexit); - if( 1==ndxTrace ){ - /* Ensure that we start with tracing in the early stage if - --sql-trace is the first arg, in order to log schema - setup. */ - g.sqlTrace.pFile = stderr; - g.sqlTrace.expandSql = expandMode; - } - cmpp_initdb(); - } - for(int i = 1; i < argc; ++i){ - int negate = 0; - char const * zArg = argv[i]; - //g_stderr("i=%d zArg=%s\n", i, zArg); - zVal = 0; - while('-'==*zArg) ++zArg; - if(zArg==argv[i]/*not a flag*/){ - zVal = zArg; - goto do_infile; - } - if( 0==strncmp(zArg,"no-",3) ){ - zArg += 3; - negate = 1; - } - ISFLAG2("?","help"){ - NOVAL; - usage(0); - goto end; - }else if('D'==*zArg){ - ++zArg; - if(!*zArg) fatal("Missing key for -D"); - DOIT { - db_define_add(zArg, 0); - } - }else if('F'==*zArg){ - ++zArg; - if(!*zArg) fatal("Missing key for -F"); - DOIT { - db_define_add_file(zArg); - } - }else if('U'==*zArg){ - ++zArg; - if(!*zArg) fatal("Missing key for -U"); - DOIT { - db_define_rm(zArg); - } - }else if('I'==*zArg){ - ++zArg; - if(!*zArg) fatal("Missing directory for -I"); - DOIT { - db_include_dir_add(zArg); - ++inclCount; - } - } - ISFLAG2("o","outfile"){ - ARGVAL; - DOIT { - FileWrapper_open(&g.out, zVal, "w"); - } - } - ISFLAG2("f","file"){ - ARGVAL; - do_infile: - DOIT { - ++nFile; - g_out_open; - cmpp_process_file(zVal); - } - } - ISFLAG("e"){ - ARGVAL; - DOIT { - ++nFile; - g_out_open; - cmpp_process_string("-e script", ustr_c(zVal), -1); - } - } - ISFLAG("@"){ - NOVAL; - DOIT { - assert( AT_DEFAULT!=AT_OFF ); - g.flags.atPolicy = negate ? AT_OFF : AT_DEFAULT; - } - } - ISFLAG("@policy"){ - AtPolicy aup; - ARGVAL; - aup = AtPolicy_fromStr(zVal, 1); - DOIT { - g.flags.atPolicy = aup; - } - } - ISFLAG("debug"){ - NOVAL; - g.flags.doDebug += negate ? -1 : 1; - } - ISFLAG("sql-trace"){ - NOVAL; - /* Needs to be set before the start of the second pass, when - the db is inited. */ - g.sqlTrace.expandSql = 0; - DOIT { - g.sqlTrace.pFile = negate ? (FILE*)0 : stderr; - }else if( !ndxTrace && !negate ){ - ndxTrace = i; - expandMode = 0; - } - } - ISFLAG("sql-trace-x"){ - NOVAL; - g.sqlTrace.expandSql = 1; - DOIT { - g.sqlTrace.pFile = negate ? (FILE*)0 : stderr; - }else if( !ndxTrace && !negate ){ - ndxTrace = i; - expandMode = 1; - } - } - ISFLAG("chomp-F"){ - NOVAL; - DOIT g.flags.chompF = !negate; - } - ISFLAG2("d","delimiter"){ - ARGVAL; - if( !doIt ){ - g.delim.z = zVal; - g.delim.n = (unsigned short)strlen(zVal); - if(!g.delim.n) fatal("Keyword delimiter may not be empty."); - } - } - ISFLAG2("dd", "dump-defines"){ - DOIT { - FILE * const fp = stderr; - fprintf(fp, "All %sdefine entries:\n", g.delim.z); - cmpp_dump_defines(fp, 1); - } - } - else{ - fatal("Unhandled flag: %s", argv[i]); - } - } - DOIT { - if(!nFile){ - if(!g.out.zName) g.out.zName = "-"; - if(!inclCount){ - db_include_dir_add("."); - ++inclCount; - } - FileWrapper_open(&g.out, g.out.zName, "w"); - cmpp_process_file("-"); - } - } - } - end: - g_cleanup(1); - return rc ? EXIT_FAILURE : EXIT_SUCCESS; -} diff --git a/ext/wasm/common/SqliteTestUtil.js b/ext/wasm/common/SqliteTestUtil.js index 2c17824c53..a817b79f85 100644 --- a/ext/wasm/common/SqliteTestUtil.js +++ b/ext/wasm/common/SqliteTestUtil.js @@ -44,7 +44,8 @@ /** abort() if expr is false. If expr is a function, it is called and its result is evaluated. */ - assert: function f(expr, msg){ + assert: function f(expr, ...msg){ + msg = msg?.join?.(' '); if(!f._){ f._ = ('undefined'===typeof abort ? (msg)=>{throw new Error(msg)} diff --git a/ext/wasm/common/whwasmutil.js b/ext/wasm/common/whwasmutil.js index 1c678f31f6..bca05a1eef 100644 --- a/ext/wasm/common/whwasmutil.js +++ b/ext/wasm/common/whwasmutil.js @@ -16,27 +16,35 @@ More specifically: - https://fossil.wanderinghorse.net/r/jaccwabyt/file/common/whwasmutil.js + https://fossil.wanderinghorse.net/r/jaccwabyt/dir/wasmutil and SQLite: https://sqlite.org This file is kept in sync between both of those trees. + + This build was generated using: + + ./c-pp -o js/whwasmutil.js -@policy=error wasmutil/whwasmutil.c-pp.js + + by libcmpp 2.x 2fc4afc31f6505c27b9c34988973a2bd9b157d559247cdd26868ae75632c3a5e @ 2025-11-16 23:03:27.352 UTC */ /** - The primary goal of this function is to replace, where possible, - Emscripten-generated glue code with equivalent utility code which - can be used in arbitrary WASM environments built with toolchains - other than Emscripten. To that end, it populates the given object - with various WASM-specific APIs. These APIs work with both 32- and - 64-bit WASM builds. + The primary goal of this function is to provide JS/WASM utility + code similar to some of that provided by Emscripten-generated + builds, the difference being that this one can be used in arbitrary + WASM environments built with toolchains other than Emscripten. To + that end, it populates the given object with various WASM-specific + APIs. These APIs work with both 32- and 64-bit WASM builds. Forewarning: this API explicitly targets only browser environments. If a given non-browser environment has the capabilities needed for a given feature (e.g. TextEncoder), great, but it does not go out of its way to account for them and does not provide compatibility - crutches for them. + crutches for them. That said: no specific incompatibilities with, + e.g., node.js are known (whereas it is known that some folks + use this with node.js). Intended usage: @@ -217,7 +225,9 @@ newly-created (or config-provided) target. The current approach seemed better at the time. */ -globalThis.WhWasmUtilInstaller = function(target){ +'use strict'; +globalThis.WhWasmUtilInstaller = +function WhWasmUtilInstaller(target){ 'use strict'; if(undefined===target.bigIntEnabled){ target.bigIntEnabled = !!globalThis['BigInt64Array']; @@ -227,6 +237,14 @@ globalThis.WhWasmUtilInstaller = function(target){ all args with a space between each. */ const toss = (...args)=>{throw new Error(args.join(' '))}; + if( !target.pointerSize && !target.pointerIR + && target.alloc && target.dealloc ){ + /* Try to determine the pointer size by allocating. */ + const ptr = target.alloc(1); + target.pointerSize = ('bigint'===typeof ptr ? 8 : 4); + target.dealloc(ptr); + } + /** As of 2025-09-21, this library works with 64-bit WASM modules built with Emscripten's -sMEMORY64=1. @@ -659,12 +677,14 @@ globalThis.WhWasmUtilInstaller = function(target){ const ft = target.functionTable(); const oldLen = __asPtrType(ft.length); let ptr; - while(cache.freeFuncIndexes.length){ - ptr = cache.freeFuncIndexes.pop(); - if(ft.get(ptr)){ /* Table was modified via a different API */ + while( (ptr = cache.freeFuncIndexes.pop()) ){ + if(ft.get(ptr)){ + /* freeFuncIndexes's entry is stale. Table was modified via a + different API */ ptr = null; continue; }else{ + /* This index is free. We'll re-use it. */ break; } } @@ -755,10 +775,10 @@ globalThis.WhWasmUtilInstaller = function(target){ has no side effects and returns undefined. */ target.uninstallFunction = function(ptr){ - if(!ptr && 0!==ptr) return undefined; - const fi = cache.freeFuncIndexes; + if(!ptr && __NullPtr!==ptr) return undefined; + const ft = target.functionTable(); - fi.push(ptr); + cache.freeFuncIndexes.push(ptr); const rc = ft.get(ptr); ft.set(ptr, null); return rc; @@ -996,12 +1016,12 @@ globalThis.WhWasmUtilInstaller = function(target){ target.heap8u(). */ target.cstrlen = function(ptr){ - if(!ptr || !target.isPtr(ptr)) return null; + if(!ptr || !target.isPtr/*64*/(ptr)) return null; ptr = Number(ptr) /*tag:64bit*/; const h = heapWrappers().HEAP8U; let pos = ptr; for( ; h[pos] !== 0; ++pos ){} - return Number(pos - ptr); + return pos - ptr; }; /** Internal helper to use in operations which need to distinguish @@ -2452,7 +2472,8 @@ globalThis.WhWasmUtilInstaller = function(target){ - If `wasmUtilTarget.alloc` is not set and `instance.exports.malloc` is, it installs `wasmUtilTarget.alloc()` and `wasmUtilTarget.dealloc()` - wrappers for the exports `malloc` and `free` functions. + wrappers for the exports' `malloc` and `free` functions + if exports.malloc exists. It returns a function which, when called, initiates loading of the module and returns a Promise. When that Promise resolves, it calls @@ -2475,7 +2496,9 @@ globalThis.WhWasmUtilInstaller = function(target){ Error handling is up to the caller, who may attach a `catch()` call to the promise. */ -globalThis.WhWasmUtilInstaller.yawl = function(config){ +globalThis.WhWasmUtilInstaller +.yawl = function yawl(config){ + 'use strict'; const wfetch = ()=>fetch(config.uri, {credentials: 'same-origin'}); const wui = this; const finalThen = function(arg){ @@ -2500,7 +2523,7 @@ globalThis.WhWasmUtilInstaller.yawl = function(config){ tgt.alloc = function(n){ return exports.malloc(n) || toss("Allocation of",n,"bytes failed."); }; - tgt.dealloc = function(m){exports.free(m)}; + tgt.dealloc = function(m){m && exports.free(m)}; } wui(tgt); } @@ -2519,4 +2542,6 @@ globalThis.WhWasmUtilInstaller.yawl = function(config){ .then(finalThen) ; return loadWasm; -}.bind(globalThis.WhWasmUtilInstaller)/*yawl()*/; +}.bind( +globalThis.WhWasmUtilInstaller +)/*yawl()*/; diff --git a/ext/wasm/demo-jsstorage.js b/ext/wasm/demo-jsstorage.js index 587aa9cc58..e3ab5a9e53 100644 --- a/ext/wasm/demo-jsstorage.js +++ b/ext/wasm/demo-jsstorage.js @@ -16,7 +16,7 @@ */ 'use strict'; (function(){ - const T = self.SqliteTestUtil; + const T = globalThis.SqliteTestUtil; const toss = function(...args){throw new Error(args.join(' '))}; const debug = console.debug.bind(console); const eOutput = document.querySelector('#test-output'); @@ -40,7 +40,7 @@ const error = function(...args){ logHtml('error',...args); }; - + const runTests = function(sqlite3){ const capi = sqlite3.capi, oo = sqlite3.oo1, @@ -51,7 +51,7 @@ error("This build is not kvvfs-capable."); return; } - + const dbStorage = 0 ? 'session' : 'local'; const theStore = 's'===dbStorage[0] ? sessionStorage : localStorage; const db = new oo.JsStorageDb( dbStorage ); @@ -108,7 +108,7 @@ } }; - sqlite3InitModule(self.sqlite3TestModule).then((sqlite3)=>{ + sqlite3InitModule(globalThis.sqlite3TestModule).then((sqlite3)=>{ runTests(sqlite3); }); })(); diff --git a/ext/wasm/demo-worker1-promiser.c-pp.html b/ext/wasm/demo-worker1-promiser.c-pp.html index e0b487bdf3..c54b46aadb 100644 --- a/ext/wasm/demo-worker1-promiser.c-pp.html +++ b/ext/wasm/demo-worker1-promiser.c-pp.html @@ -6,11 +6,11 @@ -//#if target=es6-module - worker-promise (via ESM) tests +//#if target:es6-module + Worker1-promiser (ESM) tests //#else - worker-promise tests -//#endif + Worker1-promiser tests +//#/if
    worker-promise tests
    @@ -32,11 +32,11 @@
    
    -//#if target=es6-module +//#if target:es6-module //#else -//#endif +//#/if diff --git a/ext/wasm/demo-worker1-promiser.c-pp.js b/ext/wasm/demo-worker1-promiser.c-pp.js index c129e21281..1521edfc17 100644 --- a/ext/wasm/demo-worker1-promiser.c-pp.js +++ b/ext/wasm/demo-worker1-promiser.c-pp.js @@ -19,7 +19,7 @@ import {default as promiserFactory} from "./jswasm/sqlite3-worker1-promiser.mjs" "use strict"; const promiserFactory = globalThis.sqlite3Worker1Promiser.v2; delete globalThis.sqlite3Worker1Promiser; -//#endif +//#/if (async function(){ const T = globalThis.SqliteTestUtil; const eOutput = document.querySelector('#test-output'); @@ -53,7 +53,7 @@ delete globalThis.sqlite3Worker1Promiser; before workerPromise is set. */ console.warn("This is the v2 interface - you don't need an onready() function."); }, -//#endif +//#/if debug: 1 ? undefined : (...args)=>console.debug('worker debug',...args), onunhandled: function(ev){ error("Unhandled worker message:",ev.data); diff --git a/ext/wasm/demo-worker1.js b/ext/wasm/demo-worker1.js index 1a05cc7ac2..348741bf85 100644 --- a/ext/wasm/demo-worker1.js +++ b/ext/wasm/demo-worker1.js @@ -18,7 +18,7 @@ */ 'use strict'; (function(){ - const T = self.SqliteTestUtil; + const T = globalThis.SqliteTestUtil; const SW = new Worker("jswasm/sqlite3-worker1.js"); const DbState = { id: undefined @@ -323,7 +323,7 @@ switch(ev.result){ case 'worker1-ready': log("Message:",ev); - self.sqlite3TestModule.setStatus(null); + globalThis.sqlite3TestModule.setStatus(null); runTests(); return; default: @@ -344,5 +344,5 @@ }; log("Init complete, but async init bits may still be running."); log("Installing Worker into global scope SW for dev purposes."); - self.SW = SW; + globalThis.SW = SW; })(); diff --git a/ext/wasm/fiddle/fiddle-worker.js b/ext/wasm/fiddle/fiddle-worker.js index a5f3e25b72..4b1ea2c538 100644 --- a/ext/wasm/fiddle/fiddle-worker.js +++ b/ext/wasm/fiddle/fiddle-worker.js @@ -175,10 +175,6 @@ "features (e.g. upload) do not yet work with OPFS."); } stdout('\nEnter ".help" for usage hints.'); - this.exec([ // initialization commands... - '.nullvalue NULL', - '.headers on' - ].join('\n')); return true; }, /** diff --git a/ext/wasm/fiddle/index.html b/ext/wasm/fiddle/index.c-pp.html similarity index 95% rename from ext/wasm/fiddle/index.html rename to ext/wasm/fiddle/index.c-pp.html index 378cb39027..7fe9a12a77 100644 --- a/ext/wasm/fiddle/index.html +++ b/ext/wasm/fiddle/index.c-pp.html @@ -5,20 +5,29 @@ SQLite3 Fiddle +//#if jqterm - + + +//#/if -[sqlite3]: https://sqlite.org -[emscripten]: https://emscripten.org -[sgb]: https://wanderinghorse.net/home/stephan/ [appendix-g]: #appendix-g -[StructBinderFactory]: #api-binderfactory -[StructCtors]: #api-structctor -[StructType]: #api-structtype +[BigInt64Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt64Array +[c-pp]: https://fossil.wanderinghorse.net/r/c-pp +[Emscripten]: https://emscripten.org +[jaccwabyt.js]: /file/jaccwabyt/jaccwabyt.c-pp.js +[MDN]: https://developer.mozilla.org/docs/Web/API +[sgb]: https://wanderinghorse.net/home/stephan/ +[sqlite3]: https://sqlite.org [StructBinder]: #api-structbinder +[StructBinderFactory]: #api-binderfactory +[StructCtor]: #api-structctor [StructInstance]: #api-structinstance -[^export-func]: In Emscripten, add its name, prefixed with `_`, to the - project's `EXPORT_FUNCTIONS` list. -[BigInt64Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt64Array +[StructType]: #api-structtype [TextDecoder]: https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder [TextEncoder]: https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder -[MDN]: https://developer.mozilla.org/docs/Web/API +[WASI-SDK]: https://github.com/WebAssembly/wasi-sdk +[whwasmutil.js]: /file/wasmutil/whwasmutil.c-pp.js + +[^export-func]: In Emscripten, add its name, prefixed with `_`, to the + project's `EXPORT_FUNCTIONS` list. diff --git a/ext/wasm/libcmpp.c b/ext/wasm/libcmpp.c new file mode 100644 index 0000000000..717f56ea93 --- /dev/null +++ b/ext/wasm/libcmpp.c @@ -0,0 +1,16877 @@ +/** + This C file contains both the header and source file for c-pp, + a.k.a. libcmpp. +*/ +#if !defined(NET_WANDERINGHORSE_LIBCMPP_C_INCLUDED) +#define NET_WANDERINGHORSE_LIBCMPP_C_INCLUDED +#if !defined(_POSIX_C_SOURCE) +# define _POSIX_C_SOURCE 200809L /* for fdopen() in stdio.h */ +#endif +#define CMPP_AMALGAMATION +#if !defined(NET_WANDERINGHORSE_LIBCMPP_H_INCLUDED) +/** + This is the auto-generated "amalgamation build" of libcmpp. It was amalgamated + using: + + ./c-pp -I. -I./src -Dsrcdir=./src -Dsed=/usr/bin/sed -o libcmpp.h ./tool/libcmpp.c-pp.h -o libcmpp.c ./tool/libcmpp.c-pp.c + + with libcmpp 2.0.x c02f3e3e2d3f3573a9a33c1474c2e52fc48e52c70730404a90d0ae51517e7d37 @ 2026-03-08 14:50:35.123 UTC +*/ +#define CMPP_PACKAGE_NAME "libcmpp" +#define CMPP_LIB_VERSION "2.0.x" +#define CMPP_LIB_VERSION_HASH "c02f3e3e2d3f3573a9a33c1474c2e52fc48e52c70730404a90d0ae51517e7d37" +#define CMPP_LIB_VERSION_TIMESTAMP "2026-03-08 14:50:35.123 UTC" +#define CMPP_LIB_CONFIG_TIMESTAMP "2026-03-08 15:32 GMT" +#define CMPP_VERSION CMPP_LIB_VERSION " " CMPP_LIB_VERSION_HASH " @ " CMPP_LIB_VERSION_TIMESTAMP +#define CMPP_PLATFORM_EXT_DLL ".so" +#define CMPP_MODULE_PATH ".:/usr/local/lib/cmpp" + +#if !defined(NET_WANDERINGHORSE_CMPP_H_INCLUDED_) +#define NET_WANDERINGHORSE_CMPP_H_INCLUDED_ +/* +** 2022-11-12: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** +** The C-minus Preprocessor: C-like preprocessor. Why? Because C +** preprocessors _can_ process non-C code but generally make quite a +** mess of it. The purpose of this library is a customizable +** preprocessor suitable for use with arbitrary UTF-8-encoded text. +** +** The supported preprocessor directives are documented in the +** README.md hosted with this file (or see the link below). +** +** Any mention of "#" in the docs, e.g. "#if", is symbolic. The +** directive delimiter is configurable and defaults to "##". Define +** CMPP_DEFAULT_DELIM to a string when compiling to define the default +** at build-time. +** +** This API is presented as a library but was evolved from a +** monolithic app. Thus is library interface is likely still missing +** some pieces needed to make it more readily usable as a library. +** +** Author(s): +** +** - Stephan Beal +** +** Canonical homes: +** +** - https://fossil.wanderinghorse.net/r/c-pp +** - https://sqlite.org/src/file/ext/wasm/c-pp-lite.c +** +** With the former hosting this app's SCM and the latter being the +** original deployment of c-pp.c, from which this library +** evolved. SQLite uses a "lite" version of c-pp, whereas _this_ copy +** is its much-heavier-weight fork. +*/ + +#if defined(CMPP_HAVE_AUTOCONFIG_H) +#include "libcmpp-autoconfig.h" +#endif +#if defined(HAVE_AUTOCONFIG_H) +#include "autoconfig.h" +#endif +#if defined(HAVE_CONFIG_H) +#include "config.h" +#endif + +#ifdef _WIN32 +# if defined(BUILD_libcmpp_static) || defined(CMPP_AMALGAMATION_BUILD) +# define CMPP_EXPORT extern +# elif defined(BUILD_libcmpp) +# define CMPP_EXPORT extern __declspec(dllexport) +# else +# define CMPP_EXPORT extern __declspec(dllimport) +# endif +#else +# define CMPP_EXPORT extern +#endif + +/** + cmpp_FILE is a portability hack for WASM builds, where we want to + elide the (FILE*)-using pieces to avoid having a dependency on + Emscripten's POSIX I/O proxies. In all non-WASM builds it is + guaranteed to be an alias for FILE. On WASM builds it is guaranteed + to be an alias for void and the cmpp APIs which use it become + inoperative in WASM builds. + + That said: the code does not yet support completely compiling out + (FILE*) dependencies, and may not be able to because canonical + sqlite3 (upon which it is based) depends heavily on file + descriptors and slightly on FILE handles. +*/ +#if defined(__EMSCRIPTEN__) || defined(__wasm__) || defined(__wasi__) + typedef void cmpp_FILE; +# define CMPP_PLATFORM_IS_WASM 1 +#else + #include + typedef FILE cmpp_FILE; +# define CMPP_PLATFORM_IS_WASM 0 +#endif + +#include +#include /* PRIu32 and friends */ +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/** + 32-bit flag bitmask type. This typedef exists primarily to improve + legibility of function signatures and member structs by conveying + their intent for use as flags instead of result codes or lengths. +*/ +typedef uint32_t cmpp_flag32_t; +//typedef uint16_t cmpp_flag16_t; + +/** + An X-macro which invokes its argument (a macro name) to expand to + all possible values of cmpp_rc_e entries. The macro name passed to + it is invoked once for each entry and passed 3 arguments: the enum + entry's full name (CMPP_RC_...), its integer value, and a help-text + string. +*/ +#define cmpp_rc_e_map(E) \ + E(CMPP_RC_OK, 0, \ + "The quintessential not-an-error value.") \ + E(CMPP_RC_ERROR, 100, \ + "Generic/unknown error.") \ + E(CMPP_RC_NYI, 101, \ + "A placeholder return value for not yet implemented functions.") \ + E(CMPP_RC_OOM, 102, \ + "Out of memory. Indicates that a resource allocation " \ + "request failed.") \ + E(CMPP_RC_MISUSE, 103, \ + "API misuse (invalid args)") \ + E(CMPP_RC_RANGE, 104, \ + "A range was violated.") \ + E(CMPP_RC_ACCESS, 105, \ + "Access to or locking of a resource was denied " \ + "by some security mechanism or other.") \ + E(CMPP_RC_IO, 106, \ + "Indicates an I/O error. Whether it was reading or " \ + "writing is context-dependent.") \ + E(CMPP_RC_NOT_FOUND, 107, \ + "Requested resource not found.") \ + E(CMPP_RC_ALREADY_EXISTS, 108, \ + "Indicates that a to-be-created resource already exists.") \ + E(CMPP_RC_CORRUPT, 109, \ + "Data consistency problem.") \ + E(CMPP_RC_SYNTAX, 110, \ + "Some sort of syntax error.") \ + E(CMPP_RC_NOOP, 111, \ + "Special sentinel value for some APIs.") \ + E(CMPP_RC_UNSUPPORTED, 112, \ + "An unsupported operation was request.") \ + E(CMPP_RC_DB, 113, \ + "Indicates db-level error (e.g. statement prep failed). In such " \ + "cases, the error state of the related db handle (cmpp_db) " \ + "will be updated to contain more information directly from the " \ + "db driver.") \ + E(CMPP_RC_NOT_DEFINED, 114, \ + "Failed to expand an undefined value.") \ + E(CMPP_RC_ASSERT, 116, "An #assert failed.") \ + E(CMPP_RC_TYPE, 118, \ + "Indicates that some data type or logical type is incorrect.") \ + E(CMPP_RC_CANNOT_HAPPEN, 140, \ + "This is intended only for internal use, to " \ + "report conditions which \"cannot possibly happen\".") \ + E(CMPP_RC_HELP, 141, \ + "--help was used in the arguments to cmpp_process_argv()") \ + E(CMPP_RC_NO_DIRECTIVE, 142, \ + "A special case of CMPP_RC_NOT_FOUND needed to disambiguate.") \ + E(CMPP_RC_end,200, \ + "Must be the final entry in the enum. Used for creating client-side " \ + "result codes which are guaranteed to live outside of this one's " \ + "range.") + +/** + Most functions in this library which return an int return result + codes from the cmpp_rc_e enum. None of these entries are + guaranteed to have a specific value across library versions except + for CMPP_RC_OK, which is guaranteed to always be 0 (and the API + guarantees that no other code shall have a value of zero). + + The only reasons numbers are hard-coded to the values is to + simplify debugging during development. Clients may use + cmpp_rc_cstr() to get some human-readable (or programmer-readable) + form for any given value in this enum. +*/ +enum cmpp_rc_e { +#define E(N,V,H) N = V, + cmpp_rc_e_map(E) +#undef E +}; +typedef enum cmpp_rc_e cmpp_rc_e; + +/** + Returns the string form of the given cmpp_rc_e value or NULL if + it's not a member of that enum. +*/ +char const * cmpp_rc_cstr(int rc); + +/** + CMPP_BITNESS specifies whether the library should use 32- or 64-bit + integer types for its size/length measurements. It's difficult to + envision use cases for a preprocessor which would require counters + or rangers larger than 32 bits provide for, so the default is 32 + bits. Builds created with different CMPP_BITNESS values are + not binary-compatible. +*/ +#define CMPP_BITNESS 32 +#if 32==CMPP_BITNESS +/** + Unsigned integer type for string/stream lengths. 32 bits is + sufficient for all but the weirdest of inputs and outputs. +*/ +typedef uint32_t cmpp_size_t; + +/** + A signed integer type indicating the maximum length of strings or + byte ranges in a stream. It is most frequently used in API + signatures where a negative value means "if it's negative then use + strlen() to count it". +*/ +typedef int32_t cmpp_ssize_t; + +/** + The printf-format-compatible format letter (or group of letters) + appropriate for use with cmpp_size_t. Contrary to popular usage, + size_t cannot be portably used with printf(), without careful + casting, because it has neither a fixed size nor a standardized + printf/scanf format specifier (like the stdint.h types do). +*/ +#define CMPP_SIZE_T_PFMT PRIu32 +#elif 64==CMPP_BITNESS +typedef uint64_t cmpp_size_t; +typedef int64_t cmpp_ssize_t; +#define CMPP_SIZE_T_PFMT PRIu64 +#else +#error "Invalid CMPP_BITNESS value. Expecting 32 or 64." +#endif + +/** + Generic interface for streaming in data. Implementations must read + (at most) *n bytes from their input, copy it to dest, assign *n to + the number of bytes actually read, return 0 on success, and return + non-0 cmpp_rc_e value on error (e.g. CMPP_RC_IO). + + When called, *n is the max length to read. On return, *n must be + set to the amount actually read. Implementations may need to + internally distinguish a short read due to EOF from a short read + due to an I/O error, e.g. using feof() and/or ferror(). A short + read for EOF is not an error but a short read for input failure is. + This library invariably treats a short read as EOF. + + The state parameter is the implementation-specified input + file/buffer/whatever channel. +*/ +typedef int (*cmpp_input_f)(void * state, void * dest, cmpp_size_t * n); + +/** + Generic interface for streaming out data. Implementations must + write n bytes from src to their destination channel and return 0 on + success, or a value from the cmpp_rc_e enum on error + (e.g. CMPP_RC_IO). The state parameter is the + implementation-specified output channel. + + It is implementation-defined whether an n of 0 is legal. This + library refrains from passing 0 to these functions. + + In the context of cmpp, the library makes no guarantees that output + will always end at a character boundary. It may send any given + multibyte character as the end resp. start of two calls to this + function. If that is of a concern for implementors of these + functions (e.g. because they're appending the output to a UI + widget), they may need to buffer all of the output before applying + it (see cmpp_b), or otherwise account for partial characters. + + That said: the core library, by an accident of design, will always + emit data at character boundaries, assuming that its input is + well-formed UTF-8 text (which cmpp does not validate to be the + case). Custom cmpp_dx_f() implementations are not strictly + required to do so but, because of how cmpp is used, almost + certainly will. But relying on that is ill-advised. +*/ +typedef int (*cmpp_output_f)(void * state, void const * src, + cmpp_size_t n); + +/** + Generic interface for flushing arbitrary output streams. Must + return 0 on success, a non-0 cmpp_rc_e value on error. When in + doubt, return CMPP_RC_IO on error. The interpretation of the state + parameter is implementation-specific. +*/ +typedef int (*cmpp_flush_f)(void * state); + +typedef struct cmpp_pimpl cmpp_pimpl; +typedef struct cmpp_api_thunk cmpp_api_thunk; +typedef struct cmpp_outputer cmpp_outputer; + +/** + The library's primary class. Each one of these represents a + separate preprocessor instance. + + See also: cmpp_dx (the class which client-side extensions interact + with the most). +*/ +struct cmpp { + + /** + API thunk object to support use via loadable modules. Client code + does not normally need to access this member, but it's exposed + here to give loadable modules more flexibility in how they use + the thunk. + + This pointer is _always_ the same singleton object. The library + never exposes a cmpp object with a NULL api member. + */ + cmpp_api_thunk const * const api; + + /** + Private internal state. + */ + cmpp_pimpl * const pimpl; +}; +typedef struct cmpp cmpp; + +/** + Flags for use with cmpp_ctor_cfg::flags. +*/ +enum cmpp_ctor_e { + /* Sentinel value. */ + cmpp_ctor_F_none = 0, + /* Disables #include. */ + cmpp_ctor_F_NO_INCLUDE = 0x01, + /* Disables #pipe. */ + cmpp_ctor_F_NO_PIPE = 0x02, + /* Disables #attach, #detach, and #query. */ + cmpp_ctor_F_NO_DB = 0x04, + /* Disables #module. */ + cmpp_ctor_F_NO_MODULE = 0x08, + /** + Disable all built-in directives which may work with the filesystem + or invoke external processes. Client-defined directives with the + cmpp_d_F_NOT_IN_SAFEMODE flag are also disabled. Directives + disabled via the cmpp_ctor_F_NO_... flags (or equivalent library + built-time options) do not get registered, so will trigger + "unknown directive" errors rather than safe-mode violation errors. + */ + cmpp_ctor_F_SAFEMODE = 0x10, +}; + +/** + A configuration object for cmpp_ctor(). This type may be extended + as new construction-time customization opportunities are + discovered. +*/ +struct cmpp_ctor_cfg { + /** + Bitmask from the cmpp_ctor_e enum. + */ + cmpp_flag32_t flags; + /** + If not NULL then this must name either an existing SQLite3 db + file or the name of one which can be created on demand. If NULL + then an in-memory or temporary database is used (which one is + unspecified). The library copies these bytes, so they need not be + valid after a call to cmpp_ctor(). + */ + char const * dbFile; +}; +typedef struct cmpp_ctor_cfg cmpp_ctor_cfg; + +/** + Assigns *pp to a new cmpp or NULL on OOM. Any non-NULL return + value must eventually be passed to cmpp_dtor() to free it. + + The cfg argument, if not NULL, holds config info for the new + instance. If NULL, an instance with unspecified defaults is + used. These configuration pieces may not be modified after the + instance is created. + + It returns 0 if *pp is ready to use and non-0 if either allocation + fails (in which case *pp will be set to 0) or initialization of *pp + failed (in which case cmpp_err_get() can be used to determine why + it failed). In either case, the caller must eventually pass *pp to + cmpp_dtor() to free it. + + If the library is built with the symbol CMPP_CTOR_INSTANCE_INIT + defined, it must refer to a function with this signature: + + int CMPP_CTOR_INSTANCE_INIT(cmpp *); + + The library calls this before returning and arranges to call it + lazily if pp gets reset. The intent is that the init function + installs custom directives using cmpp_d_register(). That + initialization, on error, is expected to set its argument's error + state with cmpp_err_set(). +*/ +CMPP_EXPORT int cmpp_ctor(cmpp **pp, cmpp_ctor_cfg const * cfg); + +/** + If pp is not NULL, it is passed to cmpp_reset() and then freed. +*/ +CMPP_EXPORT void cmpp_dtor(cmpp *pp); + + +/** + realloc(3)-compatible allocator used by the library. + + This API very specifically uses sqlite3_realloc() as its basis. +*/ +CMPP_EXPORT void * cmpp_mrealloc(void * p, size_t n); + +/** + malloc(3)-compatible allocator used by the library. + + This API very specifically uses sqlite3_malloc() as its basis. +*/ +CMPP_EXPORT void * cmpp_malloc(size_t n); + +/** + free(3)-compatible deallocator. It can also be used as a destructor + for cmpp_d_register() _if_ the memory in question is allocated by + cmpp_malloc(), cmpp_realloc(), or the sqlite3_malloc() family of + APIs. + + This is not called cmpp_free() to try to avoid any confusion with + cmpp_dtor(). +*/ +CMPP_EXPORT void cmpp_mfree(void *); + +/** + If m is NULL then pp's persistent error code is set to CMPP_RC_OOM, + else this is a no-op. Returns pp's error code. + + To simplify certain uses, pp may be NULL, in which case this + function returns CMPP_RC_OOM if m is NULL and 0 if it's not. +*/ +CMPP_EXPORT int cmpp_check_oom(cmpp * const pp, void const * const m ); + +/** + Re-initializes all state of pp. This saves some memory for reuse + but resets it all to default states. This closes the database and + will also reset any autoloader, policies, or delimiter + configurations to their compile-time defaults. It retains only a + small amount of state, like any configuration which was passed to + cmpp_ctor(). + + After calling this, pp is in a cleanly-initialized state and may be + re-used with the cmpp API. Its database will not be initialized + until an API which needs it is called, so pp can be used with + functions which may otherwise be prohibited after the db is + opened. (Do we still have any?) + + As of this writing, this is the only way to reliably recover a cmpp + instance from any significant errors. Errors may do things like + leave savepoints out of balance, and this cleanup step resets all + of that state. However, it also loses state like the autoloader. + + TODO?: we need(?) a partial-clear operation which keeps some of the + instance's state, most notably custom directives, the db handle, + and any cached prepared statements. See cmpp_err_set() for the + distinction between recoverable and non-recoverable errors. +*/ +CMPP_EXPORT void cmpp_reset(cmpp *pp); + +#if 0 +Not yet; +/** + If called before pp has initialized its database, this sets the + file name used for that database. If called afterwards, pp's error + state is updated and CMPP_RC_MISUSE is returned. If called while + pp has error state set, that code is returned without side-effects. + + This does not open the database. It is opened on demand when + processing starts. + + On success it returns 0 and this function makes a copy of zName. + + As a special case, zName may be NULL to use the default name, but + there is little reason to do so unless one changes their mind after + setting it to non-NULL. + */ +CMPP_EXPORT int cmpp_db_name_set(cmpp *pp, const char * zName); +#endif + +/** + Returns true if the bytes in the range [zName, zName+n) comprise a + legal name for a directive or a define. + + It disallows any control characters, spaces, and most punctuation, + but allows alphanumeric (but must not start with a number) as well + as any of: -./:_ (but it may not start with '-'). Any characters + with a high bit set are assumed to be UTF-8 and are permitted as + well. + + The name's length is limited, rather arbitrarily, to 64 bytes. + + If the key is not legal then false is returned and if zErrPos is + not NULL then *zErrPos is set to the position in zName of the first + offending character. If validation fails because n is too long then + *zErrPos (if zErrPos is not NULL) will be set to 0. + + Design note: this takes unsigned characters because it most + commonly takes input from cmpp_args::z strings. +*/ +CMPP_EXPORT bool cmpp_is_legal_key(unsigned char const *zName, + cmpp_size_t n, + unsigned char const **zErrPos); + +/** + Adds the given `#define` macro name to the list of macros, overwriting + any previous value. + + zKey must be NUL-terminated and legal as a key. The rules are the + same as for cmpp_is_legal_key() except that a '=' is also permitted + if it's not at the start of the string because... + + If zVal is NULL then zKey may contain an '=', from which the value + will be extracted. If zVal is not NULL then zKey may _not_ contain + an '='. + + The ability for zKey to contain a key=val was initially to + facilitate input from the CLI (e.g. -Dfoo=bar) because cmpp was + initially a CLI app (as opposed to a library). It's considered a + "legacy" feature, not recommended for most purposes, but it _is_ + convenient for that particular purpose. + + Returns 0 on success and updates pp's error state on error. + + See: cmpp_define_v2() + See: cmpp_undef() +*/ +CMPP_EXPORT int cmpp_define_legacy(cmpp *pp, const char * zKey, + char const *zVal); + +/** + Works like cmpp_define_legacy() except that it does not examine zKey to + see if it contains an '='. +*/ +CMPP_EXPORT int cmpp_define_v2(cmpp *pp, const char * zKey, char const *zVal); + +/** + Removes the given `#define` macro name from the list of + macros. zKey is, in this case, treated as a GLOB pattern, and all + matching defines are deleted. + + If nRemoved is not NULL then, on success, it is set to the number + of entries removed by this call. + + Returns 0 on success and updates pp's error state on error. It is not + an error if no value was undefined. + + This does _not_ affect defines made using cmpp_define_shadow(). +*/ +CMPP_EXPORT int cmpp_undef(cmpp *pp, const char * zKey, + unsigned int *nRemoved); + +/** + This works similarly to cmpp_define_v2() except that: + + - It does not permit its zKey argument to contain the value + part like that function does. + + - The new define "shadows", rather than overwrites, an existing + define with the same name. + + All APIs which look up define keys will get the value of the shadow + define. The shadow can be uninstalled with cmpp_define_unshadow(), + effectively restoring its previous value (if any). That function + should be called one time for each call to this one, passing the + same key to each call. A given key may be shadowed any number of + times by this routine. Each one saves the internal ID of the shadow + into *pId (and pId must not be NULL). That value must be passed to + cmpp_define_unshadow() to ensure that the "shadow stack" stays + balanced in the face of certain error-handling paths. + + cmpp_undef() will _not_ undefine an entry added through this + interface. + + Returns pp's persistent error code (0 on success). + + Design note: this function was added to support adding a define + named __FILE__ to input scripts which works like it does in a C + preprocessor. Alas, supporting __LINE__ would be much more costly, + as it would have to be updated in the db from several places, so + its cost would outweigh its meager benefits. +*/ +CMPP_EXPORT int cmpp_define_shadow(cmpp *pp, char const *zKey, + char const *zVal, + int64_t * pId); + +/** + Removes the most shadow define matching the zKey and id values + which where previously passed to cmpp_define_shadow(). It is not + an error if no match is found, in which case this function has no + visible side-effects. + + Unlike cmpp_undef(), zKey is matched precisely, not against a glob. + + In order to keep the "shadow stack" properly balanced, this will + delete any shadow entries for the given key which have the same id + or a newer one (i.e. they were left over from a missed call to + cmpp_define_unshadow()). + + Returns pp's persistent error code (0 on success). +*/ +CMPP_EXPORT int cmpp_define_unshadow(cmpp *pp, char const *zKey, + int64_t id); + +/** + Adds the given dir to the list of includes. They are checked in the + order they are added. +*/ +CMPP_EXPORT int cmpp_include_dir_add(cmpp *pp, const char * zKey); + +/** + Sets pp's default output channel. If pp already has a channel, it + is closed[^1]. + + The second argument, if not NULL, is _bitwise copied_, which has + implications for the ownership of out->state (see below). If it is + is NULL, cmpp_outputer_empty is copied in its place, which makes + further output a no-op. + + The third argument is a symbolic name for the channel (perhaps its + file name). It is used in debugging and error messages. cmpp does + _not_ copy it, so its bytes must outlive the cmpp instance. (In + practice, the byte names come from main()'s argv or scope-local + strings in the same scope as the cmpp instance.) This argument + should only be NULL if the second argument is. + + cmpp_reset(), or opening another channel, will end up calling + out->cleanup() (if it's not NULL) and passing it a pointer to a + _different_ cmpp_outputer object, but with the _same_ + cmpp_outputer::state pointer, which may invalidate out->state. + + To keep cmpp from doing that, make a copy of the output object, set + the cleanup member of that copy to NULL, then pass that copy to this + function. It is then up to the client to call out->cleanup(out) when + the time is right. + + For example: + + ``` + cmpp_outputer my = cmpp_outputer_FILE; + my.state = cmpp_fopen("/some/file", "wb"); + cmpp_outputer tmp = my; + tmp.cleanup = NULL; + cmpp_outputer_set( pp, &tmp, "my file"); + ... + my.cleanup(&my); // will cmpp_fclose(my.state) + ``` + + Potential TODO: internally store the output channel as a pointer. + It's not clear whether that would resolve the above grief or + compound it. + + [^1]: depending on the output channel, it might not _actually_ be + closed, but pp is disassociated from it, in any case. +*/ +CMPP_EXPORT +void cmpp_outputer_set(cmpp *pp, cmpp_outputer const *out, char const *zName); + +/** + Treats the range (zIn,zIn+nIn] as a complete cmpp input and process + it appropriately. zName is the name of the input for purposes of + error messages. If nIn is negative, strlen() is used to calculate + it. + + This is a no-op if pp has any error state set. It returns pp's + persistent error code. +*/ +CMPP_EXPORT int cmpp_process_string(cmpp *pp, const char * zName, + unsigned char const * zIn, + cmpp_ssize_t nIn); + +/** + A thin proxy for cmpp_process_string() which reads its input from + the given file. Returns 0 on success, else returns pp's persistent + error code. +*/ +CMPP_EXPORT int cmpp_process_file(cmpp *pp, const char * zName); + +/** + A thin proxy for cmpp_process_string() which reads its input from + the given input source, consuming it all before passing it + on. Returns 0 on success, else returns pp's persistent error code. +*/ +CMPP_EXPORT int cmpp_process_stream(cmpp *pp, const char * zName, + cmpp_input_f src, void * srcState); + +/** + Process the given main()-style arguments list. When calling from + main(), be sure to pass it main()'s (argc+1, argv+1) to skip argv[0] + (the binary's name). + + Each argument is expected to be one of the following: + + 1) One of --help or -?: causes this function to return CMPP_RC_HELP + without emitting any output. + + 2) -DX or -DX=Y: sets define X to 1 (if no "=" is used and no Y given) or to + Y. + + 3) -UX: unsets all defines matching glob X. + + 4) -FX=Y: works like -DX=Y but treats Y as a filename and sets X to + the contents of that file. + + 5) -IX: adds X to the "include path". If _no_ include path is + provided then cmpp assumes a path of ".", but if _any_ paths are + provided then it does not assume that "." is in the path. + + 6) --chomp-F: specifies whether subsequent -F flags should "chomp" + one trailing newline from their input files. + + 7) --delimiter|-d=X sets the directive delimiter to X. Its default + is a compile-time constant. + + 8) --output|-o=filename: sets the output channel to the given file. + A value of "-" means stdout. If no output channel is opened when + this is called, and files are to be processed, stdout is + assumed. (That's a historical artifact from earlier evolutions.) + To override that behavior use cmpp_outputer_set(). + + 9) --file|-f=filename: sets the input channel to the given file. + A value of "-" means stdin. + + 10) -e=SCRIPT Treat SCRIPT as a complete c-pp input and process it. + Because it's difficult to pack multiple lines of text into this, + it's really of use for testing #expr and #assert. + + 11) --@policy=X sets the @token@ parsing policy. X must be + one of (retain, elide, error, off) and defaults to off. + + 12) -@: shorthand for --@policy=error. + + 13) --sql-trace: enables tracing of all SQL statements to stderr. + This is useful for seeing how a script interacts with the + database. Use --no-sql-trace to disable it. + + 14) --sql-trace-x: like --sql-trace but replaces bound parameter + placeholders with their SQL values. Use --no-sql-trace to disable + it. + + 15) --dump-defines: emit all defines to stdout. They should + arguably go to stderr but that interferes with automated testing. + + Any argument which does not match one of the above flags, and does + not start with a "-", is treated as if it were passed to the --file + flag. + + Flags may start with either 1 or 2 dashes - they are equivalent. + + Flags which take a value may either be in the form X=Y or X Y, i.e. + may be a single argv entry or a pair of them. + + It performs two passes on the arguments: the first is for validation + checking for --help/-?. No processing of the input(s) and output(s) + happens unless the first pass completes. Similarly, no validation of + whether any provided filename are actually readable is performed + until the second pass. + + Arguments are processed in the order they are given. Thus the following + have completely different meanings: + + 1) -f foo.in -Dfoo + 2) -Dfoo -f foo.in + + The former will process foo.in before defining foo. + + This behavior makes it possible to process multiple input files in + a single go: + + --output foo.out foo.in -Dfoo foo.in -Ufoo --output bar.out -Dbar foo.in + + It returns 0 on success. cmpp_err_get() can be used to fetch any + error message. +*/ +CMPP_EXPORT int cmpp_process_argv(cmpp *pp, int argc, char const * const * argv); + +/** + Intended to be called if cmpp_process_argv() returns CMPP_RC_HELP. + It emits --help-style text to the given output stream. + As the first argument pass it either argv[0] or NULL. The second + should normally be stdout or stderr. + + Reminder to self: this could take a (cmpp_output_f,void*) pair + instead, and should do so for the sake of WASI builds, but its impl + currently relies heavily on fprintf() formatting. +*/ +CMPP_EXPORT void cmpp_process_argv_usage(char const *zAppName, + cmpp_FILE *os); + +/** + Returns pp's current error number (from the cmpp_rc_e enum) and + sets *zMsg (if zMsg is not NULL) to the error string. The bytes are + owned by pp and may be invalidated by any functions which take pp + as an argument. + + See cmpp_err_get() for more information. + +*/ +CMPP_EXPORT int cmpp_err_get(cmpp *pp, char const **zMsg); + +/** + Sets or clears (if 0==rc) pp's persistent error state. zFmt may be + NULL or a format string compatible with sqlite3_mprintf(). + + To simplify certain uses, this is a no-op if pp is NULL, returning + rc without other side effects. + + Returns rc with one exception: if allocation of a copy of the error + string fails then CMPP_RC_OOM will be returned (and pp will be + updated appropriately). + + If pp is currently processing a script, the resulting error string + will be prefixed with the name of the current input script and the + line number of the directive which triggered the error. + + It is legal for zFmt to be NULL or an empty string, in which case a + default, vague error message is used (without requiring allocation + of a new string). + + Recoverable vs. unrecoverable errors: + + Most cmpp APIs become no-ops if their cmpp object has error state + set, treating any error as unrecoverable. That approach simplifies + writing code for it by allowing multiple calls to be chained + without concern for whether the previous one succeeded. + + ACHTUNG: simply clearing the error state by passing 0 as the 2nd + argument to this function is _not_ enough to recover from certain + errors. e.g. an error in the middle of a script may leave db + savepoints imbalanced. The only way to _fully_ recover from any + significant failures is to use cmpp_reset(), which resets all of + pp's state. + + APIs which may set the error state but are recoverable by simply + clearing that state will document that. Errors from APIs which do + not claim to be recoverable in error cases must be treated as + unrecoverable. + + See cmpp_err_get() for more information. + + FIXME: we need a different variant for WASM builds, where variadics + aren't a usable thing. + + Potential TODO: change the error-reporting interface to support + distinguishing from recoverable and non-recoverable errors. "The + problem" is that no current uses need that - they simply quit and + free up the cmpp instance on error. Maybe that's the way it + _should_ be. +*/ +CMPP_EXPORT int cmpp_err_set(cmpp *pp, int rc, char const *zFmt, ...); + +/** + A variant of cmpp_err_set() which is not variadic, as a consolation + for WASM builds. zMsg may be NULL. The given string, if not NULL, + is copied. +*/ +CMPP_EXPORT int cmpp_err_set1(cmpp *pp, int rc, char const *zMsg); + +#if 0 +/** + Clears any error state in pp. Most cmpp APIs become no-ops if their + cmpp instance has its error flag set. + + See cmpp_err_get() for important details about doing this. +*/ +//CMPP_EXPORT void cmpp_err_clear(cmpp *pp); + +/** + This works like a combination of cmpp_err_get() and + cmpp_err_clear(), in that it clears pp's error state by transferring + ownership of it to the caller. If pp has any error state, *zMsg is + set to the error string and the error code is returned, else 0 is + returned and *zMsg is set to 0. + + The string returned via *zMsg must eventually be passed to + cmpp_mfree() to free it. + + This function is provided simply as an optimization to avoid + having to copy the error string in some cases. + + ACHTUNG: see the ACHTUNG in cmpp_err_clear(). +*/ +CMPP_EXPORT int cmpp_err_take(cmpp *pp, char **zMsg); +#endif + +/** + Returns pp's current error code, which will be 0 if it currently + has no error state. + + To simplify certain uses, this is a no-op if pp is NULL, returning + 0. +*/ +CMPP_EXPORT int cmpp_err_has(cmpp const * pp); + +/** + Returns true if pp was initialized in "safe mode". That is: if the + cmpp_ctor_F_SAFEMODE flag was passed to cmpp_ctor(). + + To simplify certain uses, this is a no-op if pp is NULL, returning + false. +*/ +CMPP_EXPORT bool cmpp_is_safemode(cmpp const * pp); + +/** + Starts a new SAVEPOINT in the database. Returns non-0, and updates + pp's persistent error state, on failure. + + If this returns 0, the caller is obligated to later call either + cmpp_sp_commit() or cmpp_sp_rollback() later. +*/ +CMPP_EXPORT int cmpp_sp_begin(cmpp *pp); + +/** + Commits the most recently-opened savepoint UNLESS pp's error state + is set, in which case this behaves like cmpp_sp_rollback(). + Returns 0 on success. + + A call to cmpp_sp_begin() which returns 0 obligates the caller to + call either cmpp_sp_rollback() or cmpp_sp_commit(). It is illegal + for either to be called in any other context. +*/ +CMPP_EXPORT int cmpp_sp_commit(cmpp *pp); + +/** + Rolls back the most recently-opened savepoint. Returns 0 on + success. + + A call to cmpp_sp_begin() which returns 0 obligates the caller to + call either cmpp_sp_rollback() or cmpp_sp_commit(). It is illegal + for either to be called in any other context. +*/ +CMPP_EXPORT int cmpp_sp_rollback(cmpp *pp); + +/** + A cmpp_output_f() impl which requires state to be a (FILE*), which + this function passes the call on to fwrite(). Returns 0 on + success, CMPP_RC_IO on error. + + If state is NULL then stdout is used. +*/ +CMPP_EXPORT int cmpp_output_f_FILE(void * state, void const * src, cmpp_size_t n); + +/** + A cmpp_output_f() impl which requires state to be a ([const] int*) + referring to a writable file descriptor, which this function + dereferences and passes to write(2). +*/ +CMPP_EXPORT int cmpp_output_f_fd(void * state, void const * src, cmpp_size_t n); + +/** + A cmpp_input_f() implementation which requires that state be + a readable (FILE*) handle, which it passes to fread(3). +*/ +CMPP_EXPORT int cmpp_input_f_FILE(void * state, void * dest, cmpp_size_t * n); + +/** + A cmpp_input_f() implementation which requires that state be a + readable file descriptor, in the form of an ([const] int*), which + this function passes to write(2). +*/ +CMPP_EXPORT int cmpp_input_f_fd(void * state, void * dest, cmpp_size_t * n); + +/** + A cmpp_flush_f() impl which expects pFile to be-a (FILE*) opened + for writing, which this function passes the call on to + fflush(). If fflush() returns 0, so does this function, else it + returns non-0. +*/ +CMPP_EXPORT int cmpp_flush_f_FILE(void * pFile); + +/** + A generic streaming routine which copies data from an + cmpp_input_f() to an cmpp_outpuf_f(). + + Reads all data from inF(inState,...) in chunks of an unspecified + size and passes them on to outF(outState,...). It reads until inF() + returns fewer bytes than requested or returns non-0. Returns the + result of the last call to outF() or (if reading fails) inF(). + Results are undefined if either of inState or outState arguments + are NULL and their callbacks require non-NULL. (This function + cannot know whether a NULL state argument is legal for the given + callbacks.) + + Here is an example which basically does the same thing as the + cat(1) command on Unix systems: + + ``` + cmpp_stream(cmpp_input_f_FILE, stdin, cmpp_output_f_FILE, stdout); + ``` + + Or copy a FILE to a string buffer: + + ``` + cmpp_b os = cmpp_b_empty; + FILE * f = cmpp_fopen(...); + rc = cmpp_stream(cmpp_input_f_FILE, f, cmpp_output_f_b, &os); + // On error os might be partially populated. + // Eventually clean up the buffer: + cmpp_b_clear(&os); + ``` +*/ +CMPP_EXPORT int cmpp_stream(cmpp_input_f inF, void * inState, + cmpp_output_f outF, void * outState); + +/** + Reads the entire contents of the given input stream, allocating it + in a buffer. On success, returns 0, assigns *pOut to the buffer, + and *nOut to the number of bytes read (which will be fewer than are + allocated). It guarantees that on success it NUL-terminates the + buffer at one byte after the returned size, with one exception: if + the string has no input, both *pOut and *nOut will be set to 0. + + On error it returns whatever code xIn() returns. +*/ +CMPP_EXPORT int cmpp_slurp(cmpp_input_f xIn, void *stateIn, + unsigned char **pOut, cmpp_size_t * nOut); + +/** + _Almost_ equivalent to fopen(3) but: + + - If name=="-", it returns one of stdin or stdout, depending on the + mode string: stdout is returned if 'w' or '+' appear, otherwise + stdin. + + If it returns NULL, the global errno "should" contain a description + of the problem unless the problem was argument validation. + + If at all possible, use cmpp_fclose() (as opposed to fclose()) to + close these handles, as it has logic to skip closing the three + standard streams. +*/ +CMPP_EXPORT cmpp_FILE * cmpp_fopen(char const * name, char const *mode); + +/** + Passes f to fclose(3) unless f is NULL or one of the C-standard + handles (stdin, stdout, stderr), in which cases it does nothing at + all. +*/ +CMPP_EXPORT void cmpp_fclose(cmpp_FILE * f); + +/** + A cleanup callback interface for use with cmpp_outputer::cleanup(). + Implementations must handle self->state appropriately for its type, + and clear self->state if appropriate, but must not free the self + object. It is implementation-specified whether self->state and/or + self->name are set to NULL by this function. Whether they should be + often depends on how they're used. +*/ +typedef void (*cmpp_outputer_cleanup_f)(cmpp_outputer *self); + +/** + An interface which encapsulates data for managing a streaming + output destination, primarily intended for use with cmpp_stream() + but also used internally by cmpp for directing output to a buffer. +*/ +struct cmpp_outputer { + /** + An optional descriptive name for the channel. The bytes + are owned elsewhere and are typically static or similarly + long-lived. + */ + char const * name; + + /** + Output channel. + */ + cmpp_output_f out; + + /** + flush() implementation. This may be NULL for most uses of this + class. Cases which specifically require it must document that + requirement so. + */ + cmpp_flush_f flush; + + /** + Optional: if not NULL, it must behave appropriately for its state + type, cleaning up any memory it owns. + */ + cmpp_outputer_cleanup_f cleanup; + + /** + State to be used when calling this->out() and this->flush(), + namely: this->out(this->state, ... ) and + this->flush(this->state). + + Whether or not any given instance of this class owns the memory + pointed to by this member must be documented for their cleanup() + method. + + Because cmpp_outputer instances frequently need to be stashed and + unstashed via bitwise copying, it is illegal to replace this + pointer after its initial assignment. The object it points to may + be mutated freely, but this pointer must stay stable for the life + of this object. + */ + void * state; +}; + +/** + Empty-initialized cmpp_outputer instance, intended for + const-copy initialization. +*/ +#define cmpp_outputer_empty_m \ + {.name=NULL, .out = NULL,.flush = NULL, .cleanup = NULL, .state =NULL} + +/** + Empty-initialized cmpp_outputer instance, intended for + non-const-copy initialization. These copies can, for purposes of + cmpp's output API, be used as-is to have cmpp process its inputs + but generate no output. +*/ +CMPP_EXPORT const cmpp_outputer cmpp_outputer_empty; + +/** + If o->out is not NULL, the result of o->out(o->state,p,n) is + returned, else 0 is returned. +*/ +CMPP_EXPORT int cmpp_outputer_out(cmpp_outputer *o, void const *p, cmpp_size_t n); + +/** + If o->flush is not NULL, the result of o->flush(o->state) is + returned, else 0 is returned. +*/ +CMPP_EXPORT int cmpp_outputer_flush(cmpp_outputer *o); + +/** + If o->cleanup is not NULL, it is called, otherwise this is a no-op. +*/ +CMPP_EXPORT void cmpp_outputer_cleanup(cmpp_outputer *o); + +/** + A cmpp_outputer initializer which uses cmpp_flush_f_FILE(), + cmpp_output_f_FILE(), and cmpp_outputer_cleanup_f_FILE() for its + implementation. After copying this, the state member must be + pointed to an opened-for-writing (FILE*). +*/ +CMPP_EXPORT const cmpp_outputer cmpp_outputer_FILE; + +/** + The cmpp_outputer_cleanup_f() impl used by cmpp_outputer_FILE. If + self->state is not NULL then it is passed to fclose() (_unless_ it + is stdin, stdout, or stderr) and set to NULL. self->name is + also set to NULL. +*/ +CMPP_EXPORT void cmpp_outputer_cleanup_f_FILE(cmpp_outputer *self); + +/** + Sets pp's current directive delimiter to a copy of the + NUL-terminated zDelim. The delimiter is the sequence which starts + line and distinguishes cmpp directives from other input, in the + same way that C preprocessors use '#' as a delimiter. + + If zDelim is NULL then the default delimiter is used. The default + delimiter can be set when compiling the library by defining + CMPP_DEFAULT_DELIM to a quoted string value. + + zDelim is assumed to be in UTF-8 encoding. If any bytes in the + range (0,32) are found, CMPP_RC_MISUSE is returned and pp's + persistent error state is set. + + The delimiter must be short and syntactically unambiguous for the + intended inputs. It has a rather arbitrary maximum length of 12, + but it's difficult to envision it being remotely human-friendly + with a delimiter longer than 3 bytes. It's conceivable, but + seemingly far-fetched, that longer delimiters might be interesting + in some machine-generated cases, e.g. using a random sequence as + the delimiter. + + Returns 0 on success. Returns non-0 if called when the delimiter + stack is empty, if it cannot copy the string or zDelim is deemed + unsuitable for use as a delimiter. Calling this when the stack is + empty represents a serious API misuse (indicating that + cmpp_delimiter_pop() was used out of scope) and will trigger an + assert() in debug builds. Except for that last case, errors from + this function are recoverable (see cmpp_err_set()). +*/ +CMPP_EXPORT int cmpp_delimiter_set(cmpp *pp, char const *zDelim); + +/** + Fetches pp's current delimiter string, assigning it to *zDelim. + The string is owned by pp and will be invalidated by any call to + cmpp_delimiter_set() or the #delimiter script directive. + + If, by some odd usage constellation, this is called after an + allocation of the delimiter stack has failed, this will set *zDelim + to the compile-time-default delimiter. That "cannot happen" in normal use because such a failure + would have been reacted to and this would not be called. +*/ +CMPP_EXPORT void cmpp_delimiter_get(cmpp const *pp, char const **zDelim); + +/** + Pushes zDelim as the current directive delimiter. Returns 0 on + success and non-zero on error (invalid zDelim value or allocation + error). If this returns 0 then the caller is obligated to + eventually call cmpp_delimiter_pop() one time. If it returns non-0 + then they must _not_ call that function. + */ +CMPP_EXPORT int cmpp_delimiter_push(cmpp *pp, char const *zDelim); + +/** + Must be called one time for each successful call to + cmpp_delimiter_push(). It restores the directive delimimter to the + value it has when cmpp_delimiter_push() was last called. + + Returns pp's current error code, and will set it to non-0 if called + when no cmpp_delimiter_push() is active. Popping an empty stack + represents a serious API misuse and may fail an assert() in debug + builds. +*/ +CMPP_EXPORT int cmpp_delimiter_pop(cmpp *pp); + +/** + If z[*n] ends on a \n or \r\n pair, it/they are stripped, + *z is NUL-terminated there, and *n is adjusted downwards + by 1 or 2. Returns true if it chomped, else false. +*/ +CMPP_EXPORT bool cmpp_chomp(unsigned char * z, cmpp_size_t * n); + +/** + A basic memory buffer class. This is primarily used with + cmpp_outputer_b to capture arbitrary output for later use. + It's also used for incrementally creating dynamic strings. + + TODO: add the heuristic that an nAlloc of 0 with a non-NULL z + refers to externally-owned memory. This would change the + buffer-write APIs to automatically copy it before making any + changes. We have code for this in the trees this class derives + from, it just needs to be ported over. It would allow us to avoid + allocating in some cases where we need a buffer but it will always + (or commonly) be a copy of a static string, like a single space. +*/ +struct cmpp_b { + /** + This buffer's memory, owned by this object. This library exclusively + uses sqlite3_realloc() and friends for memory management. + + If this pointer is taken away from this object then it must + eventually be passed to cmpp_mfree(). + */ + unsigned char * z; + /** + Number of bytes of this->z which are in use, not counting any + automatic NUL terminator which this class's APIs may add. + */ + cmpp_size_t n; + /** + Number of bytes allocated in this->z. + + Potential TODO: use a value of zero here, with a non-zero + this->n, to mean that this->z is owned elsewhere. This would + cause cmpp_b_append() to copy its original source before + appending. Similarly, cmpp_b_clear() would necessarily _not_ + free this->z. We've used that heuristic in a predecessor of this + class in another tree to good effect for years, but it's not + certain that we'd get the same level of utility out of that + capability as we do in that other project. + */ + cmpp_size_t nAlloc; + + /** + cmpp_b APIs which may fail will set this. Similarly, most + of the cmpp_b APIs become no-ops if this is non-0. + */ + int errCode; +}; + +typedef struct cmpp_b cmpp_b; + +/** + An empty-initialized cmpp_b struct for use in const-copy + initialization. +*/ +#define cmpp_b_empty_m {.z=0,.n=0,.nAlloc=0,.errCode=0} + +/** + An empty-initialized cmpp_b struct for use in non-copy copy + initialization. +*/ +extern const cmpp_b cmpp_b_empty; + +/** + Frees s->z and zeroes out s but does not free s. +*/ +CMPP_EXPORT void cmpp_b_clear(cmpp_b *s); + +/** + If s has content, s->nUsed is set to 0 and s->z is NUL-terminated + at its first byte, else this is a no-op. s->errCode is + set to 0. Returns s. + */ +CMPP_EXPORT cmpp_b * cmpp_b_reuse(cmpp_b *s); + +/** + Swaps all contents of the given buffers, including their persistent + error code. +*/ +CMPP_EXPORT void cmpp_b_swap(cmpp_b * l, cmpp_b * r); + +/** + If s->errCode is 0 and s->nAlloc is less than n, s->z is + reallocated to have at least n bytes, else this is a no-op. Returns + 0 on success, CMPP_RC_OOM on error. +*/ +CMPP_EXPORT int cmpp_b_reserve(cmpp_b *s, cmpp_size_t n); + +/** + Works just like cmpp_b_reserve() but on allocation error it + updates pp's error state. +*/ +CMPP_EXPORT int cmpp_b_reserve3(cmpp * pp, cmpp_b * os, cmpp_size_t n); + +/** + Appends n bytes from src to os, reallocating os as necessary. + Returns 0 on succes, CMPP_RC_OOM on allocation error. + + Errors from this function, and the other cmpp_b_append...() + variants, are recoverable (see cmpp_err_set()). +*/ +CMPP_EXPORT int cmpp_b_append(cmpp_b * os, void const *src, + cmpp_size_t n); + +/** + Works just like cmpp_b_append() but on allocation error it + updates pp's error state. +*/ +CMPP_EXPORT int cmpp_b_append4(cmpp * pp, + cmpp_b * os, + void const * src, + cmpp_size_t n); + +/** + Appends ch to the end of os->z, expanding as necessary, and + NUL-terminates os. Returns os->errCode and is a no-op if that is + non-0 when this is called. This is slightly more efficient than + passing length-1 strings to cmpp_b_append() _if_ os's memory + is pre-allocated with cmpp_b_reserve(), otherwise it may be + less efficient because it may need to allocate frequently if used + repeatedly. +*/ +CMPP_EXPORT int cmpp_b_append_ch(cmpp_b * os, char ch); + +/** + Appends a decimal string representation of d to os. Returns + os->errCode and is a no-op if that is non-0 when this is called. +*/ +CMPP_EXPORT int cmpp_b_append_i32(cmpp_b * os, int32_t d); + +/** int64_t counterpart of cmpp_b_append_i32(). */ +CMPP_EXPORT int cmpp_b_append_i64(cmpp_b * os, int64_t d); + +/** + A thin wrapper around cmpp_chomp() which chomps b->z. +*/ +CMPP_EXPORT bool cmpp_b_chomp(cmpp_b * b); + +/** + A cmpp_output_f() impl which requires that its first argument be a + (cmpp_b*) or be NULL. If buffer is not NULL then it appends n bytes + of src to buffer, reallocating as needed. Returns CMPP_RC_OOM in + reallocation error. On success it always NUL-terminates buffer->z. + A NULL buffer is treated as success but has no side effects. + + Example usage: + + ``` + cmpp_b os = cmpp_b_empty; + int rc = cmpp_stream(cmpp_input_f_FILE, stdin, + cmpp_output_f_b, &os); + ... + cmpp_b_clear(&os); + ``` +*/ +CMPP_EXPORT int cmpp_output_f_b(void * buffer, void const * src, + cmpp_size_t n); + +/** + A cmpp_outputer_cleanup_f() implementation which requires that + self->state be either NULL or a cmpp_b pointer. This function + passes it to cmpp_b_clear(). It does _not_ set self->state or + self->name to NULL. +*/ +CMPP_EXPORT void cmpp_outputer_cleanup_f_b(cmpp_outputer *self); + +/** + A cmpp_outputer prototype which can be copied to use a dynamic + string buffer as an output source. Its state member must be set (by + the client) to a cmpp_b instance. Its out() method is + cmpp_output_f_b(). Its cleanup() method is + cmpp_outputer_cleanup_f_b(). It has no flush() method. +*/ +extern const cmpp_outputer cmpp_outputer_b; + +/** + Returns a string containing version information in an unspecified + format. +*/ +CMPP_EXPORT char const * cmpp_version(void); + +/** + Type IDs for directive lines and argument-parsing tokens. + + This is largely a historical artifact and work is underway + to factor this back out of the public API. +*/ +enum cmpp_tt { + +/** + X-macro which defines token types. It invokes E(X,Y) for each + entry, where X is the base name part of the token type and Y is the + token name as it appears in input scripts (if any, else it's 0). + + Maintenance reminder: their ordering in this map is insignificant + except that None must be first and must have the value 0. + + Some of the more significant ones are: + + - Word: an unquoted word-like token. + + - String: a quoted string. + + - StringAt: an @"..." string. + + - GroupParen, GroupBrace, GroupSquiggly: (), [], and {} + + - All which start with D_ are directives. D_Line is a transitional + state between "unparsed" and another D_... value. +*/ +#define cmpp_tt_map(E) \ + E(None, 0) \ + E(RawLine, 0) \ + E(Unknown, 0) \ + E(Word, 0) \ + E(Noop, 0) \ + E(Int, 0) \ + E(Null, 0) \ + E(String, 0) \ + E(StringAt, 0) \ + E(GroupParen, 0) \ + E(GroupBrace, 0) \ + E(GroupSquiggly,0) \ + E(OpEq, "=") \ + E(OpNeq, "!=") \ + E(OpLt, "<") \ + E(OpLe, "<=") \ + E(OpGt, ">") \ + E(OpGe, ">=") \ + E(ArrowR, "->") \ + E(ArrowL, "<-") \ + E(Plus, "+") \ + E(Minus, "-") \ + E(ShiftR, ">>") \ + E(ShiftL, "<<") \ + E(ShiftL3, "<<<") \ + E(OpNot, "not") \ + E(OpAnd, "and") \ + E(OpOr, "or") \ + E(OpDefined, "defined") \ + E(OpGlob, "glob") \ + E(OpNotGlob, "not glob") \ + E(AnyType, 0) \ + E(Eof, 0) + +#define E(N,TOK) cmpp_TT_ ## N, + cmpp_tt_map(E) +#undef E + /** Used by cmpp_d_register() to assign new IDs. */ + cmpp_TT__last +}; +typedef enum cmpp_tt cmpp_tt; + +/** + For all of the cmpp_tt enum entries, returns a string form of the + enum entry name, e.g. "cmpp_TT_D_If". Returns NULL for any other + values +*/ +CMPP_EXPORT char const * cmpp_tt_cstr(int tt); + +/** + Policies for how to handle undefined @tokens@ when performing + content filtering. +*/ +enum cmpp_atpol_e { + /** Sentinel value. */ + cmpp_atpol_invalid = -1, + /** Turn off @token@ parsing. */ + cmpp_atpol_OFF = 0, + /** Retain undefined @token@ - emit it as-is. */ + cmpp_atpol_RETAIN, + /** Elide undefined @token@. */ + cmpp_atpol_ELIDE, + /** Error for undefined @token@. */ + cmpp_atpol_ERROR, + /** A sentinel value for use with cmpp_dx_out_expand(). */ + cmpp_atpol_CURRENT, + /** + This isn't _really_ the default. It's the default for the + --@policy CLI flag and #@pragma when it's given no value. + */ + cmpp_atpol_DEFAULT_FOR_FLAG = cmpp_atpol_ERROR, + /** + The compile-time default for all cmpp instances. + */ + cmpp_atpol_DEFAULT = cmpp_atpol_OFF + +}; +typedef enum cmpp_atpol_e cmpp_atpol_e; + +/** + Policies describing how cmpp should react to attempts to use + undefined keys. +*/ +enum cmpp_unpol_e { + /* Sentinel. */ + cmpp_unpol_invalid, + /** Treat undefined keys as NULL/falsy. This is the default. */ + cmpp_unpol_NULL, + /** Trigger an error for undefined keys. This should probably be the + default. */ + cmpp_unpol_ERROR, + /** + The compile-time default for all cmpp instances. + */ + cmpp_unpol_DEFAULT = cmpp_unpol_NULL +}; +typedef enum cmpp_unpol_e cmpp_unpol_e; + +typedef struct cmpp_arg cmpp_arg; +/** + A single argument for a directive. When a cmpp_d::flags have + cmpp_d_F_ARGS_V2 set then the part of the input immediately + following the directive (and on the same line) is parsed into a + cmpp_args, a container for these. +*/ +struct cmpp_arg { + /** Token type. */ + cmpp_tt ttype; + /** + The arg's string value, shorn of any opening/closing quotes or () + or {} or []. The args-parsing process guarantees to NUL-terminate + this. The bytes are typically owned by a cmpp_args object, but + clients may direct them wherever the need to, so long as the + bytes are valid longer than this object is. + */ + unsigned char const * z; + /** + The arg's effective length, in bytes, after opening/closing chars + are stripped. That is, its string form is the range [z,z+n). + */ + unsigned n; + /** + The next argument in the list. It is owned by whatever code set + it up (typically cmpp_args_parse()). + */ + cmpp_arg const * next; +}; + +/** + Empty-initialized cmpp_arg instance, intended for + const-copy initialization. +*/ +#define cmpp_arg_empty_m {cmpp_TT_None,0,0,0} + +/** + Empty-initialized cmpp_outputer instance, intended for + non-const-copy initialization. +*/ +extern const cmpp_arg cmpp_arg_empty; + +typedef struct cmpp_dx cmpp_dx; +typedef struct cmpp_dx_pimpl cmpp_dx_pimpl; +typedef struct cmpp_d cmpp_d; + +/** + Flags for use with cmpp_d::flags. +*/ +enum cmpp_d_e { + /** Sentinel value. */ + cmpp_d_F_none = 0, + /** + cmpp_dx_next() will not parse the directive's arguments. Instead, + it makes cmpp_dx::arg0 encapsulate the whole line of the + directive (sans the directive's name) as a single argument. The + only transformation which is performed is the removal of + backslashes from backslash-escaped newlines. It is up to the + directive's callback to handle (or not) the arguments. + */ + cmpp_d_F_ARGS_RAW = 0x01, + + /** + cmpp_dx_next() will parse the directive's arguments. + cmpp_dx::arg0 will point to the first argument in the list, or + NULL if there are no arguments. + + If both cmpp_d_F_ARGS_LIST and cmpp_d_F_ARGS_RAW are specified, + cmpp_d_F_ARGS_LIST will win. + */ + cmpp_d_F_ARGS_LIST = 0x02, + + /** + Indicates that the direction should not be available if the cmpp + instance is configured with any of the cmpp_ctor_F_SAFEMODE flags. + All directives when do any of the following are obligated to + set this flag: + + - Filesystem or network access. + - Invoking external processes. + + Or anything else which might be deamed "security-relevant". + + When registering a directive which has both opener and closer + implementations, it is sufficient to set this only on the opener. + + The library imposes this flag in the following places: + + - Registration of a directive with this flag will fail if + cmpp_is_safemode() is true for that cmpp instance. + + - cmpp_dx_process() will refuse to invoke a directive with this + flag when cmpp_is_safemode() is true. + */ + cmpp_d_F_NOT_IN_SAFEMODE = 0x04, + + /** + Call-only directives are only usable in [directive ...] "call" + contexts. They are not permitted to have a closing directive. + */ + cmpp_d_F_CALL_ONLY = 0x08, + /** + Indicates that the directive is incapable of working in a [call] + context and an error should be trigger if it is. _Most_ + directives which have a closing directive should have this + flag. The exceptions are directives which only conditionally use + a closing directive, like #query. + */ + cmpp_d_F_NO_CALL = 0x10, + + /** + Mask of the client-usable range for this enum. Values outside of + this mask are reserved for internal use and will be stripped from + registrations made with cmpp_d_register(). + */ + cmpp_d_F_MASK = 0x0000ffff + +}; + +/** + Callback type for cmpp_d::impl::callback(). cmpp directives are all + implemented as functions with this signature. Implementations are + called only by cmpp_dx_process() (and only after cmpp_dx_next() has + found a preprocessor line), passed the current context object. + These callbacks are only ever passed directives which were + specifically registered with them (see cmpp_d_register()). + + The first rule of callback is: to report errors (all of which end + processing of the current input) call cmpp_dx_err_set(), passing it + the callback's only argument, then clean up any local resources, + then return. The library will recognize the error and propagate it. + + dx's memory is only valid for the duration of this call. It must + not be held on to longer than that. dx->args.arg0 has slightly different + lifetime: if this callback does _not_ call back in to + cmpp_dx_next() then dx->args.arg0 and its neighbors will survive until + this call is completed. Calling cmpp_dx_next(), or any API which + invokes it, invalidates dx->args.arg0's memory. Thus directives which + call into that must _copy_ any data they need from their own + arguments before doing so, as their arguments list will be + invalidated. +*/ +typedef void (*cmpp_dx_f)(cmpp_dx * dx); + +/** + A typedef for generic deallocation routines. +*/ +typedef void (*cmpp_finalizer_f)(void *); + +/** + State specific to concrete cmpp_d implementations. + + TODO: move this, except for the state pointer, out of cmpp_d + so that directives cannot invoke these callbacks directly. Getting + that to work requires moving the builtin directives into the + dynamic directives list. +*/ +struct cmpp_d_impl { + /** + Callback func. If any API other othan cmpp_dx_process() invokes + this, behavior is undefined. + */ + cmpp_dx_f callback; + + /** + For custom directives with a non-NULL this->state, this will be + called, and passed that object, when the directive is cleaned + up. For directives with both an opening and a closing tag, this + destructor is only attached to the opening tag. + + If any API other othan cmpp's internal cleanup routines invoke + this, behavior is undefined. + */ + cmpp_finalizer_f dtor; + + /** + State for the directive's callback. It is accessible in + cmpp_dx_f() impls via theDx->d->impl.state. For custom + directives with both an opening and closing directive, this + same state object gets assigned to both. + */ + void * state; +}; +typedef struct cmpp_d_impl cmpp_d_impl; +#define cmpp_d_impl_empty_m {0,0,0} + +/** + Each c-pp "directive" is modeled by one of these. +*/ +struct cmpp_d { + + struct { + /** + The directive's name, as it must appear after the directive + delimiter. Its bytes are assumed to be static or otherwise + outlive this object. + */ + const char *z; + /** Byte length of this->z. We record this to speed up searches. */ + unsigned n; + } name; + + /** + Bitmask of flags from cmpp_d_e plus possibly internal flags. + */ + cmpp_flag32_t flags; + + /** + The directive which acts as this directive's closing element + element, or 0 if it has none. + */ + cmpp_d const * closer; + + /** + State specific to concrete implementations. + */ + cmpp_d_impl impl; +}; + +/** + Each instance of the cmpp_dx class (a.k.a. "directive context") + manages a single input source. It's responsible for the + tokenization of all input, locating directives, and processing + ("running") directives. The directive-specific work happens in + cmpp_dx_f() implementations, and this class internally manages the + setup, input traversal, and teardown. + + These objects only exist while cmpp is actively processing + input. Client code interacts with them only through cmpp_dx_f() + implementations which the library invokes. + + The process of filtering input to look for directives is to call + cmpp_dx_next() until it indicates either an error or that a + directive was found. In the latter case, the cmpp_dx object is + populated with info about the current directive. cmpp_dx_process() + will run that directive, but cmpp_dx_f() implementations sometimes + need to make decisions based on the located directive before doing + so (and sometimes they need to skip running it). + + If cmpp_dx_next() finds no directive, the end of the input has been + reached and there is no further output to generate. + + Content encountered before a directive is found is passed on to the + output stream via cmpp_dx_out_raw() or cmpp_dx_out_expand(). +*/ +struct cmpp_dx { + /** + The cmpp object which owns this context. + */ + cmpp * const pp; + + /** + The directive on whose behalf this context is active. + */ + cmpp_d const *d; + + /** + Name of the input for error reporting. Typically an input script + file name, but it need not refer to a file. + */ + unsigned const char * const sourceName; + + /** + State related to arguments passed to the current directive. + + It is important to keep in mind that the memory for the members + of this sub-struct may be modified or reallocated + (i.e. invalidated) by any APIs which call in to cmpp_dx_next(). + cmpp_dx_f() implementations must take care not to use any of this + memory after calling into that function, cmpp_dx_consume(), or + similar. If needed, it must be copied (e.g. using + cmpp_args_clone() to create a local copy of the parsed + arguments). + */ + struct { + /** + Starting byte of unparsed arguments. This is for cmpp_d_f() + implementations which need custom argument parsing. + */ + unsigned const char * z; + + /** + The byte length of z. + */ + cmpp_size_t nz; + + /** + The parsed arg count for the this->arg0 list. + */ + unsigned argc; + + /** + The first parsed arg or NULL. How this is set up is affected by + cmpp_d::flags. + + This is specifically _NOT_ defined as a sequential array and + using pointer math to traverse it invokes undefined behavior. + + To traverse the list: + + for( cmpp_arg const *a = dx->args.arg0; a; a=a->next ){ + ... + } + */ + cmpp_arg const * arg0; + } args; + + /** + Private impl details. + */ + cmpp_dx_pimpl * const pimpl; +}; + +/** + Thin proxy for cmpp_err_set(), replacing only the first argument. +*/ +CMPP_EXPORT int cmpp_dx_err_set(cmpp_dx *dx, int rc, + char const *zFmt, ...); + + +/** + Returns true if dx's current call into the API is the result + of a function call, else false. Any APIs which recurse into + input processing will reset this to false, so it needs to be + evaluated before doing any such work. + + Design note: this flag is actually tied to dx's arguments, which + get reset by APIs which consume from the input stream. +*/ +CMPP_EXPORT bool cmpp_dx_is_call(cmpp_dx * const dx); + +/** + Returns true if dx->pp has error state, else false. If this + function returns true, cmpp_dx_f() implementations are required to + stop working, clean up any local resources, and return. Continuing + to use dx when it's in an error state may exacerbate the problem. +*/ +#define cmpp_dx_err_check(DX) (DX)->pp->api->err_has((DX)->pp) + +/** + Scans dx to the next directive line, emitting all input before that + which is _not_ a directive line to dx->pp's output channel unless + it's elided due to being inside a block which elides its content + (e.g. #if). + + Returns 0 if no errors were triggered, else a cmpp_rc_e code. This + is a no-op if dx->pp has persistent error state set, and that error + code is returned. + + If it returns 0 then it sets *pGotOne to true if a directive was + found and false if not (in which case the end of the input has + been reached and further calls to this function for the same input + source will be no-ops). If it sets *pGotOne to true then it also + sets up dx's state for use with cmpp_dx_process(), which should + (normally) then be called. + + ACHTUNG: calling this resets any argument-handling-related state of + dx. That is important for cmpp_dx_f() implemenations, which _must + not_ hold copies of any pointers from dx->args.arg0 or dx->args.z + beyond a call to this function. Any state they need must be + evaluated, potentially copied, before calling this function(). +*/ +CMPP_EXPORT int cmpp_dx_next(cmpp_dx * dx, bool * pGotOne); + +/** + This is only legal to call immediately after a successful call to + cmpp_dx_next(). It requires that cmpp_dx_next() has just located + the next directive. This function runs that directive. Returns 0 + on success and all that. + + Design note: directive-Search and directive-process are two + distinctly separate steps because directives which have both + open/closing tags frequently discard the closing directive without + running it (it exists to tell the directive how far to read). Those + closing directives exist independently, though, and will trigger + errors when encountered outside of the context of their opening + directive tag (e.g. an "#/if" without an "#if"). +*/ +CMPP_EXPORT int cmpp_dx_process(cmpp_dx * dx); + +/** + A bitmask of flags for use with cmpp_dx_consume() +*/ +enum cmpp_dx_consume_e { + /** + Tells cmpp_dx_consume() to process any directives it encounters + which are not in the specified set of closing directives. Its + default is to fail if another directive is seen. + */ + cmpp_dx_consume_F_PROCESS_OTHER_D = 0x01, + /** + Tells cmpp_dx_consume() that non-directive content encountered + before the designated closing directive(s) must use an at-policy + of cmpp_atpol_OFF. That is: the output target of that function will + get the raw, unfiltered content. This is for cases where the + consumer will later re-emit that content, delaying @token@ + parsing until a later step (e.g. #query does this). + + This may misinteract in unpredictable ways when used with + cmpp_dx_consume_F_PROCESS_OTHER_D. Please report them as bugs. + */ + cmpp_dx_consume_F_RAW = 0x02 +}; + +/** + A helper for cmpp_dx_f() implementations which read in their + blocked-off content instead of passing it through the normal output + channel. e.g. `#define x <<` stores that content in a define named + "x". + + This function runs a cmpp_dx_next() loop which does the following: + + If the given output channel is not NULL then it first replaces the + output channel with the given one, such that all output which would + normally be produced will be sent there until this function + returns, at which point the output channel is restored. If the + given channel is NULL then output is not captured - it instead goes + dx's current output channel. + + dClosers must be a list of legal closing tags nClosers entries + long. Typically this is the single closing directive/tag of the + current directive, available to the opening directive's cmpp_dx_f() + impl via dx->d->closer. Some special cases require multiple + candidates, however. + + The flags argument may be 0 or a bitmask of values from the + cmpp_dx_consume_e enum. + + If flags does not have the cmpp_dx_consume_F_PROCESS_OTHER_D bit set + then this function requires that the next directive in the input be + one specified by dClosers. If the next directive is not one of + those, it will fail with code CMPP_RC_SYNTAX. + + If flags has the cmpp_dx_consume_F_PROCESS_OTHER_D bit set then it + will continue to search for and process directives until the + dCloser directive is found. Calling into other directives will + invalidate certain state that a cmpp_dx_f() has access to - see + below for details. If dCloser is not found before EOF, a + CMPP_RC_SYNTAX error is triggered. + + Once one of dCloser is found, this function returns with dx->d + referring to the that directive. In practice, the caller should + _not_ call cmpp_dx_process() at that point - the closing directive + is typically a no-op placeholder which exists only to mark the end + of the block. If the closer has work to do, however, the caller of + this function should call cmpp_dx_process() at that point. + + On success it returns 0, the input stream will have been consumed + between the directive dx and its closing tag, and dx->d will point + to the new directive. If os is not NULL then os will have been + sent any content. + + On error, processing of the directive must end immediately, + returning from the cmpp_dx_f() impl after cleaning up any local + resources. + + ACHTUNG: since this invokes cmpp_dx_next(), it invalidates + dx->args.arg0. Its dx->d is also replaced but the previous value + remains valid until the cmpp instance is cleaned up. + + Example from the context of a cmpp_dx_f() implementation + + ``` + // "dx" is the cmpp_dx arg to this function + cmpp_outputer oss = cmpp_outputer_b; + cmpp_b os = cmpp_b_empty; + oss.state = &os; + if( 0==cmpp_dx_consume(dx, &oss, dx->d->closer, 0) ){ + cmpp_b_chomp( &os ); + ... maybe modify the buffer or decorate the output in some way... + cmpp_dx_out_raw(dx, os.z, os.n); + } + cmpp_b_clear(&os); + ``` + + Design issue: this API does not currently have a way to handle + directives which have multiple potential waypoints/endpoints, in + the way that an #if may optionally have an #elif or #else before + the #/if. Such processing has to be done in the directive's + impl. +*/ +CMPP_EXPORT int cmpp_dx_consume(cmpp_dx * dx, cmpp_outputer * os, + cmpp_d const *const * dClosers, + unsigned nClosers, + cmpp_flag32_t flags); + +/** + Equivalent to cmpp_dx_consume(), capturing to the given buffer + instead of a cmpp_outputer object. +*/ +CMPP_EXPORT int cmpp_dx_consume_b(cmpp_dx * dx, cmpp_b * b, + cmpp_d const * const * dClosers, + unsigned nClosers, + cmpp_flag32_t flags); + +/** + If arg is not NULL, cleans up any resources owned by + arg but does not free arg. + + As of this writing, they own none and some code still requires + that. That is Olde Thynking, though. +*/ +CMPP_EXPORT void cmpp_arg_cleanup(cmpp_arg *arg); + +/** + If arg is not NULL resets arg to be re-used. arg must have + initially been cleanly initialized by copying cmpp_arg_empty (or + equivalent, i.e. zeroing it out). +*/ +CMPP_EXPORT void cmpp_arg_reuse(cmpp_arg *arg); + +/** + This is the core argument-parsing function used by the library's + provided directives. Its is available in the public API as a + convenience for custom cmpp_dx_f() implementations, but custom + implementations are not required to make use of it. + + Populates a cmpp_arg object by parsing the next token from its + input source. + + Expects *pzIn to point to the start of input for parsing arguments + and zInEnd to be the logical EOF of that range. This function + populates pOut with the info of the parse. Returns 0 on success, + non-0 (and updates pp's error state) on error. + + Output (the parsed token) is written to *pzOut. zOutEnd must be the + logical EOF of *pzOut. *pzOut needs to be, at most, + (zInEnd-*pzIn)+1 bytes long. This function range checks the output + and will not write to or past zOutEnd, but that will trigger a + CMPP_RC_RANGE error. + + On success, *pzIn will be set to 1 byte after the last one parsed + for pOut and *pzOut will be set to one byte after the final output + (NUL-terminated). pOut->z will point to the start of *pzOut and + pOut->n will be set to the byte-length of pOut->z. + + When the end of the input is reached, this function returns 0 + and sets pOut->ttype to cmpp_TT_EOF. + + General tokenization rules: + + Tokens come in the following flavors: + + - Quoted strings: single- or double-quoted. cmpp_arg::ttype: + cmpp_TT_String. + + - "At-strings": @"..." and @'...'. cmpp_arg::ttype value: + cmpp_TT_StringAt. + + - Decimal integers with an optional sign. cmpp_arg::ttype value: + cmpp_TT_Int. + + - Groups: (...), {...}, and [...]. cmpp_arg::ttype values: + cmpp_TT_GroupParen, cmpp_TT_GroupSquiggly, and + cmpp_TT_GroupBrace. These types do not automatically get parsed + recursively. To recurse into one of these, pass cmpp_arg_parse() + the grouping argument's bytes as the input range. + + - Word: anything which doesn't look like one of these above. Token + type IDs: cmpp_TT_Word. These are most often interpreted as + #define keys but cmpp_dx_f() implementations sometimes treat + them as literal values. + + - A small subset of words and operator-like tokens, e.g. '=' and + '!=', get a very specific ttype, e.g. cmpp_TT_OpNeq, but these + can generally be treated as strings. + + - Outside of strings and groups, spaces, tabs, carriage-returns, + and newlines are skipped. + + These are explained in more detail in the user's manual + (a.k.a. README.md). + + There are many other token types, mostly used internally. + + This function supports _no_ backslash escape sequences in + tokens. All backslashes, with the obligatory exception of those + which make up backslash-escaped newlines in the input stream, are + retained as-is in all token types. That means, for example, that + strings may not contain their own quote character. + + As an example of where this function is useful: cmpp_dx_f() + implementations which need to selectively parse a subset of the + directive's arguments can use this. As input, their dx argument's + args.z and args.nz members delineate the current directive line's + arguments. See c-pp.c:cmpp_dx_f_pipe() for an example. +*/ +CMPP_EXPORT int cmpp_arg_parse(cmpp_dx * dx, + cmpp_arg *pOut, + unsigned char const **pzIn, + unsigned char const *zInEnd, + unsigned char ** pzOut, + unsigned char const * zOutEnd); + +/** + True if (cmpp_arg const *)ARG's contents match the string literal + STR, else false. +*/ +#define cmpp_arg_equals(ARG,STR) \ + (sizeof(STR)-1==(ARG)->n && 0==memcmp(STR,(ARG)->z,sizeof(STR)-1)) + +/** + True if (cmpp_arg const *)ARG's contents match the string literal + STR or ("-" STR), else false. The intent is that "-flag" be passed + here to tolerantly accept either "-flag" or "--flag". +*/ +#define cmpp_arg_isflag(ARG,STR) \ + cmpp_arg_equals(ARG,STR) || cmpp_arg_equals(ARG, "-" STR) + +/** + Creates a copy of arg->z. If allocation fails then pp's persistent + error code is set to CMPP_RC_OOM. If pp's error code is not 0 when + this is called then this is a no-op and returns NULL. In other + words, if this function returns NULL, pp's error state was either + already set when this was called or it was set because allocation + failed. + + Ownership of the returned memory is transferred to the caller, who + must eventually free it using cmpp_mfree(). +*/ +CMPP_EXPORT char * cmpp_arg_strdup(cmpp *pp, cmpp_arg const *arg); + +/** + Flag bitmasks for use with cmpp_arg_to_b(). With my apologies + for the long names (but consistency calls for them). +*/ +enum cmpp_arg_to_b_e { + /** + Specifies that the argument's string value should be used as-is, + rather than expanding it (if the arg's ttype would normally cause + it to be expanded). + */ + cmpp_arg_to_b_F_FORCE_STRING = 0x01, + + /** + Tells cmpp_arg_to_b() to not expand arguments with type + cmpp_TT_Word, which it normally treats as define keys. It instead + treats these as strings. + */ + cmpp_arg_to_b_F_NO_DEFINES = 0x02, + + /** + If set, arguments with a ttype of cmpp_TT_GroupBrace will be + "called" by passing them to cmpp_call_arg(). The space-trimmed + result of the call becomes the output of the cmpp_arg_to_b() + call. + + FIXME: make this opt-out instead of opt-in. We end up _almost_ + always wanting this. + */ + cmpp_arg_to_b_F_BRACE_CALL = 0x04, + + /** + Explicitly disable [call] expansion even if + cmpp_arg_to_b_F_BRACE_CALL is set in the flags. + */ + cmpp_arg_to_b_F_NO_BRACE_CALL = 0x08 + + /** + TODO? cmpp_arg_to_b_F_UNESCAPE + */ +}; + +/** + Appends some form of arg to the given buffer. + + arg->ttype values of cmpp_TT_Word (define keys) and + cmpp_TT_StringAt cause the value to be expanded appropriately (the + latter according to dx->pp's current at-policy). Others get emitted + as-is. + + The flags argument influences the expansion decisions, as documented + in the cmpp_arg_to_b_e enum. + + Returns 0 on success and all that. + + See: cmpp_atpol_get(), cmpp_atpol_set() + + Reminder to self: though this function may, via script-side + function call resolution, recurse into the library, any such + recursion gets its own cmpp_dx instance. In this context that's + significant because it means this call won't invalidate arg's + memory like cmpp_dx_consume() or cmpp_dx_next() can (depending on + where args came from - typically it's owned by dx but + cmpp_args_clone() exists solely to work around such potential + invalidation). +*/ +CMPP_EXPORT int cmpp_arg_to_b(cmpp_dx * dx, cmpp_arg const *arg, + cmpp_b * os, cmpp_flag32_t flags); + +/** + Flags for use with cmpp_call_str() and friends. +*/ +enum cmpp_call_e { + /** Do not trim a newline from the result. */ + cmpp_call_F_NO_TRIM = 0x01, + /** Trim all leading and trailing space and newlines + from the result. */ + cmpp_call_F_TRIM_ALL = 0x02 +}; + +/** + This assumes that arg->z holds a "callable" directive + string in the form: + + directiveName ...args + + This function composes a new cmpp input source from that line + (prefixed with dx's current directive prefix if it's not already + got one), processes it with cmpp_process_string(), redirecting the + output to dest (which gets appended to, so be sure to + cmpp_b_reuse() it if needed before calling this). + + To simplify common expected usage, by default the output is trimmed + of a single newline. The flags argument, 0 or a bitmask of values + from the cmpp_call_e enum, can be used to modify that behavior. + + This is the basis of "function calls" in cmpp. + + Returns 0 on success. +*/ +int cmpp_call_str(cmpp *dx, + unsigned char const * z, + cmpp_ssize_t n, + cmpp_b * dest, + cmpp_flag32_t flags); + +/** + Convert an errno value to a cmpp_rc_e approximation, defaulting to + dflt if no known match is found. This is intended for use by + cmpp_dx_f implementations which use errno-using APIs. +*/ +CMPP_EXPORT int cmpp_errno_rc(int errNo, int dflt); + +/** + Configuration object for use with cmpp_d_register(). +*/ +struct cmpp_d_reg { + /** + The name of the directive as it will be used in + input scripts, e.g. "mydirective". It will be copied by + cmpp_d_register(). + */ + char const *name; + /** + A combination of bits from the cmpp_d_e enum. + + These flags are currently applied only to this->opener. + this->closer, because of how it's typically used, assumes + */ + struct { + /** + Callback for the directive's opening tag. + */ + cmpp_dx_f f; + /** + Flags from cmpp_d_e. Typically one of cmpp_d_F_ARGS_LIST or + cmpp_d_F_ARGS_RAW. + */ + cmpp_flag32_t flags; + } opener; + struct { + /** + Callback for the directive's closing tag, if any. + + This is only relevant for directives which have both an open and + a closing tag (even if that closing tag is only needed in some + contexts, e.g. "#define X <<" (with a closer) vs "#define X Y" + (without)). See cmpp_dx_f_dangling_closer() for a default + implementation which triggers an error if it's seen in the input + and not consumed by its counterpart opening directive. That + implementation has proven useful for #define, #pipe, and friends. + + Design notes: it's as yet unclear how to model, in the public + interface, directives which have a whole family of cooperating + directives, namely #if/#elif/#else. + */ + cmpp_dx_f f; + /** + Flags from cmpp_d_e. For closers this can typically be + left at 0. + */ + cmpp_flag32_t flags; + } closer; + /** + If not NULL then it is assigned to the directive's opener part + and will be called by the library in either of the following + cases: + + - When the custom directive is cleaned up. + + - If cmpp_d_register() fails (returns non-0), regardless of how + it fails. + + It is passed this->state. + */ + cmpp_finalizer_f dtor; + /** + Implementation state for the callbacks. + */ + void * state; +}; +typedef struct cmpp_d_reg cmpp_d_reg; +/** + Empty-initialized cmpp_d_reg instance, intended for const-copy + initialization. +*/ +#define cmpp_d_reg_empty_m {0,{0,0},{0,0},0,0} +/** + Empty-initialized instance, intended for non-const-copy + initialization. +*/ +//extern const cmpp_d_reg cmpp_d_reg_empty; + +/** + Registers a new directive, or a pair of opening/closing directives, + with pp. + + The semantics of r's members are documented in the cmpp_d_reg + class. r->name and r->opener.f are required. The remainder may be + 0/NULL. Its members are copied - r need not live longer than this + call. + + When the new directive is seen in a script, r->opener.f() will be + called. If the closing directive (if any) is seen in a script, + r->closer.f() is called. In both cases, the callback + implementation can get access to the r->state object via + cmdd_dx::d::impl::state (a.k.a dx->d->impl.state). + + If r->closer.f is not NULL then the closing directive will be named + "/${zName}". (Design note: it is thought that forcing a common + end-directive syntax will lead to fewer issues than allowing + free-form closing tag names, e.g. fewer chances of a name collision + or not quite remembering the spelling of a given closing tag + (#endef vs #enddefine vs #/define).) + + Returns 0 on success and updates pp's error state on error. Similarly, + this is a no-op if pp has an error code when this is called, in which + case it returns that result code without other side-effects. + + On success, if pOut is not NULL then it is set to the directive + pointer, memory owned by pp until it cleans up its directives. This + is the only place in the API a non-const pointer to a directive can + be found, and it is provided only for very specific use-cases where + a directive needs to be manipulated (carefully) after + registration[^post-reg-manipulation]. If this function also + registered a closing directive, it is available as (*pOut)->closer. + pOut should normally be NULL. + + Failure modes include: + + - Returns CMPP_RC_RANGE if zName is not legal for use as a + directive name. See cmpp_is_legal_key(). + + - Returns CMPP_RC_OOM on an allocation error. + + Errors from this function are recoverable (see cmpp_err_set()). A + failed registration, even one which doesn't fail until the + registration of the closing element, will leave pp in a + well-defined state (with neither of r's directives being + registered). + + [^post-reg-manipulation]: The one known use case if the #if family + of directives, all of which use the same #/if closing + directive. The public registration API does not account for sharing + of closers that way, and whether it _should_ is still TBD. The + workaround, for this case, is to get the directives as they're + registered and point the cmpp_d::closer of each of #if, #elif, and + #else to #/if. +*/ +CMPP_EXPORT int cmpp_d_register(cmpp * pp, cmpp_d_reg const * r, + cmpp_d **pOut); + +/** + A cmpp_dx_f() impl which is intended to be used as a callback for + directive closing tags for directives in which the opening tag's + implementation consumes the input up to the closing tag. This impl + triggers an error if called, indicating that the directive closing + was seen in the input without its accompanying directive opening. +*/ +CMPP_EXPORT void cmpp_dx_f_dangling_closer(cmpp_dx *dx); + +/** + Writes the first n bytes of z to dx->pp's current output channel + without performing any @token@ parsing. + + Returns dx->pp's persistent error code (0 on success) and sets that + code to non-0 on error. This is a no-op if dx->pp has a non-0 error + state, returning that code. + + See: cmpp_dx_out_expand() +*/ +CMPP_EXPORT int cmpp_dx_out_raw(cmpp_dx * dx, void const *z, + cmpp_size_t n); + +/** + Sends [zFrom,zFrom+n) to pOut, performing @token@ expansion if the + given policy says to (else it passes the content through as-is, as + per cmpp_dx_out_raw()). A policy of cmpp_atpol_CURRENT uses dx->pp's + current policy. A policy of cmpp_atpol_OFF behaves exactly like + cmpp_dx_out_raw(). + + Returns dx->pp's persistent error code (0 on success) and sets that + code to non-0 on error. This is a no-op if dx->pp has a non-0 error + state, returning that code. + + If pOut is NULL then dx->pp's default channel is used, with the + caveat that atPolicy's only legal value in that case is + cmpp_atpol_CURRENT. (The internals do not allow the at-policy to be + overridden for that particular output channel, to avoid accidental + filtering when it's not enabled. They do not impose that + restriction for other output channels, which are frequently used + for filtering intermediary results.) + + See: cmpp_dx_out_raw() + + Notes regarding how this is used internally: + + - This function currently specifically does nothing when invoked in + skip-mode[^1]. Hypothetically it cannot ever be called in skip-mode + except when evaluating #elif expressions (previous #if/#elifs + having failed and put us in skip-mode), where it's expanding + expression operands. That part currently (as of 2025-10-21) uses + dx->pp's current policy, and it's not clear whether that is + sufficient or whether we need to force it to expand (and which + policy to use when doing so). We could possibly get away with + always using cmpp_atpol_ERROR for purposes of evaluating at-string + expression operands. + + [^1]: Skip-mode is the internal mechanism which keeps directives + from running, and content from being emitted, within a falsy branch + of an #if/#elif block. Only flow-control directives are ever run + when skip-mode is active, and client-provided directives cannot + easily provide flow-control support. Ergo, much of this paragraph + is not relevant for client-level code, but it is for this library's + own use of this function. +*/ +CMPP_EXPORT int cmpp_dx_out_expand(cmpp_dx const * dx, + cmpp_outputer * pOut, + unsigned char const * zFrom, + cmpp_size_t n, + cmpp_atpol_e policy); + +/** + This creates a formatted string using sqlite3_mprintf() and emits it + using cmpp_dx_out_raw(). Returns CMPP_RC_OOM if allocation of the + string fails, else it returns whatever cmpp_dx_out_raw() returns. + + This is a no-op if dx->pp is in an error state, returning + that code. +*/ +CMPP_EXPORT int cmpp_dx_outf(cmpp_dx *dx, char const *zFmt, ...); + +/** + Convenience form of cmpp_delimiter_get() which returns the + delimiter which was active at the time when the currently-running + cmpp_dx_f() was called. This memory may be invalidated by any calls + into cmpp_dx_process() or cmpp_delimiter_set(), so a copy of this + pointer must not be retained past such a point. + + This function is primarily intended for use in generating debug and + error messages. + + If the delimiter stack is empty, this function returns NULL. +*/ +CMPP_EXPORT char const * cmpp_dx_delim(cmpp_dx const *dx); + +/** + Borrows a buffer from pp's buffer recycling pool, allocating one if + needed. It returns NULL only on allocation error, in which case it + updates pp's error state. + + This transfers ownership of the buffer to the caller, who is + obligated to eventually do ONE of the following: + + - Pass it to cmpp_b_return() with the same dx argument. + + - Pass it to cmpp_b_clear() then cmpp_mfree(). + + The purpose of this function is a memory reuse optimization. Most + directives, and many internals, need to use buffers for something + or other and this gives them a way to reuse buffers. + + Potential TODO: How this pool optimizes (or not) buffer allotment + is an internal detail. Maybe add an argument which provides a hint + about the buffer usage. e.g. argument-conversion buffers are + normally small but block content buffers can be arbitrarily large. +*/ +CMPP_EXPORT cmpp_b * cmpp_b_borrow(cmpp *dx); + +/** + Returns a buffer borrowed from cmpp_b_borrow(), transferring + ownership back to pp. Passing a non-NULL b which was not returned + by cmpp_b_borrow() invoked undefined behavior (possibly delayed + until the list is cleaned up). To simplify usage, b may be NULL. + + After calling this, b must be considered "freed" - it must not be + used again. This function is free (as it were) to immediately free + the object's memory instead of recycling it. +*/ +CMPP_EXPORT void cmpp_b_return(cmpp *dx, cmpp_b *b); + +/** + If NUL-terminated z matches one of the strings listed below, its + corresponding cmpp_atpol_e entry is returned, else + cmpp_atpol_invalid is returned. + + If pp is not NULL then (A) this also sets its current at-policy and + (B) it recognizes an additional string (see below). In this case, + if z is not a valid string then pp's persistent error state is set. + + Its accepted values each correspond to a like-named policy value: + + - "off" (the default): no processing of `@` is performed. + + - "error": fail if an undefined `X` is referenced in @token@ + parsing. + + - "retain": emit any unresolved `@X@` tokens as-is to the output + stream. i.e. `@X@` renders as `@X@`. + + - "elide": omit unresolved `@X@` from the output, as if their values + were empty. i.e. `@X@` renders as an empty string, i.e. is not + emitted at all. + + - "current": if pp!=NULL then it returns the current policy, else + this string resolves to cmpp_atpol_invalid. +*/ +CMPP_EXPORT cmpp_atpol_e cmpp_atpol_from_str(cmpp * pp, char const *z); + +/** + Returns pp's current at-token policy. +*/ +CMPP_EXPORT cmpp_atpol_e cmpp_atpol_get(cmpp const * const pp); + +/** + Sets pp's current at-token policy. Returns 0 if pol is valid, else + it updates pp's error state and returns CMPP_RC_RANGE. This is a + no-op if pp has error state, returning that code instead. + + The policy cmpp_atpol_CURRENT is a no-op, permitted to simplify + certain client-side usage. +*/ +CMPP_EXPORT int cmpp_atpol_set(cmpp * const pp, cmpp_atpol_e pol); + +/** + Pushes pol as the current at-policy. Returns 0 on success and + non-zero on error (bad pol value or allocation error). If this + returns 0 then the caller is obligated to eventually call + cmpp_atpol_pop() one time. If it returns non-0 then they _must not_ + call that function. +*/ +CMPP_EXPORT int cmpp_atpol_push(cmpp *pp, cmpp_atpol_e pol); + +/** + Must be called one time for each successful call to + cmpp_atpol_push(). It restores the at-policy to the value it + has when cmpp_atpol_push() was last called. + + If called when no cmpp_delimiter_push() is active then debug builds + will fail an assert(), else pp's error state is updated if it has + none already. +*/ +CMPP_EXPORT void cmpp_atpol_pop(cmpp *pp); + +/** + The cmpp_unpol_e counterpart of cmpp_atpol_from_str(). It + behaves identically, just for a different policy group with + different names. + + Its accepted values are: "null" and "error". The value "current" is + only legal if pp!=NULL, else it resolves to cmpp_unpol_invalid. +*/ +CMPP_EXPORT cmpp_unpol_e cmpp_unpol_from_str(cmpp * pp, char const *z); + +/** + Returns pp's current policy regarding use of undefined define keys. +*/ +CMPP_EXPORT cmpp_unpol_e cmpp_unpol_get(cmpp const * const pp); + +/** + Sets pp's current policy regarding use of undefined define keys. + Returns 0 if pol is valid, else it updates pp's error state and + returns CMPP_RC_RANGE. +*/ +CMPP_EXPORT int cmpp_unpol_set(cmpp * const pp, cmpp_unpol_e pol); + +/** + The undefined-policy counterpart of cmpp_atpol_push(). +*/ +CMPP_EXPORT int cmpp_unpol_push(cmpp *pp, cmpp_unpol_e pol); + +/** + The undefined-policy counterpart of cmpp_atpol_pop(). +*/ +CMPP_EXPORT void cmpp_unpol_pop(cmpp *pp); + +/** + The at-token counterpart of cmpp_delimiter_get(). This sets *zOpen + (if zOpen is not NULL) to the opening delimiter and *zClose (if + zClose is not NULL) to the closing delimiter. The memory is owned + by pp and may be invalidated by any calls to cmpp_atdelim_set(), + cmpp_atdelim_push(), or any APIs which consume input. Each string + is NUL-terminated and must be copied by the caller if they need + these strings past a point where they might be invalidated. + + If called when the the delimiter stack is empty, debug builds with + fail an assert() and non-debug builds will behave as if the stack + contains the compile-time default delimiters. +*/ +CMPP_EXPORT void cmpp_atdelim_get(cmpp const * pp, + char const **zOpen, + char const **zClose); +/** + The `@token@`-delimiter counterpart of cmpp_delimeter_set(). + + This sets the delimiter for `@token@` content to the given opening + and closing strings (which the library makes a copy of). If zOpen + is NULL then the compile-time default is assumed. If zClose is NULL + then zOpen is assumed. + + Returns 0 on success. Returns non-0 if called when the delimiter + stack is empty, if it cannot copy the string or zDelim is deemed + unsuitable for use as a delimiter. + + In debug builds this will trigger an assert if no `@token@` + delimiter has been set, but pp starts with one level in place, so + it is safe to call without having made an explicit + cmpp_atdelim_push() unless cmpp_atdelim_pop() has been misused. +*/ +CMPP_EXPORT int cmpp_atdelim_set(cmpp * pp, + char const *zOpen, + char const *zClose); + +/** + The `@token@`-delimiter counterpart of cmpp_delimeter_push(). + + See cmpp_atdelim_set() for the semantics of the arguments. +*/ +CMPP_EXPORT int cmpp_atdelim_push(cmpp *pp, + char const *zOpen, + char const *zClose); + +/** + The @token@-delimiter counterpart of cmpp_delimiter_pop(). +*/ +CMPP_EXPORT int cmpp_atdelim_pop(cmpp *pp); + +/** + Searches the given path (zPath), split on the given path separator + (pathSep), for the given file (zBaseName), optionally with the + given file extension (zExt). + + If zBaseName or zBaseName+zExt are found as-is, without any + search path prefix, that will be the result, else the result + is either zBaseName or zBaseName+zExt prefixed by one of the + search directories. + + On success, returns a new string, transfering ownership to the + caller (who must eventually pass it to cmpp_mfree() to deallocate). + + If no match is found, or on error, returns NULL. On a genuine + error, pp's error state is updated and the error is unlikely to be + recoverable (see cmpp_err_set()). + + This function is a no-op if called when pp's error state is set, + returning NULL. + + Results are undefined (in the sense of whether it will work or not, + as opposed to whether it will crash or not) if pathSep is a control + character. + + Design note: this is implemented as a Common Table Expression + query. +*/ +CMPP_EXPORT char * cmpp_path_search(cmpp *pp, + char const *zPath, + char pathSep, + char const *zBaseName, + char const *zExt); + +/** + Scans [*zPos,zEnd) for the next chSep character. Sets *zPos to one + after the last consumed byte, so its result includes the separator + character unless EOF is hit before then. If pCounter is not NULL + then it does ++*pCounter when finding chSep. + + Returns true if any input is consumed, else false (EOF). When it + returns false, *zPos will have the same value it had when this was + called. If it returns true, *zPos will be greater than it was + before this call and <= zEnd. + + Usage: + + ``` + unsigned char const * zBegin = ...; + unsigned char const * const zEnd = zBegin + strlen(zBegin); + unsigned char const * zEol = zBegin; + cmpp_size_t nLn = 0; + while( cmpp_next_chunk(&zEol, zEnd, '\n', &nLn) ){ + ... + } + ``` +*/ +CMPP_EXPORT +bool cmpp_next_chunk(unsigned char const **zPos, + unsigned char const *zEnd, + unsigned char chSep, + cmpp_size_t *pCounter); + +/** + Flags and constants related to the cmpp_args type. +*/ +enum cmpp_args_e { + /** + cmpp_args_parse() flag which tells cmpp_args_parse() not to + dive into (...) group tokens. It insteads leaves them to be parsed + (or not) by downstream code. The only reason to parse them in + advance is to catch syntax errors sooner rather than later. + */ + cmpp_args_F_NO_PARENS = 0x01 +}; + +/** + An internal detail of cmpp_args. +*/ +typedef struct cmpp_args_pimpl cmpp_args_pimpl; + +/** + A container for parsing a line's worth of cmpp_arg + objects. + + Instances MUST be cleanly initialized by bitwise-copying either + cmpp_args_empty or (depending on the context) cmpp_args_empty_m. + + Instances MUST eventually be passed to cmpp_args_cleanup(). + + Design notes: this class is provided to the public API as a + convenience, not as a core/required component. It offers one of + many possible solutions for dealing with argument lists and is not + the End All/Be All of solutions. I didn't _really_ want to expose + this class in the public API at all but I also want client-side + directives to have the _option_ to to do some of the things + currently builtin directives can do which are (as of this writing) + unavailable in the public API, e.g. evaluate expressions (in that + limited form which this library supports). A stepping stone to + doing so is making this class public. +*/ +struct cmpp_args { + /** + Number of parsed args. In the context of a cmpp_dx_f(), argument + lists do not include their directive's name as an argument. + */ + unsigned argc; + + /** + The list of args. This is very specifically NOT an array (or at + least not one which client code can rely on to behave + sensibly). Some internal APIs adjust a cmpp_args's arg list, + re-linking the entries via cmpp_arg::next and making array-style + traversal a foot-gun. + + To loop over them: + + for( cmpp_arg const * arg = args->arg0; arg; arg = arg->next ){...} + + This really ought to be const but it currenty cannot be for + internal reasons. Client code really should not modify these + objects, though. Doing so invokes undefined behavior. + + For directives with the cmpp_d_F_ARGS_RAW flag, this member will, + after a successful call to cmpp_dx_next(), point to a single + argument which holds the directive's entire argument string, + stripped of leading spaces. + */ + cmpp_arg * arg0; + + /** + Internal implementation details. This is initialized via + cmpp_args_parse() and freed via cmpp_args_cleanup(). + */ + cmpp_args_pimpl * pimpl; +}; +typedef struct cmpp_args cmpp_args; + +/** + Empty-initialized cmpp_args instance, intended for const-copy + initialization. +*/ +#define cmpp_args_empty_m { \ + .argc = 0, \ + .arg0 = 0, \ + .pimpl = 0 \ +} + +/** + Empty-initialized instance, intended for non-const-copy + initialization. +*/ +extern const cmpp_args cmpp_args_empty; + +/** + Parses the range [zInBegin,zInBegin+nIn) into a list of cmpp_arg + objects by iteratively processing that range with cmpp_arg_parse(). + If nIn is negative, strlen() is used to calculate it. + + Requires that arg be a cleanly-initialized instance (via + bitwise-copying cmpp_args_empty) or that it have been successfully + used with this function before. Behavior is undefined if pArgs was + not properly initialized. + + The 3rd argument is an optional bitmask of flags from the + cmpp_args_e enum. + + On success it populates arg, noting that an empty list is valid. + The memory pointed to by the arguments made available via + arg->arg0 is all owned by arg and will be invalidated by either a + subsequent call to this function (the memory will be overwritten or + reallocated) or cmpp_args_cleanup() (the memory will be freed). + + On error, returns non-0 and updates pp's error state with info + about the problem. +*/ +CMPP_EXPORT int cmpp_args_parse(cmpp_dx * dx, + cmpp_args * pOut, + unsigned char const * zInBegin, + cmpp_ssize_t nIn, + cmpp_flag32_t flags); + +/** + Frees any resources owned by its argument but does not free the + argument (which is typically stack-allocated). After calling this, + the object may again be used with cmpp_args_parse() (in which case + it eventually needs to be passed to this again). + + This is a harmless no-op if `a` is already cleaned up but `a` must + not be NULL. +*/ +CMPP_EXPORT void cmpp_args_cleanup(cmpp_args *a); + +/** + A wrapper around cmpp_args_parse() which uses dx->args.z as an + input source. This is sometimes convenient in cmpp_dx_f() + implementations which use cmpp_dx_next(), or similar, to read and + process custom directives, as doing so invalidates dx->arg's + memory. + + On success, returns 0 and populates args. On error, returns non-0 + and sets dx->pp's error state. + + cmpp_dx_args_clone() does essentially the same thing, but is more + efficient when dx->args.arg0 is is already parsed. +*/ +CMPP_EXPORT int cmpp_dx_args_parse(cmpp_dx *dx, cmpp_args *args); + +/** + Populates pOut, replacing any current content, with a copy of each + arg in dx->args.arg0 (traversing arg0->next). + + *pOut MUST be cleanly initialized via copying cmpp_args_empty or it + must have previously been used with either cmpp_args_parse() (which + has the same initialization requirement) or this function has + undefined results. + + On success, pOut->argc and pOut->arg0 will refer to pOut's copy + of the arguments. + + Copying of arguments is necessary in cmpp_dx_f() implementations + which need to hold on to arguments for use _after_ calling + cmpp_dx_next() or any API which calls that (which most directives + don't do). See that function for why. +*/ +CMPP_EXPORT int cmpp_dx_args_clone(cmpp_dx * dx, cmpp_args *pOut); + +/** Flags for cmpp_popen(). */ +enum cmpp_popen_e { + /** + Use execl[p](CMD, CMD,0) instead of + execl[p]("/bin/sh","-c",CMD,0). + */ + cmpp_popen_F_DIRECT = 0x01, + /** Use execlp() or execvp() instead of execl() or execv(). */ + cmpp_popen_F_PATH = 0x02 +}; + +/** + Result state for cmpp_popen() and friends. +*/ +struct cmpp_popen_t { + /** + The child process ID. + */ + int childPid; + /** + The child process's stdout. + */ + int fdFromChild; + /** + If not NULL, cmpp_popen() will set *fpToChild to a FILE handle + mapped to the child process's stdin. If it is NULL, the child + process's stdin will be closed instead. + */ + cmpp_FILE **fpToChild; +}; +typedef struct cmpp_popen_t cmpp_popen_t; +/** + Empty-initialized cmpp_popen_t instance, intended for const-copy + initialization. +*/ +#define cmpp_popen_t_empty_m {-1,-1,0} +/** + Empty-initialized instance, intended for non-const-copy + initialization. +*/ +extern const cmpp_popen_t cmpp_popen_t_empty; + +/** + Uses fork()/exec() to run a command in a separate process and open + a two-way stream to it. It is provided in this API to facilitate + the creation of custom directives which shell out to external + processes. + + zCmd must contain the NUL-terminated command to run and any flags + for that command, e.g. "myapp --flag --other-flag". It is passed as + the 4th argument to: + + execl("/bin/sh", "/bin/sh", "-c", zCmd, NULL) + + The po object MUST be cleanly initialized before calling this by + bitwise copying cmpp_popen_t_empty or (depending on the context) + cmpp_popen_t_empty_m. + + Flags: + + - cmpp_popen_F_DIRECT: zCmd is passed to execl(zCmd, zCmd, NULL). + instead of exec(). That can only work if zCmd is a single command + without arguments. + + - cmpp_popen_F_PATH: tells it to use execlp() or execvp(), which + performs path lookup of its initial argument. Again, that can + only work if zCmd is a single command without arguments. + + On success: + + - po->childPid will be set to the PID of the child process. + + - po->fdFromChild is set to the child's stdout file + descriptor. read(2) from it to read from the child. + + - If po->fpToChild is not NULL then *po->fpToChild is set to a + buffered output handle to the child's stdin. fwrite(3) to it to + send the child stuff. Be sure to fflush(3) and/or fclose(3) it to + keep it from hanging forever. If po->fpToChild is NULL then the + stdin of the child is closed. (Why buffered instead of unbuffered? + My attempts at getting unbuffered child stdin to work have all + failed when write() is called on it.) + + On success, the caller is obligated to pass po to cmpp_pclose(). + The caller may pass pi to cmpp_pclose() on error, if that's easier + for them, provided that the po argument was cleanly initialized + before passing it to this function. + + If the caller fclose(3)s *po->fpToChild then they must set it to + NULL so that passing it to cmpp_pclose() knows not to close it. + + On error: you know the drill. This function is a no-op if pp has + error state when it's called, and the current error code is + returned instead. + + This function is only available on non-WASM Unix-like environments. + On others it will always trigger a CMPP_RC_UNSUPPORTED error. + + Bugs: because the command is run via /bin/sh -c ... we cannot tell + if it's actually found. All we can tell is that /bin/sh ran. + + Also: this doesn't capture stderr, so commands should redirect + stderr to stdout. Adding the child's stderr handle to cmpp_popen_t is + a potential TODO without a current use case. + + See: cmpp_pclose() + See: cmpp_popenv() +*/ +CMPP_EXPORT int cmpp_popen(cmpp *pp, unsigned char const *zCmd, + cmpp_flag32_t flags, cmpp_popen_t *po); + +/** + Works like cmpp_popen() except that: + + - It takes it arguments in the form of a main()-style array of + strings because it uses execv() instead of exec(). The + cmpp_popen_F_PATH flag causes it to use execvp(). + + - It does not honor the cmpp_popen_F_DIRECT flag because all + arguments have to be passed in via the arguments array. + + As per execv()'s requirements: azCmd _MUST_ end with a NULL entry. +*/ +CMPP_EXPORT int cmpp_popenv(cmpp *pp, char * const * azCmd, + cmpp_flag32_t flags, cmpp_popen_t *po); + +/** + Closes handles returned by cmpp_popen() and zeroes out po. If the + caller fclose()d *po->fpToChild then they need to set it to NULL so + that this function does not double-close it. + + Returns the result code of the child process. + + After calling this, po may again be used as an argument to + cmpp_popen(). +*/ +CMPP_EXPORT int cmpp_pclose(cmpp_popen_t *po); + +/** + A cmpp_popenv() proxy which builds up an execv()-style array of + arguments from the given args. It has a hard, and mostly arbitrary, + upper limit on the number of args it can take in order to avoid + extra allocation. +*/ +CMPP_EXPORT int cmpp_popen_args(cmpp_dx *dx, cmpp_args const * args, + cmpp_popen_t *p); + + +/** + Callback type for use with cmpp_kav_each(). + + cmpp_kav_each() calls this one time per key/value in such a list, + passing it the relevant key/value strings and lengths, plus the + opaque state pointer which is passed to cmpp_kav_each(). + + Must return 0 on success or update (or propagate) dx->pp's error + state on error. +*/ +typedef int cmpp_kav_each_f( + cmpp_dx *dx, + unsigned char const *zKey, cmpp_size_t nKey, + unsigned char const *zVal, cmpp_size_t nVal, + void* callbackState +); + +/** + Flag bitmask for use with cmpp_kav_each() and cmpp_str_each(). +*/ +enum cmpp_kav_each_e { + /** + The key argument should be expanded using cmpp_arg_to_b() + with a 0 flags value. This flag should normally not be used. + */ + cmpp_kav_each_F_EXPAND_KEY = 0x01, + /** + The key argument should be expanded using cmpp_arg_to_b() + with a 0 flags value. This flag should normally be used. + */ + cmpp_kav_each_F_EXPAND_VAL = 0x02, + /** + Treat (...) value tokens (ttype=cmpp_TT_GroupParen) as integer + expressions. Keys are never treated this way. Without this flag, + the token expands to the ... part of (...). + */ + cmpp_kav_each_F_PARENS_EXPR = 0x04, + /** + Indicates that an empty input list is an error. If this flag is + not set and the list is empty, the callback will not be called + and no error will be triggered. + */ + cmpp_kav_each_F_NOT_EMPTY = 0x08, + /** + Indicates that the list does not have the '->' part(s). That is, + the list needs to be in pairs of KEY VAL rather than triples of + KEY -> VALUE. + */ + cmpp_kav_each_F_NO_ARROW = 0x10, + + /** + If set, keys get the cmpp_arg_to_b_F_BRACE_CALL flag added + to them. This implies cmpp_kav_each_F_EXPAND_KEY. + */ + cmpp_kav_each_F_CALL_KEY = 0x20, + /** Value counterpart of cmpp_kav_each_F_CALL_KEY. */ + cmpp_kav_each_F_CALL_VAL = 0x40, + /** Both cmpp_kav_each_F_CALL_KEY and cmpp_kav_each_F_CALL_VAL. */ + cmpp_kav_each_F_CALL = 0x60, + + //TODO: append to defines which already exist + cmpp_kav_each_F_APPEND = 0, + cmpp_kav_each_F_APPEND_SPACE = 0, + cmpp_kav_each_F_APPEND_NL = 0 +}; + +/** + A helper for cmpp_dx_f() implementations in processing directive + arguments which are lists in this form: + + { a -> b c -> d ... } + + ("kav" is short for "key arrow value".) + + The range [zBegin,zBegin+nIn) contains the raw list (not including + any containing braces, parentheses, quotes, or the like). If nIn is + negative, strlen() is used to calculate it. + + The range is parsed using cmpp_args_parse(). + + For each key/arrow/value triplet in that list, callback() is passed + the stringified form of the key and the value, plus the + callbackState pointer. + + The flags argument controls whether the keys and values get + expanded or not. (Typically the keys should not be expanded but the + values should.) + + Returns 0 on success. If the callback returns non-0, it is expected + to have updated dx's error state. callback() will never be called + when dx's error state is non-0. + + Error results include: + + - CMPP_RC_RANGE: the list is empty does not contain the correct + number of entries (groups of 3, or 2 if flags has + cmpp_kav_each_F_NO_ARROW). + + - CMPP_RC_OOM: allocation error. + + - Any value returned by cmpp_args_parse(). + + - Any number of errors can be triggered during expansion of + keys and values. +*/ +CMPP_EXPORT int cmpp_kav_each( + cmpp_dx *dx, + unsigned char const *zBegin, + cmpp_ssize_t nIn, + cmpp_kav_each_f callback, void *callbackState, + cmpp_flag32_t flags +); + +/** + This works like cmpp_kav_each() except that it treats each token in + the list as a single entry. + + When the callback is triggered, the "key" part will be the raw + token and the "value" part will be the expanded form of that + value. Its flags may contain most of the cmpp_kav_each_F_... flags, + with the exception that cmpp_kav_each_F_EXPAND_KEY has no effect + here. If cmpp_kav_each_F_EXPAND_VAL is not in the flags then the + callback receives the same string for both the key and value. +*/ +CMPP_EXPORT int cmpp_str_each( + cmpp_dx *dx, + unsigned char const *zBegin, + cmpp_ssize_t nIn, + cmpp_kav_each_f callback, void *callbackState, + cmpp_flag32_t flags +); + +/** + An interface for clients to provide directives to the library + on-demand. + + This is called when pp has encountered a directive name is does not + know. It is passed the cmpp object, the name of the directive, and + the opaque state pointer which was passed to cmpp_d_autoloader_et(). + + Implementations should compare dname to any directives they know + about. If they find no match they must return CMPP_RC_NO_DIRECTIVE + _without_ using cmpp_err_set() to make the error persistent. + + If they find a match, they must use cmpp_d_register() to register + it and (on success) return 0. The library will then look again in + the registered directive list for the directive before giving up. + + If they find a match but registration fails then the result of that + failure must be returned. + + For implementation-specific errors, e.g. trying to load a directive + from a DLL but the loading of the DLL fails, implementations are + expected to use cmpp_err_set() to report the error and to return + that result code after performing any necessary cleanup. + + It is legal for an implementation to register multiple directives + in a single invocation (in particular a pair of opening/closing + directives), as well as to register directives other than the one + requested (if necessary). Regardless of which one(s) it registers, + it must return 0 only if it registers one named dname. +*/ +typedef int (*cmpp_d_autoloader_f)(cmpp *pp, char const *dname, void *state); + +/** + A c-pp directive "autoloader". See cmpp_d_autoloader_set() + and cmpp_d_autoloader_take(). +*/ +struct cmpp_d_autoloader { + /** The autoloader callback. */ + cmpp_d_autoloader_f f; + /** + Finalizer for this->state. After calling this, if there's any + chance that this object might be later used, then it is important + that this->state be set to 0 (which this finalizer cannot + do). "Best practice" is to bitwise copy cmpp_d_autoloader_empty + over any instances immediately after calling dtor(). + */ + cmpp_finalizer_f dtor; + /** + Implementation-specific state, to be passed as the final argument + to this->f and this->dtor. + */ + void * state; +}; +typedef struct cmpp_d_autoloader cmpp_d_autoloader; +/** + Empty-initialized cmpp_d_autoloader instance, intended for + const-copy initialization. +*/ +#define cmpp_d_autoloader_empty_m {.f=0,.dtor=0,.state=0} +/** + Empty-initialized cmpp_d_autoloader instance, intended for + non-const-copy initialization. +*/ +extern const cmpp_d_autoloader cmpp_d_autoloader_empty; + +/** + Sets pp's "directive autoloader". Each cmpp instance has but a + single autoloader but this API is provided so that several + instances may be chained from client-side code. + + This function will call the existing autoloader's destructor (if + any), invalidating any pointers to its state object. + + If pNew is not NULL then pp's autoloader is set to a bitwise copy + of *pNew, otherwise it is zeroed out. This transfers ownership of + pNew->state to pp. + + See cmpp_d_autoloader_f()'s docs for how pNew must behave. + + This function has no error conditions but downstream results are + undefined if if pNew and an existing autoloader refer to the same + dtor/state values (a gateway to double-frees). +*/ +CMPP_EXPORT void cmpp_d_autoloader_set(cmpp *pp, cmpp_d_autoloader const * pNew); + +/** + Moves pp's current autoloader state into pOld, transerring + ownership of it to the caller. + + This obligates the caller to eventually either pass that same + pointer to cmpp_d_autoloader_set() (to transfer ownership back to + pp) or to call pOld->dtor() (if it's not NULL), passing it it + pOld->state (even if pOld->state is NULL). In either case, all + contents of pOld are semantically invalidated and perhaps freed. + + This would normally be a prelude to cmpp_d_autoloader_set() to + install a custom, perhaps chained, autoloader. +*/ +CMPP_EXPORT void cmpp_d_autoloader_take(cmpp *pp, cmpp_d_autoloader * pOld); + +/** + True only for ' ' and '\t'. +*/ +CMPP_EXPORT bool cmpp_isspace(int ch); + +/** + Reassigns *p to the address of the first non-space character at or + after the initial *p value. It stops looking if it reaches zEnd. + + If `*p` does not point to memory before zEnd, or is not a part of + the same logical string, results are undefined. + + + Achtung: do not pass this the address of a cmpp_b::z, + or similar, as that will effectively corrupt the buffer's + memory. To trim a whole buffer, use something like: + + ``` + cmpp_b ob = cmpp_b_empty; + ... populate ob...; + // get the trimmed range: + unsigned char const *zB = ob.z; + unsigned char const *zE = zB + n; + cmpp_skip_snl(&zB, zE); + assert( zB<=zE ); + cmpp_skip_snl_trailing(zB, &zE); + assert( zE>=zB ); + printf("trimmed range: [%.*s]\n", (int)(zE-zB), zB); + ``` + + Those assert()s are not error handling - they're demonstrating + invariants of the calls made before them. +*/ +CMPP_EXPORT void cmpp_skip_space( unsigned char const **p, + unsigned char const *zEnd ); + +/** + Works just like cmpp_skip_space() but it also + skips newlines. + + FIXME (2026-02-21): it does not recognize CRNL pairs as + atomic newlines. +*/ +CMPP_EXPORT void cmpp_skip_snl( unsigned char const **p, + unsigned char const *zEnd ); + +/** + "Trims" trailing cmpp_isspace() characters from the range [zBegin, + *p). *p must initially point to one byte after the end of zBegin + (i.e. its NUL byte or virtual EOF). Upon return *p will be modified + leftwards (if at all) until a non-space is found or *p==zBegin. + */ +CMPP_EXPORT void cmpp_skip_space_trailing( unsigned char const *zBegin, + unsigned char const **p ); + +/** + Works just like cmpp_skip_space_trailing() but + skips cmpp_skip_snl() characters. + + FIXME (2026-02-21): it does not recognize CRNL pairs as + atomic newlines. +*/ +CMPP_EXPORT void cmpp_skip_snl_trailing( unsigned char const *zBegin, + unsigned char const **p ); + + +/** + Generic array-of-T list memory-reservation routine. + + *list is the input array-of-T. nDesired is the number of entries to + reserve (list entry count, not byte length). *nAlloc is the number + of entries allocated in the list. sizeOfEntry is the sizeof(T) for + each entry in *list. T may be either a value type or a pointer + type and sizeofEntry must match, i.e. it must be sizeof(T*) for a + list-of-pointers and sizeof(T) for a list-of-objects. + + If pp is not NULL then this function updates pp's error state on + error, else it simply returns CMPP_RC_OOM on error. If pp is not + NULL then this function is a no-op if called when pp's error state + is set, returning that code without other side-effects. + + If nDesired > *nAlloc then *list is reallocated to contain at least + nDesired entries, else this function returns without side effects. + + On success *list is re-assigned to the reallocated list memory, all + newly-(re)allocated memory is zeroed out, and *nAlloc is updated to + the new allocation size of *list (the number of list entries, not + the number of bytes). + + On failure neither *list nor *nAlloc are modified. + + Returns 0 on success or CMPP_RC_OOM on error. Errors generated by + this routine are, at least in principle, recoverable (see + cmpp_err_set()), though that simply means that the pp object is + left in a well-defined state, not that the app can necessarily + otherwise recover from an OOM. + + This seemingly-out-of-API-scope routine is in the public API as a + convenience for client-level cmpp_dx_f() implementations[^1]. This API + internally has an acute need for basic list management and non-core + extensions inherit that as well. + + [^1]: this project's own directives are written as if they were + client-side whenever feasible. Some require cmpp-internal state to + do their jobs, though. +*/ +CMPP_EXPORT int cmpp_array_reserve(cmpp *pp, void **list, cmpp_size_t nDesired, + cmpp_size_t * nAlloc, unsigned sizeOfEntry); + + +/** + The current cmpp_api_thunk::apiVersion value. + See cmpp_api_thunk_map. +*/ +#define cmpp_api_thunk_version 20260206 + +/** + A helper for use with cmpp_api_thunk. + + V() defines the API version number. It invokes + V(NAME,TYPE,VERSION) once. NAME is the member name for the + cmpp_api_thunk struct. TYPE is an integer type. VERSION is the + cmpp_api_thunk object version. This is initially 0 and will + eventually be given a number which increments which new members + appended. This is to enable DLLs to check whether their + cmpp_api_thunk object has the methods they're looking for. + + Then it invokes F(NAME,RETTYPE,PARAMS) and O(NAME,TYPE) + once for each cmpp_api_thunk member in an unspecified order, and + and A(VERSION) an arbitrary number of times. + + F() is for functions. O() is for objects, which are exposed here as + pointers to those objects so that we don't copy them. A() is + injected at each point where a new API version was introduced, and + that number (an integer) is its only argument. A()'s definition + can normally be empty. + + In all cases, NAME is the public API symbol name minus the "cmpp_" + prefix. RETTYPE is the function return type or object type. PARAMS + is the function parameters, wrapped in (...). For O(), TYPE is the + const-qualified type of the object referred to by + NAME. cmpp_api_thunk necessarily exposes those as pointers, but + that pointer is not part of the TYPE argument. + + See cmpp_api_thunk for details. + + In order to help DLLs to not inadvertently use invalid areas of the + API object by referencing members which they loading c-pp version + does not have, this list must only ever be modified by appending to + it. That enables DLLs to check their compile-time + cmpp_api_thunk_version against the dx->pp->api->apiVersion. If + the runtime version is older (less than) than their compile-time + version, the DLL must not access any methods added after + dx->pp->api->apiVersion. +*/ +#define cmpp_api_thunk_map(A,V,F,O) \ + A(0) \ + V(apiVersion,unsigned,cmpp_api_thunk_version) \ + F(mrealloc,void *,(void * p, size_t n)) \ + F(malloc,void *,(size_t n)) \ + F(mfree,void,(void *)) \ + F(ctor,int,(cmpp **pp, cmpp_ctor_cfg const *)) \ + F(dtor,void,(cmpp *pp)) \ + F(reset,void,(cmpp *pp)) \ + F(check_oom,int,(cmpp * const pp, void const * m)) \ + F(is_legal_key,bool,(unsigned char const *, cmpp_size_t n, \ + unsigned char const **)) \ + F(define_legacy,int,(cmpp *, const char *,char const *)) \ + F(define_v2,int,(cmpp *, const char *, char const *)) \ + F(undef,int,(cmpp *, const char *, unsigned int *)) \ + F(define_shadow,int,(cmpp *, char const *, char const *, \ + int64_t *)) \ + F(define_unshadow,int,(cmpp *, char const *, int64_t)) \ + F(process_string,int,(cmpp *, const char *, \ + unsigned char const *, cmpp_ssize_t)) \ + F(process_file,int,(cmpp *, const char *)) \ + F(process_stream,int,(cmpp *, const char *, \ + cmpp_input_f, void *)) \ + F(process_argv,int,(cmpp *, int, char const * const *)) \ + F(err_get,int,(cmpp *, char const **)) \ + F(err_set,int,(cmpp *, int, char const *, ...)) \ + F(err_set1,int,(cmpp *, int, char const *)) \ + F(err_has,int,(cmpp const *)) \ + F(is_safemode,bool,(cmpp const *)) \ + F(sp_begin,int,(cmpp *)) \ + F(sp_commit,int,(cmpp *)) \ + F(sp_rollback,int,(cmpp *)) \ + F(output_f_FILE,int,(void *, void const *, cmpp_size_t)) \ + F(output_f_fd,int,(void *, void const *, cmpp_size_t)) \ + F(input_f_FILE,int,(void *, void *, cmpp_size_t *)) \ + F(input_f_fd,int,(void *, void *, cmpp_size_t *)) \ + F(flush_f_FILE,int,(void *)) \ + F(stream,int,(cmpp_input_f, void *, \ + cmpp_output_f, void *)) \ + F(slurp,int,(cmpp_input_f, void *, \ + unsigned char **, cmpp_size_t *)) \ + F(fopen,cmpp_FILE *,(char const *, char const *)) \ + F(fclose,void,(cmpp_FILE * )) \ + F(outputer_out,int,(cmpp_outputer *, void const *, cmpp_size_t)) \ + F(outputer_flush,int,(cmpp_outputer *)) \ + F(outputer_cleanup,void,(cmpp_outputer *)) \ + F(outputer_cleanup_f_FILE,void,(cmpp_outputer *)) \ + F(delimiter_set,int,(cmpp *, char const *)) \ + F(delimiter_get,void,(cmpp const *, char const **)) \ + F(chomp,bool,(unsigned char *, cmpp_size_t *)) \ + F(b_clear,void,(cmpp_b *)) \ + F(b_reuse,cmpp_b *,(cmpp_b *)) \ + F(b_swap,void,(cmpp_b *, cmpp_b *)) \ + F(b_reserve,int,(cmpp_b *, cmpp_size_t)) \ + F(b_reserve3,int,(cmpp *, cmpp_b *,cmpp_size_t)) \ + F(b_append,int,(cmpp_b *, void const *,cmpp_size_t)) \ + F(b_append4,int,(cmpp *,cmpp_b *,void const *, \ + cmpp_size_t)) \ + F(b_append_ch, int,(cmpp_b *, char)) \ + F(b_append_i32,int,(cmpp_b *, int32_t)) \ + F(b_append_i64,int,(cmpp_b *, int64_t)) \ + F(b_chomp,bool,(cmpp_b *)) \ + F(output_f_b,int,(void *, void const *,cmpp_size_t)) \ + F(outputer_cleanup_f_b,void,(cmpp_outputer *self)) \ + F(version,char const *,(void)) \ + F(tt_cstr,char const *,(int tt)) \ + F(dx_err_set,int,(cmpp_dx *dx, int rc, char const *zFmt, ...)) \ + F(dx_next,int,(cmpp_dx * dx, bool * pGotOne)) \ + F(dx_process,int,(cmpp_dx * dx)) \ + F(dx_consume,int,(cmpp_dx *, cmpp_outputer *, \ + cmpp_d const *const *, unsigned, cmpp_flag32_t)) \ + F(dx_consume_b,int,(cmpp_dx *, cmpp_b *, cmpp_d const * const *, \ + unsigned, cmpp_flag32_t)) \ + F(arg_parse,int,(cmpp_dx * dx, cmpp_arg *, \ + unsigned char const **, unsigned char const *, \ + unsigned char ** , unsigned char const * )) \ + F(arg_strdup,char *,(cmpp *pp, cmpp_arg const *arg)) \ + F(arg_to_b,int,(cmpp_dx * dx, cmpp_arg const *arg, \ + cmpp_b * os, cmpp_flag32_t flags)) \ + F(errno_rc,int,(int errNo, int dflt)) \ + F(d_register,int,(cmpp * pp, cmpp_d_reg const * r, cmpp_d **pOut)) \ + F(dx_f_dangling_closer,void,(cmpp_dx *dx)) \ + F(dx_out_raw,int,(cmpp_dx * dx, void const *z, cmpp_size_t n)) \ + F(dx_out_expand,int,(cmpp_dx const * dx, cmpp_outputer * pOut, \ + unsigned char const * zFrom, cmpp_size_t n, \ + cmpp_atpol_e policy)) \ + F(dx_outf,int,(cmpp_dx *dx, char const *zFmt, ...)) \ + F(dx_delim,char const *,(cmpp_dx const *dx)) \ + F(atpol_from_str,cmpp_atpol_e,(cmpp * pp, char const *z)) \ + F(atpol_get,cmpp_atpol_e,(cmpp const * const pp)) \ + F(atpol_set,int,(cmpp * const pp, cmpp_atpol_e pol)) \ + F(atpol_push,int,(cmpp * pp, cmpp_atpol_e pol)) \ + F(atpol_pop,void,(cmpp * pp)) \ + F(unpol_from_str,cmpp_unpol_e,(cmpp * pp,char const *z)) \ + F(unpol_get,cmpp_unpol_e,(cmpp const * const pp)) \ + F(unpol_set,int,(cmpp * const pp, cmpp_unpol_e pol)) \ + F(unpol_push,int,(cmpp * pp, cmpp_unpol_e pol)) \ + F(unpol_pop,void,(cmpp * pp)) \ + F(path_search,char *,(cmpp *pp, char const *zPath, char pathSep, \ + char const *zBaseName, char const *zExt)) \ + F(args_parse,int,(cmpp_dx * dx, cmpp_args * pOut, \ + unsigned char const * zInBegin, \ + cmpp_ssize_t nIn, cmpp_flag32_t flags)) \ + F(args_cleanup,void,(cmpp_args *a)) \ + F(dx_args_clone,int,(cmpp_dx * dx, cmpp_args *pOut)) \ + F(popen,int,(cmpp *, unsigned char const *, cmpp_flag32_t, \ + cmpp_popen_t *)) \ + F(popenv,int,(cmpp *pp, char * const * azCmd, cmpp_flag32_t flags, \ + cmpp_popen_t *po)) \ + F(pclose,int,(cmpp_popen_t *po)) \ + F(popen_args,int,(cmpp_dx *, cmpp_args const *, cmpp_popen_t *)) \ + F(kav_each,int, (cmpp_dx *,unsigned char const *, cmpp_ssize_t, \ + cmpp_kav_each_f, void *, cmpp_flag32_t)) \ + F(d_autoloader_set,void,(cmpp *pp, cmpp_d_autoloader const * pNew)) \ + F(d_autoloader_take,void,(cmpp *pp, cmpp_d_autoloader * pOld)) \ + F(isspace,bool,(int ch)) \ + F(skip_space,void,(unsigned char const **, unsigned char const *)) \ + F(skip_snl,void,(unsigned char const **, unsigned char const *)) \ + F(skip_space_trailing,void,(unsigned char const *zBegin, \ + unsigned char const **p)) \ + F(skip_snl_trailing,void,(unsigned char const *zBegin, \ + unsigned char const **p)) \ + F(array_reserve,int,(cmpp *pp, void **list, cmpp_size_t nDesired, \ + cmpp_size_t * nAlloc, unsigned sizeOfEntry)) \ + F(module_load,int,(cmpp *, char const *,char const *)) \ + F(module_dir_add,int,(cmpp *, const char *)) \ + O(outputer_FILE,cmpp_outputer const) \ + O(outputer_b,cmpp_outputer const) \ + O(outputer_empty,cmpp_outputer const) \ + O(b_empty,cmpp_b const) \ + A(20251116) \ + F(next_chunk,bool,(unsigned char const **,unsigned char const *, \ + unsigned char,cmpp_size_t*)) \ + A(20251118) \ + F(atdelim_get,void,(cmpp const *,char const **,char const **)) \ + F(atdelim_set,int,(cmpp *,char const *,char const *)) \ + F(atdelim_push,int,(cmpp *,char const *,char const *)) \ + F(atdelim_pop,int,(cmpp *)) \ + A(20251224) \ + F(dx_pos_save,void,(cmpp_dx const *, cmpp_dx_pos *)) \ + F(dx_pos_restore,void,(cmpp_dx *, cmpp_dx_pos const *)) \ + A(20260130) \ + F(dx_is_call,bool,(cmpp_dx * const)) \ + A(20260206) \ + F(b_borrow,cmpp_b *,(cmpp *dx)) \ + F(b_return,void,(cmpp *dx, cmpp_b*)) \ + A(1+cmpp_api_thunk_version) + + +/** + Callback signature for cmpp module import routines. + + This is called by the library after having first encountering this + module (typically after looking for it in a DLL, but static + instances are supported). + + The primary intended purpose of this interface is for + implementations to call cmpp_d_register() (any number of times). It + is also legal to use APIs which set or query defines. This + interface is not intended to interact with pp's I/O in any way + (that's the job of the directives which these functions + register). Violating that will invoke undefined results, perhaps + stepping on the toes of any being-processed directive which + triggered the dynamic load of this directive. + + Errors in module initialization must be reported via cmpp_err_set() + and that code must be returned. + + Implementations must typically call cmpp_api_init(pp) as their + first operation. + + See the files named d-*.c in libcmpp's source tree for examples. +*/ +typedef int (*cmpp_module_init_f)(cmpp * pp); + +/** + Holds information for mapping a cmpp_module_init_f to a name. + Its purpose is to get installed by the CMPP_MODULE_xxx family of + macros and referenced later via a module-loading mechanism. +*/ +struct cmpp_module{ + /** + Symbolic name of the module. + */ + char const * name; + + /** + The initialization routine for the module. + */ + cmpp_module_init_f init; +}; + +/** Convenience typedef. */ +typedef struct cmpp_module cmpp_module; + +/** @def CMPP_MODULE_DECL + + Declares an extern (cmpp_module*) symbol called + cmpp_module__#\#CNAME. + + Use CMPP_MODULE_IMPL2() or CMPP_MODULE_IMPL3() to create the + matching implementation code. + + This macro should be used in the C or H file for a loadable module. + It may be compined in a file with a single CMPP_MODULE_IMPL_SOLO() + declaration with the same name, such that the module can be loaded + both with and without the explicit symbol name. +*/ +#define CMPP_MODULE_DECL(CNAME) \ + extern const cmpp_module * cmpp_module__##CNAME + +/** @def CMPP_MODULE_IMPL + + Intended to be used to implement module declarations. If a module + has both C and H files, CMPP_MODULE_DECL(CNAME) should be used in the + H file and CMPP_MODULE_IMPL2() should be used in the C file. If the + DLL has only a C file (or no public H file), CMPP_MODULE_DECL is + unnecessary. + + If the module's human-use name is a legal C identifier, + CMPP_MODULE_IMPL2() is slightly easier to use than this macro. + + Implements a static cmpp_module object named + cmpp_module__#\#CNAME#\#_impl and a non-static + (cmpp_module*) named cmpp_module__#\#CNAME which points to + cmpp_module__#\#CNAME#\#_impl. (The latter symbol may optionally be + declared in a header file via CMPP_MODULE_DECL.) NAME is used as + the cmpp_module::name value. + + INIT_F must be a cmpp_module_init_f() function pointer. That function + is called when cmpp_module_load() loads the module. + + This macro may be combined in a file with a single + CMPP_MODULE_IMPL_SOLO() declaration using the same CNAME value, + such that the module can be loaded both with and without the + explicit symbol name. + + Example usage, in a module's header file, if any: + + ``` + CMPP_MODULE_DECL(mymodule); + ``` + + (The declaration is not strictly necessary - it is more of a matter + of documentation.) + + And in the C file: + + ``` + CMPP_MODULE_IMPL3(mymodule,"mymodule",mymodule_install); + // OR: + CMPP_MODULE_IMPL2(mymodule,mymodule_install); + ``` + + If it will be the only module in the target DLL, one can also add + this: + + ``` + CMPP_MODULE_IMPL2(mymodule,mymodule_install); + // _OR_ (every so slightly different): + CMPP_MODULE_STANDALONE_IMPL2(mymodule,mymodule_install); + ``` + + Which simplifies client-side module loading by allowing them to + leave out the module name when loading, but that approach only + works if modules are compiled one per DLL (as opposed to being + packaged together in one DLL). + + @see CMPP_MODULE_DECL + @see CMPP_MODULE_IMPL_SOLO +*/ +#define CMPP_MODULE_IMPL3(CNAME,NAME,INIT_F) \ + static const cmpp_module \ + cmpp_module__##CNAME##_impl = { NAME, INIT_F }; \ + const cmpp_module * \ + cmpp_module__##CNAME = &cmpp_module__##CNAME##_impl + +/** @def CMPP_MODULE_IMPL3 + + A simplier form of CMPP_MODULE_IMPL3() for cases where a module name + is a legal C symbol name. +*/ +#define CMPP_MODULE_IMPL2(CNAME,INIT_F) \ + CMPP_MODULE_IMPL3(CNAME,#CNAME,INIT_F) + +/** @def CMPP_MODULE_IMPL_SOLO + + Implements a static cmpp_module symbol called + cmpp_module1_impl and a non-static (cmpp_module*) named + cmpp_module1 which points to cmpp_module1_impl + + INIT_F must be a cmpp_module_init_f. + + This macro must only be used in the C file for a loadable module + when that module is to be the only one in the resuling DLL. Do not + use it when packaging multiple modules into one DLL: use + CMPP_MODULE_IMPL for those cases (CMPP_MODULE_IMPL can also be used + together with this macro). + + @see CMPP_MODULE_IMPL + @see CMPP_MODULE_DECL + @see CMPP_MODULE_STANDALONE_IMPL +*/ +#define CMPP_MODULE_IMPL_SOLO(NAME,INIT_F) \ + static const cmpp_module \ + cmpp_module1_impl = { NAME, INIT_F }; \ + const cmpp_module * cmpp_module1 = &cmpp_module1_impl +/** @def CMPP_MODULE_STANDALONE_IMPL + + CMPP_MODULE_STANDALONE_IMPL2() works like CMPP_MODULE_IMPL_SOLO() + but is only fully expanded if the preprocessor variable + CMPP_MODULE_STANDALONE is defined (to any value). If + CMPP_MODULE_STANDALONE is not defined, this macro expands to a + dummy placeholder which does nothing (but has to expand to + something to avoid leaving a trailing semicolon in the C code, + which upsets the compiler (the other alternative would be to not + require a semicolon after the macro call, but that upsets emacs' + sense of indentation (and keeping emacs happy is more important + than keeping compilers happy (all of these parens are _not_ a + reference to emacs lisp, by the way)))). + + This macro may be used in the same source file as + CMPP_MODULE_IMPL. + + The intention is that DLLs prefer this option over + CMPP_MODULE_IMPL_SOLO, to allow that the DLLs can be built as + standalone DLLs, multi-plugin DLLs, and compiled directly into a + project (in which case the code linking it in needs to resolve and + call the cmpp_module entry for each built-in module). + + @see CMPP_MODULE_IMPL_SOLO + @see CMPP_MODULE_REGISTER +*/ +#if defined(CMPP_MODULE_STANDALONE) +# define CMPP_MODULE_STANDALONE_IMPL2(NAME,INIT_F) \ + CMPP_MODULE_IMPL_SOLO(NAME,INIT_F) +//arguably too much magic in one place: +//# if !defined(CMPP_API_THUNK) +//# define CMPP_API_THUNK +//# endif +#else +# define CMPP_MODULE_STANDALONE_IMPL2(NAME,INIT_F) \ + extern void cmpp_module__dummy_does_not_exist__(void) +#endif + +/** @def CMPP_MODULE_REGISTER3 + + Performs all the necessary setup for registering a loadable module, + including declaration and definition. NAME is the stringified name + of the module. This is normally called immediately after defining + the plugin's init func (which is passed as the 3rd argument to this + macro). + + See CMPP_MODULE_IMPL3() and CMPP_MODULE_STANDALONE_IMPL2() for + the fine details. +*/ +#define CMPP_MODULE_REGISTER3(CNAME,NAME,INIT_F) \ + CMPP_MODULE_IMPL3(CNAME,NAME,INIT_F); \ + CMPP_MODULE_STANDALONE_IMPL2(NAME,INIT_F) + +/** + Slight convenience form of CMPP_MODULE_REGISTER3() which assumes a + registration function name of cpp_ext_${CNAME}_register(). +*/ +#define CMPP_MODULE_REGISTER2(CNAME,NAME) \ + CMPP_MODULE_REGISTER3(CNAME,NAME,cmpp_module__ ## CNAME ## _register) + +/** + Slight convenience form of CMPP_MODULE_REGISTER2() for cases when + CNAME and NAME are the same. +*/ +#define CMPP_MODULE_REGISTER1(CNAME) \ + CMPP_MODULE_REGISTER3(CNAME,#CNAME,cmpp_module__ ## CNAME ## _register) + +/** + This looks for a DLL file named fname. If found, it is dlopen()ed + (or equivalent) and searched for a symbol named symName. If found, + it is assumed to be a cmpp_module instance and its init() method is + invoked. + + If fname is NULL then the module is looked up in the + currently-running program. + + If symName is NULL then the name "cmpp_module1" is assumed, which + is the name used by CMPP_MODULE_IMPL_SOLO() and friends (for use + when a module is the only one in its DLL). + + If no match is found, or there's a problem loading the DLL or + resolving the name, non-0 is returned. Similarly, if the init() + method fails, non-0 is returned. + + The file name is searched using the cmpp_module_dir_add() path, and + if fname is an exact match, or an exact when the system's + conventional DLL file extension is appended to it, that is used + rather than any potential match from the search path. + + On error, pp's error state will contain more information. It's + indeterminate which errors from this API are recoverable. + + This function is a no-op if called when pp's error state is set, + returning that code. + + If built without module-loading support then this will always + fail with CMPP_RC_UNSUPPORTED. +*/ +CMPP_EXPORT int cmpp_module_load(cmpp * pp, char const * fname, + char const * symName); + +/** + Adds the directory or directories listed in zDirs to the search + path used by cmpp_module_load(). The entries are expected to be + either colon- or semicolon-delimited, depending on the platform the + library was built for. + + If zDirs is NULL and pp's library path is empty then it looks for + the environment variable CMPP_MODULE_PATH. If that is set, it is + used in place of zDirs, otherwise the library's compile-time + default is used (as set by the CMPP_MODULE_PATH compile-time value, + which defaults to ".:$prefix/lib/cmpp" in the canonical builds). + This should only be done once per cmpp instance, as the path will + otherwise be extended each time. (The current list structure does + not make it easy to recognize duplicates.) + + Returns 0 on success or if zDirs is empty. Returns CMPP_RC_OOM on + allocation error (ostensibly recoverable - see cmpp_err_set()). + + This is a no-op if called when pp has error state, returning that + code without other side-effects. + + If modules are not enabled then this function is a no-op and always + returns CMPP_RC_UNSUPPORTED _without_ setting pp's error state (as + it's not an error, per se). That can typically be ignored as a + non-error. +*/ +CMPP_EXPORT int cmpp_module_dir_add(cmpp *pp, const char * zDirs); + + +/** + State for a cmpp_dx_pimpl which we need in order to snapshot the + parse position for purposes of restoring it later. This is + basically to support that #query can contain other #query + directives, but this same capability is required by any directives + which want to both process directives in their content block and + loop over the content block. +*/ +struct cmpp_dx_pos { + /** Current parse pos. */ + unsigned char const *z; + /** Current line number. */ + cmpp_size_t lineNo; +}; +typedef struct cmpp_dx_pos cmpp_dx_pos; +#define cmpp_dx_pos_empty_m {.z=0,.lineNo=0U}//,.dline=CmppDLine_empty_m} + +/** + Stores dx's current input position into pos. pos gets completely + initialized by this routine - it need not (in contrast to many + other functions in this library) be cleanly initialized by the + caller first. +*/ +CMPP_EXPORT void cmpp_dx_pos_save(cmpp_dx const * dx, cmpp_dx_pos *pos); + +/** + Restores dx's input position from pos. Results are undefined if pos + is not populated with the result of having passed the same dx/pos + pointer combination to cmpp_dx_pos_save(). +*/ +CMPP_EXPORT void cmpp_dx_pos_restore(cmpp_dx * dx, cmpp_dx_pos const * pos); + +/** + A "thunk" for use with loadable modules, encapsulating all of the + functions from the public cmpp API into an object. This allows + loadable modules to call into the cmpp API if the binary which + loads them not built in such a way that it exports libcmpp's + symbols to the DLL. (On Linux systems, that means if it's not + linked with -rdynamic.) + + For every public cmpp function, this struct has a member with the + same signature and name, minus the "cmpp_" name prefix. Thus + cmpp_foo(...) is accessible as api->foo(...). + + Object-type exports, e.g. cmpp_b_empty, are exposed here as + pointers instead of objects. The CMPP_API_THUNK-installed API + wrapper macros account for that. + + There is only one instance of this class and it gets passed into + cmpp_module_init_f() methods. It is also assigned to the + cmpp_dx::api member of cmpp_dx instances which get passed to + cmpp_dx_f() implementations. + + Loadable modules "should" use this interface to access the API, + rather than the global symbols. If they don't then the module may, + depending on how the loading application was linked, throw + unresolved symbols errors when loading. +*/ +struct cmpp_api_thunk { +#define A(VER) +#define V(N,T,VER) T N; +#define F(N,T,P) T (*N)P; +#define O(N,T) T * const N; + cmpp_api_thunk_map(A,V,F,O) +#undef F +#undef O +#undef V +#undef A +}; + +/** + For loadable modules to be able portably access the cmpp API, + without requiring that their loading binary be linked with + -rdynamic, we need a "thunk". The API exposes cmpp_api_thunk + for that purpose. The following macros set up the thunk for + a given compilation unit. They are intended to only be used + by loadable modules, not generic client code. + + Before including this header, define CMPP_API_THUNK with no value + and/or define CMPP_API_THUNK_NAME to a C symbol name. The latter + macro implies the former and defines the name of the static symbol + to be the local cmpp_api_thunk instance, defaulting to cmppApi. + + The first line of a module's registration function should then be: + + cmpp_api_init(pp); + + where pp is the name of the sole argument to the registration + callback. After that is done, the cmpp_...() APIs may be used via + the macros defined below, all of which route through the thunk + object. +*/ +#if defined(CMPP_API_THUNK) || defined(CMPP_API_THUNK_NAME) +# if !defined(CMPP_API_THUNK) +# define CMPP_API_THUNK +# endif +# if !defined(CMPP_API_THUNK_NAME) +# define CMPP_API_THUNK_NAME cmppApi +# endif +# if !defined(CMPP_API_THUNK__defined) +# define CMPP_API_THUNK__defined +static cmpp_api_thunk const * CMPP_API_THUNK_NAME = 0; +# endif +/** + cmpp_api_init() must be invoked from the module's registration + function, passed the only argument to that function. It sets the + global symbol CMPP_API_THUNK_NAME to its argument. From that point + on, the thunk's API is accessible via cmpp_foo macros which proxy + theThunk->foo. + + It is safe to call this from, e.g. a cmpp_dx_f() implementation, as + it will always have the same pointer, so long as it is not passed + NULL, which would make the next cmpp_...() call segfault. +*/ +# if !defined(CMPP_API_THUNK__assigned) +# define CMPP_API_THUNK__assigned +# define cmpp_api_init(PP) CMPP_API_THUNK_NAME = (PP)->api +# else +# define cmpp_api_init(PP) (void)(PP)/*CMPP_API_THUNK_NAME*/ +# endif +/* What follows is generated code from c-pp's (#pragma api-thunk). */ +/* Thunk APIs which follow are available as of version 0... */ +#define cmpp_mrealloc CMPP_API_THUNK_NAME->mrealloc +#define cmpp_malloc CMPP_API_THUNK_NAME->malloc +#define cmpp_mfree CMPP_API_THUNK_NAME->mfree +#define cmpp_ctor CMPP_API_THUNK_NAME->ctor +#define cmpp_dtor CMPP_API_THUNK_NAME->dtor +#define cmpp_reset CMPP_API_THUNK_NAME->reset +#define cmpp_check_oom CMPP_API_THUNK_NAME->check_oom +#define cmpp_is_legal_key CMPP_API_THUNK_NAME->is_legal_key +#define cmpp_define_legacy CMPP_API_THUNK_NAME->define_legacy +#define cmpp_define_v2 CMPP_API_THUNK_NAME->define_v2 +#define cmpp_undef CMPP_API_THUNK_NAME->undef +#define cmpp_define_shadow CMPP_API_THUNK_NAME->define_shadow +#define cmpp_define_unshadow CMPP_API_THUNK_NAME->define_unshadow +#define cmpp_process_string CMPP_API_THUNK_NAME->process_string +#define cmpp_process_file CMPP_API_THUNK_NAME->process_file +#define cmpp_process_stream CMPP_API_THUNK_NAME->process_stream +#define cmpp_process_argv CMPP_API_THUNK_NAME->process_argv +#define cmpp_err_get CMPP_API_THUNK_NAME->err_get +#define cmpp_err_set CMPP_API_THUNK_NAME->err_set +#define cmpp_err_set1 CMPP_API_THUNK_NAME->err_set1 +#define cmpp_err_has CMPP_API_THUNK_NAME->err_has +#define cmpp_is_safemode CMPP_API_THUNK_NAME->is_safemode +#define cmpp_sp_begin CMPP_API_THUNK_NAME->sp_begin +#define cmpp_sp_commit CMPP_API_THUNK_NAME->sp_commit +#define cmpp_sp_rollback CMPP_API_THUNK_NAME->sp_rollback +#define cmpp_output_f_FILE CMPP_API_THUNK_NAME->output_f_FILE +#define cmpp_output_f_fd CMPP_API_THUNK_NAME->output_f_fd +#define cmpp_input_f_FILE CMPP_API_THUNK_NAME->input_f_FILE +#define cmpp_input_f_fd CMPP_API_THUNK_NAME->input_f_fd +#define cmpp_flush_f_FILE CMPP_API_THUNK_NAME->flush_f_FILE +#define cmpp_stream CMPP_API_THUNK_NAME->stream +#define cmpp_slurp CMPP_API_THUNK_NAME->slurp +#define cmpp_fopen CMPP_API_THUNK_NAME->fopen +#define cmpp_fclose CMPP_API_THUNK_NAME->fclose +#define cmpp_outputer_out CMPP_API_THUNK_NAME->outputer_out +#define cmpp_outputer_flush CMPP_API_THUNK_NAME->outputer_flush +#define cmpp_outputer_cleanup CMPP_API_THUNK_NAME->outputer_cleanup +#define cmpp_outputer_cleanup_f_FILE CMPP_API_THUNK_NAME->outputer_cleanup_f_FILE +#define cmpp_delimiter_set CMPP_API_THUNK_NAME->delimiter_set +#define cmpp_delimiter_get CMPP_API_THUNK_NAME->delimiter_get +#define cmpp_chomp CMPP_API_THUNK_NAME->chomp +#define cmpp_b_clear CMPP_API_THUNK_NAME->b_clear +#define cmpp_b_reuse CMPP_API_THUNK_NAME->b_reuse +#define cmpp_b_swap CMPP_API_THUNK_NAME->b_swap +#define cmpp_b_reserve CMPP_API_THUNK_NAME->b_reserve +#define cmpp_b_reserve3 CMPP_API_THUNK_NAME->b_reserve3 +#define cmpp_b_append CMPP_API_THUNK_NAME->b_append +#define cmpp_b_append4 CMPP_API_THUNK_NAME->b_append4 +#define cmpp_b_append_ch CMPP_API_THUNK_NAME->b_append_ch +#define cmpp_b_append_i32 CMPP_API_THUNK_NAME->b_append_i32 +#define cmpp_b_append_i64 CMPP_API_THUNK_NAME->b_append_i64 +#define cmpp_b_chomp CMPP_API_THUNK_NAME->b_chomp +#define cmpp_output_f_b CMPP_API_THUNK_NAME->output_f_b +#define cmpp_outputer_cleanup_f_b CMPP_API_THUNK_NAME->outputer_cleanup_f_b +#define cmpp_version CMPP_API_THUNK_NAME->version +#define cmpp_tt_cstr CMPP_API_THUNK_NAME->tt_cstr +#define cmpp_dx_err_set CMPP_API_THUNK_NAME->dx_err_set +#define cmpp_dx_next CMPP_API_THUNK_NAME->dx_next +#define cmpp_dx_process CMPP_API_THUNK_NAME->dx_process +#define cmpp_dx_consume CMPP_API_THUNK_NAME->dx_consume +#define cmpp_dx_consume_b CMPP_API_THUNK_NAME->dx_consume_b +#define cmpp_arg_parse CMPP_API_THUNK_NAME->arg_parse +#define cmpp_arg_strdup CMPP_API_THUNK_NAME->arg_strdup +#define cmpp_arg_to_b CMPP_API_THUNK_NAME->arg_to_b +#define cmpp_errno_rc CMPP_API_THUNK_NAME->errno_rc +#define cmpp_d_register CMPP_API_THUNK_NAME->d_register +#define cmpp_dx_f_dangling_closer CMPP_API_THUNK_NAME->dx_f_dangling_closer +#define cmpp_dx_out_raw CMPP_API_THUNK_NAME->dx_out_raw +#define cmpp_dx_out_expand CMPP_API_THUNK_NAME->dx_out_expand +#define cmpp_dx_outf CMPP_API_THUNK_NAME->dx_outf +#define cmpp_dx_delim CMPP_API_THUNK_NAME->dx_delim +#define cmpp_atpol_from_str CMPP_API_THUNK_NAME->atpol_from_str +#define cmpp_atpol_get CMPP_API_THUNK_NAME->atpol_get +#define cmpp_atpol_set CMPP_API_THUNK_NAME->atpol_set +#define cmpp_atpol_push CMPP_API_THUNK_NAME->atpol_push +#define cmpp_atpol_pop CMPP_API_THUNK_NAME->atpol_pop +#define cmpp_unpol_from_str CMPP_API_THUNK_NAME->unpol_from_str +#define cmpp_unpol_get CMPP_API_THUNK_NAME->unpol_get +#define cmpp_unpol_set CMPP_API_THUNK_NAME->unpol_set +#define cmpp_unpol_push CMPP_API_THUNK_NAME->unpol_push +#define cmpp_unpol_pop CMPP_API_THUNK_NAME->unpol_pop +#define cmpp_path_search CMPP_API_THUNK_NAME->path_search +#define cmpp_args_parse CMPP_API_THUNK_NAME->args_parse +#define cmpp_args_cleanup CMPP_API_THUNK_NAME->args_cleanup +#define cmpp_dx_args_clone CMPP_API_THUNK_NAME->dx_args_clone +#define cmpp_popen CMPP_API_THUNK_NAME->popen +#define cmpp_popenv CMPP_API_THUNK_NAME->popenv +#define cmpp_pclose CMPP_API_THUNK_NAME->pclose +#define cmpp_popen_args CMPP_API_THUNK_NAME->popen_args +#define cmpp_kav_each CMPP_API_THUNK_NAME->kav_each +#define cmpp_d_autoloader_set CMPP_API_THUNK_NAME->d_autoloader_set +#define cmpp_d_autoloader_take CMPP_API_THUNK_NAME->d_autoloader_take +#define cmpp_isspace CMPP_API_THUNK_NAME->isspace +#define cmpp_isnl CMPP_API_THUNK_NAME->isnl +#define cmpp_issnl CMPP_API_THUNK_NAME->issnl +#define cmpp_skip_space CMPP_API_THUNK_NAME->skip_space +#define cmpp_skip_snl CMPP_API_THUNK_NAME->skip_snl +#define cmpp_skip_space_trailing CMPP_API_THUNK_NAME->skip_space_trailing +#define cmpp_skip_snl_trailing CMPP_API_THUNK_NAME->skip_snl_trailing +#define cmpp_array_reserve CMPP_API_THUNK_NAME->array_reserve +#define cmpp_module_load CMPP_API_THUNK_NAME->module_load +#define cmpp_module_dir_add CMPP_API_THUNK_NAME->module_dir_add +#define cmpp_outputer_FILE (*CMPP_API_THUNK_NAME->outputer_FILE) +#define cmpp_outputer_b (*CMPP_API_THUNK_NAME->outputer_b) +#define cmpp_outputer_empty (*CMPP_API_THUNK_NAME->outputer_empty) +#define cmpp_b_empty (*CMPP_API_THUNK_NAME->b_empty) +/* Thunk APIs which follow are available as of version 20251116... */ +#define cmpp_next_chunk CMPP_API_THUNK_NAME->next_chunk +/* Thunk APIs which follow are available as of version 20251118... */ +#define cmpp_atdelim_get CMPP_API_THUNK_NAME->atdelim_get +#define cmpp_atdelim_set CMPP_API_THUNK_NAME->atdelim_set +#define cmpp_atdelim_push CMPP_API_THUNK_NAME->atdelim_push +#define cmpp_atdelim_pop CMPP_API_THUNK_NAME->atdelim_pop +/* Thunk APIs which follow are available as of version 20251224... */ +#define cmpp_dx_pos_save CMPP_API_THUNK_NAME->dx_pos_save +#define cmpp_dx_pos_restore CMPP_API_THUNK_NAME->dx_pos_restore +/* Thunk APIs which follow are available as of version 20260130... */ +#define cmpp_dx_is_call CMPP_API_THUNK_NAME->dx_is_call +/* Thunk APIs which follow are available as of version 20260206... */ +#define cmpp_b_borrow CMPP_API_THUNK_NAME->b_borrow +#define cmpp_b_return CMPP_API_THUNK_NAME->b_return + + +#else /* not CMPP_API_THUNK */ +/** + cmpp_api_init() is a no-op when not including a file-local API + thunk. +*/ +# define cmpp_api_init(PP) (void)0 +#endif /* CMPP_API_THUNK */ + +#ifdef __cplusplus +} /* extern "C" */ +#endif +#endif /* include guard */ +#endif /* NET_WANDERINGHORSE_LIBCMPP_H_INCLUDED */ +#if !defined(NET_WANDERINGHORSE_CMPP_INTERNAL_H_INCLUDED) +#define NET_WANDERINGHORSE_CMPP_INTERNAL_H_INCLUDED +/** + This file houses declarations and macros for the private/internal + libcmpp APIs. +*/ +#include "sqlite3.h" + +#include +#include +#include +#include +#include +#include + +/* write() and friends */ +#if defined(_WIN32) || defined(WIN32) +# include +# include +# ifndef access +# define access(f,m) _access((f),(m)) +# endif +#else +# include +# include +#endif + +#ifndef CMPP_DEFAULT_DELIM +#define CMPP_DEFAULT_DELIM "##" +#endif + +#ifndef CMPP_ATSIGN +#define CMPP_ATSIGN (unsigned char)'@' +#endif + +#ifndef CMPP_MODULE_PATH +#define CMPP_MODULE_PATH "." +#endif + +#if defined(NDEBUG) +#define cmpp__staticAssert(NAME,COND) (void)1 +#else +#define cmpp__staticAssert(NAME,COND) \ + static const char staticAssert_ ## NAME[ (COND) ? 1 : -1 ] = {0}; \ + (void)staticAssert_ ## NAME +#endif + +#if defined(CMPP_OMIT_ALL_UNSAFE) +#undef CMPP_OMIT_D_PIPE +#define CMPP_OMIT_D_PIPE +#undef CMPP_OMIT_D_DB +#define CMPP_OMIT_D_DB +#undef CMPP_OMIT_D_INCLUDE +#define CMPP_OMIT_D_INCLUDE +#undef CMPP_OMIT_D_MODULE +#define CMPP_OMIT_D_MODULE +#endif + +#if !defined(CMPP_VERSION) +#error "exporting CMPP_VERSION to have been set up" +#endif + +#define CMPP__DB_MAIN_NAME "cmpp" + +#if defined(CMPP_AMALGAMATION) +#define CMPP_PRIVATE static +#else +#define CMPP_PRIVATE +#endif + +#if CMPP_PLATFORM_IS_WASM +# define CMPP_PLATFORM_IS_WINDOWS 0 +# define CMPP_PLATFORM_IS_UNIX 0 +# define CMPP_PLATFORM_PLATFORM "wasm" +# define CMPP_PATH_SEPARATOR ':' +# define CMPP__EXPORT_NAMED(X) __attribute__((export_name(#X),used,visibility("default"))) +// See also: +//__attribute__((export_name("theExportedName"), used, visibility("default"))) +# define CMPP_OMIT_FILE_IO /* potential todo but with a large footprint */ +# if !defined(CMPP_PLATFORM_EXT_DLL) +# define CMPP_PLATFORM_EXT_DLL "" +# endif +#else +//# define CMPP_WASM_EXPORT +# define CMPP__EXPORT_NAMED(X) +# if defined(_WIN32) || defined(WIN32) +# define CMPP_PLATFORM_IS_WINDOWS 1 +# define CMPP_PLATFORM_IS_UNIX 0 +# define CMPP_PLATFORM_PLATFORM "windows" +# define CMPP_PATH_SEPARATOR ';' +//# include +# elif defined(__MINGW32__) || defined(__MINGW64__) +# define CMPP_PLATFORM_IS_WINDOWS 1 +# define CMPP_PLATFORM_IS_UNIX 0 +# define CMPP_PLATFORM_PLATFORM "windows" +# define CMPP_PATH_SEPARATOR ':' /*?*/ +# elif defined(__CYGWIN__) +# define CMPP_PLATFORM_IS_WINDOWS 0 +# define CMPP_PLATFORM_IS_UNIX 1 +# define CMPP_PLATFORM_PLATFORM "unix" +# define CMPP_PATH_SEPARATOR ':' +# else +# define CMPP_PLATFORM_IS_WINDOWS 0 +# define CMPP_PLATFORM_IS_UNIX 1 +# define CMPP_PLATFORM_PLATFORM "unix" +# define CMPP_PATH_SEPARATOR ':' +# endif +#endif + +#define CMPP__EXPORT(RETTYPE,NAME) CMPP__EXPORT_NAMED(NAME) RETTYPE NAME + +#if !defined(CMPP_PLATFORM_EXT_DLL) +# error "Expecting CMPP_PLATFORM_EXT_DLL to have been set by the auto-configured bits" +# define CMPP_PLATFORM_EXT_DLL "???" +#endif + +#if 1 +# define CMPP_NORETURN __attribute__((noreturn)) +#else +# define CMPP_NORETURN +#endif + +/** @def CMPP_HAVE_DLOPEN + + If set to true, use dlopen() and friends. Requires + linking to -ldl on some platforms. + + Only one of CMPP_HAVE_DLOPEN and CMPP_HAVE_LTDLOPEN may be + true. +*/ +/** @def CMPP_HAVE_LTDLOPEN + + If set to true, use lt_dlopen() and friends. Requires + linking to -lltdl on most platforms. + + Only one of CMPP_HAVE_DLOPEN and CMPP_HAVE_LTDLOPEN may be + true. +*/ +#if !defined(CMPP_HAVE_DLOPEN) +# if defined(HAVE_DLOPEN) +# define CMPP_HAVE_DLOPEN HAVE_DLOPEN +# else +# define CMPP_HAVE_DLOPEN 0 +# endif +#endif + +#if !defined(CMPP_HAVE_LTDLOPEN) +# if defined(HAVE_LTDLOPEN) +# define CMPP_HAVE_LTDLOPEN HAVE_LTDLOPEN +# else +# define CMPP_HAVE_LTDLOPEN 0 +# endif +#endif + +#if !defined(CMPP_ENABLE_DLLS) +# define CMPP_ENABLE_DLLS (CMPP_HAVE_LTDLOPEN || CMPP_HAVE_DLOPEN) +#endif +#if CMPP_ENABLE_DLLS && !defined(CMPP_OMIT_D_MODULE) +# define CMPP_D_MODULE 1 +#else +# define CMPP_D_MODULE 0 +#endif + +/** + Many years of practice have taught that it is literally impossible + to safely close DLLs because simply opening one may trigger + arbitrary code (at least for C++ DLLs) which "might" be used by the + application. e.g. some classloaders use DLL initialization to inject + new classes into the application without the app having to do + anything more than open the DLL. (That's precisely what the cmpp + port of this code is doing except that we don't call it classloading + here.) + + So cmpp does not close DLLs. Except (...sigh...) to please valgrind. + + When CMPP_CLOSE_DLLS is true then this API will keep track of DLL + handles so that they can be closed, and offers the ability for + higher-level clients to close them (all at once, not individually). +*/ +#if !defined(CMPP_CLOSE_DLLS) +#define CMPP_CLOSE_DLLS 1 +#endif + +/** Proxy for cmpp_malloc() which (A) is a no-op if ppCode + and (B) sets pp->err on OOM. +*/ +CMPP_PRIVATE void * cmpp__malloc(cmpp* pp, cmpp_size_t n); + +/** + Internal-use-only flags for use with cmpp_d::flags. + + These supplement the ones from the public API's cmpp_d_e. +*/ +enum cmpp_d_ext_e { + /** + Mask of flag bits from this enum and cmpp_d_e which are for + internal use only and are disallowed in client-side directives. + */ + cmpp_d_F_MASK_INTERNAL = ~cmpp_d_F_MASK, + + /** + If true, and if cmpp_d_F_ARGS_LIST is set, then cmpp_args_parse() + will pass its results to cmpp_args__not_simplify(). Only + directives which eval cmpp_arg expressions need this, and the + library does not expose the pieces for evaluating such + expressions. As such, this flag is for internal use only. This + only has an effect if cmpp_d_F_ARGS_LIST is also used. + */ + cmpp_d_F_NOT_SIMPLIFY = 0x10000, + + /** + Most directives are inert when they are seen in the "falsy" part + of an if/else. The callbacks for such directives are skipped, as + opposed to requiring each directive's callback to check whether + they should be skipping. This flag indicates that a directive + must always be run, even when skipping content (e.g. inside of an + #if 0 block). Only flow-control directives may have the + FLOW_CONTROL bit set. The library API does not expose enough of + its internals for client-defined directives to make flow-control + decisions. + + i really want to get rid of this flag but it seems to be a + necessary evil. + */ + cmpp_d_F_FLOW_CONTROL = 0x20000 +}; + +/** + A single directive line from an input stream. +*/ +struct CmppDLine { + /** Line number in the source input. */ + cmpp_size_t lineNo; + /** Start of the line within its source input. */ + unsigned char const * zBegin; + /** One-past-the-end byte of the line. A virtual EOF. It will only + actually be NUL-terminated if it is the last line of the input + and that input has no trailing newline. */ + unsigned char const * zEnd; +}; +typedef struct CmppDLine CmppDLine; +#define CmppDLine_empty_m {0U,0,0} + +/** + A snippet from a string. +*/ +struct CmppSnippet { + /* Start of the range. */ + unsigned char const *z; + /* Number of bytes. */ + unsigned int n; +}; +typedef struct CmppSnippet CmppSnippet; +#define CmppSnippet_empty_m {0,0} + +/** + CmppLvl represents one "level" of parsing, pushing one level + for each of `#if` and popping one for each `#/if`. + + These pieces are ONLY for use with flow-control directives. It's + not proven that they can be of any use to more than a single + flow-control directive. e.g. if we had a hypothetical #foreach, we + may need to extend this. +*/ +struct CmppLvl { +#if 0 + /** + The directive on whose behalf this level was opened. + */ + cmpp_d const * d; + /** + Opaque directive-specific immutable state. It's provided as a way + for a directive to see whether the top of the stack is correct + after it processes inner directives. + */ + void const * state; +#endif + /** + Bitmask of CmppLvl_F_... + */ + cmpp_flag32_t flags; + /** + The directive line number which started this level. This is used for + reporting the starting lines of erroneously unclosed block + constructs. + */ + cmpp_size_t lineNo; +}; +typedef struct CmppLvl CmppLvl; +#define CmppLvl_empty_m {/*.d=0, .state=0,*/ .flags=0U, .lineNo=0U} + +/** + Declares struct T as a container for a list-of-MT. MT may be + pointer-qualified. cmpp__ListType_impl() with the same arguments + implements T_reserve() for basic list allocation. Cleanup, alas, is + MT-dependent. +*/ +#define cmpp__ListType_decl(T,MT) \ + struct T { \ + MT * list; \ + cmpp_size_t n; \ + cmpp_size_t nAlloc; \ + }; \ + typedef struct T T; \ + int T ## _reserve(cmpp *pp, T *li, cmpp_size_t min) +#define CMPP__MAX(X,Y) ((X)<=(Y) ? (X) : (Y)) +#define cmpp__ListType_impl(T,MT) \ + int T ## _reserve(cmpp *pp,struct T *li, cmpp_size_t min) { \ + return cmpp_array_reserve(pp, (void**)&li->list, min, \ + &li->nAlloc, sizeof(MT)); \ + } + +#define cmpp__LIST_T_empty_m {.list=0,.n=0,.nAlloc=0} + +/** + A dynamically-allocated list of CmppLvl objects. +*/ +cmpp__ListType_decl(CmppLvlList,CmppLvl*); +#define CmppLvlList_empty_m cmpp__LIST_T_empty_m +CMPP_PRIVATE CmppLvl * CmppLvl_push(cmpp_dx *dx); +CMPP_PRIVATE CmppLvl * CmppLvl_get(cmpp_dx const *dx); +CMPP_PRIVATE void CmppLvl_pop(cmpp_dx *dx, CmppLvl *lvl); +CMPP_PRIVATE void CmppLvl_elide(CmppLvl *lvl, bool on); +CMPP_PRIVATE bool CmppLvl_is_eliding(CmppLvl const *lvl); +CMPP_PRIVATE bool cmpp_dx_is_eliding(cmpp_dx const *dx); + +/** + A dynamically-allocated list of cmpp_b objects. +*/ +cmpp__ListType_decl(cmpp_b_list,cmpp_b*); +#define cmpp_b_list_empty_m cmpp__LIST_T_empty_m +extern const cmpp_b_list cmpp_b_list_empty; +CMPP_PRIVATE void cmpp_b_list_cleanup(cmpp_b_list *li); +//CMPP_PRIVATE cmpp_b * cmpp_b_list_push(cmpp_b_list *li); +//CMPP_PRIVATE void cmpp_b_list_reuse(cmpp_b_list *li); +/** + cmpp_b_list sorting policies. NULL entries must + always sort last. +*/ +enum cmpp_b_list_e { + cmpp_b_list_UNSORTED, + /* Smallest first. */ + cmpp_b_list_ASC, + /* Largest first. */ + cmpp_b_list_DESC +}; + +/** + A dynamically-allocated list of cmpp_arg objects. Used by + cmmp_args. +*/ +cmpp__ListType_decl(CmppArgList,cmpp_arg); +#define CmppArgList_empty_m cmpp__LIST_T_empty_m + +/** Allocate a new arg, owned by li, and return it (cleanly zeroed + out). Returns NULL and updates pp->err on error. */ +CMPP_PRIVATE cmpp_arg * CmppArgList_append(cmpp *pp, CmppArgList *li); + +/** + The internal part of the cmpp_args interface. +*/ +struct cmpp_args_pimpl { + /** + We need(?) a (cmpp*) here for finalization/recycling purposes. + */ + cmpp *pp; + bool isCall; + /** + Next entry in the free-list. + */ + cmpp_args_pimpl * nextFree; + /** Version 3 of the real args memory. */ + CmppArgList argli; + /** + cmpp_args_parse() copies each argument's bytes into here, + each one NUL-terminated. + */ + cmpp_b argOut; +}; +#define cmpp_args_pimpl_empty_m { \ + .pp = 0, \ + .isCall = false, \ + .nextFree = 0, \ + .argli = CmppArgList_empty_m, \ + .argOut = cmpp_b_empty_m \ +} +extern const cmpp_args_pimpl cmpp_args_pimpl_empty; +void cmpp_args_pimpl_cleanup(cmpp_args_pimpl *p); + +/** + The internal part of the cmpp_dx interface. +*/ +struct cmpp_dx_pimpl { + /** Start of input. */ + unsigned const char * zBegin; + /** One-after-the-end of input. */ + unsigned const char * zEnd; + /** + Current input position. Generally speaking, only + cmpp_dx_delim_search() should update this, but it turns out that + the ability to rewind the input is necessary for looping + constructs, like #query, when they want to be able to include + other directives in their bodies. + */ + cmpp_dx_pos pos; + /** + Currently input line. + */ + CmppDLine dline; + /** Number of active #savepoints. */ + unsigned nSavepoint; + /** Current directive's args. */ + cmpp_args args; + /** + A stack of state used by #if and friends to inform the innards + that they must not generate output. This is largely historical + and could have been done differently had this code started as a + library instead of a monolithic app. + + TODO is to figure out how best to move this state completely into + the #if handler, rather than fiddle with this all throughout the + processing. We could maybe move this stack into CmppIfState? + */ + CmppLvlList dxLvl; + struct { + /** + A copy of this->d's input line which gets translated + slightly from its native form for futher processing. + */ + cmpp_b line; + /** + Holds the semi-raw input line, stripped only of backslash-escaped + newlines and leading spaces. This is primarily for debug output + but also for custom arg parsing for some directives. + */ + cmpp_b argsRaw; + } buf; + + /** + Record IDs for/from cmpp_[un]define_shadow(). + */ + struct { + /** ID for __FILE__. */ + int64_t sidFile; + /** Rowid for #include path entry. */ + int64_t ridInclPath; + } shadow; + + struct { + /** + Set when we're searching for directives so that we know whether + cmpp_out_expand() should count newlines. + */ + unsigned short countLines; + /** + True if the next directive is the start of a [call]. + */ + bool nextIsCall; + } flags; +}; +/** + Initializes or resets a. Returns non-0 on OOM. +*/ +CMPP_PRIVATE int cmpp_args__init(cmpp *pp, cmpp_args *a); + +/** + If a has state then it's recycled for reuse, else this zeroes out a + except for a->pimpl, which is retained (but may be NULL). +*/ +CMPP_PRIVATE void cmpp_args_reuse(cmpp_args *a); + +#define cmpp_dx_pimpl_empty_m { \ + .zBegin=0, .zEnd=0, \ + .pos=cmpp_dx_pos_empty_m, \ + .dline=CmppDLine_empty_m, \ + .nSavepoint=0, \ + .args = cmpp_args_empty_m, \ + .dxLvl = CmppLvlList_empty_m, \ + .buf = { \ + cmpp_b_empty_m, \ + cmpp_b_empty_m \ + }, \ + .shadow = { \ + .sidFile = 0, \ + .ridInclPath = 0 \ + }, \ + .flags = { \ + .countLines = 0, \ + .nextIsCall = false \ + } \ +} + +/** + A level of indirection for CmppDList in order to be able to + manage ownership of their name (string) lifetimes. +*/ +struct CmppDList_entry { + /** this->d.name.z points to this, which is owned by the CmppDList + which manages this object. */ + char * zName; + /* Potential TODO: move d->id into here. That doesn't eliminate our + dependency on it, though. */ + cmpp_d d; + //cmpp_d_reg reg; +}; +typedef struct CmppDList_entry CmppDList_entry; +#define CmppDList_entry_empty_m {0,cmpp_d_empty_m/*,cmpp_d_reg_empty_m*/} + +/** + A dynamically-allocated list of cmpp_arg objects. Used by CmmpArgs. +*/ +cmpp__ListType_decl(CmppDList,CmppDList_entry*); +#define CmppDList_empty_m cmpp__LIST_T_empty_m + +/** + State for keeping track of DLL handles, a.k.a. shared-object + handles, a.k.a. "soh". + + Instances of this must be either cleanly initialized by bitwise + copying CmppSohList_empty, memset() (or equivalent) them to 0, or + allocating them with CmppSohList_new(). +*/ +cmpp__ListType_decl(CmppSohList,void*); +#define CmppSohList_empty_m cmpp__LIST_T_empty_m + +/** + Closes all handles which have been CmppSohList_append()ed to soli + and frees any memory it owns, but does not free soli (which might + be stack-allocated or part of another struct). + + Special case: if built without DLL-closing support then this + is no-op. +*/ +CMPP_PRIVATE void CmppSohList_close(CmppSohList *soli); + +/** + Operators and operator policies for use with X=Y-format keys. This + is legacy stuff, actually, but some of the #define management still + needs it. +*/ +#define CmppKvp_op_map(E) \ + E(none,"") \ + E(eq1,"=") + +enum CmppKvp_op_e { +#define E(N,S) CmppKvp_op_ ## N, + CmppKvp_op_map(E) +#undef E +}; +typedef enum CmppKvp_op_e CmppKvp_op_e; + +/** + Result type for CmppKvp_parse(). +*/ +struct CmppKvp { + /* Key part of the kvp. */ + CmppSnippet k; + /* Key part of the kvp. Might be empty. */ + CmppSnippet v; + /* Operator part of kvp, if any. */ + CmppKvp_op_e op; +}; +typedef struct CmppKvp CmppKvp; +extern const CmppKvp CmppKvp_empty; + +/** + Parses X or X=Y into p. Sets pp's error state on error. + + If nKey is negative then strlen() is used to calculate it. + + The third argument specifies whether/how to permit/treat the '=' + part of X=Y. +*/ +CMPP_PRIVATE int CmppKvp_parse(cmpp *pp, CmppKvp * p, + unsigned char const *zKey, + cmpp_ssize_t nKey, + CmppKvp_op_e opPolicy); + + +/** + Stack of POD values. Intended for use with cmpp at-token and + undefined key policies. +*/ +#define cmpp__PodList_decl(ST,ET) \ + struct ST { \ + /* current stack index */ \ + cmpp_size_t n; \ + cmpp_size_t na; \ + ET * stack; \ + }; typedef struct ST ST; \ + void ST ## _wipe(ST * s, ET v); \ + int ST ## _push(cmpp *pp, ST * s, ET v); \ + void ST ## _set(ST * s, ET v); \ + void ST ## _finalize(ST * s); \ + void ST ## _pop(ST *s); \ + int ST ## _reserve(cmpp *, ST *, cmpp_size_t min) + +#define cmpp__PodList_impl(ST,ET) \ + void ST ## _wipe(ST * const s, ET v){ \ + if( s->na ) memset(s->stack, (int)v, sizeof(ET)*s->na); \ + s->n = 0; \ + } \ + int ST ## _reserve(cmpp * const pp, ST * const s, \ + cmpp_size_t min){ \ + return cmpp_array_reserve(pp, (void**)&s->stack, min>0 \ + ? min : (s->n \ + ? (s->n==s->na-1 \ + ? s->na*2 : s->n+1) \ + : 8), \ + &s->na, sizeof(ET)); \ + } \ + int ST ## _push(cmpp * const pp, ST * s, ET v){ \ + if( 0== ST ## _reserve(pp, s, 0) ) s->stack[++s->n] = v; \ + return ppCode; \ + } \ + void ST ## _set(ST * s, ET v){ \ + assert(s->n); \ + if( 0== ST ## _reserve(NULL, s, 0) ){ \ + s->stack[s->n] = v; \ + } \ + } \ + ET ST ## _get(ST const * const s){ \ + assert(s->na && s->na >=s->n); \ + return s->stack[s->n]; \ + } \ + void ST ## _pop(ST *s){ \ + assert(s->n); \ + if(s->n) --s->n; \ + } \ + void ST ## _finalize(ST *s){ \ + cmpp_mfree(s->stack); \ + s->stack = NULL; \ + s->n = s->na = 0; \ + } + +cmpp__PodList_decl(PodList__atpol,cmpp_atpol_e); +cmpp__PodList_decl(PodList__unpol,cmpp_unpol_e); + +#define cmpp__epol(PP,WHICH) (PP)->pimpl->policy.WHICH +#define cmpp__policy(PP,WHICH) \ + cmpp__epol(PP,WHICH).stack[cmpp__epol(PP,WHICH).n] + +/** + A "delimiter" object. That is, the "#" referred to in the libcmpp + docs. It's also outfitted for a second delimiter so that it can be + used for the opening/closing delimiters of @tokens@. +*/ +struct cmpp__delim { + /** + Bytes of the directive delimiter/prefix or the @token@ opening + delimiter. Owned elsewhere but often points at this->zOwns. + */ + CmppSnippet open; + /** + Closing @token@ delimiter. This has no meaning for the directive + delimiter. + */ + CmppSnippet close; + /** + Memory, owned by this object, for this->open and this->close. In + the latter case, it's one string with both delimiters encoded in + it. + */ + unsigned char * zOwns; +}; +typedef struct cmpp__delim cmpp__delim; +#define cmpp__delim_empty_m { \ + .open={ \ + .z=(unsigned char*)CMPP_DEFAULT_DELIM, \ + .n=sizeof(CMPP_DEFAULT_DELIM)-1 \ + }, \ + .close=CmppSnippet_empty_m, \ + .zOwns=0 \ +} + +extern const cmpp__delim cmpp__delim_empty; +void cmpp__delim_cleanup(cmpp__delim *d); + +/** + A dynamically-allocated list of cmpp__delim objects. +*/ +cmpp__ListType_decl(cmpp__delim_list,cmpp__delim); +#define cmpp__delim_list_empty_m {0,0,0} +extern const cmpp__delim_list cmpp__delim_list_empty; + +CMPP_PRIVATE cmpp__delim * cmpp__delim_list_push(cmpp *pp, cmpp__delim_list *li); +static inline cmpp__delim * cmpp__delim_list_get(cmpp__delim_list const *li){ + return li->n ? li->list+(li->n-1) : NULL; +} +static inline void cmpp__delim_list_pop(cmpp__delim_list *li){ + assert(li->n); + if( li->n ) cmpp__delim_cleanup(li->list + --li->n); +} +static inline void cmpp__delim_list_reuse(cmpp__delim_list *li){ + while( li->n ) cmpp__delim_cleanup(li->list + --li->n); +} + +/** + An untested experiment: an output buffer proxy. Integrating this + fully would require some surgery, but it might also inspire me to + do the same with input and stream it rather than slurp it all at + once. +*/ +#define CMPP__OBUF 0 + +typedef struct cmpp__obuf cmpp__obuf; +#if CMPP__OBUF +/** + An untested experiment. +*/ +struct cmpp__obuf { + /** Start of the output buffer. */ + unsigned char * begin; + /** One-after-the-end of this->begin. Virtual EOF. */ + unsigned char const * end; + /** Current write position. Must initially be + this->begin. */ + unsigned char * cursor; + /** + True if this object owns this->begin, which must have been + allocated using cmpp_malloc() or cmpp_realloc(). + */ + bool ownsMemory; + /** Propagating result code. */ + int rc; + /** + The output channel to buffer for. Flushing + */ + cmpp_outputer dest; +}; + +#define cmpp__obuf_empty_m { \ + .begin=0, .end=0, .cursor=0, .ownsMemory=false, \ + .rc=0, .dest=cmpp_outputer_empty_m \ +} +extern const cmpp__obuf cmpp__obuf_empty; +extern const cmpp_outputer cmpp_outputer_obuf; +#endif /* CMPP__OBUF */ +/** + The main public-API context type for this library. +*/ +struct cmpp_pimpl { + /* Internal workhorse. */ + struct { + sqlite3 * dbh; + /** + Optional filename. Memory is owned by this object. + */ + char * zName; + } db; + /** + Current directive context. It's const, primarily to help protect + cmpp_dx_f()'s from inadvertent side effects of changes which + lower-level APIs might make to it. Maybe it shouldn't be: if it + were not then we could update dx->zDelim from + cmpp__delimiter_set(). + */ + cmpp_dx const * dx; + /* Output channel. */ + cmpp_outputer out; + /** + Delimiters version 2. + */ + struct { + /** + Directive delimiter. + */ + cmpp__delim_list d; + /** + @token@ delimiters. + */ + cmpp__delim_list at; + } delim; + struct { + +#define CMPP__SEL_V_FROM(N) \ + "(SELECT v FROM " CMPP__DB_MAIN_NAME ".vdef WHERE k=?" #N \ + " ORDER BY source LIMIT 1)" + + /** + One entry for each distinct query used by cmpp: E(X,SQL), where + X is the member's name and SQL is its SQL. + */ +#define CMPP_SAVEPOINT_NAME "_cmpp_" +#define CmppStmt_map(E) \ + E(sdefIns, \ + "INSERT INTO " \ + CMPP__DB_MAIN_NAME ".sdef" \ + "(t,k,v) VALUES(?1,?2,?3) RETURNING id") \ + E(defIns, \ + "INSERT OR REPLACE INTO " \ + CMPP__DB_MAIN_NAME ".def" \ + "(t,k,v) VALUES(?1,?2,?3)") \ + E(defDel, \ + "DELETE FROM " \ + CMPP__DB_MAIN_NAME ".def" \ + " WHERE k GLOB ?1") \ + E(sdefDel, \ + "DELETE FROM " \ + CMPP__DB_MAIN_NAME ".sdef" \ + " WHERE k=?1 AND id>=?2") \ + E(defHas, \ + "SELECT 1 FROM " \ + CMPP__DB_MAIN_NAME ".vdef" \ + " WHERE k = ?1") \ + E(defGet, \ + "SELECT source,t,k,v FROM " \ + CMPP__DB_MAIN_NAME ".vdef" \ + " WHERE k = ?1 ORDER BY source LIMIT 1") \ + E(defGetBool, \ + "SELECT cmpp_truthy(v) FROM " \ + CMPP__DB_MAIN_NAME ".vdef" \ + " WHERE k = ?1" \ + " ORDER BY source LIMIT 1") \ + E(defGetInt, \ + "SELECT CAST(v AS INTEGER)" \ + " FROM " CMPP__DB_MAIN_NAME ".vdef" \ + " WHERE k = ?1" \ + " ORDER BY source LIMIT 1") \ + E(defSelAll, "SELECT t,k,v" \ + " FROM " CMPP__DB_MAIN_NAME ".vdef" \ + " ORDER BY source, k") \ + E(inclIns," INSERT OR FAIL INTO " \ + CMPP__DB_MAIN_NAME ".incl(" \ + " file,srcFile, srcLine" \ + ") VALUES(?,?,?)") \ + E(inclDel, "DELETE FROM " \ + CMPP__DB_MAIN_NAME ".incl WHERE file=?") \ + E(inclHas, "SELECT 1 FROM " \ + CMPP__DB_MAIN_NAME ".incl WHERE file=?") \ + E(inclPathAdd, "INSERT INTO " \ + CMPP__DB_MAIN_NAME ".inclpath(priority,dir) " \ + "VALUES(coalesce(?1,0),?2) " \ + "ON CONFLICT DO NOTHING " \ + "RETURNING rowid /*xlates to 0 on conflict*/") \ + E(inclPathRmId, "DELETE FROM " \ + CMPP__DB_MAIN_NAME ".inclpath WHERE rowid=?1 " \ + "RETURNING rowid") \ + E(inclSearch, \ + "SELECT ?1 fn WHERE cmpp_file_exists(fn) " \ + "UNION ALL SELECT fn FROM (" \ + " SELECT replace(dir||'/'||?1, '//','/') AS fn " \ + " FROM " CMPP__DB_MAIN_NAME ".inclpath" \ + " WHERE cmpp_file_exists(fn) " \ + " ORDER BY priority DESC, rowid LIMIT 1" \ + ")") \ + E(cmpVV, "SELECT cmpp_compare(?1,?2)") \ + E(cmpDV, \ + "SELECT cmpp_compare(" \ + CMPP__SEL_V_FROM(1) ", ?2" \ + ")") \ + E(cmpVD, \ + "SELECT cmpp_compare(" \ + "?1," CMPP__SEL_V_FROM(2) \ + ")") \ + E(cmpDD, \ + "SELECT cmpp_compare(" \ + CMPP__SEL_V_FROM(1) \ + "," \ + CMPP__SEL_V_FROM(2) \ + ")") \ + E(dbAttach, \ + "ATTACH ?1 AS ?2") \ + E(dbDetach, \ + "DETACH ?1") \ + E(spBegin, "SAVEPOINT " CMPP_SAVEPOINT_NAME) \ + E(spRollback, \ + "ROLLBACK TO SAVEPOINT " CMPP_SAVEPOINT_NAME) \ + E(spRelease, \ + "RELEASE SAVEPOINT " CMPP_SAVEPOINT_NAME) \ + E(insTtype, \ + "INSERT INTO " CMPP__DB_MAIN_NAME ".ttype" \ + "(t,n,s) VALUES(?1,?2,?3)") \ + E(selPathSearch, \ + /* sqlite.org/forum/forumpost/840c98a8e87c2207 */ \ + "WITH path(basename, sep, ext, path) AS (\n" \ + " select\n" \ + " ?1 basename,\n" \ + " ?2 sep,\n" \ + " ?3 ext,\n" \ + " ?4 path\n" \ + "),\n" \ + "pathsplit(i, l, c, r) AS (\n" \ + "-- i = sequential ID\n" \ + "-- l = Length remaining\n" \ + "-- c = text remaining\n" \ + "-- r = current unpacked value\n" \ + " SELECT 1,\n" \ + " length(p.path)+length(p.sep),\n" \ + " p.path||p.sep, ''\n" \ + " FROM path p\n" \ + " UNION ALL\n" \ + " SELECT i+1, instr( c, p.sep ) l,\n" \ + " substr( c, instr( c, p.sep ) + 1) c,\n" \ + " trim( substr( c, 1,\n" \ + " instr( c, p.sep) - 1) ) r\n" \ + " FROM pathsplit, path p\n" \ + " WHERE l > 0\n" \ + "),\n" \ + "thefile (f) AS (\n" \ + " select basename f FROM path\n" \ + " union all\n" \ + " select basename||ext\n" \ + " from path where ext is not null\n" \ + ")\n" \ + "select 0 i, replace(f,'//','/') AS fn\n" \ + "from thefile where cmpp_file_exists(fn)\n" \ + "union all\n" \ + "select i, replace(r||'/'||f,'//','/') fn\n" \ + "from pathsplit, thefile\n" \ + "where r<>'' and cmpp_file_exists(fn)\n" \ + "order by i\n" \ + "limit 1;") + + /* trivia: selPathSearch (^^^) was generated using + cmpp's #c-code directive. */ + +#define E(N,S) sqlite3_stmt * N; + CmppStmt_map(E) +#undef E + + } stmt; + + /** Error state. */ + struct { + /** Result code. */ + int code; + /** Error string owned by this object. */ + char * zMsg; + /** Either this->zMsg or an external error string. */ + char const * zMsgC; + } err; + + /** State for SQL tracing. */ + struct { + bool expandSql; + cmpp_size_t counter; + cmpp_outputer out; + } sqlTrace; + + struct { + /** If set properly, cmpp_dtor() will free this + object, else it will not. */ + void const * allocStamp; + /** + How many dirs we believe are in the #include search list. We + only do this for the sake of the historical "if no path was + added, assume '.'" behavior. This really ought to go away. + */ + unsigned nIncludeDir; + /** + The current depth of cmpp_process_string() calls. We do this so + the directory part of #include'd files can get added to the + #include path and be given a higher priority than previous + include path entries in the stack. + */ + int nDxDepth; + /* Number of active #savepoints. */ + unsigned nSavepoint; + /* If >0, enables certain debugging output. */ + char doDebug; + /* If true, chomp() files read via -Fx=file. */ + unsigned char chompF; + /* Flags passed to cmpp_ctor(). */ + cmpp_flag32_t newFlags; + + /** + An ugly hack for getting cmpp_d_register() to get + syntactically-illegal directive names, like "@policy", + to register. + */ + bool isInternalDirectiveReg; + /** + True if the next directive is the start of a [call]. This is + used for: + + 1) To set cmpp_dx::isCall, which is useful in certain + directives. + + 2) So that the cmpp_dx object created for the call can inherit + the line number from its parent context. That's significant + for error reporting. + + 3) So that #2's cmpp_dx object can communicate that flag to + cmpp_dx_next(). + */ + bool nextIsCall; + + /** + True until the cmpp's (sometimes) lazy init has been run. This + is essentially a kludge to work around a wrench cmpp_reset() + throws into cmpp state. Maybe we should just remove + cmpp_reset() from the interface, since error recovery in this + context is not really a thing. + */ + bool needsLazyInit; + } flags; + + /** Policies. */ + struct { + /** @token@-parsing policy. */ + PodList__atpol at; + /** Policy towards referencing undefined symbols. */ + PodList__unpol un; + } policy; + + /** + Directive state. + */ + struct { + /** + Runtime-installed directives. + */ + CmppDList list; + /** + Directive autoloader/auto-registerer. + */ + cmpp_d_autoloader autoload; + } d; + + struct { + /** + List of DLL handles opened by cmpp_module_extract(). + */ + CmppSohList sohList; + /** + Search path for DLLs, delimited by this->pathSep. + */ + cmpp_b path; + /** + File extension for DLLs. + */ + char const * soExt; + /** Separator char for this->path. */ + char pathSep; + } mod; + + struct { + /** + Buffer cache. Managed by cmpp_b_borrow() and cmpp_b_return(). + */ + cmpp_b_list buf; + /** How/whether this->list is sorted. */ + enum cmpp_b_list_e bufSort; + /** + Head of the free-list. + */ + cmpp_args_pimpl * argPimpl; + } recycler; +}; + +/** IDs Distinct for each cmpp::stmt member. */ +enum CmppStmt_e { + CmppStmt_none = 0, +#define E(N,S) CmppStmt_ ## N, + CmppStmt_map(E) +#undef E +}; + +static inline cmpp__delim * cmpp__pp_delim(cmpp const *pp){ + return cmpp__delim_list_get(&pp->pimpl->delim.d); +} +static inline char const * cmpp__pp_zdelim(cmpp const *pp){ + cmpp__delim const * const d = cmpp__pp_delim(pp); + return d ? (char const *)d->open.z : NULL; +} +#define cmpp__dx_delim(DX) cmpp__pp_delim(DX->pp) +#define cmpp__dx_zdelim(DX) cmpp__pp_zdelim(DX->pp) + +/** + Emit [z,(char*)z+n) to the given output channel if + (A) pOut->out is not NULL and (B) pp has no error state and (C) + n>0. On error, pp's error state is updated. Returns pp->err.code. + + Skip level is not honored. +*/ +CMPP_PRIVATE int cmpp__out2(cmpp *pp, cmpp_outputer *pOut, void const *z, cmpp_size_t n); + +CMPP_PRIVATE void cmpp__err_clear(cmpp *pp); + + +/** + Initialize pp->db.dbh. If it's already open or ppCode!=0 + then ppCode is returned. +*/ +int cmpp__db_init(cmpp *pp); + +/** + Returns the pp->pimpl->stmt.X corresponding to `which`, initializing it if + needed. If it returns NULL then either this was called when pp has + its error state set or this function will set the error state. + + If prepEvenIfErr is true then the ppCode check is bypassed, but it + will still fail if pp->pimpl->db is not opened or if the preparation itself + fails. +*/ +sqlite3_stmt * cmpp__stmt(cmpp * pp, enum CmppStmt_e which, + bool prepEvenIfErr); + +/** + Reminder to self: this must return an SQLITE_... code, not a + CMPP_RC_... code. + + On success it returns 0, SQLITE_ROW, or SQLITE_DONE. On error it + returns another non-0 SQLITE_... code and updates pp->pimpl->err. + + This is a no-op if called when pp has an error set, returning + SQLITE_ERROR. + + If resetIt is true, q is passed to cmpp__stmt_reset(), else the + caller must eventually reset it. +*/ +int cmpp__step(cmpp * const pp, sqlite3_stmt * const q, bool resetIt); + +/** Resets and clear bindings from q (if q is not NULL). */ +void cmpp__stmt_reset(sqlite3_stmt * const q); + +/** + Expects an SQLite result value. If it's SQLITE_OK, SQLITE_ROW, or + SQLITE_DONE, 0 is returned without side-effects, otherwise pp->err + is updated with pp->db's current error state. zMsgSuffix is an + optional prefix for the error message. +*/ +int cmpp__db_rc(cmpp *pp, int dbRc, char const *zMsgSuffix); + +/* Proxy for sqlite3_bind_int64(). */ +int cmpp__bind_int(cmpp *pp, sqlite3_stmt *pStmt, int col, int64_t val); + +/** + Proxy for cmpp__bind_text() which encodes val as a string. + + For queries which compare values, it's important that they all have + the same type, so some cases where we might want an int needs to be + bound as text instead. See #query for one such case. +*/ +int cmpp__bind_int_text(cmpp *pp, sqlite3_stmt *pStmt, int col, int64_t val); + +/* Proxy for sqlite3_bind_null(). */ +int cmpp__bind_null(cmpp *pp, sqlite3_stmt *pStmt, int col); + +/* Proxy for sqlite3_bind_text() which updates pp->err on error. */ +int cmpp__bind_text(cmpp *pp,sqlite3_stmt *pStmt, int col, + unsigned const char * zStr); + +/* Proxy for sqlite3_bind_text() which updates pp->err on error. */ +int cmpp__bind_textn(cmpp *pp,sqlite3_stmt *pStmt, int col, + unsigned const char *zStr, cmpp_ssize_t len); + +/** + Adds zDir to the include path, using the given priority value (use + 0 except for the implicit cwd path which #include should (but does + not yet) set). If pRowid is not NULL then *pRowid gets set to + either 0 (if zDir was already in the path) or the row id of the + newly-inserted record, which can later be used to delete just that + entry. + + If this returns a non-zero value via pRowid, the caller is + obligated to eventually pass *pRowid to cmpp__include_dir_rm_id(), + even if pp is in an error state. + + TODO: normalize zDir (at least remove trailing slashes) before + insertion to avoid that both a/b and a/b/ get be inserted. +*/ +int cmpp__include_dir_add(cmpp *pp, const char * zDir, int priority, int64_t * pRowid); + +/** + Deletes the include path entry with the given rowid. This will make + make the attempt even if pp is in an error state but also retains + any existing error rather than overwriting it if this operation + somehow fails. Returns pp's error code. + + It is not an error for the given entry to not exist. +*/ +int cmpp__include_dir_rm_id(cmpp *pp, int64_t pRowid); + + +#if 0 +/** + Proxy for sqlite3_bind_text(). It uses sqlite3_str_vappendf() so + supports all of its formatting options. +*/ +int cmpp__bind_textv(cmpp*pp, sqlite3_stmt *pStmt, int col, + const char * zFmt, ...); +#endif + +/** + Proxy for sqlite3_str_finish() which updates pp's error state if s + has error state. Returns s's string on success and NULL on + error. The returned string must eventualy be passed to + cmpp_mfree(). It also, it turns out, returns NULL if s is empty, so + callers must check pp->err to see if NULL is an error. + + If n is not NULL then on success it is set to the byte length of + the returned string. +*/ +char * cmpp_str_finish(cmpp *pp, sqlite3_str *s, int * n); + +/** + Searches pp's list of directives. If found, return it else return + NULL. See cmpp__d_search3(). +*/ +cmpp_d const * cmpp__d_search(cmpp *pp, const char *zName); + +/** + Flags for use with the final argument to + cmpp__d_search3(). +*/ +enum cmpp__d_search3_e { + /** Internal delayed-registered directives. */ + cmpp__d_search3_F_DELAYED = 0x01, + /** Directive autoloader. */ + cmpp__d_search3_F_AUTOLOADER = 0x02, + /** Search for a DLL. */ + cmpp__d_search3_F_DLL = 0x04, + /** Options which do not trigger DLL lookup. */ + cmpp__d_search3_F_NO_DLL = 0 + | cmpp__d_search3_F_DELAYED + | cmpp__d_search3_F_AUTOLOADER, + /** All options. */ + cmpp__d_search3_F_ALL = 0 + | cmpp__d_search3_F_DELAYED + | cmpp__d_search3_F_AUTOLOADER + | cmpp__d_search3_F_DLL +}; + +/** + Like cmpp__d_search() but if no match is found then it will search + through its other options and, if found, register it. + + The final argument specifies where to search. cmpp__d_search() + always checked first. After that, depending on "what", the search + order is: (1) internal delayed-load modules, (2) autoloader, (3) + DLL. + + This may update pp's error state, in which case it will return + NULL. +*/ +cmpp_d const * cmpp__d_search3(cmpp *pp, const char *zName, + cmpp_flag32_t what); + +/** + Sets pp's error state (A) if it's not set already and (B) if + !cmpp_is_legal_key(zKey). If permitEqualSign is true then '=' is + legal (to support legacy CLI pieces). Returns ppCode. +*/ +int cmpp__legal_key_check(cmpp *pp, unsigned char const *zKey, + cmpp_ssize_t nKey, + bool permitEqualSign); + +/** + Appends DLL handle soh to soli. Returns 0 on success, CMPP_RC_OOM + on error. If pp is not NULL then its error state is updated as + well. + + Results are undefined if soli was not cleanly initialized (by + copying CmppSohList_empty or using CmppSohList_new()). + + Special case: if built without DLL-closing support, this is a no-op + returning 0. +*/ +int CmppSohList_append(cmpp *pp, CmppSohList *soli, void *soh); + +/** True if arg is of type cmpp_TT_Word and it looks like it + _might_ be a filename or flag argument. Might. */ +bool cmpp__arg_wordIsPathOrFlag(cmpp_arg const * const arg); + +/** + Helper for #query and friends. Binds aVal's value to column bindNdx + of q. + + It expands cmpp_TT_StringAt and cmpp_TT_Word aVal. cmpp_TT_String + and cmpp_TT_Int are bound as strings. A cmpp_TT_GroupParen aVal is + eval'ed as an integer and that int gets bound as a string. + + This function strictly binds everything as strings, even if the + value being bound is of type cmpp_TT_Int or cmpp_TT_GroupParen, so that + comparison queries will work as expected. + + Returns ppCode. +*/ +int cmpp__bind_arg(cmpp_dx * const dx, sqlite3_stmt * q, + int bindNdx, cmpp_arg const * aVal); + +/** + Helper for #query's bind X part, where aGroup is that X. + + A wrapper around cmpp__bind_arg(). Requires aGroup->ttype to be + either cmpp_TT_GroupBrace or cmpp_TT_GroupSquiggly and to have + non-empty content. cmpp_TT_GroupBrace treats it as a list of values + to bind. cmpp_TT_GroupSquiggly expects sets of 3 tokens per stmt + column in one of these forms: + + :bindName -> value + $bindName -> value + + Each LHS refers to a same-named binding in q's SQL, including the + ':' or '$' prefix. (SQLite supports an '@' prefix but we don't + allow it here to avoid confusion with cmpp_TT_StringAt tokens.) + + Each bound value is passed to cmpp__bind_arg() for processing. + + On success, each aGroup entry is bound to q. On error q's state is + unspecified. Returns ppCode. + + See cmpp__bind_arg() for notes about the bind data type. +*/ +int cmpp__bind_group(cmpp_dx * const dx, sqlite3_stmt * const q, + cmpp_arg const * const aGroup); + +/** + Returns true if the given key is already in the `#define` list, + else false. Sets pp's error state on db error. + + nName is the length of the key part of zName (which might have + a following =y part. If it's negative, strlen() is used to + calculate it. +*/ +int cmpp_has(cmpp *pp, const char * zName, cmpp_ssize_t nName); + +/** + Returns true if the given key is already in the `#define` list, and + it has a truthy value (is not empty and not equal to '0'), else + false. If zName contains an '=' then only the part preceding that + is used as the key. + + nName is the length of zName, or <0 to use strlen() to figure + it out. + + Updates ppCode on error. +*/ +int cmpp__get_bool(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName); + +/** + Fetches the given define. If found, sets *pOut to it, else pOut is + unmodified. Returns pp->err.code. If bRequire is true and no entry + is found p->err.code is updated. +*/ +int cmpp__get_int(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName, int *pOut); + +/** + Searches for a define where (k GLOB zName). If one is found, a copy + of it is assigned to *zVal (the caller must eventually db_free() + it), *nVal (if nVal is not NULL) is assigned its strlen, and + returns non-0. If no match is found, 0 is returned and neither + *zVal nor *nVal are modified. If more than one result matches, a + fatal error is triggered. + + It is legal for *zVal to be NULL (and *nVal to be 0) if it returns + non-0. That just means that the key was defined with no value part. + + Fixme: return 0 on success and set output *gotOne=0|1. +*/ +int cmpp__get(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName, + unsigned char **zVal, unsigned int *nVal); + +/** + Like cmp__get() but puts its output in os. +*/ +int cmpp__get_b(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName, cmpp_b * os, + bool enforceUndefPolicy); + + +/** + Helper for #query and friends. + + It expects that q has just been stepped. For each column in the + row, it sets a define named after the column. If q has row data + then the values come from there. If q has no row then: if + defineIfNoRow is true then it defines each column name to an empty + value else it defines nothing. +*/ +int cmpp__define_from_row(cmpp * const pp, sqlite3_stmt * const q, + bool defineIfNoRow); + +/** Start a new savepoint for dx. */ +int cmpp__dx_sp_begin(cmpp_dx * const dx); +/** Commit and close dx's current savepoint. */ +int cmpp__dx_sp_commit(cmpp_dx * const dx); +/** Roll back and close dx's current savepoint. */ +int cmpp__dx_sp_rollback(cmpp_dx * const dx); + +/** + Append's dx's file/line information to sstr. It returns void + because that's how sqlite3_str_appendf() and friends work. +*/ +void cmpp__dx_append_script_info(cmpp_dx const * dx, + sqlite3_str * sstr); + +/** + If zName matches one of the delayed-load directives, that directive + is registered and 0 is returned. CMPP_RC_NO_DIRECTIVE is returned if + no match is found, but pp's error state is not updated in that + case. If a match is found and registration fails, that result code + will propagate via pp. +*/ +int cmpp__d_delayed_load(cmpp *pp, char const *zName); + +void cmpp__dump_defines(cmpp *pp, cmpp_FILE * fp, int bIndent); + +/** + Like cmpp_tt_cstr(), but if bSymbolName is false then it returns + the higher-level token name, which is NULL for most token types. +*/ +char const * cmpp__tt_cstr(int tt, bool bSymbolName); + +/** + Expects **zPos to be one of ('(' '{' '[' '"' '\'') and zEnd to be + the logical EOF for *zPos. + + This looks for a matching closing token, accounting for nesting. On + success, returns 0 and sets *zPos to the closing character. + + On error it update's pp's error state and returns that code. pp may + be NULL. + + If pNl is not NULL then *pNl is incremented for each '\n' character + seen while looking for the closing character. +*/ +int cmpp__find_closing2(cmpp *pp, + unsigned char const **zPos, + unsigned char const *zEnd, + cmpp_size_t *pNl); + +#define cmpp__find_closing(PP,Z0,Z1) \ + cmpp__find_closing2(PP, Z0, Z1, NULL) + +static inline cmpp_size_t cmpp__strlen(char const *z, cmpp_ssize_t n){ + return n<0 ? (cmpp_size_t)strlen(z) : (cmpp_size_t)n; +} +static inline cmpp_size_t cmpp__strlenu(unsigned char const *z, cmpp_ssize_t n){ + return n<0 ? (cmpp_size_t)strlen((char const *)z) : (cmpp_size_t)n; +} + +/** + If ppCode is not set and pol resolves to cmpp_atpol_OFF then this + updates ppCode with a message about the lack of support for + at-strings. If cmpp_atpol_CURRENT==pol then pp's current policy is + checked. Returns ppCode. +*/ +int cmpp__StringAtIsOk(cmpp * const pp, cmpp_atpol_e pol); + +/** + "define"s zKey to zVal, recording the value type as tType. +*/ +int cmpp__define2(cmpp *pp, + unsigned char const * zKey, + cmpp_ssize_t nKey, + unsigned char const *zVal, + cmpp_ssize_t nVal, + cmpp_tt tType); + +/** + Evals pArgs's arguments as an integer expression. On success, sets + *pResult to the value. + + Returns ppCode. +*/ +int cmpp__args_evalToInt(cmpp_dx * dx, cmpp_args const *pArgs, + int * pResult); + +/** Passes the contents of arg through to cmpp__args_evalToInt(). */ +int cmpp__arg_evalSubToInt(cmpp_dx *dx, cmpp_arg const *arg, + int * pResult); + +/** + Evaluated arg as an integer/bool, placing the result in *pResult + and setting *pNext to the first argument to arg's right which this + routine did not consume. Non-0 on error and all that. +*/ +int cmpp__arg_toBool(cmpp_dx * dx, cmpp_arg const *arg, + int * pResult, cmpp_arg const **pNext); + +/** + If thisTtype==cmpp_TT_AnyType or thisTtype==arg->ttype and arg->z + looks like it might contain an at-string then os is re-used to hold + the @token@-expanded version of arg's content. os is unconditionally + passed to cmpp_b_reuse() before it begines work. + + It uses the given atPolicy to determine whether or not the content + is expanded, as per cmpp_dx_out_expand(). + + Returns 0 on success. If it expands content then *pExp is set to + os->z, else *pExp is set to arg->z. If nExp is not NULL then *nExp + gets set to the length of *pExp (geither os->n or arg->n). + + Returns ppCode. + + Much later: what does this give us that cmpp_arg_to_b() + doesn't? Oh - that one calls into this one. i.e. this one is + lower-level. +*/ +int cmpp__arg_expand_ats(cmpp_dx const * const dx, + cmpp_b * os, + cmpp_atpol_e atPolicy, + cmpp_arg const * const arg, + cmpp_tt thisTtype, + unsigned char const **pExp, + cmpp_size_t * nExp); + +typedef struct cmpp_argOp cmpp_argOp; +typedef void (*cmpp_argOp_f)(cmpp_dx *dx, + cmpp_argOp const *op, + cmpp_arg const *vLhs, + cmpp_arg const **pvRhs, + int *pResult); +struct cmpp_argOp { + int ttype; + /* 1 or 2 */ + unsigned char arity; + /* 0=none/left, 1=right (unary ops only) */ + signed char assoc; + cmpp_argOp_f xCall; +}; + +cmpp_argOp const * cmpp_argOp_for_tt(cmpp_tt tt); + + +bool cmpp__is_int(unsigned char const *z, unsigned n, + int *pOut); +bool cmpp__is_int64(unsigned char const *z, unsigned n, int64_t *pOut); + +char const * cmpp__atpol_name(cmpp *pp, cmpp_atpol_e p); +char const * cmpp__unpol_name(cmpp *pp, cmpp_unpol_e p); + +/** + Uncerimoniously bitwise-replaces pp's output channel with oNew. It + does _not_ clean up the previous channel, on the assumption that + the caller is taking any necessary measures. + + Apropos necessary measures for cleanup: if oPrev is not NULL, + *oPrev is set to a bitwise copy of the previous channel. + + Intended usage: + + ``` + cmpp_outputer oMine = cmpp_outputer_b; + cmpp_b bMine = cmpp_b_empty; + cmpp_outputer oOld = {0}; + oMine.state = &bMine; + cmpp_outputer_swap(pp, &myOut, &oOld); + ...do some work then ALWAYS do... + cmpp_outputer_swap(pp, &oOld, &oMine); + ``` + + Because this involves bitwise copying, care must be taken with + stream state, e.g. bMine.z (above) can be reallocated, so we have + to be sure to swap it back before using bMine again. +*/ +void cmpp__outputer_swap(cmpp *pp, cmpp_outputer const *oNew, + cmpp_outputer *oPrev); + +/** + Init code which is usually run as part of the ctor but may have to + be run later, after cmpp_reset(). We can't run it from cmpp_reset() + because that could leave post-reset in an error state, which is + icky. This call is a no-op after the first. +*/ +int cmpp__lazy_init(cmpp *pp); + +CMPP_NORETURN void cmpp__fatalv_base(char const *zFile, int line, + char const *zFmt, va_list); +#define cmpp__fatalv(...) cmpp__fatalv_base(__FILE__,__LINE__,__VA_ARGS__) +CMPP_NORETURN void cmpp__fatal_base(char const *zFile, int line, + char const *zFmt, ...); +#define cmpp__fatal(...) cmpp__fatal_base(__FILE__,__LINE__,__VA_ARGS__) + +/** + Outputs a printf()-formatted message to stderr. +*/ +void g_stderr(char const *zFmt, ...); +#define g_warn(zFmt,...) g_stderr("%s:%d %s() " zFmt "\n", __FILE__, __LINE__, __func__, __VA_ARGS__) +#define g_warn0(zMsg) g_stderr("%s:%d %s() %s\n", __FILE__, __LINE__, __func__, zMsg) +#if 0 +#define g_debug(PP,lvl,pfexpr) (void)0 +#else +#define g_debug(PP,lvl,pfexpr) \ + if(lvl<=(PP)->pimpl->flags.doDebug) { \ + if( (PP)->pimpl->dx ){ \ + g_stderr("%s:%" CMPP_SIZE_T_PFMT ": ", \ + (PP)->pimpl->dx->sourceName, \ + (PP)->pimpl->dx->pimpl->dline.lineNo); \ + } \ + g_stderr("%s():%d: ", \ + __func__,__LINE__); \ + g_stderr pfexpr; \ + } (void)0 +#endif + +/** Returns true if zFile is readable, else false. */ +bool cmpp__file_is_readable(char const *zFile); + +#define ustr_c(X) ((unsigned char const *)X) +#define ustr_nc(X) ((unsigned char *)X) +#define ppCode pp->pimpl->err.code +#define dxppCode dx->ppCode +#define cmpp__pi(PP) cmpp_pimpl * const pi = PP->pimpl +#define cmpp__dx_pi(DX) cmpp_dx_pimpl * const dpi = DX->pimpl +#define serr(...) cmpp_err_set(pp, CMPP_RC_SYNTAX, __VA_ARGS__) +#define dxserr(...) cmpp_err_set(dx->pp, CMPP_RC_SYNTAX, __VA_ARGS__) +#define cmpp__li_reserve1_size(li,nInitial) \ + (li->n ? (li->n==li->nAlloc ? li->nAlloc * 2 : li->n+1) : nInitial) + +#define MARKER(pfexp) \ + do{ printf("MARKER: %s:%d:%s():\t",__FILE__,__LINE__,__func__); \ + printf pfexp; \ + } while(0) + +#endif /* include guard */ +/* +** 2022-11-12: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the core of libcmpp (it used to house all of it, +** but the library grew beyond the confines of a single file). +** +** See the accompanying c-pp.h and README.md and/or c-pp.h for more +** details. +*/ +#include "sqlite3.h" + +char const * cmpp_version(void){ return CMPP_VERSION; } + +const cmpp__delim cmpp__delim_empty = cmpp__delim_empty_m; +const cmpp__delim_list cmpp__delim_list_empty = cmpp__delim_list_empty_m; +const cmpp_outputer cmpp_outputer_empty = cmpp_outputer_empty_m; +const cmpp_outputer cmpp_outputer_FILE = { + .out = cmpp_output_f_FILE, + .flush = cmpp_flush_f_FILE, + .cleanup = cmpp_outputer_cleanup_f_FILE, + .state = NULL +}; +const cmpp_b_list cmpp_b_list_empty = + cmpp_b_list_empty_m; +const cmpp_outputer cmpp_outputer_b = { + .out = cmpp_output_f_b, + .flush = 0, + .cleanup = cmpp_outputer_cleanup_f_b, + .state = NULL +}; + +/** + Default delimiters for @tokens@. +*/ +static const cmpp__delim delimAtDefault = { + .open = { .z = ustr_c("@"), .n = 1 }, + .close = { .z = ustr_c("@"), .n = 1 }, + .zOwns = NULL +}; + +static const cmpp_api_thunk cmppApiMethods = { +#define A(V) +#define V(N,T,V) .N = V, +#define F(N,T,P) .N = cmpp_ ##N, +#define O(N,T) .N = &cmpp_ ##N, + cmpp_api_thunk_map(A,V,F,O) +#undef F +#undef O +#undef V +#undef A +}; + +/* Fatally exits the app with the given printf-style message. */ + +CMPP__EXPORT(bool, cmpp_isspace)(int ch){ + return ' '==ch || '\t'==ch; +} + +//CMPP__EXPORT(int, cmpp_isnl)(char const * z, char const *zEnd){} +static inline int cmpp_isnl(unsigned char const * z, unsigned char const *zEnd){ + //assert(z= zBegin ); + unsigned char const * z = *p; + while( z>zBegin && cmpp_isspace(z[-1]) ) --z; + *p = z; +} + +CMPP__EXPORT(void, cmpp_skip_snl_trailing)( unsigned char const *zBegin, + unsigned char const **p ){ + assert( *p >= zBegin ); + unsigned char const * z = *p; + /* FIXME: CRNL. */ + while( z>zBegin && cmpp_issnl(*z) ) --z; + *p = z; +} + +/* Set pp's error state. */ +static int cmpp__errv(cmpp *pp, int rc, char const *zFmt, va_list); +/** + Sets pp's error state. +*/ +CMPP__EXPORT(int, cmpp_err_set)(cmpp *pp, int rc, char const *zFmt, ...); +#define cmpp__err cmpp_err_set +#define cmpp_dx_err cmpp_dx_err_set + +/* Open/close pp's output channel. */ +static int cmpp__out_fopen(cmpp *pp, const char *zName); +static void cmpp__out_close(cmpp *pp); + +#define CmppKvp_empty_m \ + {CmppSnippet_empty_m,CmppSnippet_empty_m,CmppKvp_op_none} +const CmppKvp CmppKvp_empty = CmppKvp_empty_m; + +/* Wrapper around a cmpp_FILE handle. Legacy stuff from when we just + supported cmpp_FILE input. */ +typedef struct FileWrapper FileWrapper; +struct FileWrapper { + /* File's name. */ + char const *zName; + /* cmpp_FILE handle. */ + cmpp_FILE * pFile; + /* Where FileWrapper_slurp() stores the file's contents. */ + unsigned char * zContent; + /* Size of this->zContent, as set by FileWrapper_slurp(). */ + cmpp_size_t nContent; +}; +#define FileWrapper_empty_m {0,0,0,0} +static const FileWrapper FileWrapper_empty = FileWrapper_empty_m; + +/** + Proxy for cmpp_fclose() and frees all memory owned by p. It is not + an error if p is already closed. +*/ +static void FileWrapper_close(FileWrapper * p); + +/** Proxy for cmpp_fopen(). Closes p first if it's currently opened. */ +static int FileWrapper_open(FileWrapper * p, const char * zName, const char *zMode); + +/* Populates p->zContent and p->nContent from p->pFile. */ +//static int FileWrapper_slurp(FileWrapper * p, int bCloseFile ); + +/** + If p->zContent ends in \n or \r\n, that part is replaced with 0 and + p->nContent is adjusted. Returns true if it chomps, else false. +*/ +static bool FileWrapper_chomp(FileWrapper * p); + +/* +** Outputs a printf()-formatted message to stderr. +*/ +static void g_stderrv(char const *zFmt, va_list); + +CMPP__EXPORT(char const *, cmpp_rc_cstr)(int rc){ + switch((cmpp_rc_e)rc){ +#define E(N,V,H) case N: return # N; + cmpp_rc_e_map(E) +#undef E + } + return NULL; +} + +CMPP__EXPORT(void, cmpp_mfree)(void *p){ + /* This MUST be a proxy for sqlite3_free() because allocate memory + exclusively using sqlite3_malloc() and friends. */ + sqlite3_free(p); +} + +CMPP__EXPORT(void *, cmpp_mrealloc)(void * p, size_t n){ + return sqlite3_realloc64(p, n); +} + +CMPP__EXPORT(void *, cmpp_malloc)(size_t n){ +#if 1 + return sqlite3_malloc64(n); +#else + void * p = sqlite3_malloc64(n); + if( p ) memset(p, 0, n); + return p; +#endif +} + +cmpp_FILE * cmpp_fopen(const char *zName, const char *zMode){ + cmpp_FILE *f; + if(zName && ('-'==*zName && !zName[1])){ + f = (strchr(zMode, 'w') || strchr(zMode,'+')) + ? stdout + : stdin + ; + }else{ + f = fopen(zName, zMode); + } + return f; +} + +void cmpp_fclose( cmpp_FILE * f ){ + if(f && (stdin!=f) && (stdout!=f) && (stderr!=f)){ + fclose(f); + } +} + +int cmpp_slurp(cmpp_input_f fIn, void *sIn, + unsigned char **pOut, cmpp_size_t * nOut){ + unsigned char zBuf[1024 * 16]; + unsigned char * pDest = 0; + unsigned nAlloc = 0; + unsigned nOff = 0; + int rc = 0; + cmpp_size_t nr = 0; + while( 0==rc ){ + nr = sizeof(zBuf); + if( (rc = fIn(sIn, zBuf, &nr)) ){ + break; + } + if(nr>0){ + if(nAlloc < nOff + nr + 1){ + nAlloc = nOff + nr + 1; + pDest = cmpp_mrealloc(pDest, nAlloc); + } + memcpy(pDest + nOff, zBuf, nr); + nOff += nr; + }else{ + break; + } + } + if( 0==rc ){ + if(pDest) pDest[nOff] = 0; + *pOut = pDest; + *nOut = nOff; + }else{ + cmpp_mfree(pDest); + } + return rc; +} + +void FileWrapper_close(FileWrapper * p){ + if(p->pFile) cmpp_fclose(p->pFile); + if(p->zContent) cmpp_mfree(p->zContent); + *p = FileWrapper_empty; +} + +int FileWrapper_open(FileWrapper * p, const char * zName, + const char * zMode){ + FileWrapper_close(p); + if( (p->pFile = cmpp_fopen(zName, zMode)) ){ + p->zName = zName; + return 0; + }else{ + return cmpp_errno_rc(errno, CMPP_RC_IO); + } +} + +int FileWrapper_slurp(FileWrapper * p, int bCloseFile){ + assert(!p->zContent); + assert(p->pFile); + int const rc = cmpp_slurp(cmpp_input_f_FILE, p->pFile, + &p->zContent, &p->nContent); + if( bCloseFile ){ + cmpp_fclose(p->pFile); + p->pFile = 0; + } + return rc; +} + +CMPP__EXPORT(bool, cmpp_chomp)(unsigned char * z, cmpp_size_t * n){ + if( *n && '\n'==z[*n-1] ){ + z[--*n] = 0; + if( *n && '\r'==z[*n-1] ){ + z[--*n] = 0; + } + return true; + } + return false; +} + +bool FileWrapper_chomp(FileWrapper * p){ + return cmpp_chomp(p->zContent, &p->nContent); +} + + +#if 0 +/** + Returns the number newline characters between the given starting + point and inclusive ending point. Results are undefined if zFrom is + greater than zTo. +*/ +static unsigned cmpp__count_lines(unsigned char const * zFrom, + unsigned char const *zTo); + +unsigned cmpp__count_lines(unsigned char const * zFrom, + unsigned char const *zTo){ + unsigned ln = 0; + assert(zFrom && zTo); + assert(zFrom <= zTo); + for(; zFrom < zTo; ++zFrom){ + if((unsigned char)'\n' == *zFrom) ++ln; + } + return ln; +} +#endif + +char const * cmpp__tt_cstr(int tt, bool bSymbolName){ + switch(tt){ +#define E(N,TOK) case cmpp_TT_ ## N: \ + return bSymbolName ? "cmpp_TT_" #N : TOK; + cmpp_tt_map(E) +#undef E + } + return NULL; +} + +char const * cmpp_tt_cstr(int tt){ + return cmpp__tt_cstr(tt, true); +} + +/** Flags and constants related to CmppLvl. */ +enum CmppLvl_e { + /** + Flag indicating that all ostensible output for a CmpLevel should + be elided. This also suppresses non-flow-control directives from + being processed. + */ + CmppLvl_F_ELIDE = 0x01, + /** + Mask of CmppLvl::flags which are inherited when + CmppLvl_push() is used. + */ + CmppLvl_F_INHERIT_MASK = CmppLvl_F_ELIDE +}; + +//static const CmppDLine CmppDLine_empty = CmppDLine_empty_m; + +/** Free all memory owned by li but does not free li. */ +static void CmppLvlList_cleanup(CmppLvlList *li); + +/** + Allocate a list entry, owned by li, and return it (cleanly zeroed + out). Returns NULL and updates pp->err on error. It is expected + that the caller will populate the entry's zName using + sqlite3_mprintf() or equivalent. +*/ +static CmppLvl * CmppLvlList_push(cmpp *pp, CmppLvlList *li); + +/** Returns the most-recently-appended element of li back to li's + free-list. It expects to receive that value as a sanity-checking + measure and may fail fatally of that's not upheld. */ +static void CmppLvlList_pop(cmpp *pp, CmppLvlList *li, CmppLvl * lvl); + +static const cmpp_dx_pimpl cmpp_dx_pimpl_empty = + cmpp_dx_pimpl_empty_m; + +#define cmpp_dx_empty_m { \ + .pp=0, \ + .d=0, \ + .sourceName=0, \ + .args={ \ + .z=0, .nz=0, \ + .argc=0, .arg0=0 \ + }, \ + .pimpl = 0 \ +} + +const cmpp_dx cmpp_dx_empty = cmpp_dx_empty_m; +#define cmpp_d_empty_m {{0,0},0,0,cmpp_d_impl_empty_m} +//static const cmpp_d cmpp_d_empty = cmpp_d_empty_m; + +static const CmppDList_entry CmppDList_entry_empty = + CmppDList_entry_empty_m; + +/** Free all memory owned by li but does not free li. */ +static void CmppDList_cleanup(CmppDList *li); +/** + Allocate a list entry, owned by li, and return it (cleanly zeroed + out). Returns NULL and updates pp->err on error. It is expected + that the caller will populate the entry's zName using + sqlite3_mprintf() or equivalent. +*/ +static CmppDList_entry * CmppDList_append(cmpp *pp, CmppDList *li); +/** Returns the most-recently-appended element of li back to li's + free-list. */ +static void CmppDList_unappend(CmppDList *li); +/** Resets li's list for re-use but does not free it. Returns li. */ +//static CmppDList * CmppDList_reuse(CmppDList *li); +static CmppDList_entry * CmppDList_search(CmppDList const * li, + char const *zName); + +/** Reset dx and free any memory it may own. */ +static void cmpp_dx_cleanup(cmpp_dx * const dx); +/** + Reset some of dx's parsing-related state in prep for fetching the + next line. +*/ +static void cmpp_dx__reset(cmpp_dx * const dx); + +/* Returns dx's current directive. */ +static inline cmpp_d const * cmpp_dx_d(cmpp_dx const * const dx){ + return dx->d; +} + +static const cmpp_pimpl cmpp_pimpl_empty = { + .db = { + .dbh = 0, + .zName = 0 + }, + .dx = 0, + .out = cmpp_outputer_empty_m, + .delim = { + .d = cmpp__delim_list_empty_m, + .at = cmpp__delim_list_empty_m + }, + .stmt = { +#define E(N,S) .N = 0, + CmppStmt_map(E) +#undef E + }, + .err = { + .code = 0, + .zMsg = 0, + .zMsgC = 0 + }, + .sqlTrace = { + .expandSql = false, + .counter = 0, + .out = cmpp_outputer_empty_m + }, + .flags = { + .allocStamp = 0, + .nIncludeDir = 0, + .nDxDepth = 0, + .nSavepoint = 0, + .doDebug = 0, + .chompF = 0, + .newFlags = 0, + .isInternalDirectiveReg = false, + .nextIsCall = false, + .needsLazyInit = true + }, + .policy = { + .at = {0,0,0}, + .un = {0,0,0} + }, + .d = { + .list = CmppDList_empty_m, + .autoload = cmpp_d_autoloader_empty_m + }, + .mod = { + .sohList = CmppSohList_empty_m, + .path = cmpp_b_empty_m, + .soExt = CMPP_PLATFORM_EXT_DLL, + /* Yes, '*'. It makes sense in context. */ + .pathSep = '*' + // 0x1e /* "record separator" */ doesn't work. Must be non-ctrl. + }, + .recycler = { + .buf = cmpp_b_list_empty_m, + .bufSort = cmpp_b_list_UNSORTED, + .argPimpl = NULL + } +}; + +#if 0 +static inline int cmpp__out(cmpp *pp, void const *z, cmpp_size_t n){ + return cmpp__out2(pp, &pp->out, z, n); +} +#endif + +/** + Returns an approximate cmpp_tt for the given SQLITE_... value from + sqlite3_column_type() or sqlite3_value_type(). +*/ +static cmpp_tt cmpp__tt_for_sqlite(int sqType); + +/** + Init code which is usually run as part of the ctor but may have to + be run later, after cmpp_reset(). We can't run it from cmpp_reset() + because that could leave post-reset in an error state, which is + icky. +*/ +int cmpp__lazy_init(cmpp *pp){ + if( !ppCode && pp->pimpl->flags.needsLazyInit ){ + pp->pimpl->flags.needsLazyInit = false; + cmpp__delim_list * li = &pp->pimpl->delim.d; + if( !li->n ) cmpp_delimiter_push(pp, NULL); + li = &pp->pimpl->delim.at; + if( !li->n ) cmpp_atdelim_push(pp, NULL, NULL); +#if defined(CMPP_CTOR_INSTANCE_INIT) + if( !ppCode ){ + extern int CMPP_CTOR_INSTANCE_INIT(cmpp*); + int const rc = CMPP_CTOR_INSTANCE_INIT(pp); + if( rc && !ppCode ){ + cmpp__err(pp, rc, + "Initialization via CMPP_CTOR_INSTANCE_INIT() failed " + "with code %d/%s.", rc, cmpp_rc_cstr(rc) ); + } + } +#endif + } + return ppCode; +} + +static void cmpp__wipe_policies(cmpp *pp){ + if( 0==ppCode ){ + PodList__atpol_reserve(pp, &cmpp__epol(pp,at), 0); + PodList__unpol_reserve(pp, &cmpp__epol(pp,un), 0); + if( 0==ppCode ){ + PodList__atpol_wipe(&cmpp__epol(pp,at), cmpp_atpol_DEFAULT); + PodList__unpol_wipe(&cmpp__epol(pp,un), cmpp_unpol_DEFAULT); + } + } +} + +CMPP__EXPORT(int, cmpp_ctor)(cmpp **pOut, cmpp_ctor_cfg const * cfg){ + cmpp_pimpl * pi = 0; + cmpp * pp = 0; + void * const mv = cmpp_malloc(sizeof(cmpp) + sizeof(*pi)); + if( mv ){ + if( !cfg ){ + static const cmpp_ctor_cfg dfltCfg = {0}; + cfg = &dfltCfg; + } + cmpp const x = { + .api = &cmppApiMethods, + .pimpl = (cmpp_pimpl*)((unsigned char *)mv + sizeof(cmpp)) + /* ^^^ (T const * const) members */ + }; + memcpy(mv, &x, sizeof(x)) + /* FWIW, i'm convinced that this is a legal way to transfer + these const-pointers-to-const. If not, we'll need to change + those cmpp members from (T const * const) to (T const *). */; + pp = mv; + assert(pp->api == &cmppApiMethods); + assert(pp->pimpl); + pi = pp->pimpl; + *pOut = pp; + *pi = cmpp_pimpl_empty; + assert( pi->flags.needsLazyInit ); + pi->flags.newFlags = cfg->flags; + pi->flags.allocStamp = &cmpp_pimpl_empty; + if( cfg->dbFile ){ + pi->db.zName = sqlite3_mprintf("%s", cfg->dbFile); + cmpp_check_oom(pp, pi->db.zName); + } + if( 0==ppCode ){ + cmpp__wipe_policies(pp); + cmpp__lazy_init(pp); + } + } + return pp ? ppCode : CMPP_RC_OOM; +} + +CMPP__EXPORT(void, cmpp_reset)(cmpp *pp){ + cmpp__pi(pp); + cmpp_outputer_cleanup(&pi->sqlTrace.out); + pi->sqlTrace.out = cmpp_outputer_empty; + if( pi->d.autoload.dtor ){ + pi->d.autoload.dtor(pi->d.autoload.state); + } + pi->d.autoload = cmpp_pimpl_empty.d.autoload; + cmpp_b_clear(&pi->mod.path); + if( pi->stmt.spRelease && pi->stmt.spRollback ){ + /* Cleanly kill all savepoint levels. This is truly superfluous, + as they'll all be rolled back (if the db is persistent) or + nuked (if using a :memory: db) momentarily. However, we'll + eventually need this for a partial-clear operation which leaves + the db and custom directives intact. For now it lives here but + will eventually move to wherever that ends up being. + + 2025-11-16: or not. It's fine here, really. + */ + sqlite3_reset(pi->stmt.spRelease); + while( SQLITE_DONE==sqlite3_step(pi->stmt.spRelease) ){ + sqlite3_reset(pi->stmt.spRollback); + sqlite3_step(pi->stmt.spRollback); + sqlite3_reset(pi->stmt.spRelease); + } + } + cmpp__out_close(pp); + CmppDList_cleanup(&pi->d.list); +#define E(N,S) \ + if(pi->stmt.N) {sqlite3_finalize(pi->stmt.N); pi->stmt.N = 0;} + CmppStmt_map(E) (void)0; +#undef E + if( pi->db.dbh ){ + if( SQLITE_TXN_WRITE==sqlite3_txn_state(pi->db.dbh, NULL) ){ + sqlite3_exec(pi->db.dbh, "COMMIT;", 0, 0, NULL) + /* ignoring error */; + } + sqlite3_close(pi->db.dbh); + pi->db.dbh = 0; + } + cmpp__delim_list_reuse(&pi->delim.d); + cmpp__delim_list_reuse(&pi->delim.at); + //why? cmpp_b_list_reuse(&pi->cache.buf); + cmpp__err_clear(pp); + {/* Zero out pi but save some pieces for later, when pp is + cmpp_dtor()'d */ + cmpp_pimpl const tmp = *pi; + *pi = cmpp_pimpl_empty; + pi->db = tmp.db /* restore db.zName */; + pi->recycler = tmp.recycler; + pi->policy = tmp.policy; + pi->delim = tmp.delim; + pi->mod.sohList = tmp.mod.sohList; + cmpp__wipe_policies(pp); + pi->flags.allocStamp = tmp.flags.allocStamp; + pi->flags.newFlags = tmp.flags.newFlags; + pi->flags.needsLazyInit = true; + } +} + +static void cmpp__delim_list_cleanup(cmpp__delim_list *li); + +CMPP__EXPORT(void, cmpp_dtor)(cmpp *pp){ + if( pp ){ + cmpp__pi(pp); + cmpp_reset(pp); + cmpp_mfree(pi->db.zName); + PodList__atpol_finalize(&cmpp__epol(pp,at)); + assert(!cmpp__epol(pp,at).na); + PodList__unpol_finalize(&cmpp__epol(pp,un)); + assert(!cmpp__epol(pp,un).na); + cmpp_b_list_cleanup(&pi->recycler.buf); + cmpp__delim_list_cleanup(&pi->delim.d); + cmpp__delim_list_cleanup(&pi->delim.at); + for( cmpp_args_pimpl * apNext = 0, + * ap = pi->recycler.argPimpl; + ap; ap = apNext ){ + apNext = ap->nextFree; + ap->nextFree = 0; + cmpp_args_pimpl_cleanup(ap); + cmpp_mfree(ap); + } + CmppSohList_close(&pi->mod.sohList); + if( &cmpp_pimpl_empty==pi->flags.allocStamp ){ + pi->flags.allocStamp = 0; + cmpp_mfree(pp); + } + } +} + +CMPP__EXPORT(bool, cmpp_is_safemode)(cmpp const * pp){ + return pp ? 0!=(cmpp_ctor_F_SAFEMODE & pp->pimpl->flags.newFlags) : false; +} + +/** Sets ppCode if m is NULL. Returns ppCode. */ +CMPP__EXPORT(int, cmpp_check_oom)(cmpp * const pp, void const * const m ){ + int rc; + if( pp ){ + if( !m ){ + //assert(!"oom"); + cmpp__err(pp, CMPP_RC_OOM, 0); + } + rc = ppCode; + }else{ + rc = m ? 0 : CMPP_RC_OOM; + } + return rc; +} + +//CxMPP_WASM_EXPORT +void *cmpp__malloc(cmpp *pp, cmpp_size_t n){ + void *p = 0; + if( 0==ppCode ){ + p = cmpp_malloc(n); + cmpp_check_oom(pp, p); + } + return p; +} + +/** + If ppCode is not 0 then it flushes pp's output channel. If that + fails, it sets ppCode. Returns ppCode. +*/ +static int cmpp__flush(cmpp *pp){ + if( !ppCode && pp->pimpl->out.flush ){ + int const rc = pp->pimpl->out.flush(pp->pimpl->out.state); + if( rc && !ppCode ){ + cmpp_err_set(pp, rc, "Flush failed."); + } + } + return ppCode; +} + +void cmpp__out_close(cmpp *pp){ + cmpp__flush(pp)/*ignoring result*/; + cmpp_outputer_cleanup(&pp->pimpl->out); + pp->pimpl->out = cmpp_pimpl_empty.out; +} + +int cmpp__out_fopen(cmpp *pp, const char *zName){ + cmpp__out_close(pp); + if( !ppCode ){ + cmpp_FILE * const f = cmpp_fopen(zName, "wb"); + if( f ){ + ppCode = 0; + pp->pimpl->out = cmpp_outputer_FILE; + pp->pimpl->out.state = f; + pp->pimpl->out.name = zName; + }else{ + ppCode = cmpp__err( + pp, cmpp_errno_rc(errno, CMPP_RC_IO), + "Error opening file %s", zName + ); + } + } + return ppCode; +} + +static int cmpp__FileWrapper_open(cmpp *pp, FileWrapper * fw, + const char * zName, + const char * zMode){ + int const rc = FileWrapper_open(fw, zName, zMode); + if( rc ){ + cmpp__err(pp, rc, "Error %s opening file [%s] " + "with mode [%s]", + cmpp_rc_cstr(rc), zName, zMode); + } + return ppCode; +} + +static int cmpp__FileWrapper_slurp(cmpp* pp, FileWrapper * fw){ + assert( fw->pFile ); + int const rc = FileWrapper_slurp(fw, 1); + if( rc ){ + cmpp__err(pp, rc, "Error %s slurping file %s", + cmpp_rc_cstr(rc), fw->zName); + } + return ppCode; +} + +void g_stderrv(char const *zFmt, va_list va){ + vfprintf(0 ? stdout : stderr, zFmt, va); +} + +void g_stderr(char const *zFmt, ...){ + va_list va; + va_start(va, zFmt); + g_stderrv(zFmt, va); + va_end(va); +} + +CMPP__EXPORT(char const *, cmpp_dx_delim)(cmpp_dx const *dx){ + return (char const *)cmpp__dx_zdelim(dx); +} + +int cmpp__out2(cmpp *pp, cmpp_outputer *pOut, + void const *z, cmpp_size_t n){ + assert( pOut ); + if( !ppCode && pOut->out && n ){ + int const rc = pOut->out(pOut->state, z, n); + if( rc ){ + cmpp__err(pp, rc, + "Write of %" CMPP_SIZE_T_PFMT + " bytes to output stream failed.", n); + } + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_dx_out_raw)(cmpp_dx * dx, void const *z, cmpp_size_t n){ + if( dxppCode || cmpp_dx_is_eliding(dx) ) return dxppCode; + return cmpp__out2(dx->pp, &dx->pp->pimpl->out, z, n); +} + +CMPP__EXPORT(int, cmpp_outfv2)(cmpp *pp, cmpp_outputer *out, char const *zFmt, va_list va){ + assert( out ); + if( !ppCode && zFmt && *zFmt && out->out ){ + char * s = sqlite3_vmprintf(zFmt, va); + if( 0==cmpp_check_oom(pp, s) ){ + cmpp__out2(pp, out, s, cmpp__strlen(s, -1)); + } + cmpp_mfree(s); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_outf2)(cmpp *pp, cmpp_outputer *out, char const *zFmt, ...){ + assert( out ); + if( !ppCode && zFmt && *zFmt && out->out ){ + va_list va; + va_start(va, zFmt); + cmpp_outfv2(pp, out, zFmt, va); + va_end(va); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_outfv)(cmpp *pp, char const *zFmt, va_list va){ + return cmpp_outfv2(pp, &pp->pimpl->out, zFmt, va); +} + +CMPP__EXPORT(int, cmpp_outf)(cmpp *pp, char const *zFmt, ...){ + if( !ppCode ){ + va_list va; + va_start(va, zFmt); + cmpp_outfv2(pp, &pp->pimpl->out, zFmt, va); + va_end(va); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_dx_outf)(cmpp_dx *dx, char const *zFmt, ...){ + if( !dxppCode && zFmt && *zFmt && dx->pp->pimpl->out.out ){ + va_list va; + va_start(va, zFmt); + cmpp_outfv(dx->pp, zFmt, va); + va_end(va); + } + return dxppCode; +} + +static int cmpp__affirm_undef_policy(cmpp *pp, + unsigned char const *zName, + cmpp_size_t nName){ + if( 0==ppCode + && cmpp_unpol_ERROR==cmpp__policy(pp,un) ){ + cmpp__err(pp, CMPP_RC_NOT_DEFINED, + "Key '%.*s' was not found and the undefined-value " + "policy is 'error'.", + (int)nName, zName); + } + return ppCode; +} + +static int cmpp__out_expand(cmpp * pp, cmpp_outputer * pOut, + unsigned char const * zFrom, + cmpp_size_t n, cmpp_atpol_e atPolicy){ + enum state_e { + /* looking for @token@ opening @ */ + state_opening, + /* looking for @token@ closing @ */ + state_closing + }; + cmpp__pi(pp); + if( ppCode ) return ppCode; + if( cmpp_atpol_CURRENT==atPolicy ) atPolicy = cmpp__policy(pp,at); + assert( cmpp_atpol_invalid!=atPolicy ); + unsigned char const *zLeft = zFrom; + unsigned char const * const zEnd = zFrom + n; + unsigned char const *z = + (cmpp_atpol_OFF==atPolicy || cmpp_atpol_invalid==atPolicy) + ? zEnd + : zLeft; + unsigned char const chEol = (unsigned char)'\n'; + cmpp__delim const * delim = + cmpp__delim_list_get(&pp->pimpl->delim.at); + if( !delim && zout; + } + assert( pi->dx ? !cmpp_dx_is_eliding(pi->dx) : 1 ); + +#define tflush \ + if(z>zEnd) z=zEnd; \ + if(zLeftpimpl->dx ? pp->pimpl->dx->pimpl : NULL; + for( ; zflags.countLines ){ + ++dxp->lineNo; + } +#endif + state = state_opening; + continue; + } + if( state_opening==state ){ + if( z + delim->open.n < zEnd + && 0==memcmp(z, delim->open.z, delim->open.n) ){ + tflush; + z += delim->open.n; + if( 0 ) g_warn("zLeft..z=[%.*s]", (int)(z-zLeft), zLeft); + if( 0 ){ + g_warn("\nzLeft..=[%s]\nz=[%s]", zLeft, z); + } + state = state_closing; +#if 1 + /* Handle call of @[directive ...args]@ + + i'm not a huge fan of this syntax, but that may go away + if we replace the single-char separator with a pair of + opening/closing delimiters. + */ + if( z>", (int)(zEnd-zb), zb); + if( cmpp__find_closing2(pp, &zb, zEnd, &nl) ){ + break; + } + //g_warn("Found: <<%.*s>>", (int)(zb+1-z), z); + if( zb + delim->close.n >= zEnd + || 0!=memcmp(zb+1, delim->close.z, delim->close.n) ){ + serr("Expecting '%s' after closing ']'.", delim->close.z); + break; + } + if( nl && dxp && dxp->flags.countLines ){ + dxp->pos.lineNo +=nl; + } + cmpp_call_str(pp, z+delim->open.n, + (zb - z - delim->open.n), + cmpp_b_reuse(bCall), 0); + if( 0==ppCode ){ + cmpp__out2(pp, pOut, bCall->z, bCall->n); + state = state_opening; + zLeft = z = zb + delim->close.n + 1; + //g_warn("post-@[]@ z=%.*s", (int)(zEnd-z), z); + } + } +#endif + if( z>=zEnd ) break; + goto again /* avoid adjusting z again */; + } + }else{/*we're looking for delim->closer*/ + assert( state_closing==state ); + if( z + delim->close.n <= zEnd + && 0==memcmp(z, delim->close.z, delim->close.n ) ){ + /* process the ... part of @...@ */ + assert( state_closing==state ); + assert( zLeftopen.z, delim->open.n) ); + unsigned char const *zKey = + zLeft + delim->open.n; + cmpp_ssize_t const nKey = z - zLeft - delim->open.n; + if( 0 ) g_warn("nKey=%d zKey=[%.*s]", nKey, nKey, zKey); + assert( nKey>= 0 ); + if( !nKey ){ + serr("Empty key is not permitted in %s...%s.", + delim->open.z, delim->close.z); + break; + } + if( cmpp__get_b(pp, zKey, nKey, cmpp_b_reuse(bVal), true) ){ + if(0){ + g_warn("nVal=%d zVal=[%.*s]", (int)bVal->n, + (int)bVal->n, bVal->z); + } + if( bVal->n ){ + cmpp__out2(pp, pOut, bVal->z, bVal->n); + }else{ + /* Elide it */ + } + zLeft = z + delim->close.n; + assert( zLeft<=zEnd ); + }else if( !ppCode ){ + assert( !bVal->n ); + /* No matching define . */ + switch( atPolicy ){ + case cmpp_atpol_ELIDE: zLeft = z + delim->close.n; break; + case cmpp_atpol_RETAIN: tflush; break; + case cmpp_atpol_ERROR: + cmpp__err(pp, CMPP_RC_NOT_DEFINED, + "Undefined %skey%s: %.*s", + delim->open.z, delim->close.z, nKey, zKey); + break; + case cmpp_atpol_invalid: + case cmpp_atpol_CURRENT: + case cmpp_atpol_OFF: + assert(!"this shouldn't be reachable" ); + cmpp__err(pp, CMPP_RC_ERROR, "Unhandled atPolicy #%d", + atPolicy); + break; + } + }/* process @...@ */ + state = state_opening; + assert( z<=zEnd ); + }/*matched a closer*/ + }/*state_closer==state*/ + assert( z<=zEnd ); + }/*per-line loop*/ + }/*outer loop*/ +#if 0 + if( 0==ppCode && state_closer==state ){ + serr("Opening '%s' found without a closing '%s'.", + delim->open.z, delim->close.z); + } +#endif + tflush; +#undef tflush + cmpp_b_return(pp, bCall); + cmpp_b_return(pp, bVal); + return ppCode; +} + +CMPP__EXPORT(int, cmpp_dx_out_expand)(cmpp_dx const * const dx, + cmpp_outputer * pOut, + unsigned char const * zFrom, + cmpp_size_t n, + cmpp_atpol_e atPolicy){ + if( dxppCode || cmpp_dx_is_eliding(dx) ) return dxppCode; + return cmpp__out_expand(dx->pp, pOut, zFrom, n, atPolicy); +} + +CmppLvl * CmppLvl_get(cmpp_dx const *dx){ + return dx->pimpl->dxLvl.n + ? dx->pimpl->dxLvl.list[dx->pimpl->dxLvl.n-1] + : 0; +} + +static const CmppLvl CmppLvl_empty = CmppLvl_empty_m; + +CmppLvl * CmppLvl_push(cmpp_dx *dx){ + CmppLvl * p = 0; + if( !dxppCode ){ + CmppLvl * const pPrev = CmppLvl_get(dx); + p = CmppLvlList_push(dx->pp, &dx->pimpl->dxLvl); + if( p ){ + *p = CmppLvl_empty; + p->lineNo = dx->pimpl->dline.lineNo; + //p->d = dx->d; + if( pPrev ){ + p->flags = (CmppLvl_F_INHERIT_MASK & pPrev->flags); + //if(CLvl_isSkip(pPrev)) p->flags |= CmppLvl_F_ELIDE; + } + } + } + return p; +} + +void CmppLvl_pop(cmpp_dx *dx, CmppLvl * lvl){ + CmppLvlList_pop(dx->pp, &dx->pimpl->dxLvl, lvl); +} + +void CmppLvl_elide(CmppLvl *lvl, bool on){ + if( on ) lvl->flags |= CmppLvl_F_ELIDE; + else lvl->flags &= ~CmppLvl_F_ELIDE; +} + +bool CmppLvl_is_eliding(CmppLvl const *lvl){ + return lvl && !!(lvl->flags & CmppLvl_F_ELIDE); +} + +#if 0 +void cmpp_dx_elide_mode(cmpp_dx *dx, bool on){ + CmppLvl_elide(CmppLvl_get(dx), on); +} +#endif + +bool cmpp_dx_is_eliding(cmpp_dx const *dx){ + return CmppLvl_is_eliding(CmppLvl_get(dx)); +} + + +char * cmpp_str_finish(cmpp *pp, sqlite3_str *s, int * n){ + char * z = 0; + int const rc = sqlite3_str_errcode(s); + cmpp__db_rc(pp, rc, "sqlite3_str_errcode()"); + if(0==rc){ + int const nStr = sqlite3_str_length(s); + if(n) *n = nStr; + z = sqlite3_str_finish(s); + if( !z ){ + assert( 0==nStr && "else rc!=0" ); + } + }else{ + cmpp_mfree( sqlite3_str_finish(s) ); + } + return z; +} + +int cmpp__bind_int(cmpp *pp, sqlite3_stmt *pStmt, int col, int64_t val){ + return ppCode + ? ppCode + : cmpp__db_rc(pp, sqlite3_bind_int64(pStmt, col, val), + "from cmpp__bind_int()"); +} + +int cmpp__bind_int_text(cmpp *pp, sqlite3_stmt *pStmt, int col, + int64_t val){ + unsigned char buf[32]; + snprintf((char *)buf, sizeof(buf), "%" PRIi64, val); + return cmpp__bind_textn(pp, pStmt, col, buf, -1); +} + +int cmpp__bind_null(cmpp *pp, sqlite3_stmt *pStmt, int col){ + return ppCode + ? ppCode + : cmpp__db_rc(pp, sqlite3_bind_null(pStmt, col), + "from cmpp__bind_null()"); +} + +static int cmpp__bind_textx(cmpp *pp, sqlite3_stmt *pStmt, int col, + unsigned const char * zStr, cmpp_ssize_t n, + void (*dtor)(void *)){ + if( 0==ppCode ){ + cmpp__db_rc( + pp, (zStr && n) + ? sqlite3_bind_text(pStmt, col, + (char const *)zStr, + (int)n, dtor) + : sqlite3_bind_null(pStmt, col), + sqlite3_sql(pStmt) + ); + } + return ppCode; +} + +int cmpp__bind_textn(cmpp *pp, sqlite3_stmt *pStmt, int col, + unsigned const char * zStr, cmpp_ssize_t n){ + return cmpp__bind_textx(pp, pStmt, col, zStr, (int)n, + SQLITE_TRANSIENT); +} + +int cmpp__bind_text(cmpp *pp, sqlite3_stmt *pStmt, int col, + unsigned const char * zStr){ + return cmpp__bind_textn(pp, pStmt, col, zStr, -1); +} + +#if 0 +int cmpp__bind_textv(cmpp*pp, sqlite3_stmt *pStmt, int col, + const char * zFmt, ...){ + if( 0==p->err.code ){ + int rc; + sqlite3_str * str = sqlite3_str_new(pp->pimpl->db.dbh); + int n = 0; + char * z; + va_list va; + if( !str ) return ppCode; + va_start(va,zFmt); + sqlite3_str_vappendf(str, zFmt, va); + va_end(va); + z = cmpp_str_finish(str, &n); + cmpp__db_rc( + pp, z + ? sqlite3_bind_text(pStmt, col, z, n, sqlite3_free) + : sqlite3_bind_null(pStmt, col), + sqlite3_sql(pStmt) + ); + cmpp_mfree(z); + } + return p->err.code; +} +#endif + +void cmpp_outputer_set(cmpp *pp, cmpp_outputer const *out, + char const *zName){ + cmpp__pi(pp); + cmpp_outputer_cleanup(&pi->out); + if( out ) pi->out = *out; + else pi->out = cmpp_outputer_empty; + pi->out.name = zName; +} + +void cmpp__outputer_swap(cmpp *pp, cmpp_outputer const *oNew, + cmpp_outputer *oPrev){ + if( oPrev ){ + *oPrev = pp->pimpl->out; + } + pp->pimpl->out = *oNew; +} + +#if 0 +static void delim__list_dump(cmpp const *pp){ + cmpp__delim_list const *li = &pp->pimpl->delim.d; + if( li->n ){ + g_warn0("delimiter stack:"); + for(cmpp_size_t i = 0; i < li->n; ++i ){ + g_warn("#%d: %s", (int)i, li->list[i].z); + } + } + +} +#endif + +static bool cmpp__valid_delim(cmpp * const pp, + char const *z, + char const *zEnd){ + char const * const zB = z; + for( ; z < zEnd; ++z ){ + if( *z<33 || 127==*z ){ + cmpp_err_set(pp, CMPP_RC_SYNTAX, + "Delimiters may not contain " + "control characters."); + return false; + } + } + if( zB==z ){ + cmpp_err_set(pp, CMPP_RC_SYNTAX, + "Delimiters may not be empty."); + } + return z>zB; +} + +CMPP__EXPORT(int, cmpp_delimiter_set)(cmpp *pp, char const *zDelim){ + if( ppCode ) return ppCode; + unsigned n; + if( zDelim ){ + n = cmpp__strlen(zDelim, -1); + if( !cmpp__valid_delim(pp, zDelim, zDelim+n) ){ + return ppCode; + }else if( n>12 /* arbitrary but seems sensible enough */ ){ + return cmpp__err(pp, CMPP_RC_MISUSE, + "Invalid delimiter (too long): %s", zDelim); + } + } + cmpp__pi(pp); + if( pi->delim.d.n ){ + cmpp__delim * const delim = cmpp__pp_delim(pp); + if( !cmpp_check_oom(pp, delim) ){ + cmpp__delim_cleanup(delim); + if( zDelim ){ + delim->open.n = n; + delim->open.z = delim->zOwns = + (unsigned char*)sqlite3_mprintf("%.*s", n, zDelim); + cmpp_check_oom(pp, delim->zOwns); + }else{ + assert( delim->open.z ); + assert( !delim->zOwns ); + assert( delim->open.n==sizeof(CMPP_DEFAULT_DELIM)-1 ); + } + } + }else{ + assert(!"Cannot set delimiter on an empty stack!"); + cmpp_err_set(pp, CMPP_RC_MISUSE, + "Directive delimter stack is empty."); + } + return ppCode; +} + +CMPP__EXPORT(void, cmpp_delimiter_get)(cmpp const *pp, char const **zDelim){ + cmpp__delim const * d = cmpp__pp_delim(pp); + if( !d ) d = &cmpp__delim_empty; + *zDelim = (char const *)d->open.z; +} + +CMPP__EXPORT(int, cmpp_delimiter_push)(cmpp *pp, char const *zDelim){ + cmpp__delim * const d = + cmpp__delim_list_push(pp, &pp->pimpl->delim.d); + if( d && cmpp_delimiter_set(pp, zDelim) ){ + cmpp__delim_list_pop(&pp->pimpl->delim.d); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_delimiter_pop)(cmpp *pp){ + cmpp__delim_list * const li = &pp->pimpl->delim.d; + if( li->n ){ + //g_warn("Popping delimiter: %s", cmpp__pp_zdelim(pp)); + cmpp__delim_list_pop(li); + if( 0 && li->n ){ + g_warn("restored delimiter: %s", cmpp__pp_zdelim(pp)); + } + }else if( !ppCode ){ + assert(!"Attempt to pop an empty delimiter stack."); + cmpp_err_set(pp, CMPP_RC_MISUSE, + "Cannot pop an empty delimiter stack."); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_atdelim_set)(cmpp * const pp, + char const *zOpen, + char const *zClose){ + if( 0==ppCode ){ + cmpp__pi(pp); + cmpp__delim * const d = pi->delim.at.n + ? &pi->delim.at.list[pi->delim.at.n-1] + : NULL; + assert( d ); + if( !d ){ + return cmpp__err(pp, CMPP_RC_MISUSE, + "@token@ delimiter stack is currently empty."); + } + if( 0==zOpen ){ + zOpen = (char const *)delimAtDefault.open.z; + zClose = (char const *)delimAtDefault.close.z; + }else if( 0==zClose ){ + zClose = zOpen; + } + cmpp_size_t const nO = cmpp__strlen(zOpen, -1); + cmpp_size_t const nC = cmpp__strlen(zClose, -1); + assert( zOpen && zClose ); + if( !cmpp__valid_delim(pp, zOpen, zOpen+nO) + || !cmpp__valid_delim(pp, zClose, zClose+nC) ){ + return ppCode; + } + cmpp_b b = cmpp_b_empty + /* Don't use cmpp_b_borrow() here because we'll unconditionally + transfer ownership of b.z to d. */; + if( 0==cmpp_b_reserve3(pp, &b, nO + nC + 2) ){ +#ifndef NDEBUG + unsigned char const * const zReallocCheck = b.z; +#endif + /* Copy the open/close tokens to a single string to simplify + management. */ + cmpp_b_append4(pp, &b, zOpen, nO); + cmpp_b_append_ch(&b, '\0'); + cmpp_b_append4(pp, &b, zClose, nC); + assert( zReallocCheck==b.z + && "Else buffer was not properly pre-sized" ); + cmpp__delim_cleanup(d); + d->open.z = b.z; + d->open.n = nO; + d->close.z = d->open.z + nO + 1/*NUL*/; + d->close.n = nC; + d->zOwns = b.z; + b = cmpp_b_empty /* transfer memory ownership */; + } + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_atdelim_push)(cmpp *pp, char const *zOpen, + char const *zClose){ + cmpp__delim * const d = + cmpp__delim_list_push(pp, &pp->pimpl->delim.at); + if( d && cmpp_atdelim_set(pp, zOpen, zClose) ){ + cmpp__delim_list_pop(&pp->pimpl->delim.at); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_atdelim_pop)(cmpp *pp){ + cmpp__delim_list * const li = &pp->pimpl->delim.at; + if( li->n ){ + //g_warn("Popping delimiter: %s", cmpp__pp_zdelim(pp)); + cmpp__delim_list_pop(li); + }else if( !ppCode ){ + assert(!"Attempt to pop an empty @token@ delim stack."); + cmpp_err_set(pp, CMPP_RC_MISUSE, + "Cannot pop an empty @token@ delimiter stack."); + } + return ppCode; +} + +CMPP__EXPORT(void, cmpp_atdelim_get)(cmpp const * const pp, + char const **zOpen, + char const **zClose){ + cmpp__delim const * d + = cmpp__delim_list_get(&pp->pimpl->delim.at); + assert( d ); + if( !d ) d = &delimAtDefault; + if( zClose ) *zClose = (char const *)d->close.z; + if( zOpen ) *zOpen = (char const *)d->open.z; +} + +#define cmpp__scan_int2(SZ,PFMT,Z,N,TGT) \ + (Npimpl->flags.chompF ){ + FileWrapper_chomp(&fw); + } + if( fw.nContent ){ + cmpp__bind_textx(pp, q, 3, fw.zContent, + (cmpp_ssize_t)fw.nContent, sqlite3_free); + fw.zContent = 0 /* transferred ownership */; + fw.nContent = 0; + }else{ + cmpp__bind_null(pp, q, 2); + } + cmpp__step(pp, q, true); + g_debug(pp,2,("define: %s%s%s\n", + kvp.k.z, + kvp.v.z ? " with value " : "", + kvp.v.z ? (char const *)kvp.v.z : "")); + } + FileWrapper_close(&fw); + return ppCode; +} + +CMPP__EXPORT(int, cmpp_has)(cmpp *pp, const char * zName, cmpp_ssize_t nName){ + int rc = 0; + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defHas, false); + if( q ){ + nName = cmpp__strlen(zName, nName); + cmpp__bind_textn(pp, q, 1, ustr_c(zName), nName); + if(SQLITE_ROW == cmpp__step(pp, q, true)){ + rc = 1; + }else{ + rc = 0; + } + g_debug(pp,1,("has [%s] ?= %d\n",zName, rc)); + } + return rc; +} + +int cmpp__get_bool(cmpp *pp, unsigned const char *zName, cmpp_ssize_t nName){ + int rc = 0; + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defGetBool, false); + if( q ){ + nName = cmpp__strlenu(zName, nName); + cmpp__bind_textn(pp, q, 1, zName, nName); + assert(0==ppCode); + if(SQLITE_ROW == cmpp__step(pp, q, false)){ + rc = sqlite3_column_int(q, 0); + }else{ + rc = 0; + cmpp__affirm_undef_policy(pp, zName, nName); + } + cmpp__stmt_reset(q); + } + return rc; +} + +int cmpp__get_int(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName, int *pOut ){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defGetInt, false); + if( q ){ + nName = cmpp__strlenu(zName, nName); + cmpp__bind_textn(pp, q, 1, zName, nName); + assert(0==ppCode); + if(SQLITE_ROW == cmpp__step(pp, q, false)){ + *pOut = sqlite3_column_int(q,0); + }else{ + cmpp__affirm_undef_policy(pp, zName, nName); + } + cmpp__stmt_reset(q); + } + return ppCode; +} + +int cmpp__get_b(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName, cmpp_b * os, bool enforceUndefPolicy){ + int rc = 0; + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defGet, false); + if( q ){ + nName = cmpp__strlenu(zName, nName); + cmpp__bind_textn(pp, q, 1, zName, nName); + int n = 0; + if(SQLITE_ROW == cmpp__step(pp, q, false)){ + const unsigned char * z = sqlite3_column_text(q, 3); + n = sqlite3_column_bytes(q, 3); + cmpp_b_append4(pp, os, z, (cmpp_size_t)n); + rc = 1; + }else{ + if( enforceUndefPolicy ){ + cmpp__affirm_undef_policy(pp, zName, nName); + } + rc = 0; + } + cmpp__stmt_reset(q); + g_debug(pp,1,("get-define [%.*s] ?= %d %.*s\n", + nName, zName, rc, os->n, os->z)); + } + return rc; +} + +int cmpp__get(cmpp *pp, unsigned const char * zName, + cmpp_ssize_t nName, unsigned char **zVal, + unsigned int *nVal){ + int rc = 0; + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defGet, false); + if( q ){ + nName = cmpp__strlenu(zName, nName); + cmpp__bind_textn(pp, q, 1, zName, nName); + int n = 0; + if(SQLITE_ROW == cmpp__step(pp, q, false)){ + const unsigned char * z = sqlite3_column_text(q, 3); + n = sqlite3_column_bytes(q, 3); + if( nVal ) *nVal = (unsigned)n; + *zVal = ustr_nc(sqlite3_mprintf("%.*s", n, z)) + /* TODO? Return NULL for the n==0 case? */; + if( n && cmpp_check_oom(pp, *zVal) ){ + assert(!*zVal); + }else{ + rc = 1; + } + }else{ + cmpp__affirm_undef_policy(pp, zName, nName); + rc = 0; + } + cmpp__stmt_reset(q); + g_debug(pp,1,("get-define [%.*s] ?= %d %.*s\n", + nName, zName, rc, + *zVal ? n : 0, + *zVal ? (char const *)*zVal : "")); + } + return rc; +} + +CMPP__EXPORT(int, cmpp_undef)(cmpp *pp, const char * zKey, + unsigned int *nRemoved){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defDel, false); + if( q ){ + unsigned int const n = strlen(zKey); + cmpp__bind_textn(pp, q, 1, ustr_c(zKey), (cmpp_ssize_t)n); + cmpp__step(pp, q, true); + if( nRemoved ){ + *nRemoved = (unsigned)sqlite3_changes(pp->pimpl->db.dbh); + } + g_debug(pp,2,("undefine: %.*s\n",n, zKey)); + } + return ppCode; +} + +int cmpp__include_dir_add(cmpp *pp, const char * zDir, int priority, int64_t * pRowid){ + if( pRowid ) *pRowid = 0; + if( !ppCode && zDir && *zDir ){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_inclPathAdd, false); + if( q ){ + /* TODO: normalize zDir before insertion so that a/b and a/b/ + are equivalent. The relavent code is in another tree, + awaiting a decision on whether to import it or re-base cmpp + on top of that library (which would, e.g., replace cmpp_b + with that one, which is more mature). + */ + cmpp__bind_int(pp, q, 1, priority); + cmpp__bind_textn(pp, q, 2, ustr_c(zDir), -1); + int const rc = cmpp__step(pp, q, false); + if( SQLITE_ROW==rc ){ + ++pp->pimpl->flags.nIncludeDir; + if( pRowid ){ + *pRowid = sqlite3_column_int64(q, 0); + } + } + cmpp__stmt_reset(q); + /*g_warn("inclpath add: rc=%d rowid=%" PRIi64 " prio=%d %s", + rc, pRowid ? *pRowid : 0, priority, zDir);*/ + g_debug(pp,2,("inclpath add: prio=%d %s\n", priority, zDir)); + } + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_include_dir_add)(cmpp *pp, const char * zDir){ + return cmpp__include_dir_add(pp, zDir, 0, NULL); +} + +int cmpp__include_dir_rm_id(cmpp *pp, int64_t rowid){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_inclPathRmId, true); + if( q ){ + /* Hoop-jumping to allow this to work even if pp's in an error + state. */ + int rc = sqlite3_bind_int64(q, 1, rowid); + if( 0==rc ){ + rc = sqlite3_step(q); + if( SQLITE_ROW==rc ){ + --pp->pimpl->flags.nIncludeDir; + rc = 0; + }else if( SQLITE_DONE==rc ){ + rc = 0; + } + } + if( rc && !ppCode ){ + cmpp__db_rc(pp, rc, sqlite3_sql(q)); + } + cmpp__stmt_reset(q); + g_debug(pp,2,("inclpath rm #%"PRIi64 "\n", rowid)); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_module_dir_add)(cmpp *pp, const char * zDirs){ +#if CMPP_ENABLE_DLLS + if( !ppCode ){ + cmpp_b * const ob = &pp->pimpl->mod.path; + if( !zDirs && !ob->n ){ + zDirs = getenv("CMPP_MODULE_PATH"); + if( !zDirs ){ + zDirs = CMPP_MODULE_PATH; + } + } + if( !zDirs || !*zDirs ) return 0; + char const * z = zDirs; + char const * const zEnd = zDirs + strlen(zDirs); + if( 0==cmpp_b_reserve3(pp, ob, ob->n + (zEnd - z) + 3) ){ + unsigned char * zo = ob->z + ob->n; + unsigned i = 0; + for( ; z < zEnd && !ppCode; ++z ){ + switch( *z ){ + case CMPP_PATH_SEPARATOR: + *zo++ = pp->pimpl->mod.pathSep; + break; + default: + if( 1==++i && ob->n ){ + cmpp_b_append_ch(ob, pp->pimpl->mod.pathSep); + } + *zo++ = *z; + break; + } + } + *zo = 0; + ob->n = (zo - ob->z); + } + } + return ppCode; +#else + return CMPP_RC_UNSUPPORTED; +#endif +} + +CMPP__EXPORT(int, cmpp_db_name_set)(cmpp *pp, const char * zName){ + if( 0==ppCode ){ + cmpp__pi(pp); + if( pi->db.dbh ){ + return cmpp__err(pp, CMPP_RC_MISUSE, + "DB name cannot be set after db initialization."); + } + if( zName ){ + char * const z = sqlite3_mprintf("%s", zName); + if( 0==cmpp_check_oom(pp, z) ){ + cmpp_mfree(pi->db.zName); + pi->db.zName = z; + } + }else{ + cmpp_mfree(pi->db.zName); + pi->db.zName = 0; + } + } + return ppCode; +} + +bool cmpp__is_legal_key(unsigned char const *zName, + cmpp_size_t n, + unsigned char const **zErrPos, + bool equalIsLegal){ + if( !n || n>64/*arbitrary*/ ){ + if( zErrPos ) *zErrPos = 0; + return false; + } + unsigned char const * z = zName; + unsigned char const * const zEnd = zName + n; + for( ; z='a' && *z<='z') + || (*z>='A' && *z<='Z') + || (z>zName && + ('-'==*z + /* This is gonna bite us if we extend the expresions to + support +/-. Expressions currently parse X=Y (no + spaces) as the three tokens X = Y, but we'd need to + require a space between X-Y in expressions because + '-' is a legal symbol character. i've looked at + making '-' illegal but it's just too convenient for + use in define keys. Once one is used to + tcl-style-naming of stuff, it's painful to have to go + back to snake_case. + */ + || (*z>='0' && *z<='9'))) + || (*z>='.' && *z<='/') + || (*z==':') + || (*z=='_') + || (equalIsLegal && z>zName && '='==*z) + || (*z & 0x80) + ) ){ + if( zErrPos ) *zErrPos = z; + return false; + } + } + return true; +} + +bool cmpp_is_legal_key(unsigned char const *zName, + cmpp_size_t n, + unsigned char const **zErrPos){ + return cmpp__is_legal_key(zName, n, zErrPos, false); +} + +int cmpp__legal_key_check(cmpp *pp, unsigned char const *zKey, + cmpp_ssize_t nKey, bool permitEqualSign){ + if( !ppCode ){ + unsigned char const *zAt = 0; + nKey = cmpp__strlenu(zKey, nKey); + if( !cmpp__is_legal_key(zKey, nKey, &zAt, permitEqualSign) ){ + cmpp__err(pp, CMPP_RC_SYNTAX, + "Illegal character 0x%02x in key [%.*s]", + (int)*zAt, nKey, zKey); + } + } + return ppCode; +} + +CMPP__EXPORT(bool, cmpp_next_chunk)(unsigned char const **zPos, + unsigned char const *zEnd, + unsigned char chSep, + cmpp_size_t *pCounter){ + assert( zPos ); + assert( *zPos ); + assert( zEnd ); + if( *zPos >= zEnd ) return false; + unsigned char const * z = *zPos; + while( zpimpl state, which "doesn't happen". +*/ +//static +bool cmpp__dx_next_line(cmpp_dx * const dx, CmppDLine *ln){ + assert( !dxppCode ); + cmpp_dx_pimpl * const dxp = dx->pimpl; + if(!dxp->pos.z) dxp->pos.z = dxp->zBegin; + assert( dxp->zEnd ); + if( dxp->pos.z>=dxp->zEnd ){ + return false; + } + assert( (dxp->pos.z==dxp->zBegin || dxp->pos.z[-1]=='\n') + && "Else we've mismanaged something."); + cmpp__dx_pi(dx); + ln->lineNo = dpi->pos.lineNo; + ln->zBegin = dpi->pos.z; + ln->zEnd = ln->zBegin; + return cmpp_next_chunk(&ln->zEnd, dpi->zEnd, (unsigned char)'\n', + &dpi->pos.lineNo); +} + +/** + Scans [dx->pos.z,dx->zEnd) for a directive delimiter. Emits any + non-delimiter output found along the way to dx->pp's output + channel. + + This updates dx->pimpl->pos.z and dx->pimpl->pos.lineNo as it goes. + + If a delimiter is found, it sets *gotOne to true and updates + dx->pimpl->dline to point to the remainder of that line. On no match + *gotOne will be false and EOF will have been reached. + + Returns dxppCode. If it returns non-0 then the state of dx's + tokenization pieces are unspecified. i.e. it's illegal to call this + again without a reset. +*/ +static int cmpp_dx_delim_search(cmpp_dx * const dx, bool * gotOne){ + if( dxppCode ) return dxppCode; + cmpp_dx_pimpl * const dxp = dx->pimpl; + if(!dxp->pos.z) dxp->pos.z = dxp->zBegin; + if( dxp->pos.z>=dxp->zEnd ){ + *gotOne = false; + return 0; + } + assert( (dxp->pos.z==dxp->zBegin || dxp->pos.z[-1]=='\n') + && "Else we've mismanaged something."); + cmpp__pi(dx->pp); + cmpp__delim const * const delim = cmpp__dx_delim(dx); + if(!delim) { + return cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "The directive delimiter stack is empty."); + } + unsigned char const * const zD = delim->open.z; + unsigned short const nD = delim->open.n; + unsigned char const * const zEnd = dxp->zEnd; + unsigned char const * zLeft = dxp->pos.z; + unsigned char const * z = zLeft; + assert(zD); + assert(nD); +#if 0 + assert( 0==*zEnd && "Else we'll misinteract with strcspn()" ); + if( *zEnd ){ + return cmpp_dx_err(dx, CMPP_RC_RANGE, + "Input must be NUL-terminated."); + } +#endif + ++dxp->flags.countLines; + while( zpos.lineNo; + ++z; + } +#define tflush \ + if( z>zEnd ) z=zEnd; \ + if( z>zLeft && cmpp_dx_out_expand(dx, &pi->out, zLeft, \ + (cmpp_size_t)(z-zLeft), \ + cmpp_atpol_CURRENT) ){ \ + --dxp->flags.countLines; \ + return dxppCode; \ + } zLeft = z + + CmppDLine * const dline = &dxp->dline; + bool atBOL = true /* At the start of a line? Successful calls to + this always end at either BOL or EOF. */; + if( 0 ){ + g_warn("scanning... <<%.*s...>>", + (zEnd-z)>20?20:(zEnd-z), z); + } + while( zpos.lineNo; + z += 1 + ('\r'==*z); + atBOL = true; + } + if( atBOL ){ + break; + } + ++z; + } + if( !atBOL ) break; + } + if( 0 ){ + g_warn("at BOL... <<%.*s...>>", + (zEnd-z) > 20 ? 20 : (zEnd-z), z); + } + + /* We're at BOL. Check for a delimiter with optional leading + spaces. */ + tflush; + cmpp_skip_space(&z, zEnd); + int const skip = cmpp_isnl(z, zEnd); + if( skip ){ + /* Special case: a line comprised solely of whitespace. If we + don't catch this here, we won't recognize a delimiter which + starts on the next line. */ + tflush; + z += skip; + ++dxp->pos.lineNo; + continue; + } + if( 0 ){ + g_warn("at BOL... <<%.*s...>>", + (zEnd-z) > 20 ? 20 : (zEnd-z), z); + } + if( z + nD>zEnd ){ + /* Too short for a delimiter. We'll catch the z+nD==zEnd corner + case in a moment. */ + z = zEnd; + break; + } + if( memcmp(z, zD, nD) ){ + /* Not a delimiter. Keep trying. */ + atBOL = false; + ++z; + continue; + } + + /* z now points to a delimiter which sits at the start of a line + (ignoring leading spaces). */ + z += nD /* skip the delimiter */; + cmpp_skip_space(&z, zEnd) /* skip spaces immediately following + the delimiter. */; + if( z>=zEnd || cmpp_isnl(z, zEnd) ){ + dxserr("No directive name found after %s.", zD); + /* We could arguably treat this as no match and pass this line + through as-is but that currently sounds like a pothole. */ + break; + } + /* Set up dx->pimpl->dline to encompass the whole directive line sans + delimiter and leading spaces. */ + dline->zBegin = z + /* dx->pimpl->dline starts at the directive name and extends until the + next EOL/EOF. We don't yet know if it's a legal directive + name - cmpp_dx_next() figures that part out. */; + dline->lineNo = dxp->pos.lineNo; + /* Now find the end of the line or EOF, accounting for + backslash-escaped newlines and _not_ requiring backslashes to + escape newlines inside of {...}, (...), or [...]. We could also + add the double-quotes to this, but let's start without that. */ + bool keepGoing = true; + zLeft = z; + while( keepGoing && zpp, &z, zEnd, &dxp->pos.lineNo) ){ + --dxp->flags.countLines; + return dxppCode; + } + ++z /* group-closing character */; + /* + Sidebar: this only checks top-level groups. It is + possible that an inner group is malformed, e.g.: + + { ( } + + It's also possible that that's perfectly legal for a + specific use case. + + Such cases will, if they're indeed syntax errors, be + recognized as such in the arguments-parsing + steps. Catching them here would require that we + recursively validate all of [zLeft,z) for group + constructs, whereas that traversal happens as a matter of + course in argument parsing. It would also require the + assumption that such constructs are not legal, which is + invalid once we start dealing with free-form input like + #query SQL. + */ + break; + } + case '\n': + assert( z!=dline->zBegin && "Checked up above" ); + if( '\\'==z[-1] + || (z>zLeft+1 && '\r'==z[-1] && '\\'==z[-2]) ){ + /* Backslash-escaped newline. */ + ++z; + }else{ + /* EOL for this directive. */ + keepGoing = false; + } + ++dxp->pos.lineNo; + break; + default: + ++z; + } + } + assert( z==zEnd || '\n'==*z ); + dline->zEnd = z; + dxp->pos.z = dline->zEnd + 1 + /* For the next call to this function, skip the trailing newline + or EOF */; + assert( dline->zBegin < dline->zEnd && "Was checked above" ); + if( 0 ){ + g_warn("line= %u <<%.*s>>", (dline->zEnd-dline->zBegin), + (dline->zEnd-dline->zBegin), dline->zBegin); + } + *gotOne = true; + assert( !dxppCode ); + --dxp->flags.countLines; + return 0; + } + /* No directives found. We're now at EOL or EOF. Flush any pending + LHS content. */ + tflush; + dx->pimpl->pos.z = z; + *gotOne = false; + return dxppCode; +#undef tflush +} + +int CmppKvp_parse(cmpp *pp, CmppKvp * p, unsigned char const *zKey, + cmpp_ssize_t nKey, CmppKvp_op_e opPolicy){ + if(ppCode) return ppCode; + char chEq = 0; + char opLen = 0; + *p = CmppKvp_empty; + p->k.z = zKey; + p->k.n = cmpp__strlenu(zKey, nKey); + switch( opPolicy ){ + case CmppKvp_op_none:// break; + case CmppKvp_op_eq1: + chEq = '='; + opLen = 1; + break; + default: + assert(!"don't use these"); + /* no longer todo: ==, !=, <=, <, >, >= */ + chEq = '='; + opLen = 1; + break; + } + assert( chEq ); + p->op = CmppKvp_op_none; + unsigned const char * const zEnd = p->k.z + p->k.n; + for(unsigned const char * zPos = p->k.z ; *zPos && zPosop = CmppKvp_op_eq1; + p->k.n = (unsigned)(zPos - ustr_c(zKey)); + zPos += opLen; + assert( zPos <= zEnd ); + p->v.z = zPos; + p->v.n = (unsigned)(zEnd - zPos); + } + break; + } + } + cmpp__legal_key_check(pp, p->k.z, p->k.n, false); + return ppCode; +} + +int cmpp_array_reserve(cmpp *pp, void **list, cmpp_size_t nDesired, + cmpp_size_t * nAlloc, unsigned sizeOfEntry){ + int rc = pp ? ppCode : 0; + if( 0==rc && nDesired > *nAlloc ){ + cmpp_size_t const nA = nDesired < 10 ? 10 : nDesired; + void * const p = cmpp_mrealloc(*list, sizeOfEntry * nA); + rc = cmpp_check_oom(pp, p); + if( p ){ + memset((unsigned char *)p + + (sizeOfEntry * *nAlloc), 0, + sizeOfEntry * (nA - *nAlloc)); + *list = p; + *nAlloc = nA; + } + } + return rc; +} + +CmppLvl * CmppLvlList_push(cmpp *pp, CmppLvlList *li){ + CmppLvl * p = 0; + assert( li->list ? li->nAlloc : 0==li->nAlloc ); + if( 0==ppCode + && 0==CmppLvlList_reserve(pp, li, + cmpp__li_reserve1_size(li,5)) ){ + p = li->list[li->n]; + if( !p ){ + p = cmpp__malloc(pp, sizeof(*p)); + } + if( p ){ + li->list[li->n++] = p; + *p = CmppLvl_empty; + } + } + return p; +} + +void CmppLvlList_pop(cmpp * const pp, CmppLvlList * const li, + CmppLvl * const lvl){ + assert( li->n ); + if( li->n ){ + if( lvl==li->list[li->n-1] ){ + *lvl = CmppLvl_empty; + cmpp_mfree(lvl); + li->list[--li->n] = 0; + }else{ + if( pp ){ + cmpp_err_set(pp, CMPP_RC_ASSERT, + "Misuse of %s(): not passed the top of the stack. " + "The CmppLvl stack is now out of whack.", + __func__); + }else{ + cmpp__fatal("Misuse of %s(): not passed the top of the stack", + __func__); + } + /* do not free it - CmppLvlList_cleanup() will catch it. */ + } + } +} + +void CmppLvlList_cleanup(CmppLvlList *li){ + const CmppLvlList CmppLvlList_empty = CmppLvlList_empty_m; + while( li->nAlloc ){ + cmpp_mfree( li->list[--li->nAlloc] ); + } + cmpp_mfree(li->list); + *li = CmppLvlList_empty; +} + +static inline void CmppDList_entry_clean(CmppDList_entry * const e){ + if( e->d.impl.dtor ){ + e->d.impl.dtor( e->d.impl.state ); + } + cmpp_mfree(e->zName); + *e = CmppDList_entry_empty; +} + +#if 0 +CmppDList * CmppDList_reuse(CmppDList *li){ + while( li->n ){ + CmppDList_entry_clean( li->list[--li->n] ); + } + return li; +} +#endif + +void CmppDList_cleanup(CmppDList *li){ + static const CmppDList CmppDList_empty = CmppDList_empty_m; + while( li->n ){ + CmppDList_entry_clean( li->list[--li->n] ); + cmpp_mfree( li->list[li->n] ); + li->list[li->n] = 0; + } + cmpp_mfree(li->list); + *li = CmppDList_empty; +} + +void CmppDList_unappend(CmppDList *li){ + assert( li->n ); + if( li->n ){ + CmppDList_entry_clean(li->list[--li->n]); + } +} + + +/** bsearch()/qsort() comparison for (cmpp_d**), sorting by name. */ +static +int CmppDList_entry_cmp_pp(const void *p1, const void *p2){ + CmppDList_entry const * eL = *(CmppDList_entry const * const *)p1; + CmppDList_entry const * eR = *(CmppDList_entry const * const *)p2; + return eL->d.name.n==eR->d.name.n + ? memcmp(eL->d.name.z, eR->d.name.z, eL->d.name.n) + : strcmp((char const *)eL->d.name.z, + (char const *)eR->d.name.z); +} + +static void CmppDList_sort(CmppDList * const li){ + if( li->n>1 ){ + qsort(li->list, li->n, sizeof(CmppDList_entry*), + CmppDList_entry_cmp_pp); + } +} + +CmppDList_entry * CmppDList_append(cmpp *pp, CmppDList *li){ + CmppDList_entry * p = 0; + assert( li->list ? li->nAlloc : 0==li->nAlloc ); + if( 0==ppCode + && 0==cmpp_array_reserve(pp, (void **)&li->list, + cmpp__li_reserve1_size(li, 15), + &li->nAlloc, sizeof(p)) ){ + p = li->list[li->n]; + if( !p ){ + li->list[li->n] = p = cmpp__malloc(pp, sizeof(*p)); + } + if( p ){ + ++li->n; + *p = CmppDList_entry_empty; + } + } + return p; +} + +CmppDList_entry * CmppDList_search(CmppDList const * li, + char const *zName){ + if( li->n > 2 ){ + CmppDList_entry const key = { + .d = { + .name = { + .z = zName, + .n = strlen(zName) + } + } + }; + CmppDList_entry const * pKey = &key; + CmppDList_entry ** pRv + = bsearch(&pKey, li->list, li->n, sizeof(li->list[0]), + CmppDList_entry_cmp_pp); + //g_warn("search in=%s out=%s", zName, (pRv ? (*pRv)->d.name.z : "")); + return pRv ? *pRv : 0; + }else{ + cmpp_size_t const nName = cmpp__strlen(zName, -1); + for( cmpp_size_t i = 0; i < li->n; ++i ){ + CmppDList_entry * const e = li->list[i]; + if( nName==e->d.name.n && 0==strcmp(zName, e->d.name.z) ){ + //g_warn("search in=%s out=%s", zName, e->d.name.z); + return e; + } + } + return 0; + } +} + +void cmpp__delim_cleanup(cmpp__delim *d){ + cmpp__delim const dd = cmpp__delim_empty_m; + cmpp_mfree(d->zOwns); + *d = dd; + assert(!d->zOwns); + assert(d->open.z); + assert(0==strcmp((char*)d->open.z, CMPP_DEFAULT_DELIM)); + assert(d->open.n == sizeof(CMPP_DEFAULT_DELIM)-1); +} + +cmpp__delim * cmpp__delim_list_push(cmpp *pp, cmpp__delim_list *li){ + cmpp__delim * p = 0; + assert( li->list ? li->nAlloc : 0==li->nAlloc ); + if( 0==ppCode + && 0==cmpp_array_reserve(pp, (void **)&li->list, + cmpp__li_reserve1_size(li,4), + &li->nAlloc, sizeof(cmpp__delim)) ){ + p = &li->list[li->n++]; + *p = cmpp__delim_empty; + } + return p; +} + +void cmpp__delim_list_cleanup(cmpp__delim_list *li){ + while( li->nAlloc ) cmpp__delim_cleanup(li->list + --li->nAlloc); + cmpp_mfree(li->list); + *li = cmpp__delim_list_empty; +} + +CMPP__EXPORT(int, cmpp_dx_next)(cmpp_dx * const dx, bool * pGotOne){ + if( dxppCode ) return dxppCode; + + CmppDLine * const tok = &dx->pimpl->dline; + if( !dx->pimpl->zBegin ){ + *pGotOne = false; + return 0; + } + assert(dx->pimpl->zEnd); + assert(dx->pimpl->zEnd > dx->pimpl->zBegin); + *pGotOne = false; + cmpp_dx__reset(dx); + bool foundDelim = false; + if( cmpp_dx_delim_search(dx, &foundDelim) || !foundDelim ){ + return dxppCode; + } + if( cmpp_args__init(dx->pp, &dx->pimpl->args) ){ + return dxppCode; + } + cmpp_skip_space( &tok->zBegin, tok->zEnd ); + g_debug(dx->pp,2,("Directive @ line %u: <<%.*s>>\n", + tok->lineNo, + (int)(tok->zEnd-tok->zBegin), tok->zBegin)); + /* Normalize the directive's line and parse arguments */ + const unsigned lineLen = (unsigned)(tok->zEnd - tok->zBegin); + if(!lineLen){ + return cmpp_dx_err(dx, CMPP_RC_SYNTAX, + "Line #%u has no directive after %s", + tok->lineNo, cmpp_dx_delim(dx)); + } + unsigned char const * zi = tok->zBegin /* Start of input */; + unsigned char const * ziEnd = tok->zEnd /* Input EOF */; + cmpp_b * const bufLine = + cmpp_b_reuse(&dx->pimpl->buf.line) + /* Slightly-transformed copy of the input. */; + if( cmpp_b_reserve3(dx->pp, bufLine, lineLen+1) ){ + return dxppCode; + } + unsigned char * zo = bufLine->z /* Start of output */; + unsigned char const * const zoEnd = + zo + bufLine->nAlloc /* Output EOF. */; + g_debug(dx->pp,2,("Directive @ line %u len=%u <<%.*s>>\n", + tok->lineNo, lineLen, lineLen, tok->zBegin)); + //memset(bufLine->z, 0, bufLine->nAlloc); +#define out(CH) if(zo==zoEnd) break; (*zo++)=CH + /* + bufLine is now populated with a copy of the whole input line. + Now normalize that buffer a bit before trying to parse it. + */ + unsigned char const * zEsc = 0; + cmpp_dx_pimpl * const pimpl = dx->pimpl; + for( ; zi=zoEnd ){ + return cmpp_dx_err(dx, CMPP_RC_RANGE, + "Ran out of argument-processing space."); + } + *zo = 0; +#undef out + bufLine->n = (cmpp_size_t)(zo - bufLine->z); + if( 0 ) g_warn("bufLine.n=%u line=<<%s>>", bufLine->n, bufLine->z); + /* Line has now been normalized into bufLine->z. */ + for( zo = bufLine->z; zoz; + dx->d = cmpp__d_search3(dx->pp, (char const *)zDirective, + cmpp__d_search3_F_ALL); + if( dxppCode ){ + return dxppCode; + }else if(!dx->d){ + return cmpp_dx_err(dx, CMPP_RC_NOT_FOUND, + "Unknown directive at line %" + CMPP_SIZE_T_PFMT ": %.*s\n", + (unsigned)tok->lineNo, + (int)bufLine->n, bufLine->z); + } + assert( zDirective == bufLine->z ); + const bool isCall + = dx->pimpl->args.pimpl->isCall + = dx->pimpl->flags.nextIsCall; + dx->pimpl->flags.nextIsCall = false; + if( isCall ){ + if( cmpp_d_F_NO_CALL & dx->d->flags ){ + return cmpp_dx_err(dx, CMPP_RC_SYNTAX, + "%s%s cannot be used in a [call] context.", + cmpp_dx_delim(dx), + dx->d->name.z); + } + }else if( cmpp_d_F_CALL_ONLY & dx->d->flags ){ + return cmpp_dx_err(dx, CMPP_RC_TYPE, + "'%s' is a call-only directive, " + "not legal here.", dx->d->name.z); + } + if( bufLine->n > dx->d->name.n ){ + dx->args.z = zDirective + dx->d->name.n + 1; + assert( dx->args.z > bufLine->z ); + assert( dx->args.z <= bufLine->z+bufLine->n ); + dx->args.nz = cmpp__strlenu(dx->args.z, -1); + assert( bufLine->nAlloc > dx->args.nz ); + }else{ + dx->args.z = ustr_c("\0"); + dx->args.nz = 0; + } + if( 0 ){ + g_warn("bufLine.n=%u zArgs offset=%u line=<<%s>>\nzArgs=<<%s>>", + bufLine->n, (dx->args.z - zDirective), bufLine->z, dx->args.z); + } + cmpp_skip_snl(&dx->args.z, dx->args.z + dx->args.nz); + if(0){ + g_warn("zArgs %u = <<%.*s>>", (int)dx->args.nz, + (int)dx->args.nz, dx->args.z); + } + assert( !pimpl->buf.argsRaw.n ); + if( dx->args.nz ){ + if( 0 ){ + g_warn("lineLen=%u zargs len=%u: [%.*s]\n", + (unsigned)lineLen, + (int)dx->args.nz, (int)dx->args.nz, + dx->args.z + ); + } + if( cmpp_b_append4(dx->pp, &pimpl->buf.argsRaw, + dx->args.z, dx->args.nz) ){ + return dxppCode; + } + } + assert( !pimpl->args.arg0 ); + assert( !pimpl->args.argc ); + assert( !pimpl->args.pimpl->argOut.n ); + assert( !pimpl->args.pimpl->argli.n ); + assert( dx->args.z ); + if( //1 || //pleases valgrind. Well, it did at one point. + !cmpp_dx_is_eliding(dx) || 0!=(cmpp_d_F_FLOW_CONTROL & dx->d->flags) ){ + if( cmpp_d_F_ARGS_LIST & dx->d->flags ){ + cmpp_dx_args_parse(dx, &pimpl->args); + }else if( cmpp_d_F_ARGS_RAW & dx->d->flags ){ + /* Treat rest of line as one token */ + cmpp_arg * const arg = + CmppArgList_append(dx->pp, &pimpl->args.pimpl->argli); + if( !arg ) return dxppCode; + pimpl->args.arg0 = arg; + pimpl->args.argc = 1; + arg->ttype = cmpp_TT_RawLine; + arg->z = pimpl->buf.argsRaw.z; + arg->n = pimpl->buf.argsRaw.n; + //g_warn("arg->n/z=%u %s", (unsigned)arg->n, arg->z); + } + } + if( 0==dxppCode ){ + dx->args.arg0 = pimpl->args.arg0; + dx->args.argc = pimpl->args.argc; + } + *pGotOne = true; + return dxppCode; +} + +CMPP_EXPORT bool cmpp_dx_is_call(cmpp_dx * const dx){ + return dx->pimpl->args.pimpl->isCall; +} + +CMPP__EXPORT(int, cmpp_d_register)(cmpp * pp, cmpp_d_reg const * r, + cmpp_d ** dOut){ + CmppDList_entry * e1 = 0, * e2 = 0; + bool const isCallOnly = + (cmpp_d_F_CALL_ONLY & r->opener.flags); + if( ppCode ){ + goto end; + } + if( (cmpp_d_F_NOT_IN_SAFEMODE & (r->opener.flags | r->closer.flags)) + && (cmpp_ctor_F_SAFEMODE & pp->pimpl->flags.newFlags) ){ + cmpp__err(pp, CMPP_RC_ACCESS, + "Directive %s%s flag cmpp_d_F_NOT_IN_SAFE_MODE is set " + "and the preprocessor is running in safe mode.", + cmpp__pp_zdelim(pp), r->name); + goto end; + } + if( isCallOnly && r->closer.f ){ + cmpp__err(pp, CMPP_RC_MISUSE, + "Call-only directives may not have a closing directive."); + goto end; + } +#if 0 + if( pp->pimpl->dx ){ + cmpp__err(pp, CMPP_RC_MISUSE, + "Directives may not be added while a " + "directive is running." + /* because that might reallocate being-run directives. + 2025-10-25: that's since been resolved but we need a + use case before enabling this. + */); + goto end; + } +#endif + if( !pp->pimpl->flags.isInternalDirectiveReg + && !cmpp_is_legal_key(ustr_c(r->name), + cmpp__strlen(r->name,-1), NULL) ){ + cmpp__err(pp, CMPP_RC_RANGE, + "\"%s\" is not a legal directive name.", r->name); + goto end; + } + if( cmpp__d_search(pp, r->name) ){ + cmpp__err(pp, CMPP_RC_ALREADY_EXISTS, + "Directive name '%s' is already in use.", + r->name); + goto end; + } + e1 = CmppDList_append(pp, &pp->pimpl->d.list); + if( !e1 ) goto end; + e1->d.impl.callback = r->opener.f; + e1->d.impl.state = r->state; + e1->d.impl.dtor = r->dtor; + if( pp->pimpl->flags.isInternalDirectiveReg ){ + e1->d.flags = r->opener.flags; + }else{ + e1->d.flags = r->opener.flags & cmpp_d_F_MASK; + } + e1->zName = sqlite3_mprintf("%s", r->name); + if( 0==cmpp_check_oom(pp, e1->zName) ){ + //e1->reg = *r; e1->reg.zName = e1->zName; + e1->d.name.z = e1->zName; + e1->d.name.n = strlen(e1->zName); + if( r->closer.f + && (e2 = CmppDList_append(pp, &pp->pimpl->d.list)) ){ + e2->d.impl.callback = r->closer.f; + e2->d.impl.state = r->state; + if( pp->pimpl->flags.isInternalDirectiveReg ){ + e2->d.flags = r->closer.flags; + }else{ + e2->d.flags = r->closer.flags & cmpp_d_F_MASK; + } + e1->d.closer = &e2->d; + e2->zName = sqlite3_mprintf("/%s", r->name); + if( 0==cmpp_check_oom(pp, e2->zName) ){ + e2->d.name.z = e2->zName; + e2->d.name.n = e1->d.name.n + 1; + } + } + } + +end: + if( ppCode ){ + if( e2 ) CmppDList_unappend(&pp->pimpl->d.list); + if( e1 ) CmppDList_unappend(&pp->pimpl->d.list); + else if( r->dtor ){ + r->dtor( r->state ); + } + }else{ + CmppDList_sort(&pp->pimpl->d.list); + if( dOut ){ + *dOut = &e1->d; + } + if( 0 ){ + g_warn("Registered: %s%s%s", e1->zName, + e2 ? " and " : "", + e2 ? e2->zName : ""); + } + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_dx_consume)(cmpp_dx * const dx, cmpp_outputer * const os, + cmpp_d const * const * const dClosers, + unsigned nClosers, + cmpp_flag32_t flags){ + assert( !dxppCode ); + bool gotOne = false; + cmpp_outputer const oldOut = dx->pp->pimpl->out; + bool const allowOtherDirectives = + (flags & cmpp_dx_consume_F_PROCESS_OTHER_D); + cmpp_d const * const d = cmpp_dx_d(dx); + cmpp_size_t const lineNo = dx->pimpl->dline.lineNo; + bool const pushAt = (cmpp_dx_consume_F_RAW & flags); + if( pushAt && cmpp_atpol_push(dx->pp, cmpp_atpol_OFF) ){ + return dxppCode; + } + if( os ){ + dx->pp->pimpl->out = *os; + } + while( 0==dxppCode + && 0==cmpp_dx_next(dx, &gotOne) + /* ^^^^^^^ resets dx->d, dx->pimpl->args and friends */ ){ + if( !gotOne ){ + dxserr("No closing directive found for " + "%s%s opened on line %" CMPP_SIZE_T_PFMT ".", + cmpp_dx_delim(dx), d->name.z, lineNo); + }else{ + cmpp_d const * const d2 = cmpp_dx_d(dx); + gotOne = false; + for( unsigned i = 0; !gotOne && i < nClosers; ++i ){ + gotOne = d2==dClosers[i]; + } + //g_warn("gotOne=%d d2=%s", gotOne, d2->name.z); + if( gotOne ) break; + else if( !allowOtherDirectives ){ + dxserr("%s%s at line %" CMPP_SIZE_T_PFMT + " may not contain %s%s.", + cmpp_dx_delim(dx), d->name.z, lineNo, + cmpp_dx_delim(dx), d2->name.z); + }else{ + cmpp_dx_process(dx); + } + } + } + if( pushAt ){ + cmpp_atpol_pop(dx->pp); + } + if( os ){ + dx->pp->pimpl->out = oldOut; + } + return dxppCode; +} + +CMPP__EXPORT(int, cmpp_dx_consume_b)(cmpp_dx * const dx, cmpp_b * const b, + cmpp_d const * const * dClosers, + unsigned nClosers, cmpp_flag32_t flags){ + cmpp_outputer oss = cmpp_outputer_b; + oss.state = b; + return cmpp_dx_consume(dx, &oss, dClosers, nClosers, flags); +} + +char const * cmpp__atpol_name(cmpp *pp, cmpp_atpol_e p){ +again: + switch(p){ + case cmpp_atpol_CURRENT:{ + if( pp ){ + assert( p!=cmpp__policy(pp, at) ); + p = cmpp__policy(pp, at); + pp = 0; + goto again; + } + return NULL; + } + case cmpp_atpol_invalid: return NULL; + case cmpp_atpol_OFF: return "off"; + case cmpp_atpol_RETAIN: return "retain"; + case cmpp_atpol_ELIDE: return "elide"; + case cmpp_atpol_ERROR: return "error"; + } + return NULL; +} + +cmpp_atpol_e cmpp_atpol_from_str(cmpp * const pp, char const *z){ + cmpp_atpol_e rv = cmpp_atpol_invalid; + if( 0==strcmp(z, "retain") ) rv = cmpp_atpol_RETAIN; + else if( 0==strcmp(z, "elide") ) rv = cmpp_atpol_ELIDE; + else if( 0==strcmp(z, "error") ) rv = cmpp_atpol_ERROR; + else if( 0==strcmp(z, "off") ) rv = cmpp_atpol_OFF; + if( pp ){ + if( cmpp_atpol_invalid==rv + && 0==strcmp(z, "current") ){ + rv = cmpp__policy(pp,at); + }else if( cmpp_atpol_invalid==rv ){ + cmpp__err(pp, CMPP_RC_RANGE, + "Invalid @ policy value: %s." + " Try one of retain|elide|error|off|current.", z); + }else{ + cmpp__policy(pp,at) = rv; + } + } + return rv; +} + +int cmpp__StringAtIsOk(cmpp * pp, cmpp_atpol_e pol){ + if( 0==ppCode ){ + if( pol==cmpp_atpol_CURRENT ) pol=cmpp__policy(pp,at); + if(cmpp_atpol_OFF==pol ){ + cmpp_err_set(pp, CMPP_RC_UNSUPPORTED, + "@policy is \"off\", so cannot use @\"strings\"."); + } + } + return ppCode; +} + +cmpp__PodList_impl(PodList__atpol,cmpp_atpol_e) +cmpp__PodList_impl(PodList__unpol,cmpp_unpol_e) + +int cmpp_atpol_push(cmpp * pp, cmpp_atpol_e pol){ + if( cmpp_atpol_CURRENT==pol ) pol = cmpp__policy(pp,at); + assert( cmpp_atpol_CURRENT!=pol && "Else internal mismanagement." ); + if( 0==PodList__atpol_push(pp, &cmpp__epol(pp,at), pol) + && 0!=cmpp_atpol_set(pp, pol)/*for validation*/ ){ + PodList__atpol_pop(&cmpp__epol(pp,at)); + } + return ppCode; +} + +void cmpp_atpol_pop(cmpp * pp){ + assert( cmpp__epol(pp,at).n ); + if( cmpp__epol(pp,at).n ){ + PodList__atpol_pop(&cmpp__epol(pp,at)); + }else if( !ppCode ){ + cmpp_err_set(pp, CMPP_RC_MISUSE, + "%s() called when no cmpp_atpol_push() is active.", + __func__); + } +} + +int cmpp_unpol_push(cmpp * pp, cmpp_unpol_e pol){ + if( 0==PodList__unpol_push(pp, &cmpp__epol(pp,un), pol) + && cmpp_unpol_set(pp, pol)/*for validation*/ ){ + PodList__unpol_pop(&cmpp__epol(pp,un)); + } + return ppCode; +} + +void cmpp_unpol_pop(cmpp * pp){ + assert( cmpp__epol(pp,un).n ); + if( cmpp__epol(pp,un).n ){ + PodList__unpol_pop(&cmpp__epol(pp,un)); + }else if( !ppCode ){ + cmpp_err_set(pp, CMPP_RC_MISUSE, + "%s() called when no cmpp_unpol_push() is active.", + __func__); + } +} + +CMPP__EXPORT(cmpp_atpol_e, cmpp_atpol_get)(cmpp const * const pp){ + return cmpp__epol(pp,at).na + ? cmpp__policy(pp,at) : cmpp_atpol_DEFAULT; +} + +CMPP__EXPORT(int, cmpp_atpol_set)(cmpp * const pp, cmpp_atpol_e pol){ + if( 0==ppCode ){ + switch(pol){ + case cmpp_atpol_OFF: + case cmpp_atpol_RETAIN: + case cmpp_atpol_ELIDE: + case cmpp_atpol_ERROR: + assert(cmpp__epol(pp,at).na); + cmpp__policy(pp,at) = pol; + break; + case cmpp_atpol_CURRENT: + break; + default: + cmpp__err(pp, CMPP_RC_RANGE, "Invalid policy value: %d", + (int)pol); + } + } + return ppCode; +} + + +char const * cmpp__unpol_name(cmpp *pp, cmpp_unpol_e p){ + (void)pp; + switch(p){ + case cmpp_unpol_NULL: return "null"; + case cmpp_unpol_ERROR: return "error"; + case cmpp_unpol_invalid: return NULL; + } + return NULL; +} + +cmpp_unpol_e cmpp_unpol_from_str(cmpp * const pp, + char const *z){ + cmpp_unpol_e rv = cmpp_unpol_invalid; + if( 0==strcmp(z, "null") ) rv = cmpp_unpol_NULL; + else if( 0==strcmp(z, "error") ) rv = cmpp_unpol_ERROR; + if( pp ){ + if( cmpp_unpol_invalid==rv + && 0==strcmp(z, "current") ){ + rv = cmpp__policy(pp,un); + }else if( cmpp_unpol_invalid==rv ){ + cmpp__err(pp, CMPP_RC_RANGE, + "Invalid undefined key policy value: %s." + " Try one of null|error.", z); + }else{ + cmpp_unpol_set(pp, rv); + } + } + return rv; +} + +CMPP__EXPORT(cmpp_unpol_e, cmpp_unpol_get)(cmpp const * const pp){ + return cmpp__epol(pp,un).na + ? cmpp__policy(pp,un) : cmpp_unpol_DEFAULT; +} + +CMPP__EXPORT(int, cmpp_unpol_set)(cmpp * const pp, cmpp_unpol_e pol){ + if( 0==ppCode ){ + switch(pol){ + case cmpp_unpol_NULL: + case cmpp_unpol_ERROR: + cmpp__policy(pp,un) = pol; + break; + default: + cmpp__err(pp, CMPP_RC_RANGE, "Invalid policy value: %d", + (int)pol); + } + } + return ppCode; +} + +/** + Reminders to self re. savepoint tracking: + + cmpp_dx tracks per-input-source savepoints. We always want + savepoints which are created via scripts to be limited to that + script. cmpp instances, on the other hand, don't care about that. + + Thus we have two different APIs for starting/ending savepoints. +*/ +CMPP__EXPORT(int, cmpp_sp_begin)(cmpp *pp){ + if( 0==ppCode ){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_spBegin, true); + assert( q || !"db init would have otherwise failed"); + if( q && SQLITE_DONE==cmpp__step(pp, q, true) ){ + ++pp->pimpl->flags.nSavepoint; + } + } + return ppCode; +} + +int cmpp__dx_sp_begin(cmpp_dx * const dx){ + if( 0==dxppCode && 0==cmpp_sp_begin(dx->pp) ){ + ++dx->pimpl->nSavepoint; + } + return dxppCode; +} + +CMPP__EXPORT(int, cmpp_sp_rollback)(cmpp *const pp){ + /* Remember that rollback must (mostly) ignore the + pending error state. */ + if( !pp->pimpl->flags.nSavepoint ){ + if( 0==ppCode ){ + cmpp__err(pp, CMPP_RC_MISUSE, + "Cannot roll back: no active savepoint"); + } + }else{ + sqlite3_stmt * q = cmpp__stmt(pp, CmppStmt_spRollback, true); + assert( q || !"db init would have otherwise failed"); + if( q && SQLITE_DONE==cmpp__step(pp, q, true) ){ + q = cmpp__stmt(pp, CmppStmt_spRelease, true); + if( q && SQLITE_DONE==cmpp__step(pp, q, true) ){ + --pp->pimpl->flags.nSavepoint; + } + } + } + return ppCode; +} + +int cmpp__dx_sp_rollback(cmpp_dx * const dx){ + /* Remember that rollback must (mostly) ignore the pending error state. */ + if( !dx->pimpl->nSavepoint ){ + if( 0==dxppCode ){ + cmpp_dx_err(dx, CMPP_RC_MISUSE, + "Cannot roll back: no active savepoint"); + } + }else{ + cmpp_sp_rollback(dx->pp); + --dx->pimpl->nSavepoint; + } + return dxppCode; +} + +CMPP__EXPORT(int, cmpp_sp_commit)(cmpp * const pp){ + if( 0==ppCode ){ + if( !pp->pimpl->flags.nSavepoint ){ + cmpp__err(pp, CMPP_RC_MISUSE, + "Cannot commit: no active savepoint"); + }else{ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_spRelease, true); + assert( q || !"db init would have otherwise failed"); + if( q && SQLITE_DONE==cmpp__step(pp, q, true) ){ + --pp->pimpl->flags.nSavepoint; + } + } + }else{ + cmpp_sp_rollback(pp); + } + return ppCode; +} + +int cmpp__dx_sp_commit(cmpp_dx * const dx){ + if( 0==dxppCode ){ + if( !dx->pimpl->nSavepoint ){ + cmpp_dx_err(dx, CMPP_RC_MISUSE, + "Cannot commit: no active savepoint"); + }else if( 0==cmpp_sp_commit(dx->pp) ){ + --dx->pimpl->nSavepoint; + } + } + return dxppCode; +} + +static void cmpp_dx_pimpl_reuse(cmpp_dx_pimpl *p){ +#if 0 + /* no: we need most of the state to remain + intact. */ + cmpp_dx_pimpl const tmp = *p; + *p = cmpp_dx_pimpl_empty; + p->buf = tmp.buf; + p->args = tmp.args; +#endif + cmpp_b_reuse(&p->buf.line); + cmpp_b_reuse(&p->buf.argsRaw); + cmpp_args_reuse(&p->args); +} + +void cmpp_dx_pimpl_cleanup(cmpp_dx_pimpl *p){ + cmpp_b_clear(&p->buf.line); + cmpp_b_clear(&p->buf.argsRaw); + cmpp_args_cleanup(&p->args); + *p = cmpp_dx_pimpl_empty; +} + +void cmpp_dx__reset(cmpp_dx * const dx){ + dx->args = cmpp_dx_empty.args; + cmpp_dx_pimpl_reuse(dx->pimpl); + dx->d = 0; + //no: dx->sourceName = 0; +} + +void cmpp_dx_cleanup(cmpp_dx * const dx){ + unsigned prev = 0; + CmppLvlList_cleanup(&dx->pimpl->dxLvl); + while( dx->pimpl->nSavepoint && prev!=dx->pimpl->nSavepoint ){ + prev = dx->pimpl->nSavepoint; + cmpp__dx_sp_rollback(dx); + } + cmpp_dx_pimpl_cleanup(dx->pimpl); + memset(dx, 0, sizeof(*dx)); +} + +int cmpp__find_closing2(cmpp *pp, + unsigned char const **zPos, + unsigned char const *zEnd, + cmpp_size_t * pNl){ + unsigned char const * z = *zPos; + unsigned char const opener = *z; + unsigned char closer = 0; + switch(opener){ + case '(': closer = ')'; break; + case '[': closer = ']'; break; + case '{': closer = '}'; break; + case '"': case '\'': closer = opener; break; + default: + return cmpp__err(pp, CMPP_RC_MISUSE, + "Invalid starting char (0x%x) for %s()", + (int)opener, __func__); + } + int count = 1; + for( ++z; z < zEnd; ++z ){ + if( closer == *z && 0==--count ){ + /* Have to check this first for the case of "" and ''. */ + break; + }else if( opener == *z ){ + ++count; + }else if( pNl && '\n'==*z ){ + ++*pNl; + } + } + if( closer!=*z ){ + if( 0 ){ + g_warn("Closer=%dd Full range: <<%.*s>>", (int)*z, + (zEnd - *zPos), *zPos); + } + //assert(!"here"); + cmpp__err(pp, CMPP_RC_SYNTAX, + "Unbalanced %c%c: %.*s", + opener, closer, + (int)(z-*zPos), *zPos); + }else{ + if( 0 ){ + g_warn("group: n=%u <<%.*s>>", (z + 1 - *zPos), (z +1 - *zPos), *zPos); + } + *zPos = z; + } + return ppCode; +} + +cmpp_tt cmpp__tt_for_sqlite(int sqType){ + cmpp_tt rv; + switch( sqType ){ + case SQLITE_INTEGER: rv = cmpp_TT_Int; break; + case SQLITE_NULL: rv = cmpp_TT_Null; break; + default: rv = cmpp_TT_String; break; + } + return rv; +} + +int cmpp__define_from_row(cmpp * const pp, sqlite3_stmt * const q, + bool defineIfNoRow){ + if( 0==ppCode ){ + int const nCol = sqlite3_column_count(q); + assert( sqlite3_data_count(q)>0 || defineIfNoRow); + /* Create a #define for each column */ + bool const hasRow = sqlite3_data_count(q)>0; + for( int i = 0; !ppCode && i < nCol; ++i ){ + char const * const zCol = sqlite3_column_name(q, i); + if( hasRow ){ + unsigned char const * const zVal = sqlite3_column_text(q, i); + int const nVal = sqlite3_column_bytes(q, i); + cmpp_tt const ttype = + cmpp__tt_for_sqlite(sqlite3_column_type(q,i)); + cmpp__define2(pp, ustr_c(zCol), -1, zVal, nVal, ttype); + }else if(defineIfNoRow){ + cmpp__define2(pp, ustr_c(zCol), -1, ustr_c(""), 0, cmpp_TT_Null); + }else{ + break; + } + } + } + return ppCode; +} + +cmpp_d const * cmpp__d_search(cmpp *pp, const char *zName){ + cmpp_d const * d = 0;//cmpp__d_search(zName); + if( !d ){ + CmppDList_entry const * e = + CmppDList_search(&pp->pimpl->d.list, zName); + if( e ) d = &e->d; + } + return d; +} + +cmpp_d const * cmpp__d_search3(cmpp *pp, const char *zName, + cmpp_flag32_t what){ + cmpp_d const * d = cmpp__d_search(pp, zName); + if( !d ){ + CmppDList_entry const * e = 0; + if( cmpp__d_search3_F_DELAYED & what ){ + int rc = cmpp__d_delayed_load(pp, zName); + if( 0==rc ){ + e = CmppDList_search(&pp->pimpl->d.list, zName); + }else if( CMPP_RC_NO_DIRECTIVE!=rc ){ + assert( ppCode ); + return NULL; + } + } + if( !e + && (cmpp__d_search3_F_AUTOLOADER & what) + && pp->pimpl->d.autoload.f + && 0==pp->pimpl->d.autoload.f(pp, zName, pp->pimpl->d.autoload.state) ){ + e = CmppDList_search(&pp->pimpl->d.list, zName); + } +#if CMPP_D_MODULE + if( !e + && !ppCode + && (cmpp__d_search3_F_DLL & what) ){ + char * z = sqlite3_mprintf("libcmpp-d-%s", zName); + cmpp_check_oom(pp, z); + int rc = cmpp_module_load(pp, z, NULL); + sqlite3_free(z); + if( rc ){ + if( CMPP_RC_NOT_FOUND==rc ){ + cmpp__err_clear(pp); + } + return NULL; + } + e = CmppDList_search(&pp->pimpl->d.list, zName); + } +#endif + if( e ) d = &e->d; + } + return d; +} + +int cmpp_dx_process(cmpp_dx * const dx){ + if( 0==dxppCode ){ + cmpp_d const * const d = cmpp_dx_d(dx); + assert( d ); + if( !cmpp_dx_is_eliding(dx) || (d->flags & cmpp_d_F_FLOW_CONTROL) ){ + if( (cmpp_d_F_NOT_IN_SAFEMODE & d->flags) + && (cmpp_ctor_F_SAFEMODE & dx->pp->pimpl->flags.newFlags) ){ + cmpp_dx_err(dx, CMPP_RC_ACCESS, + "Directive %s%s is disabled by safe mode.", + cmpp_dx_delim(dx), dx->d->name.z); + }else{ + assert(d->impl.callback); + d->impl.callback(dx); + } + } + } + return dxppCode; +} + + +static void cmpp_dx__setup_include_path(cmpp_dx * dx){ + /* Add the leading dir part of dx->sourceName as the + highest-priority include path. It gets removed + in cmpp_dx__teardown(). */ + assert( dx->sourceName ); + enum { BufSize = 512 * 4 }; + unsigned char buf[BufSize] = {0}; + unsigned char *z = &buf[0]; + cmpp_size_t n = cmpp__strlenu(dx->sourceName, -1); + if( n > (unsigned)BufSize-1 ) return; + memcpy(z, dx->sourceName, n); + buf[n] = 0; + cmpp_ssize_t i = n - 1; + for( ; i > 0; --i ){ + if( '/'==z[i] || '\\'==z[i] ){ + z[i] = 0; + n = i; + break; + } + } + if( n>(cmpp_size_t)i ){ + /* No path separator found. Assuming '.'. This is intended to + replace the historical behavior of automatically adding '.' if + no -I flags are used. Potential TODO is getcwd() here instead + of using '.' */ + n = 1; + buf[0] = '.'; + buf[1] = 0; + } + int64_t rowid = 0; + cmpp__include_dir_add(dx->pp, (char const*)buf, + dx->pp->pimpl->flags.nDxDepth, + &rowid); + if( rowid ){ + //g_warn("Adding #include path #%" PRIi64 ": %s", rowid, z); + dx->pimpl->shadow.ridInclPath = rowid; + } +} + +static int cmpp_dx__setup(cmpp *pp, cmpp_dx *dx, + unsigned char const * zIn, + cmpp_ssize_t nIn){ + if( 0==ppCode ){ + assert( dx->sourceName ); + assert( dx->pimpl ); + assert( pp==dx->pp ); + nIn = cmpp__strlenu(zIn, nIn); + if( !nIn ) return 0; + pp->pimpl->dx = dx; + dx->pimpl->zBegin = zIn; + dx->pimpl->zEnd = zIn + nIn; + cmpp_define_shadow(pp, "__FILE__", (char const *)dx->sourceName, + &dx->pimpl->shadow.sidFile); + ++dx->pp->pimpl->flags.nDxDepth; + cmpp_dx__setup_include_path(dx); + } + return ppCode; +} + +static void cmpp_dx__teardown(cmpp_dx *dx){ + if( dx->pimpl->shadow.ridInclPath>0 ){ + cmpp__include_dir_rm_id(dx->pp, dx->pimpl->shadow.ridInclPath); + dx->pimpl->shadow.ridInclPath = 0; + } + if( dx->pimpl->shadow.sidFile ){ + cmpp_define_unshadow(dx->pp, "__FILE__", + dx->pimpl->shadow.sidFile); + } + --dx->pp->pimpl->flags.nDxDepth; + cmpp_dx_cleanup(dx); +} + +CMPP__EXPORT(int, cmpp_process_string)( + cmpp *pp, const char * zName, + unsigned char const * zIn, + cmpp_ssize_t nIn +){ + if( !zName ) zName = ""; + if( 0==cmpp__db_init(pp) ){ + cmpp_dx const * const oldDx = pp->pimpl->dx; + cmpp_dx_pimpl dxp = cmpp_dx_pimpl_empty; + cmpp_dx dx = { + .pp = pp, + .sourceName = ustr_c(zName), + .args = cmpp_dx_empty.args, + .pimpl = &dxp + }; + dxp.flags.nextIsCall = pp->pimpl->flags.nextIsCall; + pp->pimpl->flags.nextIsCall = false; + if( dxp.flags.nextIsCall ){ + assert( pp->pimpl->dx ); + dxp.pos.lineNo = pp->pimpl->dx->pimpl->pos.lineNo; + } + bool gotOne = false; + (void)cmpp__stmt(pp, CmppStmt_sdefIns, true); + (void)cmpp__stmt(pp, CmppStmt_inclPathAdd, true); + (void)cmpp__stmt(pp, CmppStmt_inclPathRmId, true); + (void)cmpp__stmt(pp, CmppStmt_sdefDel, true) + /* hack: ensure that those queries are allocated now, as an + error in processing may keep them from being created + later. We might want to rethink the + prepare-no-statements-on-error bits, but will have to go back + and fix routines which currently rely on that. */; + cmpp_dx__setup(pp, &dx, zIn, nIn); + while(0==ppCode + && 0==cmpp_dx_next(&dx, &gotOne) + && gotOne){ + cmpp_dx_process(&dx); + } + if(0==ppCode && 0!=dx.pimpl->dxLvl.n){ + CmppLvl const * const lv = CmppLvl_get(&dx); + cmpp_dx_err(&dx, CMPP_RC_SYNTAX, + "Input ended inside an unterminated nested construct " + "opened at [%s] line %" CMPP_SIZE_T_PFMT ".", zName, + lv ? lv->lineNo : (cmpp_size_t)0); + } + cmpp_dx__teardown(&dx); + pp->pimpl->dx = oldDx; + } + if( !ppCode ){ + cmpp_outputer_flush(&pp->pimpl->out) + /* We're going to ignore a result code just this once. */; + } + return ppCode; +} + +int cmpp_process_file(cmpp *pp, const char * zName){ + if( 0==ppCode ){ + FileWrapper fw = FileWrapper_empty; + if( 0==cmpp__FileWrapper_open(pp, &fw, zName, "rb") + && 0==cmpp__FileWrapper_slurp(pp, &fw) ){ + cmpp_process_string(pp, zName, fw.zContent, fw.nContent); + } + FileWrapper_close(&fw); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_process_stream)(cmpp *pp, const char * zName, + cmpp_input_f src, void * srcState){ + if( 0==ppCode ){ + cmpp_b * const os = cmpp_b_borrow(pp); + int const rc = os + ? cmpp_stream(src, srcState, cmpp_output_f_b, os) + : ppCode; + if( 0==rc ){ + cmpp_process_string(pp, zName, os->z, os->n); + }else{ + cmpp__err(pp, rc, "Error reading from input stream '%s'.", zName); + } + cmpp_b_return(pp, os); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_call_str)( + cmpp *pp, unsigned char const * z, cmpp_ssize_t n, + cmpp_b * dest, cmpp_flag32_t flags +){ + if( ppCode ) return ppCode; + cmpp_args args = cmpp_args_empty; + cmpp_b * const b = cmpp_b_borrow(pp); + cmpp_b * const bo = cmpp_b_borrow(pp); + cmpp_outputer oB = cmpp_outputer_b; + if( !b || !bo ) return ppCode; + cmpp__pi(pp); + oB.state = bo; + oB.name = pi->out.name;//"[call]"; + n = cmpp__strlenu(z, n); + //g_warn("calling: <<%.*s>>", (int)n, z); + unsigned char const * zEnd = z+n; + cmpp_skip_snl(&z, zEnd); + cmpp_skip_snl_trailing(z, &zEnd); + n = (zEnd-z); + if( !n ){ + cmpp_err_set(pp, CMPP_RC_SYNTAX, + "Empty [call] is not permitted."); + goto end; + } + //g_warn("calling: <<%.*s>>", (int)n, z); + cmpp__delim const * const delim = cmpp__pp_delim(pp); + assert(delim); + if( (cmpp_size_t)n<=delim->open.n + || 0!=memcmp(z, delim->open.z, delim->open.n) ){ + /* If it doesn't start with the current delimiter, + prepend one. */ + cmpp_b_reserve3(pp, b, delim->open.n + n + 2); + cmpp_b_append4(pp, b, delim->open.z, delim->open.n); + } + cmpp_b_append4(pp, b, z, n); + if( !ppCode ){ + cmpp_outputer oOld = cmpp_outputer_empty; + pi->flags.nextIsCall = true + /* Convey (indirectly) that the first cmpp_dx_next() call made + via cmpp_process_string() is a call context. */; + cmpp__outputer_swap(pp, &oB, &oOld); + cmpp_process_string(pp, (char*)b->z, b->z, b->n); + cmpp__outputer_swap(pp, &oOld, &oB); + assert( !pi->flags.nextIsCall || ppCode ); + pi->flags.nextIsCall = false; + } + if( !ppCode ){ + unsigned char const * zz = bo->z; + unsigned char const * zzEnd = bo->z + bo->n; + if( cmpp_call_F_TRIM_ALL & flags ){ + cmpp_skip_snl(&zz, zzEnd); + cmpp_skip_snl_trailing(zz, &zzEnd); + }else if( 0==(cmpp_call_F_NO_TRIM & flags) ){ + cmpp_b_chomp(bo); + zzEnd = bo->z + bo->n; + } + if( (zzEnd-zz) ){ + cmpp_b_append4(pp, dest, zz, (zzEnd-zz)); + } + } +end: + cmpp_b_return(pp, b); + cmpp_b_return(pp, bo); + cmpp_args_cleanup(&args); + return ppCode; +} + +CMPP__EXPORT(int, cmpp_errno_rc)(int errNo, int dflt){ + switch(errNo){ + /* Please expand on this as tests/use cases call for it... */ + case 0: + return 0; + case EINVAL: + return CMPP_RC_MISUSE; + case ENOMEM: + return CMPP_RC_OOM; + case EROFS: + case EACCES: + case EBUSY: + case EPERM: + case EDQUOT: + case EAGAIN: + case ETXTBSY: + return CMPP_RC_ACCESS; + case EISDIR: + case ENOTDIR: + return CMPP_RC_TYPE; + case ENAMETOOLONG: + case ELOOP: + case ERANGE: + return CMPP_RC_RANGE; + case ENOENT: + case ESRCH: + return CMPP_RC_NOT_FOUND; + case EEXIST: + case ENOTEMPTY: + return CMPP_RC_ALREADY_EXISTS; + case EIO: + return CMPP_RC_IO; + default: + return dflt; + } +} + +int cmpp_flush_f_FILE(void * _FILE){ + return fflush(_FILE) ? cmpp_errno_rc(errno, CMPP_RC_IO) : 0; +} + +int cmpp_output_f_FILE( void * state, + void const * src, cmpp_size_t n ){ + return (1 == fwrite(src, n, 1, state ? (cmpp_FILE*)state : stdout)) + ? 0 : CMPP_RC_IO; +} + +int cmpp_output_f_fd( void * state, void const * src, cmpp_size_t n ){ + int const fd = *((int*)state); + ssize_t const wn = write(fd, src, n); + return wn<0 ? cmpp_errno_rc(errno, CMPP_RC_IO) : 0; +} + +int cmpp_input_f_FILE( void * state, void * dest, cmpp_size_t * n ){ + cmpp_FILE * f = state; + cmpp_size_t const rn = *n; + *n = (cmpp_size_t)fread(dest, 1, rn, f); + return *n==rn ? 0 : (feof(f) ? 0 : CMPP_RC_IO); +} + +int cmpp_input_f_fd( void * state, void * dest, cmpp_size_t * n ){ + int const fd = *((int*)state); + ssize_t const rn = read(fd, dest, *n); + if( rn<0 ){ + return cmpp_errno_rc(errno, CMPP_RC_IO); + }else{ + *n = (cmpp_size_t)rn; + return 0; + } +} + +void cmpp_outputer_cleanup_f_FILE(cmpp_outputer *self){ + if( self->state ){ + cmpp_fclose( self->state ); + self->name = NULL; + self->state = NULL; + } +} + +CMPP__EXPORT(void, cmpp_outputer_cleanup_f_b)(cmpp_outputer *self){ + if( self->state ) cmpp_b_clear(self->state); +} + +CMPP__EXPORT(int, cmpp_outputer_out)(cmpp_outputer *o, void const *p, cmpp_size_t n){ + return o->out ? o->out(o->state, p, n) : 0; +} + +CMPP__EXPORT(int, cmpp_outputer_flush)(cmpp_outputer *o){ + return o->flush ? o->flush(o->state) : 0; +} + +CMPP__EXPORT(void, cmpp_outputer_cleanup)(cmpp_outputer *o){ + if( o->cleanup ){ + o->cleanup( o ); + } +} + +CMPP__EXPORT(int, cmpp_stream)( cmpp_input_f inF, void * inState, + cmpp_output_f outF, void * outState ){ + int rc = 0; + enum { BufSize = 1024 * 4 }; + unsigned char buf[BufSize]; + cmpp_size_t rn = BufSize; + while( 0==rc + && (rn==BufSize) + && (0==(rc=inF(inState, buf, &rn))) ){ + if(rn) rc = outF(outState, buf, rn); + } + return rc; +} + +void cmpp__fatalv_base(char const *zFile, int line, + char const *zFmt, va_list va){ + cmpp_FILE * const fp = stderr; + fflush(stdout); + fprintf(fp, "\n%s:%d: ", zFile, line); + if(zFmt && *zFmt){ + vfprintf(fp, zFmt, va); + fputc('\n', fp); + } + fflush(fp); + exit(1); +} + +void cmpp__fatal_base(char const *zFile, int line, + char const *zFmt, ...){ + va_list va; + va_start(va, zFmt); + cmpp__fatalv_base(zFile, line, zFmt, va); + va_end(va); +} + +CMPP__EXPORT(int, cmpp_err_get)(cmpp *pp, char const **zMsg){ + if( zMsg && ppCode ) *zMsg = pp->pimpl->err.zMsg; + return ppCode; +} + +CMPP__EXPORT(int, cmpp_err_take)(cmpp *pp, char **zMsg){ + int const rc = ppCode; + if( rc ){ + *zMsg = pp->pimpl->err.zMsg; + pp->pimpl->err = cmpp_pimpl_empty.err; + } + return rc; +} + +//CMPP_WASM_EXPORT +void cmpp__err_clear(cmpp *pp){ + cmpp_mfree(pp->pimpl->err.zMsg); + pp->pimpl->err = cmpp_pimpl_empty.err; +} + +CMPP__EXPORT(int, cmpp_err_has)(cmpp const * pp){ + return pp ? pp->pimpl->err.code : 0; +} + +CMPP__EXPORT(void, cmpp_dx_pos_save)(cmpp_dx const * dx, cmpp_dx_pos *pos){ + *pos = dx->pimpl->pos; +} + +CMPP__EXPORT(void, cmpp_dx_pos_restore)(cmpp_dx * dx, cmpp_dx_pos const * pos){ + dx->pimpl->pos = *pos; +} + + +//CMPP_WASM_EXPORT +void cmpp__dx_append_script_info(cmpp_dx const * dx, + sqlite3_str * const sstr){ + sqlite3_str_appendf( + sstr, + "%s%s@ %s line %" CMPP_SIZE_T_PFMT, + dx->d ? dx->d->name.z : "", + dx->d ? " " : "", + (dx->sourceName + && 0==strcmp("-", (char const *)dx->sourceName)) + ? "" + : (char const *)dx->sourceName, + dx->pimpl->dline.lineNo + ); +} + +int cmpp__errv(cmpp *pp, int rc, char const *zFmt, va_list va){ + if( pp ){ + cmpp__err_clear(pp); + ppCode = rc; + if( 0==rc ) return rc; + if( CMPP_RC_OOM==rc ){ + oom: + pp->pimpl->err.zMsgC = "An allocation failed."; + return pp->pimpl->err.code = CMPP_RC_OOM; + } + assert( !pp->pimpl->err.zMsg ); + if( pp->pimpl->dx || (zFmt && *zFmt) ){ + sqlite3_str * sstr = 0; + sstr = sqlite3_str_new(pp->pimpl->db.dbh); + if( pp->pimpl->dx ){ + cmpp__dx_append_script_info(pp->pimpl->dx, sstr); + sqlite3_str_append(sstr, ": ", 2); + } + if( zFmt && *zFmt ){ + sqlite3_str_vappendf(sstr, zFmt, va); + }else{ + sqlite3_str_appendf(sstr, "No error info provided."); + } + pp->pimpl->err.zMsgC = + pp->pimpl->err.zMsg = sqlite3_str_finish(sstr); + if( !pp->pimpl->err.zMsg ){ + goto oom; + } + }else{ + pp->pimpl->err.zMsgC = "No error info provided."; + } + rc = ppCode; + } + return rc; +} + +//CMPP_WASM_EXPORT no - variadic +int cmpp_err_set(cmpp *pp, int rc, + char const *zFmt, ...){ + if( pp ){ + va_list va; + va_start(va, zFmt); + rc = cmpp__errv(pp, rc, zFmt, va); + va_end(va); + } + return rc; +} + +const cmpp_d_autoloader cmpp_d_autoloader_empty = + cmpp_d_autoloader_empty_m; + +CMPP__EXPORT(void, cmpp_d_autoloader_set)(cmpp *pp, cmpp_d_autoloader const * pNew){ + if( pp->pimpl->d.autoload.dtor ) pp->pimpl->d.autoload.dtor(pp->pimpl->d.autoload.state); + if( pNew ) pp->pimpl->d.autoload = *pNew; + else pp->pimpl->d.autoload = cmpp_d_autoloader_empty; +} + +CMPP__EXPORT(void, cmpp_d_autoloader_take)(cmpp *pp, cmpp_d_autoloader * pOld){ + *pOld = pp->pimpl->d.autoload; + pp->pimpl->d.autoload = cmpp_d_autoloader_empty; +} + +//CMPP_WASM_EXPORT no - variadic +int cmpp_dx_err_set(cmpp_dx *dx, int rc, + char const *zFmt, ...){ + va_list va; + va_start(va, zFmt); + rc = cmpp__errv(dx->pp, rc, zFmt, va); + va_end(va); + return rc; +} + +CMPP__EXPORT(int, cmpp_err_set1)(cmpp *pp, int rc, char const *zMsg){ + return cmpp_err_set(pp, rc, (zMsg && *zMsg) ? "%s" : 0, zMsg); +} + +//no: CMPP_WASM_EXPORT +char * cmpp_path_search(cmpp *pp, + char const *zPath, + char pathSep, + char const *zBaseName, + char const *zExt){ + char * zrc = 0; + if( !ppCode ){ + sqlite3_stmt * const q = + cmpp__stmt(pp, CmppStmt_selPathSearch, false); + if( q ){ + unsigned char sep[2] = {pathSep, 0}; + cmpp__bind_text(pp, q, 1, ustr_c(zBaseName)); + cmpp__bind_text(pp, q, 2, sep); + cmpp__bind_text(pp, q, 3, ustr_c((zExt ? zExt : ""))); + cmpp__bind_text(pp, q, 4, ustr_c((zPath ? zPath: ""))); + int const dbrc = cmpp__step(pp, q, false); + if( SQLITE_ROW==dbrc ){ + unsigned char const * s = sqlite3_column_text(q, 1); + zrc = sqlite3_mprintf("%s", s); + cmpp_check_oom(pp, zrc); + } + cmpp__stmt_reset(q); + } + } + return zrc; +} + +#if CMPP__OBUF +int cmpp__obuf_flush(cmpp__obuf * b){ + if( 0==b->rc && b->cursor > b->begin ){ + if( b->dest.out ){ + b->rc = b->dest.out(b->dest.state, b->begin, + b->cursor-b->begin); + } + b->cursor = b->begin; + } + if( 0==b->rc && b->dest.flush ){ + b->rc = b->dest.flush(b->dest.state); + } + return b->rc; +} + +void cmpp__obuf_cleanup(cmpp__obuf * b){ + if( b ){ + cmpp__obuf_flush(b);/*ignoring result*/; + if( b->ownsMemory ){ + cmpp_mfree(b->begin); + } + *b = cmpp__obuf_empty; + } +} + +int cmpp__obuf_write(cmpp__obuf * b, void const * src, cmpp_size_t n){ + assert( b ); + if( n && !b->rc && b->dest.out ){ + assert( b->end ); + assert( b->cursor ); + assert( b->cursor <= b->end ); + assert( b->end>b->begin ); + if( b->cursor + n >= b->end ){ + if( 0==cmpp_flush_f_obuf(b) ){ + if( b->cursor + n >= b->end ){ + /* Go ahead and write it all */ + b->rc = b->dest.out(b->dest.state, src, n); + }else{ + goto copy_it; + } + } + }else{ + copy_it: + memcpy(b->cursor, src, n); + b->cursor += n; + } + } + return b->rc; +} + +int cmpp_flush_f_obuf(void * b){ + return cmpp__obuf_flush(b); +} + +int cmpp_output_f_obuf(void * state, void const * src, cmpp_size_t n){ + return cmpp__obuf_write(state, src, n); +} + +void cmpp_outputer_cleanup_f_obuf(cmpp_outputer * o){ + cmpp__obuf_cleanup(o->state); +} +#endif /* CMPP__OBUF */ + +//cmpp__ListType_impl(cmpp__delim_list,cmpp__delim) +//cmpp__ListType_impl(CmppDList,CmppDList_entry*) +//cmpp__ListType_impl(CmppSohList,void*) +cmpp__ListType_impl(CmppArgList,cmpp_arg) +cmpp__ListType_impl(cmpp_b_list,cmpp_b*) +cmpp__ListType_impl(CmppLvlList,CmppLvl*) + +/** + Expects that *ndx points to the current argv entry and that it is a + flag which expects a value. This function checks for --flag=val and + (--flag val) forms. If a value is found then *ndx is adjusted (if + needed) to point to the next argument after the value and *zVal is + * pointed to the value. If no value is found then it returns false. +*/ +static bool get_flag_val(int argc, + char const * const * argv, int * ndx, + char const **zVal){ + char const * zEq = strchr(argv[*ndx], '='); + if( zEq ){ + *zVal = zEq+1; + return 1; + }else if(*ndx+1>=argc){ + return 0; + }else{ + *zVal = argv[++*ndx]; + return 1; + } +} + +static +bool cmpp__arg_is_flag( char const *zFlag, char const *zArg, + char const **zValIfEqX ); +bool cmpp__arg_is_flag( char const *zFlag, char const *zArg, + char const **zValIfEqX ){ + if( zValIfEqX ) *zValIfEqX = 0; + if( 0==strcmp(zFlag, zArg) ) return true; + char const * z = strchr(zArg,'='); + if( z && z>zArg ){ + /* compare the part before the '=' */ + if( 0==strncmp(zFlag, zArg, z-zArg) ){ + if( !zFlag[z-zArg] ){ + if( zValIfEqX ) *zValIfEqX = z+1; + return true; + } + /* Else it was a prefix match. */ + } + } + return false; +} + +void cmpp__dump_defines(cmpp *pp, cmpp_FILE * fp, int bIndent){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defSelAll, false); + if( q ){ + while( SQLITE_ROW==sqlite3_step(q) ){ + int const tt = sqlite3_column_int(q, 0); + unsigned char const * zK = sqlite3_column_text(q, 1); + unsigned char const * zV = sqlite3_column_text(q, 2); + int const nK = sqlite3_column_bytes(q, 1); + int const nV = sqlite3_column_bytes(q, 2); + char const * zTt = cmpp__tt_cstr(tt, true); + if( tt && zTt ) zTt += 3; + else zTt = "String"; + fprintf(fp, "%s%.*s = [%s] %.*s\n", bIndent ? "\t" : "", + nK, zK, zTt, nV, zV); + } + cmpp__stmt_reset(q); + } +} + +/** + This is what was originally the main() of cmpp v1, back when it was + a monolithic app. It still serves as the driver for main() but is + otherwise unused. +*/ +CMPP__EXPORT(int, cmpp_process_argv)(cmpp *pp, int argc, + char const * const * argv){ + if( ppCode ) return ppCode; + int nFile = 0 /* number of files/-e scripts seen */; + +#define ARGVAL if( !zVal && !get_flag_val(argc, argv, &i, &zVal) ){ \ + cmpp__err(pp, CMPP_RC_MISUSE, "Missing value for flag '%s'", \ + argv[i]); \ + break; \ + } +#define M(X) cmpp__arg_is_flag(X, zArg, &zVal) +#define ISFLAG(X) else if(M(X)) +#define ISFLAG2(X,Y) else if(M(X) || M(Y)) +#define NOVAL if( zVal ){ \ + cmpp__err(pp,CMPP_RC_MISUSE,"Unexpected value for %s", zArg); \ + break; \ + } (void)0 + +#define open_output_if_needed \ + if( !pp->pimpl->out.out && cmpp__out_fopen(pp, "-") ) break + + cmpp__staticAssert(TT_None,0==(int)cmpp_TT_None); + cmpp__staticAssert(Mask1, cmpp_d_F_MASK_INTERNAL & cmpp_d_F_FLOW_CONTROL); + cmpp__staticAssert(Mask2, cmpp_d_F_MASK_INTERNAL & cmpp_d_F_NOT_SIMPLIFY); + cmpp__staticAssert(Mask3, 0==(cmpp_d_F_MASK_INTERNAL & cmpp_d_F_MASK)); + + for(int doIt = 0; doIt<2 && 0==ppCode; ++doIt){ + /** + Loop through the flags twice. The first time we just validate + and look for --help/-?. The second time we process the flags. + This approach allows us to easily chain multiple files and + flags: + + ./c-pp -Dfoo -o foo x.y -Ufoo -Dbar -o bar x.y + + Which, it turns out, is a surprisingly useful way to work. + */ +#define DOIT if(1==doIt) + for(int i = 0; i < argc && 0==ppCode; ++i){ + char const * zVal = 0; + int isNoFlag = 0; + char const * zArg = argv[i]; + //g_stderr("i=%d zArg=%s\n", i, zArg); + zVal = 0; + while('-'==*zArg) ++zArg; + if(zArg==argv[i]/*not a flag*/){ + zVal = zArg; + goto do_infile; + } + //g_warn("zArg=%s", zArg); + if( 0==strncmp(zArg,"no-",3) ){ + zArg += 3; + isNoFlag = 1; + } + if( M("?") || M("help") ){ + NOVAL; + cmpp__err(pp, CMPP_RC_HELP, "%s", argv[i]); + break; + }else if('D'==*zArg){ + ++zArg; + if(!*zArg){ + cmpp__err(pp,CMPP_RC_MISUSE,"Missing key for -D"); + }else DOIT { + cmpp_define_legacy(pp, zArg, 0); + } + }else if('F'==*zArg){ + ++zArg; + if(!*zArg){ + cmpp__err(pp,CMPP_RC_MISUSE,"Missing key for -F"); + }else DOIT { + cmpp__set_file(pp, ustr_c(zArg), -1); + } + } + ISFLAG("e"){ + ARGVAL; + DOIT { + ++nFile; + open_output_if_needed; + cmpp_process_string(pp, "-e script", + (unsigned char const *)zVal, -1); + } + }else if('U'==*zArg){ + ++zArg; + if(!*zArg){ + cmpp__err(pp,CMPP_RC_MISUSE,"Missing key for -U"); + }else DOIT { + cmpp_undef(pp, zArg, NULL); + } + }else if('I'==*zArg){ + ++zArg; + if(!*zArg){ + cmpp__err(pp,CMPP_RC_MISUSE,"Missing directory for -I"); + }else DOIT { + cmpp_include_dir_add(pp, zArg); + } + }else if('L'==*zArg){ + ++zArg; + if(!*zArg){ + cmpp__err(pp,CMPP_RC_MISUSE,"Missing directory for -L"); + }else DOIT { + cmpp_module_dir_add(pp, zArg); + } + } + ISFLAG2("o","outfile"){ + ARGVAL; + DOIT { + cmpp__out_fopen(pp, zVal); + } + } + ISFLAG2("f","file"){ + ARGVAL; + do_infile: + DOIT { + if( !pp->pimpl->mod.path.z ){ + cmpp_module_dir_add(pp, NULL); + } + ++nFile; + if( 0 + && !pp->pimpl->flags.nIncludeDir + && cmpp_include_dir_add(pp, ".") ){ + break; + } + open_output_if_needed; + cmpp_process_file(pp, zVal); + } + } + ISFLAG("@"){ + NOVAL; + DOIT { + assert( cmpp_atpol_DEFAULT_FOR_FLAG!=cmpp_atpol_OFF ); + cmpp_atpol_set(pp, isNoFlag + ? cmpp_atpol_OFF + : cmpp_atpol_DEFAULT_FOR_FLAG); + } + } + ISFLAG("@policy"){ + ARGVAL; + cmpp_atpol_from_str(pp, zVal); + } + ISFLAG("debug"){ + NOVAL; + DOIT { + pp->pimpl->flags.doDebug += isNoFlag ? -1 : 1; + } + } + ISFLAG2("u","undefined-policy"){ + ARGVAL; + cmpp_unpol_from_str(pp, zVal); + } + ISFLAG("sql-trace"){ + NOVAL; + /* Needs to be set before the start of the second pass, when + the db is inited. */ + DOIT { + pp->pimpl->sqlTrace.expandSql = false; + do_trace_flag: + cmpp_outputer_cleanup(&pp->pimpl->sqlTrace.out); + if( isNoFlag ){ + pp->pimpl->sqlTrace.out = cmpp_outputer_empty; + }else{ + pp->pimpl->sqlTrace.out = cmpp_outputer_FILE; + pp->pimpl->sqlTrace.out.state = stderr; + } + } + } + ISFLAG("sql-trace-x"){ + NOVAL; + DOIT { + pp->pimpl->sqlTrace.expandSql = true; + goto do_trace_flag; + } + } + ISFLAG("chomp-F"){ + NOVAL; + DOIT pp->pimpl->flags.chompF = !isNoFlag; + } + ISFLAG2("d","delimiter"){ + ARGVAL; + DOIT { + cmpp_delimiter_set(pp, zVal); + } + } + ISFLAG2("dd", "dump-defines"){ + DOIT { + cmpp_FILE * const fp = + /* tcl's exec treats output to stderr as failure. + If we use [exec -ignorestderr] then it instead replaces + stderr's output with its own message, invalidating + test expectations. */ + 1 ? stdout : stderr; + fprintf(fp, "All %sdefine entries:\n", + cmpp__pp_zdelim(pp)); + cmpp__dump_defines(pp, fp, 1); + } + } +#if !defined(CMPP_OMIT_D_DB) + ISFLAG2("db", "db-file"){ + /* Undocumented flag used for testing purposes. */ + ARGVAL; + DOIT { + cmpp_db_name_set(pp, zVal); + } + } +#endif + ISFLAG("version"){ + NOVAL; +#if !defined(CMPP_OMIT_FILE_IO) + fprintf(stdout, "c-pp version %s\nwith SQLite %s %s\n", + cmpp_version(), + sqlite3_libversion(), + sqlite3_sourceid()); +#endif + doIt = 100; + break; + } +#if defined(CMPP_MAIN) && !defined(CMPP_MAIN_SAFEMODE) + ISFLAG("safe-mode"){ + if( i>0 ){ + cmpp_err_set(pp, CMPP_RC_MISUSE, + "--%s, if used, must be the first argument.", + zArg); + break; + } + } +#endif + else{ + cmpp__err(pp,CMPP_RC_MISUSE, + "Unhandled flag: %s", argv[i]); + } + } + DOIT { + if(!nFile){ + /* We got no file arguments, so read from stdin. */ + if(0 + && !pp->pimpl->flags.nIncludeDir + && cmpp_include_dir_add(pp, ".") ){ + break; + } + open_output_if_needed; + cmpp_process_file(pp, "-"); + } + } +#undef DOIT + } + return ppCode; +#undef ARGVAL +#undef M +#undef ISFLAG +#undef ISFLAG2 +#undef NOVAL +#undef open_output_if_needed +} + +void cmpp_process_argv_usage(char const *zAppName, cmpp_FILE *fOut){ +#if defined(CMPP_OMIT_FILE_IO) + (void)zAppName; (void)fOut; +#else + fprintf(fOut, "%s version %s\nwith SQLite %s %s\n", + zAppName ? zAppName : "c-pp", + cmpp_version(), + sqlite3_libversion(), + sqlite3_sourceid()); + fprintf(fOut, "Usage: %s [flags] [infile...]\n", zAppName); + fprintf(fOut, + "Flags and filenames may be in any order and " + "they are processed in that order.\n" + "\nFlags:\n"); +#define GAP " " +#define arg(F,D) fprintf(fOut,"\n %s\n" GAP "%s\n",F, D) +#if defined(CMPP_MAIN) && !defined(CMPP_MAIN_SAFEMODE) + arg("--safe-mode", + "Disables preprocessing directives which use the filesystem " + "or invoke external processes. If used, it must be the first " + "argument."); +#endif + + arg("-o|--outfile FILE","Send output to FILE (default=- (stdout)).\n" + GAP "Because arguments are processed in order, this should\n" + GAP "normally be given before -f."); + arg("-f|--file FILE","Process FILE (default=- (stdin)).\n" + GAP "All non-flag arguments are assumed to be the input files."); + arg("-e SCRIPT", + "Treat SCRIPT as a complete c-pp input and process it.\n" + GAP "Doing anything marginally useful with this requires\n" + GAP "using it several times, once per directive. It will not\n" + GAP "work with " CMPP_DEFAULT_DELIM "if but is fine for " + CMPP_DEFAULT_DELIM "expr, " + CMPP_DEFAULT_DELIM "assert, and " + CMPP_DEFAULT_DELIM "define."); + arg("-DXYZ[=value]","Define XYZ to the given value (default=1)."); + arg("-UXYZ","Undefine all defines matching glob XYZ."); + arg("-IXYZ","Add dir XYZ to the " CMPP_DEFAULT_DELIM "include path."); + arg("-LXYZ","Add dir XYZ to the loadable module search path."); + arg("-FXYZ=filename", + "Define XYZ to the raw contents of the given file.\n" + GAP "The file is not processed as by " CMPP_DEFAULT_DELIM"include.\n" + GAP "Maybe it should be. Or maybe we need a new flag for that."); + arg("-d|--delimiter VALUE", "Set directive delimiter to VALUE " + "(default=" CMPP_DEFAULT_DELIM ")."); + arg("--@policy retain|elide|error|off", + "Specifies how to handle @tokens@ (default=off).\n" + GAP "off = do not look for @tokens@\n" + GAP "retain = parse @tokens@ and retain any undefined ones\n" + GAP "elide = parse @tokens@ and elide any undefined ones\n" + GAP "error = parse @tokens@ and error out for any undefined ones" + ); + arg("-u|--undefined-policy NAME", + "Sets the policy for how to handle references to undefined key:\n" + GAP "null = treat them as empty/falsy. This is the default.\n" + GAP "error = trigger an error. This should probably be " + "the default." + ); + arg("-@", "Equivalent to --@policy=error."); + arg("-no-@", "Equivalent to --@policy=off (the default)."); + arg("--sql-trace", "Send a trace of all SQL to stderr."); + arg("--sql-trace-x", + "Like --sql-trace but expand all bound values in the SQL."); + arg("--no-sql-trace", "Disable SQL tracing (default)."); + arg("--chomp-F", "One trailing newline is trimmed from files " + "read via -FXYZ=filename."); + arg("--no-chomp-F", "Disable --chomp-F (default)."); +#undef arg +#undef GAP + fputs("\nFlags which require a value accept either " + "--flag=value or --flag value. " + "The exceptions are that the -D... and -F... flags " + "require their '=' to be part of the flag (because they " + "are parsed elsewhere).\n\n",fOut); +#endif /*CMPP_OMIT_FILE_IO*/ +} + +#if defined(CMPP_MAIN) /* add main() */ +int main(int argc, char const * const * argv){ + int rc = 0; + cmpp * pp = 0; + cmpp_flag32_t newFlags = 0 +#if defined(CMPP_MAIN_SAFEMODE) + | cmpp_ctor_F_SAFEMODE +#endif + ; + cmpp_b bArgs = cmpp_b_empty; + sqlite3_config(SQLITE_CONFIG_URI,1); + { + /* Copy argv to a string so we can #define it. This has proven + helpful in testing, debugging, and output validation. */ + for( int i = 0; i < argc; ++i ){ + if( i ) cmpp_b_append_ch(&bArgs,' '); + cmpp_b_append(&bArgs, argv[i], strlen(argv[i])); + } + if( (rc = bArgs.errCode) ) goto end; + if( argc>1 && cmpp__arg_is_flag("--safe-mode", argv[1], NULL) ){ + newFlags |= cmpp_ctor_F_SAFEMODE; + --argc; + ++argv; + } + } + cmpp_ctor_cfg const cfg = { + .flags = newFlags + }; + rc = cmpp_ctor(&pp, &cfg); + if( rc ) goto end; + /** + Define CMPP_MAIN_INIT to the name of a function with the signature + + int (*)(cmpp*) + + to have it called here. The intent is that custom directives can + be installed this way without having to edit this code. + */ +#if defined(CMPP_MAIN_INIT) + extern int CMPP_MAIN_INIT(cmpp*); + if( 0!=(rc = CMPP_MAIN_INIT(pp)) ){ + g_warn0("Initialization via CMPP_MAIN_INIT() failed"); + goto end; + } +#endif +#if defined(CMPP_MAIN_AUTOLOADER) + { + extern int CMPP_MAIN_AUTOLOADER(cmpp*,char const *,void*); + cmpp_d_autoloader al = cmpp_d_autoloader_empty; + al.f = CMPP_MAIN_AUTOLOADER; + cmpp_d_autoloader_set(pp, &al); + } +#endif + if( cmpp_define_v2(pp, "c-pp::argv", (char*)bArgs.z) ) goto end; + cmpp_b_clear(&bArgs); + rc = cmpp_process_argv(pp, argc-1, argv+1); + switch( rc ){ + case 0: break; + case CMPP_RC_HELP: + rc = 0; + cmpp_process_argv_usage(argv[0], stdout); + break; + default: + break; + } +end: + cmpp_b_clear(&bArgs); + if( pp ){ + char const *zErr = 0; + rc = cmpp_err_get(pp, &zErr); + if( rc && CMPP_RC_HELP!=rc ){ + g_warn("error %s: %s", cmpp_rc_cstr(rc), zErr); + } + cmpp_dtor(pp); + }else if( rc && CMPP_RC_HELP!=rc ){ + g_warn("error #%d/%s", rc, cmpp_rc_cstr(rc)); + } + sqlite3_shutdown(); + return rc ? EXIT_FAILURE : EXIT_SUCCESS; +} +#endif /* CMPP_MAIN */ +/* +** 2022-11-12: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the cmpp_b-related parts of libcmpp. +*/ + +const cmpp_b cmpp_b_empty = cmpp_b_empty_m; + +CMPP__EXPORT(int, cmpp_b_append4)(cmpp * const pp, + cmpp_b * const os, + void const * src, + cmpp_size_t n){ + if( !ppCode && cmpp_b_append(os, src, n) ){ + cmpp_check_oom(pp, 0); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_b_reserve3)(cmpp * const pp, + cmpp_b * const os, + cmpp_size_t n){ + if( !ppCode && cmpp_b_reserve(os, n) ){ + cmpp_check_oom(pp, 0); + } + return ppCode; +} + + +CMPP__EXPORT(void, cmpp_b_clear)(cmpp_b *s){ + if( s->z ) cmpp_mfree(s->z); + *s = cmpp_b_empty; +} + +CMPP__EXPORT(cmpp_b *, cmpp_b_reuse)(cmpp_b * const s){ + if( s->z ){ +#if 1 + memset(s->z, 0, s->nAlloc) + /* valgrind pushes for this, which is curious because + cmpp_b_reserve[3]() memset()s new space to 0. + + Try the following without this block using one commit after + [5f9c31d1da1d] (that'll be the commit that this comment and #if + block were added): + + ##define foo + ##if not defined a + ##/if + ##query define {select ?1 a} bind [1] + + There's a misuse complaint about a jump depending on + uninitialized memory deep under cmpp__is_int(), in strlen(), on + the "define" argument of the ##query. It does not appear if + the lines above it are removed, which indicates that it's at + least semi-genuine. gcc v13.3.0, if it matters. + */; +#else + s->z[0] = 0; +#endif + s->n = 0; + } + s->errCode = 0; + return s; +} + +CMPP__EXPORT(void, cmpp_b_swap)(cmpp_b * const l, cmpp_b * const r){ + if( l!=r ){ + cmpp_b const x = *l; + *l = *r; + *r = x; + } +} + +CMPP__EXPORT(int, cmpp_b_reserve)(cmpp_b *s, cmpp_size_t n){ + if( 0==s->errCode && s->nAlloc < n ){ + void * const m = cmpp_mrealloc(s->z, s->nAlloc + n); + if( m ){ + memset((unsigned char *)m + s->nAlloc, 0, (n - s->nAlloc)) + /* valgrind convincingly recommends this. */; + s->z = m; + s->nAlloc += n; + }else{ + s->errCode = CMPP_RC_OOM; + } + } + return s->errCode; +} + +CMPP__EXPORT(int, cmpp_b_append)(cmpp_b * os, void const *src, + cmpp_size_t n){ + if(0==os->errCode){ + cmpp_size_t const nNeeded = os->n + n + 1; + if( nNeeded>=os->nAlloc && cmpp_b_reserve(os, nNeeded) ){ + assert( CMPP_RC_OOM==os->errCode ); + return os->errCode; + } + memcpy(os->z + os->n, src, n); + os->n += n; + os->z[os->n] = 0; + if( 0 ) { + g_warn("n=%u z=[%.*s] nUsed=%d", (unsigned)n, (int)n, + (char const*) src, (int)os->n); + } + } + return os->errCode; +} + +CMPP__EXPORT(int, cmpp_b_append_ch)(cmpp_b * os, char ch){ + if( 0==os->errCode + && (os->n+1nAlloc + || 0==cmpp_b_reserve(os, os->n+2)) ){ + os->z[os->n++] = (unsigned char)ch; + os->z[os->n] = 0; + } + return os->errCode; +} + +CMPP__EXPORT(int, cmpp_b_append_i32)(cmpp_b * os, int32_t d){ + if( 0==os->errCode ){ + char buf[16] = {0}; + int const n = snprintf(buf, sizeof(buf), "%" PRIi32, d); + cmpp_b_append(os, buf, (unsigned)n); + } + return os->errCode; +} + +CMPP__EXPORT(int, cmpp_b_append_i64)(cmpp_b * os, int64_t d){ + if( 0==os->errCode ){ + char buf[32] = {0}; + int const n = snprintf(buf, sizeof(buf), "%" PRIi64, d); + cmpp_b_append(os, buf, (unsigned)n); + } + return os->errCode; +} + +CMPP__EXPORT(bool, cmpp_b_chomp)(cmpp_b * b){ + return cmpp_chomp(b->z, &b->n); +} + +CMPP__EXPORT(void, cmpp_b_list_cleanup)(cmpp_b_list *li){ + while( li->nAlloc ){ + cmpp_b * const b = li->list[--li->nAlloc]; + if(b){ + cmpp_b_clear(b); + cmpp_mfree(b); + } + } + cmpp_mfree(li->list); + *li = cmpp_b_list_empty; +} + +CMPP__EXPORT(void, cmpp_b_list_reuse)(cmpp_b_list *li){ + while( li->n ){ + cmpp_b * const b = li->list[li->n--]; + if(b) cmpp_b_reuse(b); + } +} + +static cmpp_b * cmpp_b_list_push(cmpp_b_list *li){ + cmpp_b * p = 0; + assert( li->list ? li->nAlloc : 0==li->nAlloc ); + if( !cmpp_b_list_reserve(NULL, li, + cmpp__li_reserve1_size(li, 20)) ){ + p = li->list[li->n]; + if( p ){ + cmpp_b_reuse(p); + }else{ + p = cmpp_malloc(sizeof(*p)); + if( p ){ + li->list[li->n++] = p; + *p = cmpp_b_empty; + } + } + } + return p; +} + +/** + bsearch()/qsort() comparison for (cmpp_b**), sorting by size, + largest first and empty slots last. +*/ +static int cmpp_b__cmp_desc(const void *p1, const void *p2){ + cmpp_b const * const eL = *(cmpp_b const **)p1; + cmpp_b const * const eR = *(cmpp_b const **)p2; + if( eL==eR ) return 0; + else if( !eL ) return 1; + else if (!eR ) return -1; + return (int)(/*largest first*/eL->nAlloc - eR->nAlloc); +} + +/** + bsearch()/qsort() comparison for (cmpp_b**), sorting by size, + smallest first and empty slots last. +*/ +static int cmpp_b__cmp_asc(const void *p1, const void *p2){ + cmpp_b const * const eL = *(cmpp_b const **)p1; + cmpp_b const * const eR = *(cmpp_b const **)p2; + if( eL==eR ) return 0; + else if( !eL ) return 1; + else if (!eR ) return -1; + return (int)(/*smallest first*/eR->nAlloc - eL->nAlloc); +} + +/** + Sort li's buffer list using the given policy. NULL entries always + sort last. This is a no-op of how == cmpp_b_list_UNSORTED or + li->n<2. +*/ +static void cmpp_b_list__sort(cmpp_b_list * const li, + enum cmpp_b_list_e how){ + switch( li->n<2 ? cmpp_b_list_UNSORTED : how ){ + case cmpp_b_list_UNSORTED: + break; + case cmpp_b_list_DESC: + qsort(li->list, li->n, sizeof(cmpp_b*), cmpp_b__cmp_desc); + break; + case cmpp_b_list_ASC: + qsort(li->list, li->n, sizeof(cmpp_b*), cmpp_b__cmp_asc); + break; + } +} + +CMPP__EXPORT(cmpp_b *, cmpp_b_borrow)(cmpp *pp){ + cmpp__pi(pp); + cmpp_b_list * const li = &pi->recycler.buf; + cmpp_b * b = 0; + if( cmpp_b_list_UNSORTED==pi->recycler.bufSort ){ + pi->recycler.bufSort = cmpp_b_list_DESC; + cmpp_b_list__sort(li, pi->recycler.bufSort); + assert( cmpp_b_list_UNSORTED!=pi->recycler.bufSort + || pi->recycler.buf.n<2 ); + } + for( cmpp_size_t i = 0; i < li->n; ++i ){ + b = li->list[i]; + if( b ){ + li->list[i] = 0; + assert( !b->n && + "Someone wrote to a buffer after giving it back" ); + if( i < li->n-1 ){ + pi->recycler.bufSort = cmpp_b_list_UNSORTED; + } + return cmpp_b_reuse(b); + } + } + /** + Allocate the list entry now and then remove the buffer from it to + "borrow" it. We allocate now, instead of in cmpp_b_return(), so + that that function has no OOM condition (handling it properly in + higher-level code would be a mess). + */ + b = cmpp_b_list_push(li); + if( 0==cmpp_check_oom(pp, b) ) { + assert( b==li->list[li->n-1] ); + li->list[li->n-1] = 0; + } + return b; +} + +CMPP__EXPORT(void, cmpp_b_return)(cmpp *pp, cmpp_b *b){ + if( !b ) return; + cmpp__pi(pp); + cmpp_b_list * const li = &pi->recycler.buf; + for( cmpp_size_t i = 0; i < li->n; ++i ){ + if( !li->list[i] ){ + li->list[i] = cmpp_b_reuse(b); + pi->recycler.bufSort = cmpp_b_list_UNSORTED; + return; + } + } + assert( !"This shouldn't be possible - no slot in recycler.buf" ); + cmpp_b_clear(b); + cmpp_mfree(b); +} + +CMPP__EXPORT(int, cmpp_output_f_b)( + void * state, void const * src, cmpp_size_t n +){ + if( state ){ + return cmpp_b_append(state, src, n); + } + return 0; +} + +#if CMPP__OBUF +int cmpp__obuf_flush(cmpp__obuf * b); +int cmpp__obuf_write(cmpp__obuf * b, void const * src, cmpp_size_t n); +void cmpp__obuf_cleanup(cmpp__obuf * b); +int cmpp_output_f_obuf(void * state, void const * src, cmpp_size_t n); +int cmpp_flush_f_obuf(void * state); +void cmpp_outputer_cleanup_f_obuf(cmpp_outputer * o); +const cmpp__obuf cmpp__obuf_empty = cmpp__obuf_empty_m; +const cmpp_outputer cmpp_outputer_obuf = { + .out = cmpp_output_f_obuf, + .flush = cmpp_flush_f_obuf, + .cleanup = cmpp_outputer_cleanup_f_obuf +}; +#endif +/* +** 2025-11-07: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the db-related pieces of libcmpp. +*/ + +/** + A proxy for sqlite3_prepare() which updates pp->pimpl->err on error. +*/ +static int cmpp__prepare(cmpp *pp, sqlite3_stmt **pStmt, + const char * zSql, ...){ + /* We need for pp->pimpl->stmt.sp* to work regardless of pending errors so + that we can, when appropriate, create the rollback statements. */ + sqlite3_str * str = sqlite3_str_new(pp->pimpl->db.dbh); + char * z = 0; + int n = 0; + va_list va; + assert( pp->pimpl->db.dbh ); + va_start(va, zSql); + sqlite3_str_vappendf(str, zSql, va); + va_end(va); + z = cmpp_str_finish(pp, str, &n); + if( z ){ + int const rc = sqlite3_prepare_v2(pp->pimpl->db.dbh, z, n, pStmt, 0); + cmpp__db_rc(pp, rc, z); + sqlite3_free(z); + } + return ppCode; +} + +sqlite3_stmt * cmpp__stmt(cmpp * pp, enum CmppStmt_e which, + bool prepEvenIfErr){ + if( !pp->pimpl->db.dbh && cmpp__db_init(pp) ) return NULL; + sqlite3_stmt ** q = 0; + char const * zSql = 0; + switch(which){ + default: + cmpp__fatal("Maintenance required: not a valid CmppStmt ID: %d", which); + return NULL; +#define E(N,S) case CmppStmt_ ## N: zSql = S; q = &pp->pimpl->stmt.N; break; + CmppStmt_map(E) +#undef E + } + assert( q ); + assert( zSql && *zSql ); + if( !*q && (!ppCode || prepEvenIfErr) ){ + cmpp__prepare(pp, q, "%s", zSql); + } + return *q; +} + +void cmpp__stmt_reset(sqlite3_stmt * const q){ + if( q ){ + sqlite3_clear_bindings(q); + sqlite3_reset(q); + } +} + +static inline int cmpp__stmt_is_sp(cmpp const * const pp, + sqlite3_stmt const * const q){ + return q==pp->pimpl->stmt.spBegin + || q==pp->pimpl->stmt.spRelease + || q==pp->pimpl->stmt.spRollback; +} + +int cmpp__step(cmpp * const pp, sqlite3_stmt * const q, bool resetIt){ + int rc = SQLITE_ERROR; + assert( q ); + if( !ppCode || cmpp__stmt_is_sp(pp,q) ){ + rc = sqlite3_step(q); + cmpp__db_rc(pp, rc, sqlite3_sql(q)); + } + if( resetIt /* even if ppCode!=0 */ ) cmpp__stmt_reset(q); + assert( 0!=rc ); + return rc; +} + + +/** + Expects an SQLITE_... result code and returns an approximate match + from cmpp_rc_e. It specifically treats SQLITE_ROW and SQLITE_DONE + as non-errors, returning 0 for those. +*/ +static int cmpp__db_errcode(sqlite3 * const db, int sqliteCode); +int cmpp__db_errcode(sqlite3 * const db, int sqliteCode){ + (void)db; + int rc = 0; + switch(sqliteCode & 0xff){ + case SQLITE_ROW: + case SQLITE_DONE: + case SQLITE_OK: rc = 0; break; + case SQLITE_NOMEM: rc = CMPP_RC_OOM; break; + case SQLITE_CORRUPT: rc = CMPP_RC_CORRUPT; break; + case SQLITE_TOOBIG: + case SQLITE_FULL: + case SQLITE_RANGE: rc = CMPP_RC_RANGE; break; + case SQLITE_NOTFOUND: rc = CMPP_RC_NOT_FOUND; break; + case SQLITE_PERM: + case SQLITE_AUTH: + case SQLITE_BUSY: + case SQLITE_LOCKED: + case SQLITE_READONLY: rc = CMPP_RC_ACCESS; break; + case SQLITE_CANTOPEN: + case SQLITE_IOERR: rc = CMPP_RC_IO; break; + case SQLITE_NOLFS: rc = CMPP_RC_UNSUPPORTED; break; + default: + //MARKER(("sqlite3_errcode()=0x%04x\n", rc)); + rc = CMPP_RC_DB; break; + } + return rc; +} + +int cmpp__db_rc(cmpp *pp, int dbRc, char const *zMsg){ + switch(dbRc){ + case 0: + case SQLITE_DONE: + case SQLITE_ROW: + return 0; + default: + return cmpp_err_set( + pp, cmpp__db_errcode(pp->pimpl->db.dbh, dbRc), + "SQLite error #%d: %s%s%s", + dbRc, + pp->pimpl->db.dbh + ? sqlite3_errmsg(pp->pimpl->db.dbh) + : "", + zMsg ? ": " : "", + zMsg ? zMsg : "" + ); + } +} + +/** + The base "define" impl. Requires q to be an INSERT for one of the + define tables and have the (t,k,v) columns set up to bind to ?1, + ?2, and ?3. +*/ +static +int cmpp__define_impl(cmpp * const pp, + sqlite3_stmt * const q, + unsigned char const * zKey, + cmpp_ssize_t nKey, + unsigned char const *zVal, + cmpp_ssize_t nVal, + int tType, + bool resetStmt){ + if( 0==ppCode){ + assert( q ); + nKey = cmpp__strlenu(zKey, nKey); + nVal = cmpp__strlenu(zVal, nVal); + if( 0==cmpp__bind_textn(pp, q, 2, zKey, (int)nKey) + && 0==cmpp__bind_int(pp, q, 1, tType) ){ + //g_stderr("zKey=%s\nzVal=%s\nzEq=%s\n", zKey, zVal, zEq); + /* TODO? if tType==cmpp_TT_Blob, bind it as a blob */ + if( zVal ){ + if( nVal ){ + cmpp__bind_textn(pp, q, 3, zVal, (int)nVal); + }else{ + /* Arguable */ + cmpp__bind_null(pp, q, 3); + } + }else{ + cmpp__bind_int(pp, q, 3, 1); + } + cmpp__step(pp, q, resetStmt); + g_debug(pp,2,("define: %s [%s]=[%.*s]\n", + cmpp_tt_cstr(tType), zKey, (int)nVal, zVal)); + } + } + return ppCode; +} + +int cmpp__define2(cmpp *pp, + unsigned char const * zKey, + cmpp_ssize_t nKey, + unsigned char const *zVal, + cmpp_ssize_t nVal, + cmpp_tt tType){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defIns, false); + if( q ){ + cmpp__define_impl(pp, q, zKey, nKey, zVal, nVal, tType, true); + } + return ppCode; +} + +/** + The legacy variant of define() which accepts X=Y in zKey. This + continues to exist because it's convenient for passing args from + main(). +*/ +static int cmpp__define_legacy(cmpp *pp, const char * zKey, char const *zVal, + cmpp_tt ttype ){ + + if(ppCode) return ppCode; + CmppKvp kvp = CmppKvp_empty; + if( CmppKvp_parse(pp, &kvp, ustr_c(zKey), -1, + zVal + ? CmppKvp_op_none + : CmppKvp_op_eq1) ) { + return ppCode; + } + if( kvp.v.z ){ + if( zVal ){ + assert(!"cannot happen - CmppKvp_op_none will prevent it"); + return cmpp_err_set(pp, CMPP_RC_MISUSE, + "Cannot assign two values to [%.*s] [%.*s] [%s]", + kvp.k.n, kvp.k.z, kvp.v.n, kvp.v.z, zVal); + } + }else{ + kvp.v.z = (unsigned char const *)zVal; + kvp.v.n = zVal ? (int)strlen(zVal) : 0; + } + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_defIns, false); + if( !q ) return ppCode; + int64_t intCheck = 0; + switch( ttype ){ + case cmpp_TT_Unknown: + if(kvp.v.n){ + if( cmpp__is_int64(kvp.v.z, kvp.v.n, &intCheck) ){ + ttype = cmpp_TT_Int; + if( '+'==*kvp.v.z ){ + ++kvp.v.z; + --kvp.v.n; + } + }else{ + ttype = cmpp_TT_String; + } + }else if( kvp.v.z ){ + ttype = cmpp_TT_String; + }else{ + ttype = cmpp_TT_Int; + intCheck = 1 /* No value ==> value of 1. */; + } + break; + case cmpp_TT_Int: + if( !cmpp__is_int64(kvp.v.z, kvp.v.n, &intCheck) ){ + ttype = cmpp_TT_String; + } + break; + default: + break; + } + if( 0==cmpp__bind_textn(pp, q, 2, kvp.k.z, kvp.k.n) + && 0==cmpp__bind_int(pp, q, 1, ttype) ){ + //g_stderr("zKey=%s\nzVal=%s\nzEq=%s\n", zKey, zVal, zEq); + switch( ttype ){ + case cmpp_TT_Int: + cmpp__bind_int(pp, q, 3, intCheck); + break; + case cmpp_TT_Null: + cmpp__bind_null(pp, q, 3); + break; + default: + cmpp__bind_textn(pp, q, 3, kvp.v.z, (int)kvp.v.n); + break; + } + cmpp__step(pp, q, true); + g_debug(pp,2,("define: [%.*s]=[%.*s]\n", + kvp.k.n, kvp.k.z, + kvp.v.n, kvp.v.z)); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_define_legacy)(cmpp *pp, const char * zKey, char const *zVal){ + return cmpp__define_legacy(pp, zKey, zVal, cmpp_TT_Unknown); +} + +CMPP__EXPORT(int, cmpp_define_v2)(cmpp *pp, const char * zKey, char const *zVal){ + return cmpp__define2(pp, ustr_c(zKey), -1, ustr_c(zVal), -1, + cmpp_TT_String); +} + +static +int cmpp__define_shadow(cmpp *pp, unsigned char const *zKey, + cmpp_ssize_t nKey, + unsigned char const *zVal, + cmpp_ssize_t nVal, + int ttype, + int64_t * pId){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_sdefIns, false); + if( q ){ + if( 0==cmpp__define_impl(pp, q, zKey, nKey, zVal, nVal, ttype, false) + && pId ){ + *pId = sqlite3_column_int64(q, 0); + assert( *pId ); + } + cmpp__stmt_reset(q); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_define_shadow)(cmpp *pp, char const *zKey, + char const *zVal, int64_t *pId){ + assert( pId ); + return cmpp__define_shadow(pp, ustr_c(zKey), -1, + ustr_c(zVal), -1, cmpp_TT_String, pId); +} + +static +int cmpp__define_unshadow(cmpp *pp, unsigned char const *zKey, + cmpp_ssize_t nKey, int64_t id){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_sdefDel, false); + if( q ){ + cmpp__bind_textn(pp, q, 1, zKey, (int)nKey); + cmpp__bind_int(pp, q, 2, id); + cmpp__step(pp, q, true); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_define_unshadow)(cmpp *pp, char const *zKey, int64_t id){ + return cmpp__define_unshadow(pp, ustr_c(zKey), -1, id); +} + +/* +** This sqlite3_trace_v2() callback outputs tracing info using +** ((cmpp*)c)->sqlTrace.pFile. +*/ +static int cmpp__db_sq3TraceV2(unsigned dx,void*c,void*p,void*x){ + switch(dx){ + case SQLITE_TRACE_STMT:{ + char const * const zSql = x; + cmpp * const pp = c; + cmpp__pi(pp); + if(pi->sqlTrace.out.out){ + char * const zExp = pi->sqlTrace.expandSql + ? sqlite3_expanded_sql((sqlite3_stmt*)p) + : 0; + sqlite3_str * const s = sqlite3_str_new(pi->db.dbh); + if( pi->dx ){ + cmpp__dx_append_script_info(pi->dx, s); + sqlite3_str_appendchar(s, 1, ':'); + sqlite3_str_appendchar(s, 1, ' '); + } + sqlite3_str_appendall(s, zExp ? zExp : zSql); + sqlite3_str_appendchar(s, 1, '\n'); + int const n = sqlite3_str_length(s); + if( n ){ + char * const z = sqlite3_str_finish(s); + if( z ){ + cmpp__out2(pp, &pi->sqlTrace.out, z, (cmpp_size_t)n); + sqlite3_free(z); + } + } + sqlite3_free(zExp); + } + break; + } + } + return 0; +} + +#include +/* +** sqlite3 UDF which returns true if its argument refers to an +** accessible file, else false. +*/ +static void cmpp__udf_file_exists( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + const char *zName; + (void)(argc); /* Unused parameter */ + zName = (const char*)sqlite3_value_text(argv[0]); + if( 0!=zName ){ + struct stat sb; + sqlite3_result_int(context, stat(zName, &sb) + ? 0 + : S_ISREG(sb.st_mode)); + } +} + +static void cmpp__udf_truthy( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + (void)(argc); /* Unused parameter */ + assert(1==argc); + int buul = 0; + sqlite3_value * const sv = argv[0]; + switch( sqlite3_value_type(sv) ){ + case SQLITE_NULL: + break; + case SQLITE_FLOAT: + buul = 0.0!=sqlite3_value_double(sv); + break; + case SQLITE_INTEGER: + buul = 0!=sqlite3_value_int(sv); + break; + case SQLITE_TEXT: + case SQLITE_BLOB:{ + int const n = sqlite3_value_bytes(sv); + if( n>1 ) buul = 1; + else if( 1==n ){ + const char *z = + (const char*)sqlite3_value_text(sv); + buul = z + ? 0!=strcmp(z,"0") + : 0; + } + } + } + sqlite3_result_int(context, buul); +} + +/** + SQLite3 UDF which compares its two arguments using memcmp() + semantics. NULL will compare equal to NULL, but less than anything + else. +*/ +static void cmpp__udf_compare( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + (void)(argc); /* Unused parameter */ + assert(2==argc); + sqlite3_value * const v1 = argv[0]; + sqlite3_value * const v2 = argv[1]; + unsigned char const * const z1 = sqlite3_value_text(v1); + unsigned char const * const z2 = sqlite3_value_text(v2); + int const n1 = sqlite3_value_bytes(v1); + int const n2 = sqlite3_value_bytes(v2); + int rv; + if( !z1 ){ + rv = z2 ? -1 : 0; + }else if( !z2 ){ + rv = 1; + }else{ + rv = strncmp((char const *)z1, (char const *)z2, n1>n2 ? n1 : n2); + } + if(0) g_stderr("udf_compare (%s,%s) = %d\n", z1, z2, rv); + sqlite3_result_int(context, rv); +} + +int cmpp__db_init(cmpp *pp){ + cmpp__pi(pp); + if( pi->db.dbh || ppCode ) return ppCode; + int rc; + char * zErr = 0; + const char * zDrops = + "BEGIN EXCLUSIVE;" + "DROP TABLE IF EXISTS " CMPP__DB_MAIN_NAME ".def;" + "DROP TABLE IF EXISTS " CMPP__DB_MAIN_NAME ".incl;" + "DROP TABLE IF EXISTS " CMPP__DB_MAIN_NAME ".inclpath;" + "DROP TABLE IF EXISTS " CMPP__DB_MAIN_NAME ".predef;" + "DROP TABLE IF EXISTS " CMPP__DB_MAIN_NAME ".ttype;" + "DROP VIEW IF EXISTS " CMPP__DB_MAIN_NAME ".vdef;" + "COMMIT;" + ; + const char * zSchema = + "BEGIN EXCLUSIVE;" + "CREATE TABLE " CMPP__DB_MAIN_NAME ".def(" + /* ^^^ defines */ + "t INTEGER DEFAULT NULL," + /*^^ type: cmpp_tt or NULL */ + "k TEXT PRIMARY KEY NOT NULL," + "v TEXT DEFAULT NULL" + ") WITHOUT ROWID;" + + "CREATE TABLE " CMPP__DB_MAIN_NAME ".incl(" + /* ^^^ files currently being included */ + "file TEXT PRIMARY KEY NOT NULL," + "srcFile TEXT DEFAULT NULL," + "srcLine INTEGER DEFAULT 0" + ") WITHOUT ROWID;" + + "CREATE TABLE " CMPP__DB_MAIN_NAME ".inclpath(" + /* ^^^ include path. We use (ORDER BY priority DESC, rowid) to + make their priority correct. priority should only be set by the + #include directive for its cwd entry. */ + "priority INTEGER DEFAULT 0," /* higher sorts first */ + "dir TEXT UNIQUE NOT NULL ON CONFLICT IGNORE" + ");" + + "CREATE TABLE " CMPP__DB_MAIN_NAME ".modpath(" + /* ^^^ module path. We use ORDER BY ROWID to make their + priority correct. */ + "dir TEXT PRIMARY KEY NOT NULL ON CONFLICT IGNORE" + ");" + + "CREATE TABLE " CMPP__DB_MAIN_NAME ".predef(" + /* ^^^ pre-defines */ + "t INTEGER DEFAULT NULL," /* a cmpp_tt or NULL */ + "k TEXT PRIMARY KEY NOT NULL," + "v TEXT DEFAULT NULL" + ") WITHOUT ROWID;" + "INSERT INTO " CMPP__DB_MAIN_NAME ".predef (t,k,v)" + " VALUES(NULL,'cmpp::version','" CMPP_VERSION "')" + ";" + + /** + sdefs - "scoped defines" or "shadow defines". The problem these + solve is the one of supporting a __FILE__ define in cmpp input + sources, such that it remains valid both before and after an + #include, but has a new name in the scope of an #include. We + can't use savepoints for that because they're a nuclear option + affecting _all_ #defines in the #include'd file, whereas we + normally want #defines to stick around across files. + + See cmpp_define_shadow() and cmpp_define_unshadow(). + */ + "CREATE TABLE " CMPP__DB_MAIN_NAME ".sdef(" + "id INTEGER PRIMARY KEY AUTOINCREMENT," + "t INTEGER DEFAULT NULL," /* a cmpp_tt or NULL */ + "k TEXT NOT NULL," + "v TEXT DEFAULT NULL" + ");" + + /** + vdef is a view consolidating the various #define stores. It's + intended to be used for all general-purpose fetching of defines + and it orders the results such that the library's defines + supercede all others, then scoped keys, then client-level + defines. + + To push a new sdef we simply insert into sdef. Then vdef will + order the newest sdef before any entry from the def table. + */ + "CREATE VIEW " CMPP__DB_MAIN_NAME ".vdef(source,t,k,v) AS" + " SELECT NULL,t,k,v FROM " CMPP__DB_MAIN_NAME ".predef" + /* ------^^^^ sorts before numbers */ + " UNION ALL" + " SELECT -rowid,t,k,v FROM " CMPP__DB_MAIN_NAME ".sdef" + /* ^^^^ sorts newest of matching keys first */ + " UNION ALL" + " SELECT 0,t,k,v FROM " CMPP__DB_MAIN_NAME ".def" + " ORDER BY 1, 3" + ";" + +#if 0 + "CREATE TABLE " CMPP__DB_MAIN_NAME ".ttype(" + /* ^^^ token types */ + "t INTEGER PRIMARY KEY NOT NULL," + /*^^ type: cmpp_tt */ + "n TEXT NOT NULL," + /*^^ cmpp_TT_... name. */ + "s TEXT DEFAULT NULL" + /* Symbolic or directive name, if any. */ + ");" +#endif + + "COMMIT;" + "BEGIN EXCLUSIVE;" + ; + cmpp__err_clear(pp); + int openFlags = SQLITE_OPEN_READWRITE; + if( pi->db.zName ){ + openFlags |= SQLITE_OPEN_CREATE; + } + rc = sqlite3_open_v2( + pi->db.zName ? pi->db.zName : ":memory:", + &pi->db.dbh, openFlags, 0); + if(rc){ + cmpp__db_rc(pp, rc, pi->db.zName + ? pi->db.zName + : ":memory:"); + sqlite3_close(pi->db.dbh); + pi->db.dbh = 0; + assert(ppCode); + return rc; + } + sqlite3_busy_timeout(pi->db.dbh, 5000); + sqlite3_db_config(pi->db.dbh, SQLITE_DBCONFIG_MAINDBNAME, + CMPP__DB_MAIN_NAME); + rc = sqlite3_trace_v2(pi->db.dbh, SQLITE_TRACE_STMT, + cmpp__db_sq3TraceV2, pp); + if( cmpp__db_rc(pp, rc, "Installing tracer failed") ){ + goto end; + } + //g_warn("Schema:\n%s\n",zSchema); + struct { + /* SQL UDFs */ + char const * const zName; + void (*xUdf)(sqlite3_context *,int,sqlite3_value **); + int arity; + int flags; + } aFunc[] = { + { + .zName = "cmpp_file_exists", + .xUdf = cmpp__udf_file_exists, + .arity = 1, + .flags = SQLITE_UTF8 | SQLITE_DIRECTONLY + }, + { + .zName = "cmpp_truthy", + .xUdf = cmpp__udf_truthy, + .arity = 1, + .flags = SQLITE_UTF8 | SQLITE_DIRECTONLY | SQLITE_DETERMINISTIC + }, + { + .zName = "cmpp_compare", + .xUdf = cmpp__udf_compare, + .arity = 2, + .flags = SQLITE_UTF8 | SQLITE_DIRECTONLY | SQLITE_DETERMINISTIC + } + }; + assert( 0==rc ); + for( unsigned int i = 0; 0==rc && i < sizeof(aFunc)/sizeof(aFunc[0]); ++i ){ + rc = sqlite3_create_function( + pi->db.dbh, aFunc[i].zName, aFunc[i].arity, + aFunc[i].flags, 0, aFunc[i].xUdf, 0, 0 + ); + } + if( cmpp__db_rc(pp, rc, "UDF registration failed.") ){ + return ppCode; + } + if( pi->db.zName ){ + /* Drop all cmpp tables when using a persistent db so that we are + not beholden to a given structure. TODO: a config flag to + toggle this. */ + rc = sqlite3_exec(pi->db.dbh, zDrops, 0, 0, &zErr); + } + if( !rc ){ + rc = sqlite3_exec(pi->db.dbh, zSchema, 0, 0, &zErr); + } + + if( !rc ){ + extern int sqlite3_series_init(sqlite3 *, char **, const sqlite3_api_routines *); + rc = sqlite3_series_init(pi->db.dbh, &zErr, NULL); + } + + if(rc){ + if( zErr ){ + cmpp_err_set(pp, cmpp__db_errcode(pi->db.dbh, rc), + "SQLite error #%d initializing DB: %s", rc, zErr); + sqlite3_free(zErr); + }else{ + cmpp_err_set(pp, cmpp__db_errcode(pi->db.dbh, rc), + "SQLite error #%d initializing DB", rc); + } + goto end; + } + + while(0){ + /* Insert the ttype mappings. We don't yet make use of this but + only for lack of a use case ;). */ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_insTtype, false); + if( !q ) goto end; +#define E(N,STR) \ + cmpp__bind_int(pp, q, 1, cmpp_TT_ ## N); \ + cmpp__bind_textn(pp, q, 2, \ + ustr_c("cmpp_TT_ " # N), sizeof("cmpp_TT_" # N)-1); \ + if( STR ) cmpp__bind_textn(pp, q, 3, ustr_c(STR), sizeof(STR)-1); \ + else cmpp__bind_null(pp, q, 3); \ + if( SQLITE_DONE!=cmpp__step(pp, q, true) ) return ppCode; + cmpp_tt_map(E) +#undef E + sqlite3_finalize(q); + pi->stmt.insTtype = 0; + break; + } + +end: + if( !ppCode ){ + /* + ** Keep us from getting in the situation later that delayed + ** preparation if one of the savepoint statements fails (e.g. due + ** to OOM or memory corruption). + */ + cmpp__stmt(pp, CmppStmt_spBegin, false); + cmpp__stmt(pp, CmppStmt_spRelease, false); + cmpp__stmt(pp, CmppStmt_spRollback, false); + cmpp__lazy_init(pp); + } + return ppCode; +} +/* +** 2022-11-12: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the core cmpp_dx_f() implementations of libcmpp. +*/ + +static int cmpp__dx_err_just_once(cmpp_dx *dx, cmpp_arg const *arg){ + return cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "'%s' may only be used once.", + arg->z); +} + +/* No-op cmpp_dx_f() impl. */ +static void cmpp_dx_f_noop(cmpp_dx *dx){ + (void)dx; +} + +/** + cmpp_kav_each_f() impl for use by #define {k->v}. +*/ +static int cmpp_kav_each_f_define__group( + cmpp_dx *dx, + unsigned char const *zKey, cmpp_size_t nKey, + unsigned char const *zVal, cmpp_size_t nVal, + void* callbackState +){ + if( (callbackState==dx) + && cmpp_has(dx->pp, (char const*)zKey, nKey) ){ + return dxppCode; + } + return cmpp__define2(dx->pp, zKey, nKey, zVal, nVal, cmpp_TT_String); +} + +/* #error impl. */ +static void cmpp_dx_f_error(cmpp_dx *dx){ + const char *zBegin = (char const *)dx->args.z; + unsigned n = (unsigned)dx->args.nz; + if( n>2 && (('"' ==*zBegin || '\''==*zBegin) && zBegin[n-1]==*zBegin) ){ + ++zBegin; + n -= 2; + } + if( n ){ + cmpp_dx_err_set(dx, CMPP_RC_ERROR, "%.*s", n, zBegin); + }else{ + cmpp_dx_err_set(dx, CMPP_RC_ERROR, "(no additional info)"); + } +} + +/* Impl. for #define. */ +static void cmpp_dx_f_define(cmpp_dx *dx){ + cmpp_d const * const d = dx->d; + assert(d); + if( !dx->args.arg0 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting one or more arguments"); + return; + } + cmpp_arg const * aKey = 0; + int argNdx = 0; + int nChomp = 0; + unsigned nHeredoc = 0; + unsigned char acHeredoc[128] = {0} /* TODO: cmpp_args_clone() */; + bool ifNotDefined = false /* true if '?' arg */; + cmpp_arg const *aAppend = 0; +#define checkIsDefined(ARG) \ + if(ifNotDefined && (cmpp_has(dx->pp, (char const*)ARG->z, ARG->n) \ + || dxppCode)) break + + for( cmpp_arg const * arg = dx->args.arg0; + 0==dxppCode && arg; + arg = arg->next, ++argNdx ){ + //g_warn("arg=%s", arg->z); + if( 0==argNdx && cmpp_arg_equals(arg, "?") ){ + /* Only set the key if it's not already defined. */ + ifNotDefined = true; + continue; + } + switch( arg->ttype ){ + case cmpp_TT_ShiftL3: + ++nChomp; + /* fall through */ + case cmpp_TT_ShiftL: + if( arg->next || argNdx<1 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Ill-placed '%s'.", arg->z); + }else if( arg->n >= sizeof(acHeredoc)-1 ){ + cmpp_dx_err_set(dx, CMPP_RC_RANGE, + "Heredoc name is too large."); + }else if( !aKey ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Missing key before %s.", + cmpp__tt_cstr(cmpp_TT_ShiftL, false)); + }else{ + assert( aKey ); + nHeredoc = aKey->n; + memcpy(acHeredoc, aKey->z, aKey->n+1/*NUL*/); + } + break; + case cmpp_TT_OpEq: + if( 1 /*seenEq || argNdx!=1 || !arg->next*/ ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Ill-placed '%s'.", arg->z); + break; + } + continue; + case cmpp_TT_StringAt: + if( cmpp__StringAtIsOk(dx->pp, cmpp_atpol_CURRENT) ){ + break; + } + /* fall through */ + case cmpp_TT_Int: + case cmpp_TT_String: + case cmpp_TT_Word: + if( cmpp_arg_isflag(arg,"-chomp") ){ + ++nChomp; + break; + } + if( cmpp_arg_isflag(arg,"-append") + || cmpp_arg_isflag(arg,"-a") ){ + aAppend = arg->next; + if( !aAppend ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting argument for %s", + arg->z); + } + arg = aAppend; + break; + } + if( aKey ){ + /* This is the second arg - the value */ + checkIsDefined(aKey); + cmpp_b * const os = cmpp_b_borrow(dx->pp); + cmpp_b * const ba = aAppend ? cmpp_b_borrow(dx->pp) : 0; + while( os ){ + if( ba ){ + cmpp__get_b(dx->pp, aKey->z, aKey->n, ba, false); + if( dxppCode ) break; + if( 0 ){ + g_warn("key=%s\n", aKey->z); + g_warn("ba=%u %.*s\n", ba->n, ba->n, ba->z); + } + } + if( cmpp_arg_to_b(dx, arg, os, + cmpp_arg_to_b_F_BRACE_CALL) ) break; + cmpp_b * const which = (ba && ba->n) ? ba : os; + if( which==ba && os->n ){ + if( ba->n ) cmpp_b_append4(dx->pp, ba, aAppend->z, aAppend->n); + cmpp_b_append4(dx->pp, ba, os->z, os->n); + } + cmpp__define2(dx->pp, aKey->z, aKey->n, which->z, which->n, + arg->ttype); + if( 0 ){ + g_warn("aKey=%u z=[%.*s]\n", aKey->n, (int)aKey->n, aKey->z); + g_warn("nExp=%u z=[%.*s]\n", which->n, (int)which->n, which->z); + } + break; + } + cmpp_b_return(dx->pp, os); + cmpp_b_return(dx->pp, ba); + aKey = 0; + }else if( cmpp_TT_Word!=arg->ttype ){ + cmpp_dx_err_set(dx, CMPP_RC_TYPE, + "Expecting a define-name token here."); + }else if( arg->next ){ + aKey = arg; + }else{ + /* No value = a value of 1. */ + checkIsDefined(arg); + cmpp__define2(dx->pp, arg->z, arg->n, + ustr_c("1"), 1, cmpp_TT_Int); + } + break; + case cmpp_TT_GroupSquiggly: + assert( !acHeredoc[0] ); + if( (ifNotDefined ? argNdx>1 : argNdx>0) || arg->next ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "{...} must be the only argument.") + /* This is for simplicity's sake. */; + }else{ + cmpp_kav_each(dx, arg->z, arg->n, + cmpp_kav_each_f_define__group, + ifNotDefined ? dx : NULL, + cmpp_kav_each_F_NOT_EMPTY + | cmpp_kav_each_F_CALL_VAL + | cmpp_kav_each_F_PARENS_EXPR + //TODO cmpp_kav_each_F_IF_UNDEF + ); + } + aKey = 0; + break; + case cmpp_TT_GroupParen:{ + if( !aKey ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "(...) is not permitted as a key."); + break; + } + checkIsDefined(aKey); + int d = 0; + if( 0==cmpp__arg_evalSubToInt(dx, arg, &d) ){ + char exprBuf[32] = {0}; + cmpp_size_t nVal = + (cmpp_size_t)snprintf(&exprBuf[0], + sizeof(exprBuf), "%d", d); + assert(nVal>0); + cmpp__define2(dx->pp, aKey->z, aKey->n, + ustr_c(&exprBuf[0]), nVal, cmpp_TT_Int); + } + break; + } + case cmpp_TT_GroupBrace:{ + if( !aKey ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "[...] is not permitted as a key."); + break; + } + checkIsDefined(aKey); + cmpp_b * const b = cmpp_b_borrow(dx->pp); + if( b && 0==cmpp_call_str(dx->pp, arg->z, arg->n, b, 0) ){ + cmpp__define2(dx->pp, aKey->z, aKey->n, + b->z, b->n, cmpp_TT_AnyType); + } + cmpp_b_return(dx->pp, b); + break; + } + default: + // TODO: treat (...) as an expression + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Unhandled arg type %s: %s", + cmpp__tt_cstr(arg->ttype, true), arg->z); + break; + } + } + if( 0==nHeredoc && nChomp ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "-chomp can only be used with <<."); + } + if( 0==dxppCode && nHeredoc ){ + // Process (#define KEY <<) + cmpp_b * const os = cmpp_b_borrow(dx->pp); + assert( dx->d->closer ); + if( os && + 0==cmpp_dx_consume_b(dx, os, &dx->d->closer, 1, + cmpp_dx_consume_F_PROCESS_OTHER_D) ){ + while( nChomp-- && cmpp_b_chomp(os) ){} + g_debug(dx->pp,2,("define heredoc: [%s]=[%.*s]\n", + acHeredoc, (int)os->n, os->z)); + if( !ifNotDefined + || !cmpp_has(dx->pp, (char const*)acHeredoc, nHeredoc) ){ + cmpp__define2( + dx->pp, acHeredoc, nHeredoc, os->z, os->n, cmpp_TT_String + ); + } + } + cmpp_b_return(dx->pp, os); + } +#undef checkIsDefined + return; +} + +/* Impl. for #undef */ +static void cmpp_dx_f_undef(cmpp_dx *dx){ + if( !dx->args.arg0 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting one or more arguments"); + return; + } + cmpp_d const * const d = dx->d; + for( cmpp_arg const * arg = dx->args.arg0; + 0==dxppCode && arg; + arg = arg->next ){ + if( 0 ){ + g_stderr(" %s: %s %p n=%d %.*s\n", d->name.z, + cmpp__tt_cstr(arg->ttype, true), arg->z, + (int)arg->n, (int)arg->n, arg->z); + } + if( cmpp_TT_Word==arg->ttype ){ +#if 0 + /* Too strict? */ + if( 0==cmpp__legal_key_check(dx->pp, arg->z, + (cmpp_ssize_t)arg->n, false) ) { + cmpp_undef(dx->pp, (char const *)arg->z); + } +#else + cmpp_undef(dx->pp, (char const *)arg->z, NULL); +#endif + }else{ + cmpp_err_set(dx->pp, CMPP_RC_MISUSE, "Invalid arg for %s: %s", + d->name.z, arg->z); + } + } +} + +/* Impl. for #once. */ +static void cmpp_dx_f_once(cmpp_dx *dx){ + cmpp_d const * const d = dx->d; + assert(d); + assert(d->closer); + if( dx->args.arg0 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting no arguments"); + return; + } + cmpp_dx_pimpl * const dxp = dx->pimpl; + cmpp_b * const b = cmpp_b_borrow(dx->pp); + if( !b ) return; + cmpp_b_append_ch(b, '#'); + cmpp_b_append4(dx->pp, b, d->name.z, d->name.n); + cmpp_b_append_ch(b, ':'); + cmpp__get_b(dx->pp, ustr_c("__FILE__"), 8, b, true) + /* Wonky return semantics. */; + if( b->errCode + || dxppCode + || cmpp_b_append_ch(b, ':') + || cmpp_b_append_i32(b, (int)dxp->pos.lineNo) ){ + goto end; + } + //g_debug(dx->pp,1,("#once key: %s", b->z)); + int const had = cmpp_has(dx->pp, (char const *)b->z, b->n); + if( dxppCode ) goto end; + else if( had ){ + CmppLvl * const lvl = CmppLvl_push(dx); + if( lvl ){ + CmppLvl_elide(lvl, true); + cmpp_outputer devNull = cmpp_outputer_empty; + cmpp_dx_consume(dx, &devNull, &d->closer, 1, + cmpp_dx_consume_F_PROCESS_OTHER_D); + CmppLvl_pop(dx, lvl); + } + }else if( !cmpp_define_v2(dx->pp, (char const*)b->z, "1") ){ + cmpp_dx_consume(dx, NULL, &d->closer, 1, + cmpp_dx_consume_F_PROCESS_OTHER_D); + } +end: + cmpp_b_return(dx->pp, b); + return; +} + + +/* Impl. for #/define, /#query, /#pipe. */ +CMPP__EXPORT(void, cmpp_dx_f_dangling_closer)(cmpp_dx *dx){ + cmpp_d const * const d = dx->d; + char const * const zD = cmpp_dx_delim(dx); + dxserr("%s%s used without its opening directive.", + zD, d->name.z); +} + +#ifndef CMPP_OMIT_D_INCLUDE +static int cmpp__including_has(cmpp *pp, unsigned const char * zName){ + int rc = 0; + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_inclHas, false); + if( q && 0==cmpp__bind_text(pp, q, 1, zName) ){ + if(SQLITE_ROW == cmpp__step(pp, q, true)){ + rc = 1; + }else{ + rc = 0; + } + g_debug(pp,2,("inclpath has [%s] = %d\n",zName, rc)); + } + return rc; +} + +/** + Returns a resolved path of PREFIX+'/'+zKey, where PREFIX is one of + the `#include` dirs (cmpp_include_dir_add()). If no file match is + found, NULL is returned. Memory must eventually be passed to + cmpp_mfree() to free it. +*/ +static char * cmpp__include_search(cmpp *pp, unsigned const char * zKey, + int * nVal){ + char * zName = 0; + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_inclSearch, false); + if( nVal ) *nVal = 0; + if( q && 0==cmpp__bind_text(pp, q, 1, zKey) ){ + int const rc = cmpp__step(pp, q, false); + if(SQLITE_ROW==rc){ + const unsigned char * z = sqlite3_column_text(q, 0); + int const n = sqlite3_column_bytes(q,0); + zName = n ? sqlite3_mprintf("%.*s", n, z) : 0; + if( n ) cmpp_check_oom(pp, zName); + if( nVal ) *nVal = n; + } + cmpp__stmt_reset(q); + } + return zName; +} + +/** + Removes zKey from the currently-being-`#include`d list + list. +*/ +static int cmpp__include_rm(cmpp *pp, unsigned const char * zKey){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_inclDel, false); + if( q ){ + cmpp__bind_text(pp, q, 1, ustr_c(zKey)); + cmpp__step(pp, q, true); + g_debug(pp,2,("incl rm [%s]\n", zKey)); + } + return ppCode; +} + +#if 0 +/* +** Sets pp's error state if the `#include` list contains the given +** key. +*/ +static int cmpp__including_check(cmpp *pp, const char * zKey); +int cmpp__including_check(cmpp *pp, const char * zName){ + if( !ppCode ){ + if(cmpp__including_has(pp, zName)){ + cmpp__err(pp, CMPP_RC_MISUSE, + "Recursive include detected: %s\n", zName); + } + } + return ppCode; +} +#endif + + +/** + Adds the given filename to the list of being-`#include`d files, + using the given source file name and line number of error reporting + purposes. If recursion is later detected. +*/ +static int cmpp__including_add(cmpp *pp, unsigned const char * zKey, + unsigned const char * zSrc, cmpp_size_t srcLine){ + sqlite3_stmt * const q = cmpp__stmt(pp, CmppStmt_inclIns, false); + if( q ){ + cmpp__bind_text(pp, q, 1, zKey); + cmpp__bind_text(pp, q, 2, zSrc); + cmpp__bind_int(pp, q, 3, srcLine); + cmpp__step(pp, q, true); + g_debug(pp,2,("is-including-file add [%s] from [%s]:%" + CMPP_SIZE_T_PFMT "\n", zKey, zSrc, srcLine)); + } + return ppCode; +} + +/* Impl. for #include. */ +static void cmpp_dx_f_include(cmpp_dx *dx){ + char * zResolved = 0; + int nResolved = 0; + cmpp_b * const ob = cmpp_b_borrow(dx->pp); + bool raw = false; + cmpp_args args = cmpp_args_empty; + if( !ob || cmpp_dx_args_clone(dx, &args) ){ + goto end; + } + assert(args.pimpl && args.pimpl->pp==dx->pp); + cmpp_arg const * arg = args.arg0; + for( ; arg; arg = arg->next){ +#define FLAG(X)if( cmpp_arg_isflag(arg, X) ) + FLAG("-raw"){ + raw = true; + continue; + } + break; +#undef FLAG + } + if( !arg ){ + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, + "Expecting at least one filename argument."); + } + for( ; !dxppCode && arg; arg = arg->next ){ + cmpp_flag32_t a2bf = cmpp_arg_to_b_F_BRACE_CALL; + if( cmpp_TT_Word==arg->ttype && cmpp__arg_wordIsPathOrFlag(arg) ){ + a2bf |= cmpp_arg_to_b_F_NO_DEFINES; + } + if( cmpp_arg_to_b(dx, arg, cmpp_b_reuse(ob), a2bf) ){ + break; + } + //g_stderr("zFile=%s zResolved=%s\n", zFile, zResolved); + if(!raw && cmpp__including_has(dx->pp, ob->z)){ + /* Note that different spellings of the same filename + ** will elude this check, but that seems okay, as different + ** spellings means that we're not re-running the exact same + ** invocation. We might want some other form of multi-include + ** protection, rather than this, however. There may well be + ** sensible uses for recursion. */ + cmpp_dx_err_set(dx, CMPP_RC_RANGE, "Recursive include of file: %s", + ob->z); + break; + } + cmpp_mfree(zResolved); + nResolved = 0; + zResolved = cmpp__include_search(dx->pp, ob->z, &nResolved); + if(!zResolved){ + if( !dxppCode ){ + cmpp_dx_err_set(dx, CMPP_RC_NOT_FOUND, "file not found: %s", ob->z); + } + break; + } + if( raw ){ + if( !dx->pp->pimpl->out.out ) break; + FILE * const fp = cmpp_fopen(zResolved, "r"); + if( fp ){ + int const rc = cmpp_stream(cmpp_input_f_FILE, fp, + dx->pp->pimpl->out.out, + dx->pp->pimpl->out.state); + if( rc ){ + cmpp_dx_err_set(dx, rc, "Unknown error streaming file %s.", + arg->z); + } + cmpp_fclose(fp); + }else{ + cmpp_dx_err_set(dx, cmpp_errno_rc(errno, CMPP_RC_IO), + "Unknown error opening file %s.", arg->z); + } + }else{ + cmpp__including_add(dx->pp, ob->z, ustr_c(dx->sourceName), + dx->pimpl->dline.lineNo); + cmpp_process_file(dx->pp, zResolved); + cmpp__include_rm(dx->pp, ob->z); + } + } +end: + cmpp_mfree(zResolved); + cmpp_args_cleanup(&args); + cmpp_b_return(dx->pp, ob); +} +#endif /* #ifndef CMPP_OMIT_D_INCLUDE */ + +/** + cmpp_dx_f() callback state for cmpp_dx_f_if(): pointers to the + various directives of that family. +*/ +struct CmppIfState { + cmpp_d * dIf; + cmpp_d * dElif; + cmpp_d * dElse; + cmpp_d const * dEndif; +}; +typedef struct CmppIfState CmppIfState; + +/* Version 2 of #if. */ +static void cmpp_dx_f_if(cmpp_dx *dx){ + /* Reminder to self: + + We need to be able to recurse, even in skip mode, for #if nesting + to work. That's not great because it means we are evaluating + stuff we ideally should be skipping over, but it's keeping the + current tests working as-is. We can/do, however, avoid evaluating + expressions and such when recursing via skip mode. If we can + eliminate that here, by keeping track of the #if stack depth, + then we can possibly eliminate the whole CmppLvl_F_ELIDE + flag stuff. + + The more convoluted version 1 #if (which this replaced not hours + ago) kept track of the skip state across a separate directive + function for #if and #/if. That was more complex but did avoid + having to recurse into #if in order to straighten out #elif and + #else. Update: tried a non-recursive variant involving moving + this function's gotTruth into the CmppLvl object() and + managing the CmppLvl stack here, but it just didn't want to + work for me and i was too tired to figure out why. + */ + int gotTruth = 0 /*expr result*/; + CmppIfState const * const cis = dx->d->impl.state; + cmpp_d const * dClosers[] = { + cis->dElif, cis->dElse, cis->dEndif + }; + CmppLvl * lvl = 0; + CmppDLine const dline = dx->pimpl->dline; + cmpp_args args = cmpp_args_empty; + char delim[20] = {0}; +#define skipOn CmppLvl_elide((lvl), true) +#define skipOff CmppLvl_elide((lvl), false) + + assert( dx->d==cis->dIf ); + if( !dx->args.arg0 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Expecting an expression."); + return; + } + snprintf(delim, sizeof(delim), "%s", cmpp_dx_delim(dx)); + delim[sizeof(delim)-1] = 0; + lvl = CmppLvl_push(dx); + if( !lvl ) goto end; + if( cmpp_dx_is_eliding(dx) ){ + gotTruth = 1; + }else if( cmpp__args_evalToInt(dx, &dx->pimpl->args, &gotTruth) ){ + goto end; + }else if( !gotTruth ){ + skipOn; + } + + cmpp_d const * dPrev = dx->d; + cmpp_outputer devNull = cmpp_outputer_empty; + while( !dxppCode ){ + dPrev = dx->d; + bool const isFinal = dPrev==cis->dElse + /* true if expecting an #/if. */; + if( cmpp_dx_consume(dx, + CmppLvl_is_eliding(lvl) ? &devNull : NULL, + isFinal ? &cis->dEndif : dClosers, + isFinal ? 1 : sizeof(dClosers)/sizeof(dClosers[0]), + cmpp_dx_consume_F_PROCESS_OTHER_D) ){ + break; + } + cmpp_d const * const d2 = dx->d; + if( !d2 ){ + dxserr("Reached end of input in an untermined %s%s opened " + "at line %" CMPP_SIZE_T_PFMT ".", + delim, cis->dIf->name.z, dline.lineNo); + } + if( d2==cis->dEndif ){ + break; + }else if( isFinal ){ + assert(!"cannot happen - caught by consume()"); + dxserr("Expecting %s%s to close %s%s.", + delim, cis->dEndif->name.z, + delim, dPrev->name.z); + break; + }else if( gotTruth ){ + skipOn; + continue; + }else if( d2==cis->dElif ){ + if( 0==cmpp_dx_args_parse(dx, &args) + && 0==cmpp__args_evalToInt(dx, &args, &gotTruth) ){ + if( gotTruth ) skipOff; + else skipOn; + } + continue; + }else{ + assert( d2==cis->dElse + && "Else (haha!) we cannot have gotten here" ); + skipOff; + continue; + } + assert(!"unreachable"); + } + +#undef skipOff +#undef skipOn +end: + cmpp_args_cleanup(&args); + if( lvl ){ + bool const lvlIsOk = CmppLvl_get(dx)==lvl; + CmppLvl_pop(dx, lvl); + if( !lvlIsOk && !dxppCode ){ + assert(!"i naively believe that this is not possible"); + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, + "Mis-terminated %s%s opened at line " + "%" CMPP_SIZE_T_PFMT ".", + delim, cis->dIf->name.z, dline.lineNo); + } + } + return; +} + +/* Version 2 of #elif, #else, and #/if. */ +static void cmpp_dx_f_if_dangler(cmpp_dx *dx){ + CmppIfState const * const cis = dx->d->impl.state; + char const *zDelim = cmpp_dx_delim(dx); + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, + "%s%s with no matching %s%s", + zDelim, dx->d->name.z, + zDelim, cis->dIf->name.z); +} + +static void cmpp__dump_sizeofs(cmpp_dx*dx){ + (void)dx; +#define SO(X) printf("sizeof(" # X ") = %u\n", (unsigned)sizeof(X)) + SO(cmpp); + SO(cmpp_api_thunk); + SO(cmpp_arg); + SO(cmpp_args); + SO(cmpp_args_pimpl); + SO(cmpp_b); + SO(cmpp_d); + SO(cmpp_d_reg); + SO(cmpp__delim); + SO(cmpp__delim_list); + SO(cmpp_dx); + SO(cmpp_dx_pimpl); + SO(cmpp_outputer); + SO(cmpp_pimpl); + SO(((cmpp_pimpl*)0)->stmt); + SO(((cmpp_pimpl*)0)->policy); + SO(CmppArgList); + SO(CmppDLine); + SO(CmppDList); + SO(CmppDList_entry); + SO(CmppLvl); + SO(CmppSnippet); + SO(PodList__atpol); + printf("cmpp_TT__last = %d\n", + cmpp_TT__last); +#undef SO +} + + +/* Impl. for #pragma. */ +static void cmpp_dx_f_pragma(cmpp_dx *dx){ + cmpp_arg const * arg = dx->args.arg0; + if(!arg){ + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, "Expecting an argument"); + return; + }else if(arg->next){ + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, "Too many arguments"); + return; + } + const char * const zArg = (char const *)arg->z; +#define M(X) 0==strcmp(zArg,X) + if(M("defines")){ + cmpp__dump_defines(dx->pp, stderr, 1); + }else if(M("sizeof")){ + cmpp__dump_sizeofs(dx); + }else if(M("chomp-F")){ + dx->pp->pimpl->flags.chompF = 1; + }else if(M("no-chomp-F")){ + dx->pp->pimpl->flags.chompF = 0; + }else if(M("api-thunk")){ + /* Generate macros for CMPP_API_THUNK and friends from + cmpp_api_thunk_map. */ + char const * zName = "CMPP_API_THUNK_NAME"; + char buf[256]; +#define out(FMT,...) snprintf(buf, sizeof(buf), FMT,__VA_ARGS__); \ + cmpp_dx_out_raw(dx, buf, strlen(buf)) + if( 0 ){ + out("/* libcmpp API thunk. */\n" + "static cmpp_api_thunk const * %s = 0;\n" + "#define cmpp_api_init(PP) %s = (PP)->api\n", zName, zName); + } +#define A(V) \ + if(V<=cmpp_api_thunk_version) { \ + out("/* Thunk APIs which follow are available as of " \ + "version %d... */\n",V); \ + } +#define V(N,T,V) +#define F(N,T,P) out("#define cmpp_%s %s->%s\n", # N, zName, # N); +#define O(N,T) out("#define cmpp_%s (*%s->%s)\n", # N, zName, # N); +cmpp_api_thunk_map(A,V,F,O) +#undef V +#undef F +#undef O +#undef A +#undef out + }else{ + cmpp_dx_err_set(dx, CMPP_RC_NOT_FOUND, "Unknown pragma: %s", zArg); + } +#undef M +} + +/* Impl. for #savepoint. */ +static void cmpp_dx_f_savepoint(cmpp_dx *dx){ + if(!dx->args.arg0 || dx->args.arg0->next){ + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, "Expecting one argument"); + }else{ + const char * const zArg = (const char *)dx->args.arg0->z; +#define M(X) else if( 0==strcmp(zArg,X) ) + if( 0 ){} + M("begin"){ + cmpp__dx_sp_begin(dx); + } + M("rollback"){ + cmpp__dx_sp_rollback(dx); + }M("commit"){ + cmpp__dx_sp_commit(dx); + }else{ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Unknown savepoint option: %s", zArg); + } + } +#undef M +} + +/* #stderr impl. */ +static void cmpp_dx_f_stderr(cmpp_dx *dx){ + if(dx->args.z){ + g_stderr("%s:%" CMPP_SIZE_T_PFMT ": %.*s\n", dx->sourceName, + dx->pimpl->dline.lineNo, + (int)dx->args.nz, dx->args.z); + }else{ + cmpp_d const * d = dx->d; + g_stderr("%s:%" CMPP_SIZE_T_PFMT ": (no %s%s argument)\n", + dx->sourceName, dx->pimpl->dline.lineNo, + cmpp_dx_delim(dx), d->name.z); + } +} + +/** + Manages both the @token@ policy and the delimiters. + + #@ ?push? policy NAME ?<args.arg0; + if( !arg ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting arguments."); + return; + } + enum ops { op_none, op_set, op_push, op_pop, op_heredoc }; + enum popWhichE { pop_policy = 0x01, pop_delim = 0x02, + pop_both = pop_policy | pop_delim }; + enum ops op = op_none /* what to do */; + int popWhich = 0 /* what to pop */; + bool gotPolicy = false; + bool checkedCallForm = !cmpp_dx_is_call(dx); + cmpp_arg const * argDelimO = 0 /* @token@ opener */; + cmpp_arg const * argDelimC = 0 /* @token@ closer */; + cmpp__pi(dx->pp); + cmpp_atpol_e polNew = cmpp_atpol_get(dx->pp); + for( ; arg; arg = arg ? arg->next : NULL ){ + //g_warn("arg=%s", arg->z); + if( !checkedCallForm ){ + assert( cmpp_dx_is_call(dx) ); + checkedCallForm = true; + if( cmpp_arg_equals(arg, "policy") ){ + char const * z = + cmpp__atpol_name(dx->pp, cmpp__policy(dx->pp,at)); + if( z ){ + cmpp_dx_out_raw(dx, z, strlen(z)); + } + }else if( cmpp_arg_equals(arg, "delimiter") ){ + char const * zO = 0; + char const * zC = 0; + cmpp_atdelim_get(dx->pp, &zO, &zC); + if( zC ){ + cmpp_dx_out_raw(dx, zO, strlen(zO)); + cmpp_dx_out_raw(dx, " ", 1); + cmpp_dx_out_raw(dx, zC, strlen(zC)); + } + goto end; + }else{ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "In call form, '%s' expects one of " + "'policy' or delimiter'."); + } + goto end; + }/* checkedCallForm */ + if( !argDelimC && op_none==op ){ + /* Look for push|pop. */ + if( cmpp_arg_equals(arg, "pop") ){ + arg = arg->next; + if( !arg ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'pop' expects arguments of 'policy' " + "and/or 'delimiter' and/or 'both'."); + goto end; + } + for( ; arg; arg = arg->next ){ + if( 0==(pop_policy & popWhich) + && cmpp_arg_equals(arg, "policy") ){ + popWhich |= pop_policy; + }else if( 0==(pop_delim & popWhich) + && cmpp_arg_equals(arg, "delimiter") ){ + popWhich |= pop_delim; + }else if( 0==(pop_both & popWhich) + && cmpp_arg_equals(arg, "both") ){ + popWhich |= pop_both; + }else{ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Invalid argument to 'pop': ", arg->z); + goto end; + } + } + assert( !arg ); + op = op_pop; + break; + }/* pop */ + if( cmpp_arg_equals(arg, "push") ){ + op = op_push; + continue; + } + if( cmpp_arg_equals(arg, "set") ){ + /* set is implied if neither of push/pop are and we get + a policy name. */ + op = op_set; + continue; + } + /* Fall through */ + }/* !argDelimC && op_none==op */ + if( !gotPolicy && cmpp_arg_equals(arg, "policy") ){ + arg = arg->next; + if( !arg ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'policy' requires a policy name argument."); + goto end; + } + polNew = cmpp_atpol_from_str(NULL, (char const*)arg->z); + if( cmpp_atpol_invalid==polNew ){ + cmpp_atpol_from_str(dx->pp, (char const*)arg->z) + /* Will set the error state to something informative. */; + goto end; + } + if( op_none==op ) op = op_set; + gotPolicy = true; + continue; + } + if( !argDelimC && cmpp_arg_equals(arg, "delimiter") ){ + assert( !argDelimO && !argDelimC ); + argDelimO = arg->next; + argDelimC = argDelimO ? argDelimO->next : NULL; + if( !argDelimC ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'delimiter' requires two arguments."); + goto end; + } + arg = argDelimC->next; + continue; + } + if( op_pop!=op ){ + if( cmpp_arg_equals(arg,"<<") ){ + if( arg->next ) { + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'%s' must be the final argument.", + arg->z); + goto end; + } + op = op_heredoc; + break; + } + } + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Unhandled argument: %s", arg->z); + return; + }/*arg collection*/ + + assert( !dxppCode ); + assert( cmpp_atpol_invalid!=polNew ); + +#define popcheck(LIST) \ + if(dxppCode) goto end; \ + if(!LIST.n) goto bad_pop + + if( op_pop==op ){ + assert( popWhich>0 && popWhich<=3 ); + if( pop_policy & popWhich ){ + popcheck(pi->policy.at); + cmpp_atpol_pop(dx->pp); + } + if( pop_delim & popWhich ){ + popcheck(pi->delim.at); + cmpp_atdelim_pop(dx->pp); + } + goto end; + } + + assert( op_set==op || op_push==op || op_heredoc==op ); + if( argDelimC ){ + /* Push or set the @token@ delimiters */ + if( 0 ){ + g_warn("%s @delims@: %s %s", (op_set==op) ? "set" : "push", + argDelimO->z, argDelimC->z); + } + if( op_push==op || op_heredoc==op ){ + if( cmpp_atdelim_push(dx->pp, (char const*)argDelimO->z, + (char const*)argDelimC->z) ){ + goto end; + } + argDelimO = 0 /* Re-use argDelimC as a flag in case we need to + roll this back on an error below. */; + }else{ + assert( op_set==op ); + if( cmpp_atdelim_set(dx->pp, (char const*)argDelimO->z, + (char const*)argDelimC->z) ){ + goto end; + } + argDelimO = argDelimC = 0; + } + } + + assert( !dxppCode ); + assert( !argDelimO ); + if( op_heredoc==op ){ + if( cmpp_atpol_push(dx->pp, polNew) ){ + if( argDelimC ){ + popcheck(pi->delim.at); + cmpp_atdelim_pop(dx->pp); + } + }else{ + bool const pushedDelim = NULL!=argDelimC; + assert( dx->d->closer ); + cmpp_dx_consume(dx, NULL, &dx->d->closer, 1, + cmpp_dx_consume_F_PROCESS_OTHER_D) + /* !Invalidates argDelimO and argDelimC! */; + popcheck(pi->policy.at); + cmpp_atpol_pop(dx->pp); + if( pushedDelim ) cmpp_atdelim_pop(dx->pp); + } + }else if( op_push==op ){ + if( cmpp_atpol_push(dx->pp, polNew) && argDelimC ){ + /* Roll back delimiter push */ + cmpp_atdelim_pop(dx->pp); + } + }else{ + assert( op_set==op ); + if( cmpp__policy(dx->pp,at)!=polNew ){ + cmpp_atpol_set(dx->pp, polNew); + } + } +end: + return; +bad_pop: + cmpp_dx_err_set(dx, CMPP_RC_RANGE, + "Cannot pop an empty stack."); +#undef popcheck +} + + +static void cmpp_dx_f_expr(cmpp_dx *dx){ + int rv = 0; + assert( dx->args.z ); + if( 0 ){ + g_stderr("%s() argc=%d arg0 [%.*s]\n", __func__, dx->args.argc, + dx->args.arg0->n, dx->args.arg0->z); + g_stderr("%s() dx->args.z [%.*s]\n", __func__, + (int)dx->args.nz, dx->args.z); + } + if( !dx->args.argc ){ + dxserr("An empty expression is not permitted."); + return; + } +#if 0 + for( cmpp_arg const * a = dx->args.arg0; a; a = a->next ){ + g_stderr("got type=%s n=%u z=%.*s\n", + cmpp__tt_cstr(a->ttype, true), + (unsigned)a->n, (int)a->n, a->z); + } +#endif + if( 0==cmpp__args_evalToInt(dx, &dx->pimpl->args, &rv) ){ + if( 'a'==dx->d->name.z[0] ){ + if( !rv ){ + cmpp_dx_err_set(dx, CMPP_RC_ASSERT, "Assertion failed: %s", + dx->pimpl->buf.argsRaw.z); + } + }else{ + char buf[60]; + snprintf(buf, sizeof(buf), "%d\n", rv); + cmpp_dx_out_raw(dx, buf, strlen(buf)); + } + } +} + +static void cmpp_dx_f_undef_policy(cmpp_dx *dx){ + cmpp_unpol_e up = cmpp_unpol_invalid; + int nSeen = 0; + cmpp_arg const * arg = dx->args.arg0; + enum ops { op_set, op_push, op_pop }; + enum ops op = op_set; + if( !dx->args.argc ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting one of: error, null"); + return; + } +again: + ++nSeen; + if( cmpp_arg_equals(arg,"error") ) up = cmpp_unpol_ERROR; + else if( cmpp_arg_equals(arg,"null") ) up = cmpp_unpol_NULL; + else if( 1==nSeen ){ + if( cmpp_arg_equals(arg, "push") ){ + if( !arg->next ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting argument to 'push'."); + return; + } + op = op_push; + arg = arg->next; + goto again; + }else if( cmpp_arg_equals(arg, "pop") ){ + if( arg->next ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Extra argument after 'pop': %s", + arg->next->z); + return; + } + op = op_pop; + } + } + if( op_pop!=op && cmpp_unpol_invalid==up ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Unhandled undefined-policy '%s'." + " Try one of: error, null", + arg->z); + }else if( op_set==op ){ + cmpp_unpol_set(dx->pp, up); + }else if( op_push==op ){ + cmpp_unpol_push(dx->pp, up); + }else{ + assert( op_pop==op ); + if( !cmpp__epol(dx->pp,un).n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "No %s%s push is active.", + cmpp_dx_delim(dx), dx->d->name.z); + }else{ + cmpp_unpol_pop(dx->pp); + } + } +} + +#ifndef CMPP_OMIT_D_DB +/* Impl. for #attach. */ +static void cmpp_dx_f_attach(cmpp_dx *dx){ + if( 3!=dx->args.argc ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "%s expects: STRING as NAME", dx->d->name.z); + return; + } + cmpp_arg const * pNext = 0; + cmpp_b osDbFile = cmpp_b_empty; + cmpp_b osSchema = cmpp_b_empty; + for( cmpp_arg const * arg = dx->args.arg0; + 0==dxppCode && arg; + arg = pNext ){ + pNext = arg->next; + if( !osDbFile.n ){ + if( 0==cmpp_arg_to_b(dx, arg, &osDbFile, + cmpp_arg_to_b_F_BRACE_CALL) + && !osDbFile.n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Empty db file name is not permitted. " + "If '%s' is intended as a value, " + "it should be quoted.", arg->z); + break; + } + assert( pNext ); + if( !pNext || !cmpp_arg_equals(pNext, "as") ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting 'as' after db file name."); + break; + } + pNext = pNext->next; + }else if( !osSchema.n ){ + if( 0==cmpp_arg_to_b(dx, arg, &osSchema, + cmpp_arg_to_b_F_BRACE_CALL) + && !osSchema.n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Empty db schema name is not permitted." + "If '%s' is intended as a value, " + "it should be quoted.", + arg->z); + break; + } + } + } + if( dxppCode ) goto end; + if( !osSchema.n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Missing schema name."); + goto end; + } + sqlite3_stmt * const q = + cmpp__stmt(dx->pp, CmppStmt_dbAttach, false); + if( q ){ + cmpp__bind_textn(dx->pp, q, 1, osDbFile.z, osDbFile.n); + cmpp__bind_textn(dx->pp, q, 2, osSchema.z, osSchema.n); + cmpp__step(dx->pp, q, true); + } +end: + cmpp_b_clear(&osDbFile); + cmpp_b_clear(&osSchema); +} + +/* Impl. for #detach. */ +static void cmpp_dx_f_detach(cmpp_dx *dx){ + cmpp_d const * d = dx->d; + if( 1!=dx->args.argc ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "%s expects: NAME", d->name.z); + return; + } + cmpp_arg const * const arg = dx->args.arg0; + cmpp_b os = cmpp_b_empty; + if( cmpp_arg_to_b(dx, arg, &os, cmpp_arg_to_b_F_BRACE_CALL) ){ + goto end; + } + if( !os.n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Empty db schema name is not permitted."); + goto end; + } + sqlite3_stmt * const q = + cmpp__stmt(dx->pp, CmppStmt_dbDetach, false); + if( q ){ + cmpp__bind_textn(dx->pp, q, 1, os.z, os.n); + cmpp__step(dx->pp, q, true); + } +end: + cmpp_b_clear(&os); +} +#endif /* #ifndef CMPP_OMIT_D_DB */ + +static void cmpp_dx_f_delimiter(cmpp_dx *dx){ + cmpp_arg const * arg = dx->args.arg0; + enum ops { op_none, op_set, op_push, op_pop }; + enum ops op = op_none; + cmpp_arg const * argD = 0; + bool doHeredoc = false; + bool const isCall = cmpp_dx_is_call(dx); + for( ; arg; arg = arg->next ){ + if( op_none==op ){ + /* Look for push|pop. */ + if( cmpp_arg_equals(arg, "push") ){ + op = op_push; + continue; + }else if( cmpp_arg_equals(arg, "pop") ){ + if( arg->next ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'pop' expects no arguments."); + return; + } + op = op_pop; + break; + } + /* Fall through */ + } + if( !argD ){ + if( op_none==op ) op = op_set; + argD = arg; + continue; + }else if( !doHeredoc && cmpp_arg_equals(arg,"<<") ){ + if( isCall ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'%s' is not legal in [call] form.", arg->z); + return; + }else if( arg->next ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "'%s' must be the final argument.", arg->z); + return; + } + op = op_push; + doHeredoc = true; + continue; + } + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Unhandled arg: %s", arg->z); + return; + } + if( op_pop==op ){ + cmpp_delimiter_pop(dx->pp); + }else if( !argD ){ + if( isCall ){ + cmpp__delim const * const del = cmpp__dx_delim(dx); + if( del ) cmpp_dx_out_raw(dx, del->open.z, del->open.n); + }else{ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "No delimiter specified."); + } + return; + }else{ + char const * const z = + (0==strcmp("default",(char*)argD->z)) + ? NULL + : (char const*)argD->z; + if( op_push==op ){ + cmpp_delimiter_push(dx->pp, (char const*)argD->z); + }else{ + assert( op_set==op ); + if( doHeredoc ) cmpp_delimiter_push(dx->pp, z); + else cmpp_delimiter_set(dx->pp, z); + } + } + if( !cmpp_dx_err_check(dx) ){ + if( isCall ){ + cmpp__delim const * const del = cmpp__dx_delim(dx); + if( del ) cmpp_dx_out_raw(dx, del->open.z, del->open.n); + }else if( doHeredoc ){ + assert( op_push==op ); + cmpp_dx_consume(dx, NULL, &dx->d->closer, 1, + cmpp_dx_consume_F_PROCESS_OTHER_D); + cmpp_delimiter_pop(dx->pp); + } + } +} + +#ifndef NDEBUG +/* Experimenting grounds. */ +static void cmpp_dx_f_experiment(cmpp_dx *dx){ + void * st = dx->d->impl.state; + (void)st; + g_warn("raw args: %s", dx->pimpl->buf.argsRaw.z); + g_warn("argc=%u", dx->args.argc); + g_warn("isCall=%d\n", cmpp_dx_is_call(dx)); + if( 1 ){ + for( cmpp_arg const * a = dx->args.arg0; a; a = a->next ){ + g_stderr("got type=%s n=%u z=%.*s\n", + cmpp__tt_cstr(a->ttype, true), + (unsigned)a->n, (int)a->n, a->z); + } + } + if( 0 ){ + int rv = 0; + if( 0==cmpp__args_evalToInt(dx, &dx->pimpl->args, &rv) ){ + g_stderr("expr result: %d\n", rv); + } + } + + if( 0 ){ + char const * zIn = "a strspn test @# and @"; + g_stderr("strlen : %u\n", (unsigned)strlen(zIn)); + g_stderr("strspn 1: %u, %u\n", + (unsigned)strspn(zIn, "#@"), + (unsigned)strspn(zIn, "@#")); + g_stderr("strcspn 2: %u, %u\n", + (unsigned)strcspn(zIn, "#@"), + (unsigned)strcspn(zIn, "@#")); + g_stderr("strcspn 3: %u, %u\n", + (unsigned)strcspn(zIn, "a strspn"), + (unsigned)strcspn(zIn, "nope")); + } + + if( 1 ){ + cmpp__dump_sizeofs(dx); + } +} +#endif /* #ifndef NDEBUG */ + +#ifndef CMPP_OMIT_D_DB + +/** + Helper for #query and friends. Expects arg to be an SQL value. If + arg->next is "bind" then this consumes the following two arguments( + "bind" BIND_ARG), where BIND_ARG must be one of either + cmpp_TT_GroupSquiggly or cmpp_TT_GroupBrace. + + If it returns 0 then: + + - If "bind" was found then *pBind is set to the BIND_ARG argument + and *pNext is set to the one after that. + + - Else *pBind is set to NULL and and *pNext is set to + arg->next. + + In either case, *pNext may be set to NULL. +*/ +static +int cmpp__consume_sql_args(cmpp *pp, cmpp_arg const *arg, + cmpp_arg const **pBind, + cmpp_arg const **pNext){ + if( 0==ppCode ){ + *pBind = 0; + cmpp_arg const *pN = arg->next; + if( pN && cmpp_arg_equals(pN, "bind") ){ + pN = pN->next; + if( !pN || ( + cmpp_TT_GroupSquiggly!=pN->ttype + && cmpp_TT_GroupBrace!=pN->ttype + ) ){ + return serr("Expecting {...} or [...] after 'bind'."); + } + *pBind = pN; + *pNext = pN->next; + } else { + *pBind = 0; + *pNext = pN; + } + } + return ppCode; +} + +/** + cmpp_kav_each_f() impl for used by #query's `bind {...}` argument. +*/ +static int cmpp_kav_each_f_query__bind( + cmpp_dx * const dx, + unsigned char const * const zKey, cmpp_size_t nKey, + unsigned char const * const zVal, cmpp_size_t nVal, + void * const callbackState +){ + /* Expecting: :bindName -> bindValue */ + if( ':'!=zKey[0] && '$'!=zKey[0] /*&& '@'!=zKey[0]*/ ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Bind keys must start with ':' or '$'."); + }else{ + sqlite3_stmt * const q = callbackState; + assert( q ); + int const bindNdx = + sqlite3_bind_parameter_index(q, (char const*)zKey); + if( bindNdx ){ + cmpp__bind_textn(dx->pp, q, bindNdx, zVal, nVal); + }else{ + cmpp_err_set(dx->pp, CMPP_RC_RANGE, "Invalid bind name: %.*s", + (int)nKey, zKey); + } + } + return dxppCode; +} + +int cmpp__bind_group(cmpp_dx * const dx, sqlite3_stmt * const q, + cmpp_arg const * const aGroup){ + if( dxppCode ) return dxppCode; + if( cmpp_TT_GroupSquiggly==aGroup->ttype ){ + return cmpp_kav_each( + dx, aGroup->z, aGroup->n, + cmpp_kav_each_f_query__bind, q, + cmpp_kav_each_F_NOT_EMPTY + | cmpp_kav_each_F_CALL_VAL + | cmpp_kav_each_F_PARENS_EXPR + ); + } + if( cmpp_TT_GroupBrace!=aGroup->ttype ){ + return cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting {...} or [...] " + "for SQL binding list."); + } + int bindNdx = 0; + cmpp_args args = cmpp_args_empty; + cmpp_args_parse(dx, &args, aGroup->z, aGroup->n, 0); + if( !args.argc && !dxppCode ){ + cmpp_err_set(dx->pp, CMPP_RC_RANGE, + "Empty SQL bind list is not permitted."); + /* Keep going so we can clean up a partially-parsed args. */ + } + for( cmpp_arg const * aVal = args.arg0; + !dxppCode && aVal; + aVal = aVal->next ){ + ++bindNdx; + if( 0 ){ + g_warn("bind #%d %s <<%s>>", bindNdx, + cmpp__tt_cstr(aVal->ttype, true), aVal->z); + } + cmpp__bind_arg(dx, q, bindNdx, aVal); + } + cmpp_args_cleanup(&args); + return dxppCode; +} + +/** #query impl */ +static void cmpp_dx_f_query(cmpp_dx *dx){ + //cmpp_d const * d = cmpp_dx_d(dx); + if( !dx->args.arg0 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting one or more arguments"); + return; + } + cmpp * const pp = dx->pp; + sqlite3_stmt * q = 0; + cmpp_b * const obBody = cmpp_b_borrow(dx->pp); + cmpp_b * const sql = cmpp_b_borrow(dx->pp); + cmpp_outputer obNull = cmpp_outputer_empty; + //cmpp_b obBindArgs = cmpp_b_empty; + cmpp_args args = cmpp_args_empty + /* We need to copy the args or do some arg-type-specific work to + copy the memory for specific cases. */; + int nChomp = 0; + bool spStarted = false; + bool seenDefine = false; + bool batchMode = false; + cmpp_arg const * pNext = 0; + cmpp_arg const * aBind = 0; + cmpp_d const * const dNoRows = dx->d->impl.state; + cmpp_d const * const dClosers[2] = {dx->d->closer, dNoRows}; + + if( !obBody || !sql ) goto cleanup; + + assert( dNoRows ); + if( cmpp_dx_args_clone(dx, &args) ){ + goto cleanup; + } + //g_warn("args.argc=%d", args.argc); + for( cmpp_arg const * arg = args.arg0; + 0==dxppCode && arg; + arg = pNext ){ + //g_warn("arg=%s <<%s>>", cmpp_tt_cstr(arg->ttype), arg->z); + pNext = arg->next; + if( cmpp_arg_equals(arg, "define") ){ + if( seenDefine ){ + cmpp__dx_err_just_once(dx, arg); + goto cleanup; + } + seenDefine = true; + continue; + } + if( cmpp_arg_equals(arg, "-chomp") ){ + ++nChomp; + continue; + } + if( cmpp_arg_equals(arg, "-batch") ){ + if( batchMode ){ + cmpp__dx_err_just_once(dx, arg); + goto cleanup; + } + batchMode = true; + continue; + } + if( !sql->n ){ + if( cmpp__consume_sql_args(pp, arg, &aBind, &pNext) ){ + goto cleanup; + } + if( cmpp_arg_to_b(dx, arg, sql, cmpp_arg_to_b_F_BRACE_CALL) ){ + goto cleanup; + } + //g_warn("SQL: <<%s>>", sql->z); + continue; + } + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Unhandled arg: %s", arg->z); + goto cleanup; + } + if( ppCode ) goto cleanup; + if( seenDefine ){ + if( nChomp ){ + serr("-chomp and define may not be used together."); + goto cleanup; + }else if( batchMode ){ + serr("-batch and define may not be used together."); + goto cleanup; + } + } + if( !sql->n ){ + serr("Expecting an SQL-string argument."); + goto cleanup; + } + + if( batchMode ){ + if( aBind ){ + serr("Bindable values may not be used with -batch."); + goto cleanup; + } + char *zErr = 0; + cmpp__pi(dx->pp); + int rc = sqlite3_exec(pi->db.dbh, (char const *)sql->z, 0, 0, &zErr); + rc = cmpp__db_rc(dx->pp, rc, zErr); + sqlite3_free(zErr); + goto cleanup; + } + + if( cmpp__db_rc(pp, sqlite3_prepare_v2( + pp->pimpl->db.dbh, (char const *)sql->z, + (int)sql->n, &q, 0), 0) ){ + goto cleanup; + }else if( !q ){ + cmpp_err_set(pp, CMPP_RC_RANGE, + "Empty SQL is not permitted."); + goto cleanup; + } + //g_warn("SQL via stmt: <<%s>>", sqlite3_sql(q)); + int const nCol = sqlite3_column_count(q); + if( !nCol ){ + cmpp_err_set(pp, CMPP_RC_RANGE, + "SQL does not have any result columns."); + goto cleanup; + } + if( !seenDefine ){ + if( cmpp_sp_begin(pp) ) goto cleanup; + spStarted = true; + } + + if( aBind && cmpp__bind_group(dx, q, aBind) ){ + goto cleanup; + } + + bool gotARow = false; + cmpp_dx_pos dxPosStart; + cmpp_flag32_t const consumeFlags = cmpp_dx_consume_F_PROCESS_OTHER_D; + cmpp_dx_pos_save(dx, &dxPosStart); + int const nChompOrig = nChomp; + while( 0==ppCode ){ + int const dbrc = cmpp__step(pp, q, false); + if( SQLITE_ROW==dbrc ){ + nChomp = nChompOrig; + gotARow = true; + if( cmpp__define_from_row(pp, q, false) ) break; + if( seenDefine ) break; + cmpp_dx_pos_restore(dx, &dxPosStart); + cmpp_b_reuse(obBody); + /* If it weren't for -chomp, we wouldn't need to + buffer this. */ + if( cmpp_dx_consume_b(dx, obBody, dClosers, + sizeof(dClosers)/sizeof(dClosers[0]), + consumeFlags) ){ + goto cleanup; + } + assert( dx->d == dClosers[0] || dx->d == dClosers[1] ); + while( nChomp-- && cmpp_b_chomp(obBody) ){} + if( obBody->n && cmpp_dx_out_raw(dx, obBody->z, obBody->n) ) break; + if( dx->d == dNoRows ){ + if( cmpp_dx_consume(dx, &obNull, dClosers, 1/*one!*/, + consumeFlags) ){ + goto cleanup; + } + assert( dx->d == dClosers[0] ); + /* TODO? chomp? */ + } + continue; + } + if( 0==ppCode && seenDefine ){ + /* If we got here, there was no result row. */ + cmpp__define_from_row(pp, q, true); + } + break; + }/*result row loop*/ + cmpp__stmt_reset(q); + if( ppCode ) goto cleanup; + + while( !seenDefine && !gotARow ){ + /* No result rows. Skip past the body, emitting the #query:no-rows + content, if any. We disable @token processing for that first + step because (A) the output is not going anywhere, so no need + to expand it (noting that expanding may have side effects via + @[call...]@) and (B) the @tokens@ referring to this query's + results will not have been set because there was no row to set + them from, so @expanding@ them would fail. */ + cmpp_atpol_e const atpol = cmpp_atpol_get(dx->pp); + if( cmpp_atpol_set(dx->pp, cmpp_atpol_OFF) ) break; + cmpp_dx_consume(dx, &obNull, dClosers, + sizeof(dClosers)/sizeof(dClosers[0]), + consumeFlags); + cmpp_atpol_set(dx->pp, atpol); + if( dxppCode ) break; + assert( dx->d == dClosers[0] || dx->d == dClosers[1] ); + if( dx->d == dNoRows ){ + if( cmpp_dx_consume(dx, 0, dClosers, 1/*one!*/, + consumeFlags) ){ + break; + } + assert( dx->d == dClosers[0] ); + /* TODO? chomp? */ + } + break; + } + +cleanup: + cmpp_args_cleanup(&args); + cmpp_b_return(dx->pp, obBody); + cmpp_b_return(dx->pp, sql); + sqlite3_finalize(q); + if( spStarted ) cmpp_sp_rollback(pp); +} +#endif /* #ifndef CMPP_OMIT_D_DB */ + +#ifndef CMPP_OMIT_D_PIPE +/** #pipe impl. */ +static void cmpp_dx_f_pipe(cmpp_dx *dx){ + //cmpp_d const * d = cmpp_dx_d(dx); + unsigned char const * zArgs = dx->args.z; + assert( dx->args.arg0->n == dx->args.nz ); + unsigned char const * const zArgsEnd = zArgs + dx->args.nz; + if( zArgs==zArgsEnd ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting a command and arguments to pipe."); + return; + } + cmpp_FILE * fpToChild = 0; + int nChompIn = 0, nChompOut = 0; + cmpp_b * const chout = cmpp_b_borrow(dx->pp); + cmpp_b * const cmd = cmpp_b_borrow(dx->pp); + cmpp_b * const body = cmpp_b_borrow(dx->pp); + cmpp_b * const bArg = cmpp_b_borrow(dx->pp) + /* arg parsing and the initial command name part of the + external command. */; + cmpp_args cmdArgs = cmpp_args_empty; + /* TODOs and FIXMEs: + + We need flags to optionally @token@-parse before and/or after + filtering. + */ + bool seenDD = false /* true if seen "--" or [...] */; + bool doCapture = true /* true if we need a closing /pipe */; + bool argsAsGroup = false /* true if args is [...] */; + bool dumpDebug = false; + cmpp_flag32_t popenFlags = 0; + cmpp_popen_t po = cmpp_popen_t_empty; + if( cmpp_b_reserve3(dx->pp, cmd, zArgsEnd-zArgs + 1) + || cmpp_b_reserve3(dx->pp, bArg, cmd->nAlloc) ){ + goto cleanup; + } + + unsigned char * zOut = bArg->z; + unsigned char const * const zOutEnd = bArg->z + bArg->nAlloc - 1; + while( 0==dxppCode ){ + cmpp_arg arg = cmpp_arg_empty; + zOut = bArg->z; + if( cmpp_arg_parse(dx, &arg, &zArgs, zArgsEnd, + &zOut, zOutEnd) ){ + goto cleanup; + } + if( cmpp_arg_equals(&arg, "--") ){ + zOut = bArg->z; + if( cmpp_arg_parse(dx, &arg, &zArgs, zArgsEnd, + &zOut, zOutEnd) ){ + goto cleanup; + } + if( !arg.n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting external command name " + "or [...] after --."); + goto cleanup; + } + do_arg_list: + seenDD = true; + cmpp_flag32_t a2bFlags = cmpp_arg_to_b_F_BRACE_CALL; + if( cmpp_TT_GroupBrace==arg.ttype ){ + argsAsGroup = true; + a2bFlags |= cmpp_arg_to_b_F_NO_BRACE_CALL; + }else if( cmpp__arg_wordIsPathOrFlag(&arg) ){ + /* If it looks like it is a path, do not + expand it as a word. */ + arg.ttype = cmpp_TT_String; + } + if( cmpp_arg_to_b(dx, &arg, cmd, a2bFlags) + || (!argsAsGroup && cmpp_b_append_ch(cmd, ' ')) ){ + goto cleanup; + } + //g_warn("command: [%s]=>%s", arg.z, cmd->z); + if( cmd->n<2 ){ + cmpp_dx_err_set(dx, CMPP_RC_RANGE, + "Command name '%s' resolves to empty. " + "This is most commonly caused by not " + "quoting it but it can also mean that it " + "is an unknown define key.", arg.z); + goto cleanup; + } + //g_warn("arg=%s", arg.z); + //g_warn("cmd=%s", cmd->z); + break; + } + if( cmpp_TT_GroupBrace==arg.ttype ){ + goto do_arg_list; + } +#define FLAG(X)if( cmpp_arg_isflag(&arg, X) ) + FLAG("-no-input"){ + doCapture = false; + continue; + } + FLAG("-chomp-output"){ + ++nChompOut; + continue; + } + FLAG("-chomp"){ + ++nChompIn; + continue; + } + FLAG("-exec-direct"){ + popenFlags |= cmpp_popen_F_DIRECT; + continue; + } + FLAG("-path"){ + popenFlags |= cmpp_popen_F_PATH; + continue; + } + FLAG("-debug"){ + dumpDebug = true; + continue; + } +#undef FLAG + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Unhandled argument: %s. %s%s requires -- " + "before its external command name.", + arg.z, cmpp_dx_delim(dx), + dx->d->name.z); + goto cleanup; + } + + if( !seenDD ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "%s%s requires a -- before the name of " + "its external app.", + cmpp_dx_delim(dx), dx->d->name.z); + goto cleanup; + } + + //g_warn("zArgs n=%u zArgs=%s", (unsigned)(zArgsEnd-zArgs), zArgs); + /* dx->pimpl->args gets overwritten by cmpp_dx_consume(), so we have to copy + the args. */ + if( argsAsGroup ){ + assert( cmd->z ); + if( cmpp_args_parse(dx, &cmdArgs, cmd->z, cmd->n, 0) ){ + goto cleanup; + } + }else{ + /* zArgs can have newlines in it. We need to strip those out + before passing it on. We elide them entirely, as opposed to + replacing them with a space. */ + cmpp_skip_snl(&zArgs, zArgsEnd); + if( cmpp_b_reserve3(dx->pp, cmd, cmd->n + (zArgsEnd-zArgs) + 1) ){ + goto cleanup; + } + unsigned char * zo = cmd->z + cmd->n; + unsigned char const *zi = zArgs; +#if !defined(NDEBUG) + unsigned char const * zoEnd = cmd->z + cmd->nAlloc; +#endif + for( ; zi zo ); + *zo = 0; + cmd->n = zo - cmd->z; + } + assert( !dxppCode ); + + if( doCapture ){ + assert( dx->d->closer ); + if( cmpp_dx_consume_b(dx, body, &dx->d->closer, 1, + cmpp_dx_consume_F_PROCESS_OTHER_D) ){ + goto cleanup; + } + while( nChompIn-- && cmpp_b_chomp(body) ){} + po.fpToChild = &fpToChild; + } + + if( dumpDebug ){ + g_warn("%s%s -debug: cmd argsAsGroup=%d n=%u z=%s", + cmpp_dx_delim(dx), dx->d->name.z, + (int)argsAsGroup, + (unsigned)cmd->n, cmd->z); + } + if( argsAsGroup ){ + cmpp_popen_args(dx, &cmdArgs, &po); + }else{ + unsigned char const * z = cmd->z; + //cmpp_skip_snl(&z, cmd->z + cmd->n); + cmpp_popen(dx->pp, z, popenFlags, &po); + } + if( dxppCode ) goto cleanup; + int rc = 0; + if( doCapture ){ + /* Bug: if body is too bug (no idea how much that is), this will + block while waiting on input from the child. This can easily + happen with #include -raw. */ +#if 0 + /* Failed attempt to work around it. */ + assert( fpToChild ); + enum { BufSize = 128 }; + unsigned char buf[BufSize]; + cmpp_size_t nLeft = body->n; + unsigned char const * z = body->z; + while( nLeft>0 && !dxppCode ){ + cmpp_size_t nWrite = nLeft < BufSize ? nLeft : BufSize; + g_warn("writing %u to child...", (unsigned)nWrite); + rc = cmpp_output_f_FILE(fpToChild, z, nWrite); + if( rc ){ + cmpp_dx_err_set(dx, rc, "Error feeding stdin to piped process."); + break; + } + z += nWrite; + nLeft -= nWrite; + fflush(fpToChild); + cmpp_size_t nRead = BufSize; + rc = cmpp_input_f_fd(&po.fdFromChild, &buf[0], &nRead); + if( rc ) goto err_reading; + cmpp_b_append4(dx->pp, &chout, buf, nRead);\ + } + if( !dxppCode ){ + g_warn0("reading from child..."); + rc = cmpp_stream( cmpp_input_f_fd, &po.fdFromChild, + cmpp_output_f_b, chout ); + if( rc ) goto err_reading; + } + g_warn0("I/O done"); +#else + //g_warn("writing %u bytes to child...", (unsigned)body->n); + rc = cmpp_output_f_FILE(fpToChild, body->z, body->n); + if( rc ){ + cmpp_dx_err_set(dx, rc, "Error feeding stdin to piped process."); + goto cleanup; + } + //g_warn("wrote %u bytes to child.", (unsigned)body->n); + fclose(fpToChild); + fpToChild = 0; + if( dxppCode ) goto cleanup; + goto stream_chout; +#endif + }else{ + stream_chout: + //g_warn0("waiting on child..."); + rc = cmpp_stream(cmpp_input_f_fd, &po.fdFromChild, + cmpp_output_f_b, chout); + //g_warn0("I/O done"); + if( rc ){ + //err_reading: + cmpp_dx_err_set(dx, rc, "Error reading stdout from piped process."); + goto cleanup; + } + } + while( nChompOut-- && cmpp_b_chomp(chout) ){} + //g_warn("Read in:\n%.*s", (int)chout->n, chout->z); + cmpp_dx_out_raw(dx, chout->z, chout->n); + +cleanup: + cmpp_args_cleanup(&cmdArgs); + cmpp_b_return(dx->pp, chout); + cmpp_b_return(dx->pp, cmd); + cmpp_b_return(dx->pp, body); + cmpp_b_return(dx->pp, bArg); + cmpp_pclose(&po); +} +#endif /* #ifndef CMPP_OMIT_D_PIPE */ + +/** + #sum ...args + + Emits the sum of its arguments, treating each as an + integer. Non-integer arguments are silently skipped. +*/ +static void cmpp_dx_f_sum(cmpp_dx *dx){ + int64_t n = 0, i = 0; + cmpp_b b = cmpp_b_empty; + for( cmpp_arg const * arg = dx->args.arg0; + arg && !cmpp_dx_err_check(dx); arg = arg->next ){ + if( 0==cmpp_arg_to_b(dx, arg, cmpp_b_reuse(&b), + cmpp_arg_to_b_F_BRACE_CALL) + && cmpp__is_int64(b.z, b.n, &i) ){ + n += i; + } + } + cmpp_b_append_i64(cmpp_b_reuse(&b), n); + cmpp_dx_out_raw(dx, b.z, b.n); + cmpp_b_clear(&b); +} + +/** + #arg ?flags? the-arg + + -trim-left + -trim-right + -trim: trim both sides + + It sends its arg to cmpp_arg_to_b() to expand it, optionally + trims the result, and emits that value. + + This directive is not expected to be useful except, perhaps in + testing cmpp itself. Its trim flags, in particular, aren't commonly + useful because #arg is only useful in a function call context and + those unconditionally trim their output. +*/ +static void cmpp_dx_f_arg(cmpp_dx *dx){ + cmpp_flag32_t a2bFlags = cmpp_arg_to_b_F_BRACE_CALL; + bool trimL = false, trimR = false; + cmpp_arg const * arg = dx->args.arg0; + for( ; arg && !cmpp_dx_err_check(dx); arg = arg->next ){ +#define FLAG(X)if( cmpp_arg_isflag(arg, X) ) + FLAG("-raw") { + a2bFlags = cmpp_arg_to_b_F_FORCE_STRING; + continue; + } + FLAG("-trim-left") { trimL=true; continue; } + FLAG("-trim-right") { trimR=true; continue; } + FLAG("-trim") { trimL=trimR=true; continue; } +#undef FLAG + break; + } + if( arg ){ + cmpp_b * const b = cmpp_b_borrow(dx->pp); + if( b && 0==cmpp_arg_to_b(dx, arg, b, a2bFlags) ){ + unsigned char const * zz = b->z; + unsigned char const * zzEnd = b->z + b->n; + if( trimL ) cmpp_skip_snl(&zz, zzEnd); + if( trimR ) cmpp_skip_snl_trailing(zz, &zzEnd); + if( zzEnd-zz ){ + cmpp_dx_out_raw(dx, zz, zzEnd-zz); + } + } + cmpp_b_return(dx->pp, b); + }else if( !cmpp_dx_err_check(dx) ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Expecting an argument."); + } +} + +/** + #join ?flags? ...args + + -s SEPARATOR: sets the separator for its RHS arguments. Default=space. + + -nl: append a newline (will be stripped by [call]s!). This is the default + when !cmpp_dx_is_call(dx). + + -nonl: do not append a newline. Default when dx->isCall. +*/ +static void cmpp_dx_f_join(cmpp_dx *dx){ + cmpp_b * const b = cmpp_b_borrow(dx->pp); + cmpp_b * const bSep = cmpp_b_borrow(dx->pp); + cmpp_flag32_t a2bFlags = cmpp_arg_to_b_F_BRACE_CALL; + bool addNl = !cmpp_dx_is_call(dx); + int n = 0; + if( !b || !bSep ) goto end; + if( !dx->args.argc ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "%s%s expects ?flags? ...args", + cmpp_dx_delim(dx), dx->d->name.z); + goto end; + } + cmpp_b_append_ch(bSep, ' '); + cmpp_check_oom(dx->pp, bSep->z); + for( cmpp_arg const * arg = dx->args.arg0; arg + && !b->errCode + && !bSep->errCode + && !cmpp_dx_err_check(dx); + arg = arg->next ){ +#define FLAG(X)if( cmpp_arg_isflag(arg, X) ) + FLAG("-s"){ + if( !arg->next ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Missing SEPARATOR argument to -s."); + break; + } + cmpp_arg_to_b(dx, arg->next, + cmpp_b_reuse(bSep), + cmpp_arg_to_b_F_BRACE_CALL); + arg = arg->next; + continue; + } + //FLAG("-nl"){ addNl=true; continue; } + FLAG("-nonl"){ addNl=false; continue; } +#undef FLAG + if( n++ && cmpp_dx_out_raw(dx, bSep->z, bSep->n) ){ + break; + } + if( cmpp_arg_to_b(dx, arg, cmpp_b_reuse(b), a2bFlags) ){ + break; + } + cmpp_dx_out_raw(dx, b->z, b->n); + } + if( !cmpp_dx_err_check(dx) ){ + if( !n ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting at least one argument."); + }else if( addNl ){ + cmpp_dx_out_raw(dx, "\n", 1); + } + } +end: + cmpp_b_return(dx->pp, b); + cmpp_b_return(dx->pp, bSep); +} + + +/* Impl. for #file */ +static void cmpp_dx_f_file(cmpp_dx *dx){ + if( !dx->args.arg0 ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting one or more arguments"); + return; + } + cmpp_d const * const d = dx->d; + enum e_op { + op_none, op_exists, op_join + }; + cmpp_b * const b0 = cmpp_b_borrow(dx->pp); + if( !b0 ) goto end; + enum e_op op = op_none; + cmpp_arg const * opArg = 0; + cmpp_arg const * arg = 0; + for( arg = dx->args.arg0; + 0==dxppCode && arg; + arg = arg->next ){ + if( op_none==op ){ + if( cmpp_arg_equals(arg, "exists") ){ + op = op_exists; + opArg = arg->next; + arg = opArg->next; + if( arg ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "%s%s exists: too many arguments", + cmpp_dx_delim(dx), d->name.z); + goto end; + } + break; + }else if( cmpp_arg_equals(arg, "join") ){ + op = op_join; + if( !arg->next ) goto missing_arg; + arg = arg->next; + break; + }else{ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Unknown %s%s command: %s", + cmpp_dx_delim(dx), d->name.z, arg->z); + goto end; + } + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "%s%s unhandled argument: %s", + cmpp_dx_delim(dx), d->name.z, arg->z); + goto end; + } + } + switch( op ){ + case op_none: goto missing_arg; + case op_join: { + int i = 0; + cmpp_flag32_t const bFlags = cmpp_arg_to_b_F_BRACE_CALL; + for( ; arg; arg = arg->next, ++i ){ + if( cmpp_arg_to_b(dx, arg, cmpp_b_reuse(b0), bFlags) + || (i && cmpp_dx_out_raw(dx, "/", 1)) + || (b0->n && cmpp_dx_out_raw(dx, b0->z, b0->n)) ){ + break; + } + } + cmpp_dx_out_raw(dx, "\n", 1); + break; + } + case op_exists: { + assert( opArg ); + bool const b = cmpp__file_is_readable((char const *)opArg->z); + cmpp_dx_out_raw(dx, b ? "1\n" : "0\n", 2); + break; + } + } +end: + cmpp_b_return(dx->pp, b0); + return; +missing_arg: + if( arg ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "%s%s %s: missing argument", + cmpp_dx_delim(dx), d->name.z, arg->z ); + }else{ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "%s%s: missing subcommand", + cmpp_dx_delim(dx), d->name.z); + } + goto end; +} + + +/** + #cmp LHS op RHS +*/ +static void cmpp_dx_f_cmp(cmpp_dx *dx){ + cmpp_b * const bL = cmpp_b_borrow(dx->pp); + cmpp_b * const bR = cmpp_b_borrow(dx->pp); + cmpp_flag32_t a2bFlags = cmpp_arg_to_b_F_BRACE_CALL; + if( !bL || !!bR ) goto end; + for( cmpp_arg const * arg = dx->args.arg0; arg + && !cmpp_dx_err_check(dx); + arg = arg->next ){ + if( !bL->z ){ + cmpp_arg_to_b(dx, arg, bL, a2bFlags); + continue; + } + if( !bR->z ){ + cmpp_arg_to_b(dx, arg, bR, a2bFlags); + continue; + } + goto usage; + } + + if( cmpp_dx_err_check(dx) ) goto end; + if( !bL->z || !bR->z ){ + usage: + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Usage: LHS RHS"); + goto end; + } + assert( bL->z ); + assert( bR->z ); + char cbuf[20]; + int const cmp = strcmp((char*)bL->z, (char*)bR->z); + int const n = snprintf(cbuf, sizeof(cbuf), "%d", cmp); + assert(n>0); + cmpp_dx_out_raw(dx, cbuf, (cmpp_size_t)n); + +end: + cmpp_b_return(dx->pp, bL); + cmpp_b_return(dx->pp, bR); +} + + +#if 0 +/* Impl. for dummy placeholder. */ +static void cmpp_dx_f_todo(cmpp_dx *dx){ + cmpp_d const * d = cmpp_dx_d(dx); + g_warn("TODO: directive handler for %s", d->name.z); +} +#endif + +/** + If zName matches one of the delayed-load directives, that directive + is registered and 0 is returned. CMPP_RC_NO_DIRECTIVE is returned if + no match is found, but pp's error state is not updated in that + case. If a match is found and registration fails, that result code + will propagate via pp. +*/ +int cmpp__d_delayed_load(cmpp *pp, char const *zName){ + if( ppCode ) return ppCode; + int rc = CMPP_RC_NO_DIRECTIVE; + unsigned const nName = strlen(zName); + + pp->pimpl->flags.isInternalDirectiveReg = true; + +#define M(NAME) (nName==sizeof(NAME)-1 && 0==strcmp(zName,NAME)) +#define M_OC(NAME) (M(NAME) || M("/" NAME)) +#define M_IF(NAME) if( M(NAME) ) +#define CF(X) cmpp_d_F_ ## X +#define F_A_RAW CF(ARGS_RAW) +#define F_A_LIST CF(ARGS_LIST) +#define F_EXPR CF(ARGS_LIST) | CF(NOT_SIMPLIFY) +#define F_UNSAFE cmpp_d_F_NOT_IN_SAFEMODE +#define F_NC cmpp_d_F_NO_CALL +#define F_CALL cmpp_d_F_CALL_ONLY +#define DREG0(SYMNAME, NAME, OPENER, OFLAGS, CLOSER, CFLAGS) \ + cmpp_d_reg SYMNAME = { \ + .name = NAME, \ + .opener = { \ + .f = OPENER, \ + .flags = OFLAGS \ + }, \ + .closer = { \ + .f = CLOSER, \ + .flags = CFLAGS \ + }, \ + .dtor = 0, \ + .state = 0 \ + } + +#define DREG(NAME, OPENER, OFLAGS, CLOSER, CFLAGS ) \ + DREG0(const rReg, NAME, OPENER, OFLAGS, CLOSER, CFLAGS ); \ + rc = cmpp_d_register(pp, &rReg, NULL); \ + goto end + + /* The #if family requires some hand-holding... */ + if( M_OC("if") || M("elif") || M("else") ) { + DREG0(rIf, "if", + cmpp_dx_f_if, F_EXPR | F_NC | CF(FLOW_CONTROL), + cmpp_dx_f_if_dangler, 0); + DREG0(rElif, "elif", + cmpp_dx_f_if_dangler, F_NC, + 0, 0); + DREG0(rElse, "else", + cmpp_dx_f_if_dangler, F_NC, + 0, 0); + CmppIfState * const cis = cmpp__malloc(pp, sizeof(*cis)); + if( !cis ) goto end; + memset(cis, 0, sizeof(*cis)); + rIf.state = cis; + rIf.dtor = cmpp_mfree; + if( cmpp_d_register(pp, &rIf, &cis->dIf) + /* rIf must be first to avoid leaking cis on error */ + || cmpp_d_register(pp, &rElif, &cis->dElif) + || cmpp_d_register(pp, &rElse, &cis->dElse) ){ + rc = ppCode; + }else{ + assert( cis->dIf && cis->dElif && cis->dElse ); + assert( !cis->dEndif ); + assert( cis == cis->dIf->impl.state ); + assert( cmpp_mfree==cis->dIf->impl.dtor ); + cis->dElif->impl.state + = cis->dElse->impl.state + = cis; + cis->dElif->closer + = cis->dElse->closer + = cis->dEndif + = cis->dIf->closer; + rc = 0; + } + goto end; + }/* #if and friends */ + + /* Basic core directives... */ +#define M_IF_CORE(N,OPENER,OFLAGS,CLOSER,CFLAGS) \ + if( M_OC(N) ){ \ + DREG(N, OPENER, OFLAGS, CLOSER, CFLAGS); \ + } (void)0 + + M_IF_CORE("@", cmpp_dx_f_at, F_A_LIST, + cmpp_dx_f_dangling_closer, 0); + M_IF_CORE("arg", cmpp_dx_f_arg, F_A_LIST, 0, 0); + M_IF_CORE("assert", cmpp_dx_f_expr, F_EXPR, 0, 0); + M_IF_CORE("cmp", cmpp_dx_f_cmp, F_A_LIST, 0, 0); + M_IF_CORE("define", cmpp_dx_f_define, F_A_LIST, + cmpp_dx_f_dangling_closer, 0); + M_IF_CORE("delimiter", cmpp_dx_f_delimiter, F_A_LIST, + cmpp_dx_f_dangling_closer, 0); + M_IF_CORE("error", cmpp_dx_f_error, F_A_RAW, 0, 0); + M_IF_CORE("expr", cmpp_dx_f_expr, F_EXPR, 0, 0); + M_IF_CORE("join", cmpp_dx_f_join, F_A_LIST, 0, 0); + M_IF_CORE("once", cmpp_dx_f_once, F_A_LIST | F_NC, + cmpp_dx_f_dangling_closer, 0); + M_IF_CORE("pragma", cmpp_dx_f_pragma, F_A_LIST, 0, 0); + M_IF_CORE("savepoint", cmpp_dx_f_savepoint, F_A_LIST, 0, 0); + M_IF_CORE("stderr", cmpp_dx_f_stderr, F_A_RAW, 0, 0); + M_IF_CORE("sum", cmpp_dx_f_sum, F_A_LIST, 0, 0); + M_IF_CORE("undef", cmpp_dx_f_undef, F_A_LIST, 0, 0); + M_IF_CORE("undefined-policy", cmpp_dx_f_undef_policy, F_A_LIST, 0, 0); + M_IF_CORE("//", cmpp_dx_f_noop, F_A_RAW, 0, 0); + M_IF_CORE("file", cmpp_dx_f_file, + F_A_LIST | F_UNSAFE, 0, 0); + +#undef M_IF_CORE + + + /* Directives which can be disabled via build flags or + flags to cmpp_ctor()... */ +#define M_IF_FLAGGED(NAME,FLAG,OPENER,OFLAGS,CLOSER,CFLAGS) \ + M_IF(NAME) { \ + if( 0==(FLAG & pp->pimpl->flags.newFlags) ) { \ + DREG(NAME,OPENER,OFLAGS,CLOSER,CFLAGS); \ + } \ + goto end; \ + } + +#ifndef CMPP_OMIT_D_INCLUDE + M_IF_FLAGGED("include", cmpp_ctor_F_NO_INCLUDE, + cmpp_dx_f_include, F_A_LIST | F_UNSAFE, + 0, 0); +#endif + +#ifndef CMPP_OMIT_D_PIPE + M_IF_FLAGGED("pipe", cmpp_ctor_F_NO_PIPE, + cmpp_dx_f_pipe, F_A_RAW | F_UNSAFE, + cmpp_dx_f_dangling_closer, 0); +#endif + +#ifndef CMPP_OMIT_D_DB + M_IF_FLAGGED("attach", cmpp_ctor_F_NO_DB, + cmpp_dx_f_attach, F_A_LIST | F_UNSAFE, + 0, 0); + M_IF_FLAGGED("detach", cmpp_ctor_F_NO_DB, + cmpp_dx_f_detach, F_A_LIST | F_UNSAFE, + 0, 0); + if( 0==(cmpp_ctor_F_NO_DB & pp->pimpl->flags.newFlags) + && (M_OC("query") || M("query:no-rows")) ){ + DREG0(rQ, "query", cmpp_dx_f_query, F_A_LIST | F_UNSAFE, + cmpp_dx_f_dangling_closer, 0); + cmpp_d * dQ = 0; + rc = cmpp_d_register(pp, &rQ, &dQ); + if( 0==rc ){ + /* + It would be preferable to delay registration of query:no-rows + until we need it, but doing so causes an error when: + + |#if 0 + |#query + |... + |#query:no-rows HERE + |... + |#/query + |#/if + + Because query:no-rows won't have been registered, and unknown + directives are an error even in skip mode. Maybe they + shouldn't be. Maybe we should just skip them in skip mode. + That's only been an issue since doing delayed registration of + directives, so it's not come up until recently (as of + 2025-10-27). i was so hoping to be able to get _rid_ of skip + mode at some point. + */ + cmpp_d * dNoRows = 0; + cmpp_d_reg const rNR = { + .name = "query:no-rows", + .opener = { + .f = cmpp_dx_f_dangling_closer, + .flags = F_NC + } + }; + rc = cmpp_d_register(pp, &rNR, &dNoRows); + if( 0==rc ){ + dNoRows->closer = dQ->closer; + assert( !dQ->impl.state ); + dQ->impl.state = dNoRows; + } + } + goto end; + } +#endif /*CMPP_OMIT_D_DB*/ + +#if CMPP_D_MODULE + extern void cmpp_dx_f_module(cmpp_dx *); + M_IF_FLAGGED("module", cmpp_ctor_F_NO_MODULE, + cmpp_dx_f_module, F_A_LIST | F_UNSAFE, + 0, 0); +#endif + +#undef M_IF_FLAGGED + +#ifndef NDEBUG + M_IF("experiment"){ + DREG("experiment", cmpp_dx_f_experiment, + F_A_LIST | F_UNSAFE, 0, 0); + } +#endif + +end: +#undef DREG +#undef DREG0 +#undef F_EXPR +#undef F_A_RAW +#undef F_A_LIST +#undef F_UNSAFE +#undef F_NC +#undef F_CALL +#undef CF +#undef M +#undef M_OC +#undef M_IF + pp->pimpl->flags.isInternalDirectiveReg = false; + return ppCode ? ppCode : rc; +} +/* +** 2026-02-07: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses filesystem-related APIs libcmpp. +*/ + +#include + +/** + There are APIs i'd _like_ to have here, but the readily-available + code for them BSD license, so can't be pasted in here. Examples: + + - Filename canonicalization. + + - Cross-platform getcwd() (see below). + + - Windows support. This requires, in addition to the different + filesystem APIs, converting strings into something it can use. + + All of that adds up to infrastructure... which already exists + elsewhere but can't be copied here while retaining this project's + license. +*/ + +bool cmpp__file_is_readable(char const *zFile){ + return 0==access(zFile, R_OK); +} + +#if 0 +FILE *cmpp__fopen(const char *zName, const char *zMode){ + FILE *f; + if(zName && ('-'==*zName && !zName[1])){ + f = (strchr(zMode, 'w') || strchr(zMode,'+')) + ? stdout + : stdin + ; + }else{ + f = fopen(zName, zMode); + } + return f; +} + +void cmpp__fclose( FILE * f ){ + if(f && (stdin!=f) && (stdout!=f) && (stderr!=f)){ + fclose(f); + } +} +#endif +/* +** 2025-11-07: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the arguments-handling-related pieces for libcmpp. +*/ + +const cmpp_args_pimpl cmpp_args_pimpl_empty = + cmpp_args_pimpl_empty_m; +const cmpp_args cmpp_args_empty = cmpp_args_empty_m; +const cmpp_arg cmpp_arg_empty = cmpp_arg_empty_m; + +//just in case these ever get dynamic state +void cmpp_arg_cleanup(cmpp_arg *arg){ + if( arg ) *arg = cmpp_arg_empty; +} + +//just in case these ever get dynamic state +void cmpp_arg_reuse(cmpp_arg *arg){ + if( arg ) *arg = cmpp_arg_empty; +} + +/** Resets li's list for re-use but does not free it. Returns li. */ +static CmppArgList * CmppArgList_reuse(CmppArgList *li){ + for(cmpp_size_t n = li->nAlloc; n; ){ + cmpp_arg_reuse( &li->list[--n] ); + assert( !li->list[n].next ); + } + li->n = 0; + return li; +} + +/** Free all memory owned by li but does not free li. */ +void CmppArgList_cleanup(CmppArgList *li){ + const CmppArgList CmppArgList_empty = CmppArgList_empty_m; + while( li->nAlloc ){ + cmpp_arg_cleanup( &li->list[--li->nAlloc] ); + } + cmpp_mfree(li->list); + *li = CmppArgList_empty; +} + +/** Returns the most-recently-appended arg of li back to li's + free-list. */ +static void CmppArgList_unappend(CmppArgList *li){ + assert( li->n ); + if( li->n ){ + cmpp_arg_reuse( &li->list[--li->n] ); + } +} + +cmpp_arg * CmppArgList_append(cmpp *pp, CmppArgList *li){ + cmpp_arg * p = 0; + assert( li->list ? li->nAlloc : 0==li->nAlloc ); + if( 0==ppCode + && 0==CmppArgList_reserve(pp, li, + cmpp__li_reserve1_size(li,10)) ){ + p = &li->list[li->n++]; + cmpp_arg_reuse( p ); + } + return p; +} + +void cmpp_args_pimpl_cleanup(cmpp_args_pimpl *p){ + assert( !p->nextFree ); + cmpp_b_clear(&p->argOut); + CmppArgList_cleanup(&p->argli); + *p = cmpp_args_pimpl_empty; +} + +static void cmpp_args_pimpl_reuse(cmpp_args_pimpl *p){ + assert( !p->nextFree ); + cmpp_b_reuse(&p->argOut); + CmppArgList_reuse(&p->argli); + assert( !p->argOut.n ); + assert( !p->argli.n ); +} + +static void cmpp_args_pimpl_return(cmpp *pp, cmpp_args_pimpl *p){ + if( p ){ + assert( p->pp ); + cmpp__pi(pp); + assert( !p->nextFree ); + cmpp_args_pimpl_reuse(p); + p->nextFree = pi->recycler.argPimpl; + pi->recycler.argPimpl = p; + } +} + +static cmpp_args_pimpl * cmpp_args_pimpl_borrow(cmpp *pp){ + cmpp__pi(pp); + cmpp_args_pimpl * p = 0; + if( pi->recycler.argPimpl ){ + p = pi->recycler.argPimpl; + pi->recycler.argPimpl = p->nextFree; + p->nextFree = 0; + p->pp = pp; + assert( !p->argOut.n && "Buffer was used when not borrowed" ); + }else{ + p = cmpp__malloc(pp, sizeof(*p)); + if( 0==cmpp_check_oom(pp, p) ) { + *p = cmpp_args_pimpl_empty; + p->pp = pp; + } + } + return p; +} + +CMPP__EXPORT(void, cmpp_args_cleanup)(cmpp_args *a){ + if( a ){ + if( a->pimpl ){ + cmpp * const pp = a->pimpl->pp; + assert( pp ); + if( pp ){ + cmpp_args_pimpl_return(pp, a->pimpl); + }else{ + cmpp_args_pimpl_cleanup(a->pimpl); + cmpp_mfree(a->pimpl); + } + } + *a = cmpp_args_empty; + } +} + +CMPP__EXPORT(void, cmpp_args_reuse)(cmpp_args *a){ + cmpp_args_pimpl * const p = a->pimpl; + if( p ) cmpp_args_pimpl_reuse(p); + *a = cmpp_args_empty; + a->pimpl = p; +} + +int cmpp_args__init(cmpp * pp, cmpp_args * a){ + if( 0==ppCode ){ + if( a->pimpl ){ + assert( a->pimpl->pp == pp ); + cmpp_args_reuse(a); + assert(! a->pimpl->argOut.n ); + assert( a->pimpl->pp == pp ); + }else{ + a->pimpl = cmpp_args_pimpl_borrow(pp); + assert( !a->pimpl || a->pimpl->pp==pp ); + } + } + return ppCode; +} + +/** + Declare cmpp_argOp_f_NAME(). +*/ +#define cmpp_argOp_decl(NAME) \ + static void cmpp_argOp_f_ ## NAME (cmpp_dx *dx, \ + cmpp_argOp const *op, \ + cmpp_arg const *vLhs, \ + cmpp_arg const **pvRhs, \ + int *pResult) +cmpp_argOp_decl(compare); + +#if 0 +cmpp_argOp_decl(logical1); +cmpp_argOp_decl(logical2); +cmpp_argOp_decl(defined); +#endif + +static const struct { + const cmpp_argOp opAnd; + const cmpp_argOp opOr; + const cmpp_argOp opGlob; + const cmpp_argOp opNotGlob; + const cmpp_argOp opNot; + const cmpp_argOp opDefined; +#define cmpp_argOps_cmp_map(E) E(Eq) E(Neq) E(Lt) E(Le) E(Gt) E(Ge) +#define E(NAME) const cmpp_argOp op ## NAME; + cmpp_argOps_cmp_map(E) +#undef E +} cmpp_argOps = { + .opAnd = { + .ttype = cmpp_TT_OpAnd, + .arity = 2, + .assoc = 0, + .xCall = 0//cmpp_argOp_f_logical2 + }, + .opOr = { + .ttype = cmpp_TT_OpOr, + .arity = 2, + .assoc = 0, + .xCall = 0//cmpp_argOp_f_logical2 + }, + .opGlob = { + .ttype = cmpp_TT_OpGlob, + .arity = 2, + .assoc = 0, + .xCall = 0//cmpp_argOp_f_glob + }, + .opNotGlob = { + .ttype = cmpp_TT_OpNotGlob, + .arity = 2, + .assoc = 0, + .xCall = 0//cmpp_argOp_f_glob + }, + .opNot = { + .ttype = cmpp_TT_OpNot, + .arity = 1, + .assoc = 1, + .xCall = 0//cmpp_argOp_f_logical1 + }, + .opDefined = { + .ttype = cmpp_TT_OpDefined, + .arity = 1, + .assoc = 1, + .xCall = 0//cmpp_argOp_f_defined + }, + /* Comparison ops... */ +#define E(NAME) .op ## NAME = { \ + .ttype = cmpp_TT_Op ## NAME, .arity = 2, .assoc = 0, \ + .xCall = cmpp_argOp_f_compare }, + cmpp_argOps_cmp_map(E) +#undef E +}; + +cmpp_argOp const * cmpp_argOp_for_tt(cmpp_tt tt){ + switch(tt){ + case cmpp_TT_OpAnd: return &cmpp_argOps.opAnd; + case cmpp_TT_OpOr: return &cmpp_argOps.opOr; + case cmpp_TT_OpGlob: return &cmpp_argOps.opGlob; + case cmpp_TT_OpNot: return &cmpp_argOps.opNot; + case cmpp_TT_OpDefined: return &cmpp_argOps.opDefined; +#define E(NAME) case cmpp_TT_Op ## NAME: return &cmpp_argOps.op ## NAME; + cmpp_argOps_cmp_map(E) +#undef E + default: return NULL; + } +} +#define argOp(ARG) cmpp_argOp_for_tt((ARG)->ttype) + +#if 0 +cmpp_argOp_decl(logical1){ + assert( cmpp_TT_OpNot==op->ttype ); + assert( !vRhs ); + assert( vLhs ); + if( 0==cmpp__arg_toBool(dx, vLhs, pResult) ){ + *pResult = !*pResult; + } +} + +cmpp_argOp_decl(logical2){ + assert( vRhs ); + assert( vLhs ); + int vL = 0; + int vR = 0; + if( 0==cmpp__arg_toBool(dx, vLhs, &vL) + && 0==cmpp__arg_toBool(dx, vRhs, &vR) ){ + switch( op->ttype ){ + case cmpp_TT_OpAnd: *pResult = vL && vR; break; + case cmpp_TT_OpOr: *pResult = vL || vR; break; + default: + cmpp__fatal("Cannot happen: illegal op mapping"); + } + } +} + +cmpp_argOp_decl(defined){ + assert( cmpp_TT_OpDefined==op->ttype ); + assert( !vRhs ); + assert( vLhs ); + if( cmpp_TT_Word==vLhs->ttype ){ + *pResult = cmpp_has(pp, (char const *)vLhs->z, vLhs->n); + if( !*pResult && vLhs->n>1 && '#'==vLhs->z[0] ){ + *pResult = !!cmpp__d_search3(pp, vLhs->z+1, + cmpp__d_search3_F_NO_DLL); + } + }else{ + cmpp__err(pp, CMPP_RC_TYPE, "Invalid token type %s for %s", + cmpp__tt_cstr(vLhs->ttype, true), + cmpp__tt_cstr(op->ttype, false)); + } +} +#endif + +#if 0 +static cmpp_argOp const * cmpp_argOp_isCompare(cmpp_tt tt){ + cmpp_argOp const * const p = cmpp_argOp_for_tt(tt); + switch( p ? p->ttype : cmpp_TT_None ){ +#define E(NAME) case cmpp_TT_Op ## NAME: return p; + cmpp_argOps_cmp_map(E) +#undef E + return p; + case cmpp_TT_None: + default: + return NULL; + } +} +#endif + +/** + An internal helper for cmpp_argOp_...(). It binds some value of + *paArg to column bindNdx of query q and sets *paArg to the next + argument to be consumed. This function expects that q is set up to + do the right thing when *paArg is a Word-type value (see + cmpp_argOp_f_compare()). +*/ +static void cmpp_argOp__cmp_bind(cmpp_dx * const dx, + sqlite3_stmt * const q, + int bindNdx, + cmpp_arg const ** paArg){ + cmpp_arg const * const arg = *paArg; + assert(arg); + switch( dxppCode ? 0 : arg->ttype ){ + case 0: break; + case cmpp_TT_Word: + /* In this case, q is supposed to be set up to use + CMPP__SEL_V_FROM(bindNdx), i.e. it expects the verbatim word + and performs the expansion to its value in the query. */ + cmpp__bind_textn(dx->pp, q, bindNdx, arg->z, arg->n); + *paArg = arg->next; + break; + case cmpp_TT_StringAt: + case cmpp_TT_String: + case cmpp_TT_Int:{ + cmpp__bind_arg(dx, q, bindNdx, arg); + *paArg = arg->next; + break; + } + case cmpp_TT_OpNot: + case cmpp_TT_OpDefined: + case cmpp_TT_GroupParen:{ + int rv = 0; + if( 0==cmpp__arg_toBool(dx, arg, &rv, paArg) ){ + cmpp__bind_int(dx->pp, q, bindNdx, rv); + } + *paArg = arg->next; + break; + } + /* TODO? cmpp_TT_GroupParen */ + default: + cmpp_dx_err_set(dx, CMPP_RC_TYPE, + "Invalid argument type (%s) for the comparison " + "queries: %s", + cmpp_tt_cstr(arg->ttype), arg->z); + } +} + +/** + Internal helper for cmp_argOp_...(). + + Expects q to be a query with an integer in result column 0. This + steps/resets the query and applies the given comparison operator's + logic to column 0's value, placing the result of the operator in + *pResult. + + If q has no result row, a default value of 0 is assumed. +*/ +static void cmpp_argOp__cmp_apply(cmpp * const pp, + cmpp_argOp const * const op, + sqlite3_stmt * const q, + int * const pResult){ + if( 0==ppCode ){ + int rc = cmpp__step(pp, q, false); + assert( SQLITE_ROW==rc || ppCode ); + if( SQLITE_ROW==rc ){ + rc = sqlite3_column_int(q, 0); + }else{ + rc = 0; + } + switch( op->ttype ){ + case 0: break; + case cmpp_TT_OpEq: *pResult = 0==rc; break; + case cmpp_TT_OpNeq: *pResult = 0!=rc; break; + case cmpp_TT_OpLt: *pResult = rc<0; break; + case cmpp_TT_OpLe: *pResult = rc<=0; break; + case cmpp_TT_OpGt: *pResult = rc>0; break; + case cmpp_TT_OpGe: *pResult = rc>=0; break; + default: + cmpp__fatal("Cannot happen: invalid arg mapping"); + } + } + cmpp__stmt_reset(q); +} + +/** + Applies *paRhs as the RHS of an integer binary operator, the LHS of + which is the lhs argument. The result is put in *pResult. On + success *paRhs is set to the next argument for the expression to + parse. +*/ +static void cmpp_argOp_applyTo(cmpp_dx *dx, + cmpp_argOp const * const op, + int lhs, + cmpp_arg const ** paRhs, + int * pResult){ + sqlite3_stmt * q = 0; + cmpp_arg const * aRhs = *paRhs; + assert(aRhs); + q = cmpp_TT_Word==aRhs->ttype + ? cmpp__stmt(dx->pp, CmppStmt_cmpVD, false) + : cmpp__stmt(dx->pp, CmppStmt_cmpVV, false); + if( q ){ + char numbuf[32]; + int const nNum = snprintf(numbuf, sizeof(numbuf), "%d", lhs); + cmpp__bind_textn(dx->pp, q, 1, ustr_c(numbuf), nNum); + cmpp_argOp__cmp_bind(dx, q, 2, paRhs); + cmpp_argOp__cmp_apply(dx->pp, op, q, pResult); + } +} + +cmpp_argOp_decl(compare){ + cmpp_arg const * const vRhs = *pvRhs; + sqlite3_stmt * q = 0; + /* Select which query to use, depending on whether each + of the LHS/RHS are Word tokens. For Word tokens + the corresponding query columns get bound to + a subquery which resolves the word. Non-word + tokens get bound as-is. */ + if( cmpp_TT_Word==vLhs->ttype ){ + q = cmpp_TT_Word==vRhs->ttype + ? cmpp__stmt(dx->pp, CmppStmt_cmpDD, false) + : cmpp__stmt(dx->pp, CmppStmt_cmpDV, false); + if(0){ + g_warn("\nvLhs=%s %s\nvRhs=%s %s\n", + cmpp_tt_cstr(vLhs->ttype), vLhs->z, + cmpp_tt_cstr(vRhs->ttype), vRhs->z); + } + }else if( cmpp_TT_Word==vRhs->ttype ){ + q = cmpp__stmt(dx->pp, CmppStmt_cmpVD, false); + }else{ + q = cmpp__stmt(dx->pp, CmppStmt_cmpVV, false); + } + if( q ){ + //cmpp__bind_textn(pp, q, 1, vLhs->z, vLhs->n); + cmpp_argOp__cmp_bind(dx, q, 1, &vLhs); + cmpp_argOp__cmp_bind(dx, q, 2, pvRhs); + cmpp_argOp__cmp_apply(dx->pp, op, q, pResult); + } +} + +#undef cmpp_argOp_decl + +#if 0 +static inline int cmpp_dxt_isBinOp(cmpp_tt tt){ + cmpp_argOp const * const a = cmpp_argOp_for_tt(tt); + return a ? 2==a->arity : 0; +} + +static inline int cmpp_dxt_isUnaryOp(cmpp_tt tt){ + return tt==cmpp_TT_OpNot || cmpp_TT_OpDefined; +} + +static inline int cmpp_dxt_isGroup(cmpp_tt tt){ + return tt==cmpp_TT_GroupParen || tt==cmpp_TT_GroupBrace || cmpp_TT_GroupSquiggly; +} +#endif + +int cmpp__arg_evalSubToInt(cmpp_dx *dx, + cmpp_arg const *arg, + int * pResult){ + cmpp_args sub = cmpp_args_empty; + if( 0==cmpp_args_parse(dx, &sub, arg->z, arg->n, 0) ){ + cmpp__args_evalToInt(dx, &sub, pResult); + } + cmpp_args_cleanup(&sub); + return dxppCode; +} + +int cmpp__args_evalToInt(cmpp_dx * const dx, + cmpp_args const *pArgs, + int * pResult){ + if( dxppCode ) return dxppCode; + + cmpp_arg const * pNext = 0; + cmpp_arg const * pPrev = 0; + int result = *pResult; + cmpp_b osL = cmpp_b_empty; + cmpp_b osR = cmpp_b_empty; + static int level = 0; + ++level; + +#define lout(fmt,...) if(0) g_stderr("%.*c" fmt, level*2, ' ', __VA_ARGS__) + + //lout("START %s(): %s\n", __func__, pArgs->pimpl->buf.argsRaw.z); + for( cmpp_arg const *arg = pArgs->arg0; + arg && 0==dxppCode; + pPrev = arg, arg = pNext ){ + pNext = arg->next; + if( cmpp_TT_Noop==arg->ttype ){ + arg = pPrev /* help the following arg to DTRT */; + continue; + } + cmpp_argOp const * const thisOp = argOp(arg); + cmpp_argOp const * const nextOp = pNext ? argOp(pNext) : 0; + if( 0 ){ + lout("arg: %s @%p %s\n", + cmpp__tt_cstr(arg->ttype, true), arg, arg->z); + if(1){ + if( pPrev ) lout(" prev arg: %s %s\n", + cmpp__tt_cstr(pPrev->ttype, true), pPrev->z); + if( pNext ) lout(" next arg: %s %s\n", + cmpp__tt_cstr(pNext->ttype, true), pNext->z); + } + } + if( thisOp ){ /* Basic validation */ + if( !pNext ){ + dxserr("Missing '%s' RHS.", + cmpp__tt_cstr(thisOp->ttype, false)); + break; + }else if( !pPrev && 2==thisOp->arity ){ + dxserr("Missing %s LHS.", + cmpp__tt_cstr(thisOp->ttype, false)); + break; + } + if( nextOp && nextOp->arity>1 ){ + dxserr("Invalid '%s' RHS: %s", arg->z, pNext->z); + break; + } + } + + switch( arg->ttype ){ + + case cmpp_TT_OpNot: + case cmpp_TT_OpDefined: + if( pPrev && !argOp(pPrev) ){ + cmpp_dx_err_set(dx, CMPP_RC_CANNOT_HAPPEN, + "We expected to have consumed '%s' by " + "this point.", + pPrev->z); + }else{ + cmpp__arg_toBool(dx, arg, &result, &pNext); + } + break; + + case cmpp_TT_OpAnd: + case cmpp_TT_OpOr:{ + assert( pNext ); + assert( pPrev ); + /* Reminder to self: we can't add short-circuiting of the RHS + right now because the handling of chained unary ops on the + RHS is handled via cmpp__arg_toBool(). */ + int rv = 0; + if( 0==cmpp__arg_toBool(dx, pNext, &rv, &pNext) ){ + if( cmpp_TT_OpAnd==arg->ttype ) result = result && rv; + else result = result || rv; + } + //g_warn("post-and/or pNext=%s\n", pNext ? pNext->z : 0); + break; + } + + case cmpp_TT_OpNotGlob: + case cmpp_TT_OpGlob:{ + assert( pNext ); + assert( pPrev ); + assert( pNext!=arg ); + assert( pPrev!=arg ); + if( cmpp_arg_to_b(dx, pNext, &osL, 0) ){ + break; + } + unsigned char const * const zGlob = osL.z; + if( 0==cmpp_arg_to_b(dx, pPrev, &osR, 0) ){ + if( 0 ){ + g_warn("zGlob=[%s] z=[%s]", zGlob, osR.z); + } + result = 0==sqlite3_strglob((char const *)zGlob, + (char const *)osR.z); + if( cmpp_TT_OpNotGlob==arg->ttype ){ + result = !result; + } + //g_warn("\nzGlob=%s\nz=%s\nresult=%d", zGlob, z, result); + } + pNext = pNext->next; + break; + } + +#define E(NAME) case cmpp_TT_Op ## NAME: + cmpp_argOps_cmp_map(E) { + cmpp_argOp const * const prevOp = pPrev ? argOp(pPrev) : 0; + if( prevOp ){ + /* Chained operators */ + cmpp_argOp_applyTo(dx, thisOp, result, &pNext, &result); + }else{ + assert( pNext ); + assert( pPrev ); + assert( thisOp ); + assert( thisOp->xCall ); + thisOp->xCall(dx, thisOp, pPrev, &pNext, &result); + } + break; + } +#undef E + +#define checkConsecutiveNonOps \ + if( pPrev && !argOp(pPrev) ){ \ + dxserr("Illegal consecutive non-operators: %s %s", \ + pPrev->z, arg->z); \ + break; \ + }(void)0 + + case cmpp_TT_Int: + case cmpp_TT_String: + checkConsecutiveNonOps; + if( !cmpp__is_int(arg->z, arg->n, &result) ){ + /* This is mostly for and/or ops. glob will reach back and + grab arg->z. */ + result = 0; + } + break; + case cmpp_TT_Word: + checkConsecutiveNonOps; + cmpp__get_int(dx->pp, arg->z, arg->n, &result); + break; + case cmpp_TT_GroupParen:{ + checkConsecutiveNonOps; + cmpp_args sub = cmpp_args_empty; + if( 0==cmpp_args_parse(dx, &sub, arg->z, arg->n, 0) ){ + cmpp__args_evalToInt(dx, &sub, &result); + } + cmpp_args_cleanup(&sub); + break; + } + case cmpp_TT_GroupBrace:{ + checkConsecutiveNonOps; + cmpp_b b = cmpp_b_empty; + if( 0==cmpp_call_str(dx->pp, arg->z, arg->n, &b, 0) ){ + cmpp__is_int(b.z, b.n, &result); + } + cmpp_b_clear(&b); + break; + } +#undef checkConsecutiveNonOps + default: + assert( arg->z ); + dxserr("Illegal expression token %s: %s", + cmpp__tt_cstr(arg->ttype, true), arg->z); + }/*switch(arg->ttype)*/ + }/* foreach arg */ + if( 0 ){ + lout("END %s() result=%d\n", __func__, result); + } + --level; + if( !dxppCode ){ + *pResult = result; + } + cmpp_b_clear(&osL); + cmpp_b_clear(&osR); + return dxppCode; +#undef lout +} + +#undef argOp +#undef cmpp_argOp_decl + +static inline cmpp_tt cmpp_dxt_is_group(cmpp_tt ttype){ + switch(ttype){ + case cmpp_TT_GroupParen: + case cmpp_TT_GroupBrace: + case cmpp_TT_GroupSquiggly: + return ttype; + default: + return cmpp_TT_None; + } +} + +int cmpp_args_parse(cmpp_dx * const dx, + cmpp_args * const pArgs, + unsigned char const * const zInBegin, + cmpp_ssize_t nIn, + cmpp_flag32_t flags){ + assert( zInBegin ); + unsigned char const * const zInEnd = + zInBegin + cmpp__strlenu(zInBegin, nIn); + + if( cmpp_args__init(dx->pp, pArgs) ) return dxppCode; + if( 0 ){ + g_warn("whole input = <<%.*s>>", (int)(zInEnd-zInBegin), + zInBegin); + } + unsigned char const * zPos = zInBegin; + cmpp_size_t const nBuffer = + /* Buffer size for our copy of the args. We need to know the + size before we start so that we can have each arg reliably + point back into this without it being reallocated during + parsing. */ + (cmpp_size_t)(zInEnd - zInBegin) + /* Plus we need one final NUL and one NUL byte per argument, but + we don't yet know how many arguments we will have, so let's + estimate... */ + + ((cmpp_size_t)(zInEnd - zInBegin))/3 + + 5/*fudge room*/; + cmpp_b * const buffer = &pArgs->pimpl->argOut; + assert( !buffer->n ); + if( cmpp_b_reserve3(dx->pp, buffer, nBuffer) ){ + return dxppCode; + } + unsigned char * zOut = buffer->z; + unsigned char const * const zOutEnd = zOut + buffer->nAlloc - 1; + cmpp_arg * prevArg = 0; +#if !defined(NDEBUG) + unsigned char const * const zReallocCheck = buffer->z; +#endif + + if(0) g_warn("pre-parsed line: %.*s", (zInEnd - zInBegin), + zInBegin); + pArgs->arg0 = NULL; + pArgs->argc = 0; + for( int i = 0; zPospp, &pArgs->pimpl->argli); + if( !arg ) return dxppCode; + assert( pArgs->pimpl->argli.n ); + if( 0 ) g_warn("zPos=<<%.*s>>", (int)(zInEnd-zPos), zPos); + if( cmpp_arg_parse(dx, arg, &zPos, zInEnd, &zOut, zOutEnd) ){ + if( 0 ) g_warn("zPos=<<%.*s>>", (int)(zInEnd-zPos), zPos); + break; + } + if( 0 ){ + g_warn("#%d zPos=<<%.*s>>", i, (int)(zInEnd-zPos), zPos); + g_warn("#%d arg n=%u z=<<%.*s>> %s", i, (int)arg->n, (int)arg->n, arg->z, arg->z); + } + assert( zPos<=zInEnd ); + if( 0 ){ + g_stderr("ttype=%d %s n=%u z=%.*s\n", arg->ttype, + cmpp__tt_cstr(arg->ttype, true), + (unsigned)arg->n, (int)arg->n, arg->z); + } + if( cmpp_TT_Eof==arg->ttype ){ + CmppArgList_unappend(&pArgs->pimpl->argli); + break; + } + switch( 0==(flags & cmpp_args_F_NO_PARENS) + ? cmpp_dxt_is_group( arg->ttype ) + : 0 ){ + case cmpp_TT_GroupParen:{ + /* Sub-expression. We tokenize it here just to ensure that we + can, so we can fail earlier rather than later. This is why + we need a recycler for the cmpp_args buffer memory. */ + cmpp_args sub = cmpp_args_empty; + cmpp_args_parse(dx, &sub, arg->z, arg->n, flags); + //g_stderr("Parsed sub-expr: %s\n", sub.buffer.z); + cmpp_args_cleanup(&sub); + break; + } + case cmpp_TT_GroupBrace: + case cmpp_TT_GroupSquiggly: + default: break; + } + if( dxppCode ) break; + if( prevArg ){ + assert( !prevArg->next ); + prevArg->next = arg; + } + prevArg = arg; + }/*foreach input char*/ + //g_stderr("rc=%s argc=%d\n", cmpp_rc_cstr(dxppCode), pArgs->args.n); + if( 0==dxppCode ){ + pArgs->argc = pArgs->pimpl->argli.n; + assert( !pArgs->arg0 ); + if( pArgs->argc ) pArgs->arg0 = pArgs->pimpl->argli.list; + if( zOutarg0; a; a = a->next ){ + g_stderr(" got: %s %.*s\n", cmpp__tt_cstr(a->ttype, true), + a->n, a->z); + } + } + } + assert(zReallocCheck==buffer->z + && "Else buffer was reallocated, invalidating argN->z"); + return dxppCode; +} + +CMPP__EXPORT(int, cmpp_args_clone)(cmpp *pp, cmpp_arg const * const a0, + cmpp_args * const dest){ + if( cmpp_args__init(pp, dest) || !a0 ) return ppCode; + cmpp_b * const ob = &dest->pimpl->argOut; + CmppArgList * const argli = &dest->pimpl->argli; + unsigned int i = 0; + cmpp_size_t nReserve = 0 /* arg buffer mem to preallocate */; + + assert( !ob->n ); + assert( !dest->arg0 ); + assert( !dest->argc ); + assert( !argli->n ); + + /* Preallocate ob->z to fit a copy of a0's args. */ + for( cmpp_arg const * a = a0; a; ++i, a = a->next ){ + nReserve += a->n + 1/*NUL byte*/; + } + if( cmpp_b_reserve3(pp, ob, nReserve+1) + || CmppArgList_reserve(pp, argli, i) ){ + goto end; + } + assert( argli->nAlloc>=i ); + i = 0; +#ifndef NDEBUG + unsigned char const * const zReallocCheck = ob->z; +#endif + for( cmpp_arg const * a = a0; a; ++i, a = a->next ){ + cmpp_arg * const aNew = &argli->list[i]; + aNew->n = a->n; + aNew->z = ob->z + ob->n; + aNew->ttype = a->ttype; + if( i ) argli->list[i-1].next = aNew; + assert( !a->z[a->n] && "Expecting a NUL byte there" ); + cmpp_b_append4(pp, ob, a->z, a->n+1/*NUL byte*/); + if( 0 ){ + g_warn("arg#%d=%s <<<%.*s>>> %s", i, cmpp_tt_cstr(a->ttype), + (int)a->n, a->z, a->z); + } + assert( zReallocCheck==ob->z + && "This cannot fail: ob->z was pre-allocated" ); + } + dest->argc = i; + dest->arg0 = i ? &argli->list[0] : 0; +end: + if( ppCode ){ + cmpp_args_reuse(dest); + } + return ppCode; +} + +CMPP__EXPORT(int, cmpp_dx_args_clone)(cmpp_dx * dx, cmpp_args *pOut){ + return cmpp_args_clone(dx->pp, dx->args.arg0, pOut); +} + +char * cmpp_arg_strdup(cmpp *pp, cmpp_arg const *arg){ + char * z = 0; + if( 0==ppCode ){ + z = sqlite3_mprintf("%s",arg->z); + cmpp_check_oom(pp, z); + } + return z; +} + +static cmpp_tt cmpp_tt_forWord(unsigned char const *z, unsigned n, + cmpp_tt dflt){ + static const struct { +#define E(NAME,STR) struct CmppSnippet NAME; + cmpp_tt_map(E) +#undef E + } ttStr = { +#define E(NAME,STR) \ + .NAME = {(unsigned char const *)STR,sizeof(STR)-1}, + cmpp_tt_map(E) +#undef E + }; +#define CASE(NAME) if( 0==memcmp(ttStr.NAME.z, z, n) ) return cmpp_TT_ ## NAME + switch( n ){ + case 1: + CASE(OpEq); + CASE(Plus); + CASE(Minus); + break; + case 2: + CASE(OpOr); + CASE(ShiftL); + //CASE(ShiftR); + //CASE(ArrowL); + CASE(ArrowR); + CASE(OpNeq); + CASE(OpLt); + CASE(OpLe); + CASE(OpGt); + CASE(OpGe); + break; + case 3: + CASE(OpAnd); + CASE(OpNot); + CASE(ShiftL3); + break; + case 4: + CASE(OpGlob); + break; + case 7: + CASE(OpDefined); + break; +#undef CASE + } +#if 0 + bool b = cmpp__is_int(z, n, NULL); + if( 1|| !b ){ + g_warn("is_int(%s)=%d", z, b); + } + return b ? cmpp_TT_Int : dflt; +#else + return cmpp__is_int(z, n, NULL) ? cmpp_TT_Int : dflt; +#endif +} + +int cmpp_arg_parse(cmpp_dx * const dx, cmpp_arg *pOut, + unsigned char const **pzIn, + unsigned char const *zInEnd, + unsigned char ** pzOut, + unsigned char const * zOutEnd){ + unsigned char const * zi = *pzIn; + unsigned char * zo = *pzOut; + cmpp_tt ttype = cmpp_TT_None; + +#if 0 + // trying to tickle valgrind + for(unsigned char const *x = zi; x < zInEnd; ++x ){ + assert(*x); + } +#endif + cmpp_arg_reuse( pOut ); + cmpp_skip_snl( &zi, zInEnd ); + if( zi>=zInEnd ){ + *pzIn = zi; + pOut->ttype = cmpp_TT_Eof; + return 0; + } +#define out(CH) if(zo>=zOutEnd) goto notEnoughOut; *zo++ = CH +#define eot_break if( cmpp_TT_None!=ttype ){ keepGoing = 0; break; } (void)0 + pOut->z = zo; + bool keepGoing = true; + for( ; keepGoing + && 0==dxppCode + && zi'==zi[1] ){ + ttype = cmpp_TT_ArrowR; + out(*zi++); + out(*zi++); + keepGoing = false; + }else{ + goto do_word; + } + break; + case '=': + eot_break; keepGoing = false; ttype = cmpp_TT_OpEq; out(*zi++); break; +#define opcmp(CH,TT,TTEQ,TTSHIFT,TTARROW) \ + case CH: eot_break; keepGoing = false; ttype = TT; out(*zi++); \ + if( zi',cmpp_TT_OpGt,cmpp_TT_OpGe,cmpp_TT_ShiftR,0) break; + opcmp('<',cmpp_TT_OpLt,cmpp_TT_OpLe,cmpp_TT_ShiftL,cmpp_TT_ArrowL) + if( cmpp_TT_ShiftL==ttype && zi= zInEnd || ('"'!=zi[1] && '\''!=zi[1]) ){ + goto do_word; + } + //if( cmpp__StringAtIsOk(dx->pp) ) break; + ttOverride = cmpp_TT_StringAt; + ++zi /* consume opening '@' */; + //g_stderr("@-string override\n"); + /* fall through */ + case '"': + case '\'': { + /* Parse a string. We do not support backslash-escaping of any + sort here. Strings which themselves must contain quotes + should use the other quote type. */ + keepGoing = false; + if( cmpp_TT_None!=ttype ){ + cmpp_dx_err_set(dx, CMPP_RC_SYNTAX, + "Misplaced quote character near: %.*s", + (int)(zi+1 - *pzIn), *pzIn); + break; + } + unsigned char const * zQuoteAt = zi; + if( cmpp__find_closing(dx->pp, &zQuoteAt, zInEnd) ){ + break; + } + assert( zi+1 <= zQuoteAt ); + assert( *zi == *zQuoteAt ); + if( (zQuoteAt - zi - 2) >= (zOutEnd-zo) ){ + goto notEnoughOut; + } + memcpy(zo, zi+1, zQuoteAt - zi - 1); + //g_warn("string=<<%.*s>>", (zQuoteAt-zi-1), zo); + zo += zQuoteAt - zi - 1; + zi = zQuoteAt + 1/* closing quote */; + ttype = (cmpp_TT_None==ttOverride ? cmpp_TT_String : ttOverride); + break; + } + case '[': + case '{': + case '(': { + /* Slurp these as a single token for later sub-parsing */ + keepGoing = false; + unsigned char const * zAt = zi; + if( cmpp__find_closing(dx->pp, &zi, zInEnd) ) break; + /* Transform the output, eliding the open/close characters and + trimming spaces. We need to keep newlines intact, as the + content may be free-form, intended for other purposes, e.g. + the #pipe or #query directives. */ + ttype = ('('==*zAt + ? cmpp_TT_GroupParen + : ('['==*zAt + ? cmpp_TT_GroupBrace + : cmpp_TT_GroupSquiggly)); + ++zAt /* consume opening brace */; + /* Trim leading and trailing space, but retain tabs and all but + the first and last newline. */ + cmpp_skip_space(&zAt, zi); + if( zAt*pzOut && ' '==zo[-1] ) *--zo = 0; + if( zo>*pzOut && '\n'==zo[-1] ){ + *--zo = 0; + if( zo>*pzOut && '\r'==zo[-1] ){ + *--zo = 0; + } + } + ++zi /* consume the closer */; + break; + } + default: + ; do_word: + out(*zi++); + ttype = cmpp_TT_Word; + break; + } + //g_stderr("kg=%d char=%d %c\n", keepGoing, (int)*zi, *zi); + } + if( dxppCode ){ + /* problem already reported */ + }else if( zo>=zOutEnd-1 ){ + notEnoughOut: + cmpp_dx_err_set(dx, CMPP_RC_RANGE, + "Ran out of output space (%u bytes) while " + "parsing an argument", (unsigned)(zOutEnd-*pzOut)); + }else{ + pOut->n = (zo - *pzOut); + if( cmpp_TT_None==ttype ){ + pOut->ttype = cmpp_TT_Eof; + }else if( cmpp_TT_Word==ttype && pOut->n ){ + pOut->ttype = cmpp_tt_forWord(pOut->z, pOut->n, ttype); + }else{ + pOut->ttype = ttype; + } + *zo++ = 0; + *pzIn = zi; + *pzOut = zo; + switch( pOut->ttype ){ + case cmpp_TT_Int: + if( '+'==*pOut->z ){ /* strip leading + */ + ++pOut->z; + --pOut->n; + } + break; + default: + break; + } + if(0){ + g_stderr("parse1: %s n=%u <<%.*s>>", + cmpp__tt_cstr(pOut->ttype, true), pOut->n, + pOut->n, pOut->z); + } + } +#undef out +#undef eot_break + return dxppCode; +} + +int cmpp__arg_toBool(cmpp_dx * const dx, cmpp_arg const *arg, + int * pResult, cmpp_arg const **pNext){ + switch( dxppCode ? 0 : arg->ttype ){ + case 0: break; + + case cmpp_TT_Word: + *pNext = arg->next; + *pResult = cmpp__get_bool(dx->pp, arg->z, arg->n); + break; + + case cmpp_TT_Int: + *pNext = arg->next; + cmpp__is_int(arg->z, arg->n, pResult)/*was already validated*/; + break; + + case cmpp_TT_String: + case cmpp_TT_StringAt:{ + unsigned char const * z = 0; + cmpp_size_t n = 0; + cmpp_b os = cmpp_b_empty; + if( 0==cmpp__arg_expand_ats(dx, &os, cmpp_atpol_CURRENT, + arg, cmpp_TT_StringAt, &z, &n) ){ + *pNext = arg->next; + *pResult = n>0 && 0!=memcmp("0\0", z, 2); + } + cmpp_b_clear(&os); + break; + } + + case cmpp_TT_GroupParen:{ + *pNext = arg->next; + cmpp_args sub = cmpp_args_empty; + if( 0==cmpp_args_parse(dx, &sub, arg->z, arg->n, 0) ){ + cmpp__args_evalToInt(dx, &sub, pResult); + } + cmpp_args_cleanup(&sub); + break; + } + + case cmpp_TT_OpDefined: + if( !arg->next ){ + dxserr("Missing '%s' RHS.", arg->z); + }else if( cmpp_TT_Word!=arg->next->ttype ){ + dxserr( "Invalid '%s' RHS: %s", arg->z, arg->next->z); + }else{ + cmpp_arg const * aOperand = arg->next; + *pNext = aOperand->next; + if( aOperand->n>1 + && '#'==aOperand->z[0] + && !!cmpp__d_search3(dx->pp, (char const*)aOperand->z+1, + cmpp__d_search3_F_NO_DLL) ){ + *pResult = 1; + }else{ + *pResult = cmpp_has(dx->pp, (char const *)aOperand->z, + aOperand->n); + } + } + break; + + case cmpp_TT_OpNot:{ + assert( arg->next && "See cmpp_args__not_simplify()"); + assert( cmpp_TT_OpNot!=arg->next->ttype && "See cmpp_args__not_simplify()"); + if( 0==cmpp__arg_toBool(dx, arg->next, pResult, pNext) ){ + *pResult = !*pResult; + } + break; + } + + default: + dxserr("Invalid token type %s for %s(): %s", + cmpp__tt_cstr(arg->ttype, true), __func__, arg->z); + break; + } + return dxppCode; +} + +CMPP__EXPORT(int, cmpp_arg_to_b)(cmpp_dx * const dx, cmpp_arg const *arg, + cmpp_b * ob, cmpp_flag32_t flags){ + /** + Reminder to self: this function specifically does not do any + expression evaluation of its arguments. Please avoid the + temptation to make it do so. Unless it proves necessary. Or + useful. Even then, though, consider the implications deeply + before doing so. + */ + switch( dxppCode + ? 0 + : ((cmpp_arg_to_b_F_FORCE_STRING & flags) + ? cmpp_TT_String : arg->ttype) ){ + + case 0: + break; + case cmpp_TT_Word: + if( 0==(flags & cmpp_arg_to_b_F_NO_DEFINES) ){ + cmpp__get_b(dx->pp, arg->z, arg->n, ob, true); + break; + } + goto theDefault; + case cmpp_TT_StringAt:{ + unsigned char const * z = 0; + cmpp_size_t n = 0; + if( 0 ){ + g_warn("ob->z [%.*s] [%s]", (int)ob->n, ob->z, ob->z); + } + if( 0==cmpp__arg_expand_ats(dx, ob, cmpp_atpol_CURRENT, arg, + cmpp_TT_StringAt, &z, &n) + && 0 ){ + g_warn("expanded at [%.*s] [%s]", (int)n, z, z); + g_warn("ob->z [%.*s] [%s]", (int)ob->n, ob->z, ob->z); + } + break; + } + case cmpp_TT_GroupBrace: + if( !(cmpp_arg_to_b_F_NO_BRACE_CALL & flags) + && (cmpp_arg_to_b_F_BRACE_CALL & flags) ){ + cmpp_call_str(dx->pp, arg->z, arg->n, ob, 0); + break; + } + /* fall through */ + default: { + theDefault: ; + cmpp_outputer oss = cmpp_outputer_b; + oss.state = ob;//no: cmpp_b_reuse(ob); Append instead. + cmpp__out2(dx->pp, &oss, arg->z, arg->n); + break; + } + } + return dxppCode; +} + +int cmpp__bind_arg(cmpp_dx * const dx, sqlite3_stmt * const q, + int bindNdx, cmpp_arg const * const arg){ + + if( 0 ){ + g_warn("bind #%d %s <<%.*s>>", bindNdx, + cmpp__tt_cstr(arg->ttype, true), + (int)arg->n, arg->z); + } + switch( arg->ttype ){ + default: + case cmpp_TT_Int: + case cmpp_TT_String: + cmpp__bind_textn(dx->pp, q, bindNdx, arg->z, (int)arg->n); + break; + + case cmpp_TT_Word: + case cmpp_TT_StringAt:{ + cmpp_b os = cmpp_b_empty; + if( 0==cmpp_arg_to_b(dx, arg, &os, 0) ){ + if( 0 ){ + g_warn("bind #%d <<%s>> => <<%.*s>>", + bindNdx, arg->z, (int)os.n, os.z); + } + cmpp__bind_textn(dx->pp, q, bindNdx, os.z, (int)os.n); + } + cmpp_b_clear(&os); + break; + } + + case cmpp_TT_GroupParen:{ + cmpp_args sub = cmpp_args_empty; + int i = 0; + if( 0==cmpp_args_parse(dx, &sub, arg->z, arg->n, 0) + && 0==cmpp__args_evalToInt(dx, &sub, &i) ){ + /* See comment above about cmpp_TT_Int. */ + cmpp__bind_int_text(dx->pp, q, bindNdx, i); + } + cmpp_args_cleanup(&sub); + break; + } + + case cmpp_TT_GroupBrace:{ + cmpp_b b = cmpp_b_empty; + cmpp_call_str(dx->pp, arg->z, arg->n, &b, 0); + cmpp__bind_textn(dx->pp, q, bindNdx, b.z, b.n); + cmpp_b_clear(&b); + break; + } + + } + return dxppCode; +} + +/** + If a is in li->list, return its non-const pointer from li->list + (O(1)), else return NULL. +*/ +static cmpp_arg * CmppArgList_arg_nc(CmppArgList *li, cmpp_arg const * a){ + if( li->nAlloc && a>=li->list && a<(li->list + li->nAlloc) ){ + return li->list + (a - li->list); + } + return NULL; +} + +/** + To be called only by cmpp_dx_args_parse() and only if the current + directive asked for it via cmpp_d::flags cmpp_d_F_NOT_SIMPLIFY. + + Filter chains of "not" operators from pArgs, removing unnecessary + ones. Also collapse "not glob" into a single cmpp_TT_OpNotGlob argument. + Performs some basic validation as well to simplify downstream + operations. Returns p->err.code and is a no-op if that's set + before this is called. +*/ +static int cmpp_args__not_simplify(cmpp * const pp, cmpp_args *pArgs){ + cmpp_arg * pPrev = 0; + cmpp_arg * pNext = 0; + CmppArgList * const ali = &pArgs->pimpl->argli; + pArgs->argc = 0; + for( cmpp_arg * arg = ali->n ? &ali->list[0] : NULL; + arg && !ppCode; + pPrev=arg, arg = pNext ){ + pNext = CmppArgList_arg_nc(ali, arg->next); + assert( pNext || !arg->next ); + if( cmpp_TT_OpNot==arg->ttype ){ + if( !pNext ){ + serr("Missing '%s' RHS", arg->z); + break; + } + cmpp_argOp const * const nop = cmpp_argOp_for_tt(pNext->ttype); + if( nop && nop->arity>1 && cmpp_TT_OpGlob!=nop->ttype ){ + serr("Illegal '%s' RHS: binary '%s' operator", + arg->z, pNext->z); + break; + } + int bNeg = 1; + if( '!'==*arg->z ){ + /* odd number of ! == negate */ + bNeg = arg->n & 1; + } + while( pNext && cmpp_TT_OpNot==pNext->ttype ){ + bNeg = !bNeg; + arg->next = pNext = CmppArgList_arg_nc(ali, pNext->next); + } + if( pNext && cmpp_TT_OpGlob==pNext->ttype ){ + /* Transform it to a cmpp_TT_OpNotGlob or cmpp_TT_OpGlob. */ + assert( pNext->z > arg->z + arg->n ); + arg->n = pNext->z + pNext->n - arg->z; + arg->next = pNext->next; + arg->ttype = bNeg + ? cmpp_TT_OpNotGlob + : pNext->ttype; + ++pArgs->argc; + }else if( pPrev ){ + if( bNeg ){ + ++pArgs->argc; + }else{ + /* Snip this node out. */ + pPrev->next = pNext; + } + }else{ + assert( 0==pArgs->argc ); + ++pArgs->argc; + if( !bNeg ){ + arg->ttype = cmpp_TT_Noop; + } + } + /* Potential bug in waiting/fixme: by eliding all nots we are + ** changing the behavior from forced coercion to bool to + ** coercion to whatever the LHS wants. */ + }else{ + ++pArgs->argc; + } + } + pArgs->arg0 = pArgs->argc ? &ali->list[0] : NULL; + return ppCode; +} + +CMPP__EXPORT(int, cmpp_dx_args_parse)(cmpp_dx *dx, + cmpp_args *args){ + if( !dxppCode + && 0==cmpp_args_parse(dx, args, dx->args.z, dx->args.nz, + cmpp_args_F_NO_PARENS) + && (cmpp_d_F_NOT_SIMPLIFY & dx->d->flags) ){ + cmpp_args__not_simplify(dx->pp, args); + } + return dxppCode; +} + +/* Helper for cmpp_kav_each() and friends. */ +static +int cmpp__each_parse_args(cmpp_dx *dx, + cmpp_args *args, + unsigned char const *zBegin, + cmpp_ssize_t nz, + cmpp_flag32_t flags){ + if( 0==cmpp_args_parse(dx, args, zBegin, nz, cmpp_args_F_NO_PARENS) ){ + if( !args->argc + && (cmpp_kav_each_F_NOT_EMPTY & flags) ){ + cmpp_err_set(dx->pp, CMPP_RC_RANGE, + "Empty list is not permitted here."); + } + } + return dxppCode; +} + +/* Helper for cmpp_kav_each() and friends. */ +static +int cmpp__each_paren_expr(cmpp_dx *dx, cmpp_arg const * arg, + unsigned char * pOut, size_t nOut){ + cmpp_args sub = cmpp_args_empty; + int rc = cmpp_args_parse(dx, &sub, arg->z, arg->n, 0); + if( 0==rc ){ + int d = 0; + rc = cmpp__args_evalToInt(dx, &sub, &d); + if( 0==rc ){ + snprintf((char *)pOut, nOut, "%d", d); + } + } + cmpp_args_cleanup(&sub); + return rc; +} + +CMPP__EXPORT(int, cmpp_kav_each)(cmpp_dx *dx, + unsigned char const *zBegin, + cmpp_ssize_t nIn, + cmpp_kav_each_f callback, + void *callbackState, + cmpp_flag32_t flags){ + if( dxppCode ) return dxppCode; + /* Reminder to self: we cannot reuse internal buffers here because a + callback could recurse into this or otherwise use APIs which use + those same buffers. */ + cmpp_b bKey = cmpp_b_empty; + cmpp_b bVal = cmpp_b_empty; + bool const reqArrow = 0==(cmpp_kav_each_F_NO_ARROW & flags); + cmpp_args args = cmpp_args_empty; + unsigned char exprBuf[32] = {0}; + cmpp_size_t const nz = cmpp__strlenu(zBegin,nIn); + unsigned char const * const zEnd = zBegin + nz; + cmpp_flag32_t a2bK = 0, a2bV = 0 /*cmpp_arg_to_b() flags*/; + assert( zBegin ); + assert( zEnd ); + assert( zEnd>=zBegin ); + + if( cmpp__each_parse_args(dx, &args, zBegin, nz, flags) ){ + goto cleanup; + }else if( reqArrow && 0!=args.argc%3 ){ + cmpp_err_set(dx->pp, CMPP_RC_RANGE, + "Expecting a list of 3 tokens per entry: " + "KEY -> VALUE"); + }else if( !reqArrow && 0!=args.argc%2 ){ + cmpp_err_set(dx->pp, CMPP_RC_RANGE, + "Expecting a list of 2 tokens per entry: " + "KEY VALUE"); + } + if( cmpp_kav_each_F_CALL_KEY & flags ){ + a2bK |= cmpp_arg_to_b_F_BRACE_CALL; + flags |= cmpp_kav_each_F_EXPAND_KEY; + } + if( cmpp_kav_each_F_CALL_VAL & flags ){ + a2bV |= cmpp_arg_to_b_F_BRACE_CALL; + flags |= cmpp_kav_each_F_EXPAND_VAL; + } + cmpp_arg const * aNext = 0; + for( cmpp_arg const * aKey = args.arg0; + !dxppCode && aKey; + aKey = aNext ){ + aNext = aKey->next; + cmpp_arg const * aVal = aKey->next; + if( !aVal ){ + dxserr("Expecting %s after key '%s'.", + (reqArrow ? "->" : "a value"), + aKey->z); + break; + } + if( reqArrow ){ + if( cmpp_TT_ArrowR!=aVal->ttype ){ + dxserr("Expecting -> after key '%s'.", aKey->z); + break; + } + aVal = aVal->next; + if( !aVal ){ + dxserr("Expecting a value after '%s' ->.", aKey->z); + break; + } + } + //g_warn("\nkey=[%s]\nval=[%s]", aKey->z, aVal->z); + /* Expand the key/value parts if needed... */ + unsigned char const *zKey; + unsigned char const *zVal; + cmpp_size_t nKey, nVal; + if( cmpp_kav_each_F_EXPAND_KEY & flags ){ + if( cmpp_arg_to_b(dx, aKey, cmpp_b_reuse(&bKey), + a2bK) ){ + break; + } + zKey = bKey.z; + nKey = bKey.n; + }else{ + zKey = aKey->z; + nKey = aKey->n; + } + if( cmpp_TT_GroupParen==aVal->ttype + && (cmpp_kav_each_F_PARENS_EXPR & flags) ){ + if( cmpp__each_paren_expr(dx, aVal, &exprBuf[0], + sizeof(exprBuf)-1) ){ + break; + } + zVal = &exprBuf[0]; + nVal = cmpp__strlenu(zVal, -1); + }else if( cmpp_kav_each_F_EXPAND_VAL & flags ){ + if( cmpp_arg_to_b(dx, aVal, cmpp_b_reuse(&bVal), + a2bV) ){ + break; + } + zVal = bVal.z; + nVal = bVal.n; + }else{ + zVal = aVal->z; + nVal = aVal->n; + } + aNext = aVal->next; + if( 0!=callback(dx, zKey, nKey, zVal, nVal, callbackState) ){ + break; + } + } +cleanup: + cmpp_b_clear(&bKey); + cmpp_b_clear(&bVal); + cmpp_args_cleanup(&args); + return dxppCode; +} + +CMPP__EXPORT(int, cmpp_str_each)(cmpp_dx *dx, + unsigned char const *zBegin, + cmpp_ssize_t nIn, + cmpp_kav_each_f callback, void *callbackState, + cmpp_flag32_t flags){ + g_warn0("UNTESTED!"); + if( dxppCode ) return dxppCode; + /* Reminder to self: we cannot reuse internal buffers here because a + callback could recurse into this or otherwise use APIs which use + those same buffers. */ + cmpp_b ob = cmpp_b_empty; + cmpp_args args = cmpp_args_empty; + unsigned char exprBuf[32] = {0}; + cmpp_size_t const nz = cmpp__strlenu(zBegin,nIn); + unsigned char const * const zEnd = zBegin + nz; + assert( zBegin ); + assert( zEnd ); + assert( zEnd>=zBegin ); + + if( cmpp__each_parse_args(dx, &args, zBegin, nz, flags) ){ + goto cleanup; + } + cmpp_arg const * aNext = 0; + for( cmpp_arg const * arg = args.arg0; + !dxppCode && arg; + arg = aNext ){ + aNext = arg->next; + //g_warn("\nkey=[%s]\nval=[%s]", arg->z, aVal->z); + /* Expand the key/value parts if needed... */ + unsigned char const *zVal; + cmpp_size_t nVal; + if( cmpp_TT_GroupParen==arg->ttype + && (cmpp_kav_each_F_PARENS_EXPR & flags) ){ + if( cmpp__each_paren_expr(dx, arg, &exprBuf[0], + sizeof(exprBuf)-1) ){ + break; + } + zVal = &exprBuf[0]; + nVal = cmpp__strlenu(zVal, -1); + }else if( cmpp_kav_each_F_EXPAND_VAL & flags ){ + if( cmpp_arg_to_b(dx, arg, cmpp_b_reuse(&ob), 0) ){ + break; + } + zVal = ob.z; + nVal = ob.n; + }else{ + zVal = arg->z; + nVal = arg->n; + } + if( 0!=callback(dx, arg->z, arg->n, zVal, nVal, callbackState) ){ + break; + } + } +cleanup: + cmpp_b_clear(&ob); + cmpp_args_cleanup(&args); + return dxppCode; +} + +/** + Returns true if z _might_ be a cmpp_TT_StringAt, else false. It may have + false positives but won't have false negatives. + + This is only intended to be used on NUL-terminated strings, not a + pointer into a cmpp input source. +*/ +static bool cmpp__might_be_atstring(unsigned char const *z){ + char const * const x = strchr((char const *)z, '@'); + return x && !!strchr(x+1, '@'); +} + +int cmpp__arg_expand_ats(cmpp_dx const * const dx, + cmpp_b * os, + cmpp_atpol_e atPolicy, + cmpp_arg const * const arg, + cmpp_tt thisTtype, + unsigned char const **pExp, + cmpp_size_t * nExp){ + assert( os ); + cmpp_b_reuse(os); + if( 0==dxppCode + && (cmpp_TT_AnyType==thisTtype || thisTtype==arg->ttype) + && cmpp__might_be_atstring(arg->z) + && 0==cmpp__StringAtIsOk(dx->pp, atPolicy) ){ +#if 0 + if( !os->nAlloc ){ + cmpp_b_reserve3(os, 128); + } +#endif + cmpp_outputer oos = cmpp_outputer_b; + oos.state = os; + assert( !os->n ); + if( !cmpp_dx_out_expand(dx, &oos, arg->z, arg->n, + atPolicy ) ){ + *pExp = os->z; + if( nExp ) *nExp = os->n; + if( 0 ){ + g_warn("os->n=%u os->z=[%.*s]\n", os->n, (int)os->n, + os->z); + } + + } + }else if( !dxppCode ){ + *pExp = arg->z; + if( nExp ) *nExp = arg->n; + } + return dxppCode; +} + +bool cmpp__arg_wordIsPathOrFlag( + cmpp_arg const * const arg +){ + return cmpp_TT_Word==arg->ttype + && ('-'==(char)arg->z[0] + || strchr((char*)arg->z, '.') + || strchr((char*)arg->z, '-') + || strchr((char*)arg->z, '/') + || strchr((char*)arg->z, '\\')); +} +/* +** 2022-11-12: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the cmpp_popen() pieces. +*/ +#if !defined(_POSIX_C_SOURCE) +# define _POSIX_C_SOURCE 200809L /* for fdopen() in stdio.h */ +#endif +#include + +const cmpp_popen_t cmpp_popen_t_empty = cmpp_popen_t_empty_m; + +#if CMPP_PLATFORM_IS_UNIX +#include +static int cmpp__err_errno(cmpp *pp, int errNo, char const *zContext){ + return cmpp_err_set(pp, cmpp_errno_rc(errNo, CMPP_RC_ERROR), + "errno #%d: %s", errNo, zContext); +} +#endif + +/** + Uses fork()/exec() to run a command in a separate process and open + a two-way stream to it. + + If azCmd is NULL then zCmd must contain the command to run and + any flags. It is passed as the 4th argument to + execl("/bin/sh", "/bin/sh", "-c", zCmd, NULL). + + If azCmd is not NULL then it must be suitable for use as the 2nd + argument to execv(2). execv(X, azCmd) is used in this case, where + X is (zCmd ? zCmd : azCmd[0]). + + Flags: + + - cmpp_popen_F_DIRECT: if azCmd is NULL and flags has this bit set then + zCmd is instead passed to execl(zCmd, zCmd, NULL). That can only + work if zCmd is a single command without arguments. + cmpp_popen_F_DIRECT has no effect if azCmd is not NULL. + + - cmpp_popen_F_PATH: tells it to use execlp() or execvp(), which + performs path lookup of its initial argument. + + On success: + + - po->fdFromChild is the child's stdout. Read from it to read from + the child. + + - If po->fpToChild is not NULL then *po->fpToChild is set to the + child's stdin. Write to it to send the child stuff. Be sure to + flush() and/or close() it to keep it from hanging forever. If + po->fpToChild is NULL then the stdin of the child is closed. + + - po->childPid will be set to the PID of the child process. + + On error: you know the drill. + + After calling this, the caller is obligated to pass po to + cmpp_pclose(). If the caller fcloses() *po->fpToChild then they + must set it to NULL so that passing it to cmpp_pclose() knows not + to close it. + + Bugs: because the command is run via /bin/sh -c ... we cannot tell + if it's actually found. All we can tell is that /bin/sh ran. + + Also: this doesn't capture stderr, so commands should redirect + stderr to stdout. Adding the child's stderr handle to cmpp_popen_t is + a potential TODO without a current use case. +*/ +static +int cmpp__popen_impl(cmpp *pp, unsigned char const *zCmd, + char * const * azCmd, cmpp_flag32_t flags, + cmpp_popen_t *po){ +#if !CMPP_PLATFORM_IS_UNIX + return cmpp__err(pp, CMPP_RC_UNSUPPORTED, + "Piping is not supported in this build."); +#else + if( ppCode ) return ppCode; +#define shut(P,N) close(P[N]) + /** Attribution: this impl is derived from one found in + the Fossil SCM. */ + int pin[2]; + int pout[2]; + + po->fdFromChild = -1; + if( po->fpToChild ) *po->fpToChild = 0; + if( pipe(pin)<0 ){ + return cmpp__err_errno(pp, errno, "pipe(in) failed"); + } + if( pipe(pout)<0 ){ + int const rc = cmpp__err_errno(pp, errno, + "pipe(out) failed"); + shut(pin,0); + shut(pin,1); + return rc; + } + po->childPid = fork(); + if( po->childPid<0 ){ + int const rc = cmpp__err_errno(pp, errno, "fork() failed"); + shut(pin,0); + shut(pin,1); + shut(pout,0); + shut(pout,1); + return rc; + } + signal(SIGPIPE,SIG_IGN); + if( po->childPid==0 ){ + /* The child process. */ + int fd; + close(0); + fd = dup(pout[0]); + if( fd!=0 ) { + cmpp__fatal("Error opening file descriptor 0."); + }; + shut(pout,0); + shut(pout,1); + close(1); + fd = dup(pin[1]); + if(fd!=1) { + cmpp__fatal("Error opening file descriptor 1."); + }; + shut(pin,0); + shut(pin,1); + if( azCmd ){ + if( pp->pimpl->flags.doDebug>1 ){ + for( int i = 0; azCmd[i]; ++i ){ + g_warn("execv arg[%d]=%s", i, azCmd[i]); + } + } + int (*exc)(const char *, char *const []) = + (cmpp_popen_F_PATH & flags) ? execvp : execv; + exc(zCmd ? (char*)zCmd : azCmd[0], azCmd); + cmpp__fatal("execv() failed"); + }else{ + g_debug(pp,2,("zCmd=%s\n", zCmd)); + int (*exc)(const char *, char const *, ...) = + (cmpp_popen_F_PATH & flags) ? execlp : execl; + if( cmpp_popen_F_DIRECT & flags ){ + exc((char*)zCmd, (char*)zCmd, (char*)0); + }else{ + exc("/bin/sh", "/bin/sh", "-c", zCmd, (char*)0); + } + cmpp__fatal("execl() failed"); + } + /* not reached */ + }else{ + /* The parent process. */ + //cmpp_outputer_flush(&pp->pimpl->out.ch); + po->fdFromChild = pin[0]; + shut(pin,1); + shut(pout,0); + if( po->fpToChild ){ + *po->fpToChild = fdopen(pout[1], "w"); + if( !*po->fpToChild ){ + shut(pin,0); + shut(pout,1); + po->fdFromChild = -1; + cmpp__err_errno(pp, errno, + "Error opening child process's stdin " + "FILE handle from its descriptor."); + } + }else{ + shut(pout,1); + } + return ppCode; + } +#undef shut +#endif +} + +int cmpp_popen(cmpp *pp, unsigned char const *zCmd, + cmpp_flag32_t flags, cmpp_popen_t *po){ + return cmpp__popen_impl(pp, zCmd, NULL, flags, po); +} + +int cmpp_popenv(cmpp *pp, char * const * azCmd, + cmpp_flag32_t flags, cmpp_popen_t *po){ + return cmpp__popen_impl(pp, NULL, azCmd, flags, po); +} + +int cmpp_popen_args(cmpp_dx *dx, cmpp_args const * args, + cmpp_popen_t *po){ +#if !CMPP_PLATFORM_IS_UNIX + return cmpp__popen_impl(dx->pp, NULL, 0, po) /* will fail */; +#else + if( dxppCode ) return dxppCode; + enum { MaxArgs = 128 }; + char * argv[MaxArgs] = {0}; + cmpp_size_t offsets[MaxArgs] = {0}; + cmpp_b osAll = cmpp_b_empty; + cmpp_b os1 = cmpp_b_empty; + if( args->argc >= MaxArgs ){ + return cmpp_dx_err_set(dx, CMPP_RC_RANGE, + "Too many arguments (%d). Max is %d.", + args->argc, (int)MaxArgs); + } + int i = 0; + for(cmpp_arg const * a = args->arg0; + a; ++i, a = a->next ){ + offsets[i] = osAll.n; + cmpp_flag32_t a2bFlags = cmpp_arg_to_b_F_BRACE_CALL; + if( cmpp__arg_wordIsPathOrFlag(a) ){ + a2bFlags |= cmpp_arg_to_b_F_FORCE_STRING; + } + if( cmpp_arg_to_b(dx, a, cmpp_b_reuse(&os1), a2bFlags) + || cmpp_b_append4(dx->pp, &osAll, os1.z, os1.n+1/*NUL*/) ){ + goto end; + } + assert( osAll.n > offsets[i] ); + if( 0 ){ + g_warn("execv arg[%d] = %s => %s", i, a->z, + osAll.z+offsets[i]); + } + } + argv[i] = 0; + for( --i; i >= 0; --i ){ + argv[i] = (char*)(osAll.z + offsets[i]); + if( 0 ){ + g_warn("execv arg[%d] = %s", i, argv[i]); + } + } +end: + if( 0==dxppCode ){ + cmpp__popen_impl(dx->pp, NULL, argv, 0, po); + } + cmpp_b_clear(&osAll); + cmpp_b_clear(&os1); + return dxppCode; +#endif +} + +int cmpp_pclose(cmpp_popen_t *po){ +#if CMPP_PLATFORM_IS_UNIX + if( po->fdFromChild>=0 ) close(po->fdFromChild); + if( po->fpToChild && *po->fpToChild ) fclose(*po->fpToChild); + int const childPid = po->childPid; + *po = cmpp_popen_t_empty; +#if 1 + int wp, rc = 0; + if( childPid>0 ){ + //kill(childPid, SIGINT); // really needed? + do{ + wp = waitpid(childPid, &rc, WNOHANG); + if( wp>0 ){ + if( WIFEXITED(rc) ){ + rc = WEXITSTATUS(rc); + }else if( WIFSIGNALED(rc) ){ + rc = WTERMSIG(rc); + }else{ + rc = 0/*???*/; + } + } + } while( wp>0 ); + } + return rc; +#elif 0 + while( waitpid(childPid, NULL, WNOHANG)>0 ){} +#else + if( childPid>0 ){ + kill(childPid, SIGINT); // really needed? + waitpid((pid_t)childPid, NULL, WNOHANG); + }else{ + while( waitpid( (pid_t)0, NULL, WNOHANG)>0 ){} + } +#endif +#endif +} +/* +** 2022-11-12: +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** * May you do good and not evil. +** * May you find forgiveness for yourself and forgive others. +** * May you share freely, never taking more than you give. +** +************************************************************************ +** This file houses the module-loading pieces libcmpp. +*/ + +#if CMPP_ENABLE_DLLS +static const CmppSohList CmppSohList_empty = + CmppSohList_empty_m; +#endif + +#if CMPP_ENABLE_DLLS +/** + If compiled without CMPP_ENABLE_DLLS defined to a true value + then this function always returns CMPP_RC_UNSUPPORTED and updates + the error state of its first argument with information about that + code. + + Its first argument is the controlling cmpp. It can actually be + NULL - it's only used for reporting error details. + + Its second argument is the name of a DLL file. + + Its third argument is the name of a symbol in the given DLL which + resolves to a cmpp_module pointer. This name may be NULL, + in which case a default symbol name of "cmpp_module1" is used + (which is only useful when plugins are built one per DLL). + + The fourth argument is the output pointer to store the + resulting module handle in. + + The fifth argument is an optional list to append the DLL's + native handle to. It may be NULL. + + This function tries to open a DLL named fname using the system's + DLL loader. If none is found, CMPP_RC_NOT_FOUND is returned and the + cmpp's error state is populated with info about the error. If + one is found, it looks for a symbol in the DLL: if symName is not + NULL and is not empty then the symbol "cmpp_module_symName" is + sought, else "cmpp_module". (e.g. if symName is "foo" then it + searches for a symbol names "cmpp_module_foo".) If no such symbol is + found then CMPP_RC_NOT_FOUND (again) is returned and the + cmpp's error state is populated, else the symbol is assumed to + be a (cmpp_module*) and *mod is assigned to it. + + All errors update pp's error state but all are recoverable. + + Returns 0 on success. + + On success: + + - `*mod` is set to the module object. Its ownship is kinda murky: it + lives in memory made available via the module loader. It remains + valid memory until the DLL is closed. The module might also + actually be statically linked with the application, in which case + it will live as long as the app. + + - If soli is not NULL then the native DLL handle is appended to it. + Allocation errors when appending the DLL handle to the target list + are ignored - failure to retain a DLL handle for closing later is + not considered critical (and it would be extraordinarily rare (and + closing them outside of late-/post-main() cleanup is ill-advised, + anyway)). + + @see cmpp_module_load() + @see CMPP_MODULE_DECL + @see CMPP_MODULE_IMPL2 + @see CMPP_MODULE_IMPL3 + @see CMPP_MODULE_IMPL_SOLO + @see CMPP_MODULE_REGISTER2 + @see CMPP_MODULE_REGISTER3 +*/ +static +int cmpp__module_extract(cmpp * pp, + char const * dllFileName, + char const * symName, + cmpp_module const ** mod); +#endif + +#if CMPP_ENABLE_DLLS && !defined(CMPP_OMIT_D_MODULE) +# define CMPP_D_MODULE 1 +#else +# define CMPP_D_MODULE 0 +#endif + +#if CMPP_D_MODULE +/** + The #module directive: + + #module dll ?moduleName? + + Uses cmpp_module_load(dx, dll, moduleName||NULL) to try to load a + directive module. +*/ +//static +void cmpp_dx_f_module(cmpp_dx *dx) { + cmpp_arg const * aName = 0; + cmpp_b obDll = cmpp_b_empty; + for( cmpp_arg const *arg = dx->args.arg0; + arg; arg = arg->next ){ + //MARKER(("arg %s=%s\n", cmpp_tt_cstr(arg->ttype), arg->z)); + if( cmpp_dx_err_check(dx) ) goto end; + else if( !obDll.z ){ + cmpp_arg_to_b( + dx, arg, &obDll, + 0//cmpp_arg_to_b_F_NO_DEFINES + ); + continue; + }else if( !aName ){ + aName = arg; + continue; + } + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Unhandled argument: %s", arg->z); + goto end; + } + if( !obDll.z ){ + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, + "Expecting a DLL name argument."); + goto end; + } + cmpp_module_load(dx->pp, (char const *)obDll.z, + aName ? (char const *)aName->z : NULL); +end: + cmpp_b_clear(&obDll); + return; +#if 0 + missing_arg: + cmpp_dx_err_set(dx, CMPP_RC_MISUSE, "Expecting an argument after %s.", + arg->z); + return; +#endif +} +#endif /* #module */ + +/** + Module loader pedantic licensing note: Most of cmpp's + module-loading code was copied verbatim from another project[^1], + but was written by the same author who relicenses it in + cmpp. + + [^1]: https://fossil.wanderinghorse.net/r/cwal +*/ +#if CMPP_ENABLE_DLLS +#if CMPP_HAVE_DLOPEN +typedef void * cmpp_soh; +# include /* this actually has a different name on some platforms! */ +#elif CMPP_HAVE_LTDLOPEN +# include +typedef lt_dlhandle cmpp_soh; +#elif CMPP_ENABLE_DLLS +# error "We have no dlopen() impl for this configuration." +#endif + +static cmpp_soh cmpp__dlopen(char const * fname, + char const **errMsg){ + static int once = 0; + cmpp_soh soh = 0; + if(!once && ++once){ +#if CMPP_HAVE_DLOPEN + dlopen( 0, RTLD_NOW | RTLD_GLOBAL ); +#elif CMPP_HAVE_LTDLOPEN + lt_dlinit(); + lt_dlopen( 0 ); +#endif + } +#if CMPP_HAVE_DLOPEN + soh = dlopen(fname, RTLD_NOW | RTLD_GLOBAL); +#elif CMPP_HAVE_LTDLOPEN + soh = lt_dlopen(fname); +#endif + if(!soh && errMsg){ +#if CMPP_HAVE_DLOPEN + *errMsg = dlerror(); +#elif CMPP_HAVE_LTDLOPEN + *errMsg = lt_dlerror(); +#endif + } + return soh; +} + +static +cmpp_module const * cmpp__dlsym(cmpp_soh soh, + char const * mname){ + cmpp_module const ** sym = +#if CMPP_HAVE_DLOPEN + dlsym(soh, mname) +#elif CMPP_HAVE_LTDLOPEN + lt_dlsym(soh, mname) +#else + NULL +#endif + ; + return sym ? *sym : NULL; +} + +static void cmpp__dlclose(cmpp_soh soh){ + if( soh ) { +#if CMPP_CLOSE_DLLS + /* MARKER(("Closing loaded module @%p.\n", (void const *)soh)); */ +#if CMPP_HAVE_DLOPEN + dlclose(soh); +#elif CMPP_HAVE_LTDLOPEN + lt_dlclose(soh); +#endif +#endif + } +} +#endif /* CMPP_ENABLE_DLLS */ + +#define CmppSohList_works (CMPP_ENABLE_DLLS && CMPP_CLOSE_DLLS) + +int CmppSohList_append(cmpp *pp, CmppSohList *soli, void *soh){ +#if CmppSohList_works + int const rc = cmpp_array_reserve(pp, (void**)&soli->list, + soli->n + ? (soli->n==soli->nAlloc + ? soli->nAlloc*2 + : soli->n+1) + : 8, + &soli->nAlloc, sizeof(void*)); + if( 0==rc ){ + soli->list[soli->n++] = soh; + } + return rc; +#else + (void)pp; (void)soli; (void)soh; + return 0; +#endif +} + +void CmppSohList_close(CmppSohList *s){ +#if CmppSohList_works + while( s->nAlloc ){ + if( s->list[--s->nAlloc] ){ + //MARKER(("closing soh %p\n", s->list[s->nAlloc])); + cmpp__dlclose(s->list[s->nAlloc]); + s->list[s->nAlloc] = 0; + } + } + cmpp_mfree(s->list); + *s = CmppSohList_empty; +#else + (void)s; +#endif +} + +#if 0 +/** + Passes soli to CmppSohList_close() then frees soli. Results are + undefined if soli is not NULL but was not returned from + CmppSohList_new(). + + Special case: if built without DLL-closing support, this is a no-op. +*/ +//static void CmppSohList_free(CmppSohList *soli); +void CmppSohList_free(CmppSohList *s){ + if( s ){ +#if CmppSohList_works + CmppSohList_close(s); + cmpp_mfree(s); +#endif + } +} + +/** + Returns a new, cleanly-initialized CmppSohList or NULL + on allocation error. The returned instance must eventually be + passed to CmppSohList_free(). + + Special case: if built without DLL-closing support, this returns a + no-op singleton instance. +*/ +//static CmppSohList * CmppSohList_new(void); +CmppSohList * CmppSohList_new(void){ +#if CmppSohList_works + CmppSohList * s = cmpp_malloc(sizeof(*s)); + if( s ) *s = CmppSohList_empty; + return s; +#else + static CmppSohList soli = CmppSohList_empty; + return &soli; +#endif +} +#endif + +#undef CmppSohList_works + +#if CMPP_ENABLE_DLLS +/** + Default entry point symbol name for loadable modules. This must + match the symbolic name defined by CMPP_MODULE_IMPL_SOLO(). +*/ +static char const * const cmppModDfltSym = "cmpp_module1"; + +/** + Looks for a symbol in the given DLL handle. If symName is NULL or + empty, the symbol "cmpp_module" is used, else the symbols + ("cmpp_module__" + symName) is used. If it finds one, it casts it to + cmpp_module and returns it. On error it may update pp's + error state with the error information if pp is not NULL. + + Errors: + + - symName is too long. + + - cmpp__dlsym() lookup failure. +*/ +static cmpp_module const * +cmpp__module_fish_out_entry_pt(cmpp * pp, + cmpp_soh soh, + char const * symName){ + enum { MaxLen = 128 }; + char buf[MaxLen] = {0}; + cmpp_size_t const slen = symName ? strlen(symName) : 0; + cmpp_module const * mod = 0; + if(slen > (MaxLen-20)){ + cmpp_err_set(pp, CMPP_RC_RANGE, + "DLL symbol name '%.*s' is too long. Max is %d.", + (int)slen, symName, (int)MaxLen-20); + }else{ + if(symName && *symName){ + snprintf(buf, MaxLen,"cmpp_module__%s", symName); + symName = &buf[0]; + }else{ + symName = cmppModDfltSym; + } + mod = cmpp__dlsym(soh, symName); + } + /*MARKER(("%s() [%s] ==> %p\n",__func__, symName, + (void const *)mod));*/ + return mod; +} +#endif/*CMPP_ENABLE_DLLS*/ + +#if CMPP_ENABLE_DLLS +/** + Tries to dlsym() the given cmpp_module symbol from the given + DLL handle. On success, 0 is returned and *mod is assigned to the + memory. On error, non-0 is returned and pp's error state may be + updated. + + Ownership of the returned module ostensibly lies with the first + argument, but that's not entirely true. If CMPP_CLOSE_DLLS is true + then a copy of the module's pointer is stored in the engine for + later closing. The memory itself is owned by the module loader, and + "should" stay valid until the DLL is closed. +*/ +static int cmpp__module_get_sym(cmpp * pp, + cmpp_soh soh, + char const * symName, + cmpp_module const ** mod){ + + cmpp_module const * lm = 0; + int rc = cmpp_err_has(pp); + if( 0==rc ){ + lm = cmpp__module_fish_out_entry_pt(pp, soh, symName); + rc = cmpp_err_has(pp); + } + if(0==rc){ + if(lm){ + *mod = lm; + }else{ + cmpp__dlclose(soh); + rc = cmpp_err_set(pp, CMPP_RC_NOT_FOUND, + "Did not find module entry point symbol '%s'.", + symName ? symName : cmppModDfltSym); + } + } + return rc; +} +#endif/*CMPP_ENABLE_DLLS*/ + +#if !CMPP_ENABLE_DLLS +static int cmpp__err_no_dlls(cmpp * const pp){ + return cmpp_err_set(pp, CMPP_RC_UNSUPPORTED, + "No dlopen() equivalent is installed " + "for this build configuration."); +} +#endif + +#if CMPP_ENABLE_DLLS +//no: CMPP_WASM_EXPORT +int cmpp__module_extract(cmpp * pp, + char const * fname, + char const * symName, + cmpp_module const ** mod){ + int rc = cmpp_err_has(pp); + if( rc ) return rc; + else if( cmpp_is_safemode(pp) ){ + return cmpp_err_set(pp, CMPP_RC_ACCESS, + "Cannot use DLLs in safe mode."); + }else{ + cmpp_soh soh; + char const * errMsg = 0; + soh = cmpp__dlopen(fname, &errMsg); + if(soh){ + if( pp ){ + CmppSohList_append(NULL/*alloc error here can be ignored*/, + &pp->pimpl->mod.sohList, soh); + } + cmpp_module const * x = 0; + rc = cmpp__module_get_sym(pp, soh, symName, &x); + if(!rc && mod) *mod = x; + return rc; + }else{ + return errMsg + ? cmpp_err_set(pp, CMPP_RC_ERROR, "DLL open failed: %s", + errMsg) + : cmpp_err_set(pp, CMPP_RC_ERROR, + "DLL open failed for unknown reason."); + } + } +} +#endif + +//no: CMPP_WASM_EXPORT +int cmpp_module_load(cmpp * pp, char const * fname, + char const * symName){ +#if CMPP_ENABLE_DLLS + if( ppCode ){ + /* fall through */ + }else if( cmpp_ctor_F_SAFEMODE & pp->pimpl->flags.newFlags ){ + cmpp_err_set(pp, CMPP_RC_ACCESS, + "%s() is disallowed in safe-mode."); + }else{ + cmpp__pi(pp); + char * zName = 0; + if( fname ){ + zName = cmpp_path_search(pp, (char const *)pi->mod.path.z, + pi->mod.pathSep, fname, + pi->mod.soExt); + if( !zName ){ + return cmpp_err_set(pp, CMPP_RC_NOT_FOUND, + "Did not find [%s] or [%s%s] " + "in search path [%s].", + fname, fname, pi->mod.soExt, + pi->mod.path.z); + } + } + cmpp_module const * mod = 0; + if( 0==cmpp__module_extract(pp, zName, symName, &mod) ){ + assert(mod); + assert(mod->init); + int const rc = mod->init(pp); + if( rc && !ppCode ){ + cmpp_err_set(pp, CMPP_RC_ERROR, + "Module %s::init() failed with code #%d/%s " + "without providing additional info.", + symName ? symName : "cmpp_module", + rc, cmpp_rc_cstr(rc)); + } + cmpp_mfree(zName); + } + } + return ppCode; +#else + (void)fname; (void)symName; + return cmpp__err_no_dlls(pp); +#endif +} +/* +** 2015-08-18, 2023-04-28 +** +** The author disclaims copyright to this source code. In place of +** a legal notice, here is a blessing: +** +** May you do good and not evil. +** May you find forgiveness for yourself and forgive others. +** May you share freely, never taking more than you give. +** +************************************************************************* +** +** This file demonstrates how to create a table-valued-function using +** a virtual table. This demo implements the generate_series() function +** which gives the same results as the eponymous function in PostgreSQL, +** within the limitation that its arguments are signed 64-bit integers. +** +** Considering its equivalents to generate_series(start,stop,step): A +** value V[n] sequence is produced for integer n ascending from 0 where +** ( V[n] == start + n * step && sgn(V[n] - stop) * sgn(step) >= 0 ) +** for each produced value (independent of production time ordering.) +** +** All parameters must be either integer or convertable to integer. +** The start parameter is required. +** The stop parameter defaults to (1<<32)-1 (aka 4294967295 or 0xffffffff) +** The step parameter defaults to 1 and 0 is treated as 1. +** +** Examples: +** +** SELECT * FROM generate_series(0,100,5); +** +** The query above returns integers from 0 through 100 counting by steps +** of 5. In other words, 0, 5, 10, 15, ..., 90, 95, 100. There are a total +** of 21 rows. +** +** SELECT * FROM generate_series(0,100); +** +** Integers from 0 through 100 with a step size of 1. 101 rows. +** +** SELECT * FROM generate_series(20) LIMIT 10; +** +** Integers 20 through 29. 10 rows. +** +** SELECT * FROM generate_series(0,-100,-5); +** +** Integers 0 -5 -10 ... -100. 21 rows. +** +** SELECT * FROM generate_series(0,-1); +** +** Empty sequence. +** +** HOW IT WORKS +** +** The generate_series "function" is really a virtual table with the +** following schema: +** +** CREATE TABLE generate_series( +** value, +** start HIDDEN, +** stop HIDDEN, +** step HIDDEN +** ); +** +** The virtual table also has a rowid which is an alias for the value. +** +** Function arguments in queries against this virtual table are translated +** into equality constraints against successive hidden columns. In other +** words, the following pairs of queries are equivalent to each other: +** +** SELECT * FROM generate_series(0,100,5); +** SELECT * FROM generate_series WHERE start=0 AND stop=100 AND step=5; +** +** SELECT * FROM generate_series(0,100); +** SELECT * FROM generate_series WHERE start=0 AND stop=100; +** +** SELECT * FROM generate_series(20) LIMIT 10; +** SELECT * FROM generate_series WHERE start=20 LIMIT 10; +** +** The generate_series virtual table implementation leaves the xCreate method +** set to NULL. This means that it is not possible to do a CREATE VIRTUAL +** TABLE command with "generate_series" as the USING argument. Instead, there +** is a single generate_series virtual table that is always available without +** having to be created first. +** +** The xBestIndex method looks for equality constraints against the hidden +** start, stop, and step columns, and if present, it uses those constraints +** to bound the sequence of generated values. If the equality constraints +** are missing, it uses 0 for start, 4294967295 for stop, and 1 for step. +** xBestIndex returns a small cost when both start and stop are available, +** and a very large cost if either start or stop are unavailable. This +** encourages the query planner to order joins such that the bounds of the +** series are well-defined. +** +** Update on 2024-08-22: +** xBestIndex now also looks for equality and inequality constraints against +** the value column and uses those constraints as additional bounds against +** the sequence range. Thus, a query like this: +** +** SELECT value FROM generate_series($SA,$EA) +** WHERE value BETWEEN $SB AND $EB; +** +** Is logically the same as: +** +** SELECT value FROM generate_series(max($SA,$SB),min($EA,$EB)); +** +** Constraints on the value column can server as substitutes for constraints +** on the hidden start and stop columns. So, the following two queries +** are equivalent: +** +** SELECT value FROM generate_series($S,$E); +** SELECT value FROM generate_series WHERE value BETWEEN $S and $E; +** +*/ +#if 0 +#include "sqlite3ext.h" +#else +#include "sqlite3.h" +#endif +#include +#include +#include +#include + +#ifndef SQLITE_OMIT_VIRTUALTABLE + +/* series_cursor is a subclass of sqlite3_vtab_cursor which will +** serve as the underlying representation of a cursor that scans +** over rows of the result. +** +** iOBase, iOTerm, and iOStep are the original values of the +** start=, stop=, and step= constraints on the query. These are +** the values reported by the start, stop, and step columns of the +** virtual table. +** +** iBase, iTerm, iStep, and bDescp are the actual values used to generate +** the sequence. These might be different from the iOxxxx values. +** For example in +** +** SELECT value FROM generate_series(1,11,2) +** WHERE value BETWEEN 4 AND 8; +** +** The iOBase is 1, but the iBase is 5. iOTerm is 11 but iTerm is 7. +** Another example: +** +** SELECT value FROM generate_series(1,15,3) ORDER BY value DESC; +** +** The cursor initialization for the above query is: +** +** iOBase = 1 iBase = 13 +** iOTerm = 15 iTerm = 1 +** iOStep = 3 iStep = 3 bDesc = 1 +** +** The actual step size is unsigned so that can have a value of +** +9223372036854775808 which is needed for querys like this: +** +** SELECT value +** FROM generate_series(9223372036854775807, +** -9223372036854775808, +** -9223372036854775808) +** ORDER BY value ASC; +** +** The setup for the previous query will be: +** +** iOBase = 9223372036854775807 iBase = -1 +** iOTerm = -9223372036854775808 iTerm = 9223372036854775807 +** iOStep = -9223372036854775808 iStep = 9223372036854775808 bDesc = 0 +*/ +typedef unsigned char u8; +typedef struct series_cursor series_cursor; +struct series_cursor { + sqlite3_vtab_cursor base; /* Base class - must be first */ + sqlite3_int64 iOBase; /* Original starting value ("start") */ + sqlite3_int64 iOTerm; /* Original terminal value ("stop") */ + sqlite3_int64 iOStep; /* Original step value */ + sqlite3_int64 iBase; /* Starting value to actually use */ + sqlite3_int64 iTerm; /* Terminal value to actually use */ + sqlite3_uint64 iStep; /* The step size */ + sqlite3_int64 iValue; /* Current value */ + u8 bDesc; /* iStep is really negative */ + u8 bDone; /* True if stepped past last element */ +}; + +/* +** Computed the difference between two 64-bit signed integers using a +** convoluted computation designed to work around the silly restriction +** against signed integer overflow in C. +*/ +static sqlite3_uint64 span64(sqlite3_int64 a, sqlite3_int64 b){ + assert( a>=b ); + return (*(sqlite3_uint64*)&a) - (*(sqlite3_uint64*)&b); +} + +/* +** Add or substract an unsigned 64-bit integer from a signed 64-bit integer +** and return the new signed 64-bit integer. +*/ +static sqlite3_int64 add64(sqlite3_int64 a, sqlite3_uint64 b){ + sqlite3_uint64 x = *(sqlite3_uint64*)&a; + x += b; + return *(sqlite3_int64*)&x; +} +static sqlite3_int64 sub64(sqlite3_int64 a, sqlite3_uint64 b){ + sqlite3_uint64 x = *(sqlite3_uint64*)&a; + x -= b; + return *(sqlite3_int64*)&x; +} + +/* +** The seriesConnect() method is invoked to create a new +** series_vtab that describes the generate_series virtual table. +** +** Think of this routine as the constructor for series_vtab objects. +** +** All this routine needs to do is: +** +** (1) Allocate the series_vtab object and initialize all fields. +** +** (2) Tell SQLite (via the sqlite3_declare_vtab() interface) what the +** result set of queries against generate_series will look like. +*/ +static int seriesConnect( + sqlite3 *db, + void *pUnused, + int argcUnused, const char *const*argvUnused, + sqlite3_vtab **ppVtab, + char **pzErrUnused +){ + sqlite3_vtab *pNew; + int rc; + +/* Column numbers */ +#define SERIES_COLUMN_ROWID (-1) +#define SERIES_COLUMN_VALUE 0 +#define SERIES_COLUMN_START 1 +#define SERIES_COLUMN_STOP 2 +#define SERIES_COLUMN_STEP 3 + + (void)pUnused; + (void)argcUnused; + (void)argvUnused; + (void)pzErrUnused; + rc = sqlite3_declare_vtab(db, + "CREATE TABLE x(value,start hidden,stop hidden,step hidden)"); + if( rc==SQLITE_OK ){ + pNew = *ppVtab = sqlite3_malloc( sizeof(*pNew) ); + if( pNew==0 ) return SQLITE_NOMEM; + memset(pNew, 0, sizeof(*pNew)); + sqlite3_vtab_config(db, SQLITE_VTAB_INNOCUOUS); + } + return rc; +} + +/* +** This method is the destructor for series_cursor objects. +*/ +static int seriesDisconnect(sqlite3_vtab *pVtab){ + sqlite3_free(pVtab); + return SQLITE_OK; +} + +/* +** Constructor for a new series_cursor object. +*/ +static int seriesOpen(sqlite3_vtab *pUnused, sqlite3_vtab_cursor **ppCursor){ + series_cursor *pCur; + (void)pUnused; + pCur = sqlite3_malloc( sizeof(*pCur) ); + if( pCur==0 ) return SQLITE_NOMEM; + memset(pCur, 0, sizeof(*pCur)); + *ppCursor = &pCur->base; + return SQLITE_OK; +} + +/* +** Destructor for a series_cursor. +*/ +static int seriesClose(sqlite3_vtab_cursor *cur){ + sqlite3_free(cur); + return SQLITE_OK; +} + + +/* +** Advance a series_cursor to its next row of output. +*/ +static int seriesNext(sqlite3_vtab_cursor *cur){ + series_cursor *pCur = (series_cursor*)cur; + if( pCur->iValue==pCur->iTerm ){ + pCur->bDone = 1; + }else if( pCur->bDesc ){ + pCur->iValue = sub64(pCur->iValue, pCur->iStep); + assert( pCur->iValue>=pCur->iTerm ); + }else{ + pCur->iValue = add64(pCur->iValue, pCur->iStep); + assert( pCur->iValue<=pCur->iTerm ); + } + return SQLITE_OK; +} + +/* +** Return values of columns for the row at which the series_cursor +** is currently pointing. +*/ +static int seriesColumn( + sqlite3_vtab_cursor *cur, /* The cursor */ + sqlite3_context *ctx, /* First argument to sqlite3_result_...() */ + int i /* Which column to return */ +){ + series_cursor *pCur = (series_cursor*)cur; + sqlite3_int64 x = 0; + switch( i ){ + case SERIES_COLUMN_START: x = pCur->iOBase; break; + case SERIES_COLUMN_STOP: x = pCur->iOTerm; break; + case SERIES_COLUMN_STEP: x = pCur->iOStep; break; + default: x = pCur->iValue; break; + } + sqlite3_result_int64(ctx, x); + return SQLITE_OK; +} + +#ifndef LARGEST_UINT64 +#define LARGEST_INT64 ((sqlite3_int64)0x7fffffffffffffffLL) +#define LARGEST_UINT64 ((sqlite3_uint64)0xffffffffffffffffULL) +#define SMALLEST_INT64 ((sqlite3_int64)0x8000000000000000LL) +#endif + +/* +** The rowid is the same as the value. +*/ +static int seriesRowid(sqlite3_vtab_cursor *cur, sqlite_int64 *pRowid){ + series_cursor *pCur = (series_cursor*)cur; + *pRowid = pCur->iValue; + return SQLITE_OK; +} + +/* +** Return TRUE if the cursor has been moved off of the last +** row of output. +*/ +static int seriesEof(sqlite3_vtab_cursor *cur){ + series_cursor *pCur = (series_cursor*)cur; + return pCur->bDone; +} + +/* True to cause run-time checking of the start=, stop=, and/or step= +** parameters. The only reason to do this is for testing the +** constraint checking logic for virtual tables in the SQLite core. +*/ +#ifndef SQLITE_SERIES_CONSTRAINT_VERIFY +# define SQLITE_SERIES_CONSTRAINT_VERIFY 0 +#endif + +/* +** Return the number of steps between pCur->iBase and pCur->iTerm if +** the step width is pCur->iStep. +*/ +static sqlite3_uint64 seriesSteps(series_cursor *pCur){ + if( pCur->bDesc ){ + assert( pCur->iBase >= pCur->iTerm ); + return span64(pCur->iBase, pCur->iTerm)/pCur->iStep; + }else{ + assert( pCur->iBase <= pCur->iTerm ); + return span64(pCur->iTerm, pCur->iBase)/pCur->iStep; + } +} + +#if defined(SQLITE_ENABLE_MATH_FUNCTIONS) || defined(_WIN32) +/* +** Case 1 (the most common case): +** The standard math library is available so use ceil() and floor() from there. +*/ +static double seriesCeil(double r){ return ceil(r); } +static double seriesFloor(double r){ return floor(r); } +#elif defined(__GNUC__) && !defined(SQLITE_DISABLE_INTRINSIC) +/* +** Case 2 (2nd most common): Use GCC/Clang builtins +*/ +static double seriesCeil(double r){ return __builtin_ceil(r); } +static double seriesFloor(double r){ return __builtin_floor(r); } +#else +/* +** Case 3 (rarely happens): Use home-grown ceil() and floor() routines. +*/ +static double seriesCeil(double r){ + sqlite3_int64 x; + if( r!=r ) return r; + if( r<=(-4503599627370496.0) ) return r; + if( r>=(+4503599627370496.0) ) return r; + x = (sqlite3_int64)r; + if( r==(double)x ) return r; + if( r>(double)x ) x++; + return (double)x; +} +static double seriesFloor(double r){ + sqlite3_int64 x; + if( r!=r ) return r; + if( r<=(-4503599627370496.0) ) return r; + if( r>=(+4503599627370496.0) ) return r; + x = (sqlite3_int64)r; + if( r==(double)x ) return r; + if( r<(double)x ) x--; + return (double)x; +} +#endif + +/* +** This method is called to "rewind" the series_cursor object back +** to the first row of output. This method is always called at least +** once prior to any call to seriesColumn() or seriesRowid() or +** seriesEof(). +** +** The query plan selected by seriesBestIndex is passed in the idxNum +** parameter. (idxStr is not used in this implementation.) idxNum +** is a bitmask showing which constraints are available: +** +** 0x0001: start=VALUE +** 0x0002: stop=VALUE +** 0x0004: step=VALUE +** 0x0008: descending order +** 0x0010: ascending order +** 0x0020: LIMIT VALUE +** 0x0040: OFFSET VALUE +** 0x0080: value=VALUE +** 0x0100: value>=VALUE +** 0x0200: value>VALUE +** 0x1000: value<=VALUE +** 0x2000: value0, the value of the LIMIT */ + sqlite3_int64 iOffset = 0; /* if >0, the value of the OFFSET */ + + (void)idxStrUnused; + + /* If any constraints have a NULL value, then return no rows. + ** See ticket https://sqlite.org/src/info/fac496b61722daf2 + */ + for(i=0; i

SQLite Source Repository

4.4 Special Directives

4.4.25 The `%wildcard` directive

4.4.26 The `%realloc` and `%free` directives

4.4.26 The `%realloc`, `%free`, and +`%stack_size_limit` directives

5.0 Error Processing

sqlite3 tester #1: Worker thread (@bitness@-bit WASM)

	Quote	Csv +
zColumnSep	","	"," +
zRowSep	"\\n"	"\\r\\n" +
zNull	"NULL"	"" +
eText	QRF_TEXT_Sql	QRF_TEXT_Csv +
eBlob	QRF_BLOB_Sql	QRF_BLOB_Text +

SQLite Source Repository

4.4 Special Directives

4.4.25 The %wildcard directive

4.4.26 The %realloc and %free directives

4.4.26 The %realloc, %free, and +%stack_size_limit directives

5.0 Error Processing

sqlite3 tester #1: Worker thread (@bitness@-bit WASM)

4.4.25 The `%wildcard` directive

4.4.26 The `%realloc` and `%free` directives

4.4.26 The `%realloc`, `%free`, and +`%stack_size_limit` directives