unicode results from the CPAN

unicode

DBD-SQLite

view release on metacpan or search on metacpan

1.71_01 2021-12-02
    - Upgraded SQLite to 3.37.0
    - Add a feature to unregister a created function
    - Fix accented characters in POD (GH#90, HaraldJoerg++)

1.70 2021-08-01
    - Switched to a production version

1.69_02 2021-07-30
    - Fix doc to use the correct attribute with sqlite_ (GH#86, eekboek++)
    - Modify the fix to silence the sqlite_unicode warning not to check
      the attribute twice
    - Fix an encoding issue of naive (GH#83, HaraldJoerg++)

1.69_01 2021-07-30
    - Typo (GH#85, grr++)
    - Silenced deprecation warning of sqlite_unicode not to break
      tests of existing applications

1.68 2021-07-22
    - Switched to a production version

1.67_07 2021-06-19
    - Upgraded SQLite to 3.36.0

1.67_06 2021-06-14
    - Experiment with another quadmath patch to see if it works

Changes view on Meta::CPAN


1.67_05 2021-06-13
    - Made DBD_SQLITE_STRING_MODE constants exportable

1.67_04 2021-05-31
    - Upgraded SQLite to 3.35.5
    - Stop setting THREADSAFE=0 if perl has pthread (ie. 5.20+)
      (Bjoern Hoehrmann++, GH#69, #72)
    - Fixed a memory leak in ::VirtualTable
    - Introduced "string_mode" handle attribute (Felipe Gasper++)
      to fix long-standing issues of sqlite_unicode (GH#78, #68)
    - Added a dependency from dbdimp.o to the *.inc files included
      into dbdimp.c (Laurent Dami++, GH#74)
    - Fixed an offset issue of VirtualTable (Laurent Dami++, GH#75)

1.67_03 2021-03-31
    - Upgraded SQLite to 3.35.3
    - Enabled math functions introduced in SQLite 3.35
    - Fix quadmath issues (Tux++, leont++)

1.67_02 2020-12-06

Changes view on Meta::CPAN

1.51_01 2016-02-20
    *** CHANGES THAT MAY POSSIBLY BREAK YOUR OLD APPLICATIONS ***
    - Updated to SQLite 3.11.0.
      As upstream disabled two-arg fts3_tokenizer() for security concern,
      DBD::SQLite also stopped enabling it by default. If you do need
      perl tokenizer, compile/install with SQLITE_ENABLE_FTS3_TOKENIZER
      environmental variable.

    - Applied a doc patch by Salvatore Bonaccorso
    - Enabled (experimental) FTS5
    - Fixed REGEXP function to work under sqlite_unicode correctly
      (AndrÃ¡s Farkas++)

1.50 2016-02-11
    - Switched to a production version.

1.49_08 2016-01-30
    - no significant code changes
    - Resolved RT#111558: Virtual table tests depend on enhanced
      query syntax availability (vlmarek++)
    - Ingore FTS tests if FTS is not available

Changes view on Meta::CPAN

1.43_06 2014-07-22
    - Fixed compile error/warning for older perls (reported by ribasushi)
      (ISHIGAKI)

1.43_05 2014-07-21
    - No significant code changes; removed unnecessary dependencies.

1.43_04 2014-07-21
    *** CHANGES THAT MAY POSSIBLY BREAK YOUR OLD APPLICATIONS ***
    - Resolved #96877: sql statements should be converted to utf8 (DAMI)
      If you set sqlite_unicode to true, SQL statements will be upgraded
      to avoid inconsistency between embedded params and bind params.

    - Resolved #96494: [PATCH] add SYSTEM TABLE to table_info() type
      list (MJP)
    - Supported virtual tables in Perl, and added two sample tables
      (DAMI++)

1.43_03 2014-06-12
    - Updated to SQLite 3.8.5, which should fix query planner's 
      issues in SQLite (ISHIGAKI)

Changes view on Meta::CPAN


1.40 2013-07-28
    - NetBSD also doesn't like the _XOPEN_SOURCE hack (ISHIGAKI)
    - Resolved #86080: PATCH: statistics_info support (DDICK)

1.39 2013-05-31
    - Production release, no changes from 1.38_05

1.38_05 2013-05-31
    - OpenBSD doesn't like the previous _XOPEN_SOURCE hack (ISHIGAKI)
    - Disabled a unicode-related test for older perls (ISHIGAKI)

1.38_04 2013-05-29
    - Tentatively defined _XOPEN_SOURCE under *BSD systems to see
      if it solves a compilation issue for threaded perls (ISHIGAKI)

1.38_03 2013-05-20
    *** NOTICE ON QUERY OPTIMIZER ENHANCEMENT ***
    - As of SQLite 3.7.15, SQLite's query optimizer was enhanced
      and the result order of a SELECT statement without an ORDER
      BY clause may be different from the one of the previous

Changes view on Meta::CPAN

      it actually broke things. This feature probably will be
      enabled by default by the sqlite team in the future, and
      eventually you'll need to cope with it, but right now we
      agreed with some discussion to give you more time to be
      prepared. If you use referential stuff in your schema
      (which sqlite ignores now) should do extensive testing
      to ensure that they will work when you issue "PRAGMA
      foreign_keys = ON". (ISHIGAKI)

    - Updated to SQLite 3.6.20 (DUNCAND)
    - Resolved #50935: there remained old "unicode" attribute usage
      in the pod, spotted by ASHLEY. (ISHIGAKI)

1.26_06 2009-10-28
    *** CHANGES THAT MAY POSSIBLY BREAK YOUR OLD APPLICATIONS ***
    - Removed undocumented (and most probably unused) reset method
      from a statement handle (which was only accessible via func().)
      Simply use "$sth->finish" instead. (ISHIGAKI)
    - Now DBD::SQLite supports foreign key constraints by default.
      Long-ignored foreign keys (typically written for other DB
      engines) will start working. If you don't want this feature,
      issue a pragma to disable foreign keys. (ISHIGAKI)
    - Renamed "unicode" attribute to "sqlite_unicode" for integrity.
      Old "unicode" attribute is still accessible but will be
      deprecated in the near future. (ISHIGAKI)

    - You can see file/line info while tracing even if you compile
      with a non-gcc compiler. (ISHIGAKI)
    - Major code refactoring. (ISHIGAKI)
    - Pod reorganized, and some of the missing bits (including
      pragma) are added. (ISHIGAKI)

1.26_05 2009-10-15
    - Updated to SQLite 3.6.19 (ISHIGAKI)

Changes view on Meta::CPAN

    - Added access to Online Backup functionality. (TJC)
    - Added enable_load_extension pod (ISHIGAKI)
    - Now private methods/functions return true after successful
      calls (#44871) (ISHIGAKI)
    - Removed all of the "croak"s (#44871) (ISHIGAKI)

1.26_01 2009-05-05
    - Added ORDINAL_POSITION support for $dbh->column_info (ADAMK)
    - Applied several fixes from GFUJI to clean up code (#45578)
      (ISHIGAKI)
    - Skipped some of the unicode path tests under cygwin (#45166)
      (JDHEDDEN)
    - Added some explanation and workarounds for a SQL that
      compares a return value of a function with a numeric bind
      value (ISHIGAKI)

1.25 2009-04-23
    - Amalgamation conversion turned out to be quicker than expected.
    - Changing to a production release.  (ADAMK)

1.24_02 2009-04-22
    - Merging various externally-contributed annotations from
      annocpan.org (ADAMK)
    - Created the beginnings of a DBD::SQLite::Cookbook (ADAMK)

1.24_01 2009-04-22
    - Moved getsqlite.pl into util (ADAMK)
    - Switching to the RT queue instead of the RT report page that
      does nothing and just refers you to email (ADAMK)
    - Now DBD::SQLite also uses amalgamated source recommended at sqlite.org (ISHIGAKI)
    - Resolved #45166: better unicode path handling under cygwin (ISHIGAKI)
    - Resolved #45171: test failure on CentOS 4.6 (ISHIGAKI)

1.23 2009-04-19
    - No changes from 1.22_08, just switched to production release (ADAMK)

1.22_08 2009-04-17
    - Completed the migration of all tests and deleted lib.pl (ADAMK)
    - Prevented a double "commit is innefective" warning (ADAMK)
    - Adding a version for the Win32.pm dependency (ADAMK)

1.22_07 2009-04-16
    - Improved non-latin unicode filename support/test
      on Windows (SZABGAB/ISHIGAKI)
    - Removed the table name generator from t/lib.pl,
      getting us closer to removing t/lib.pl entirely (ADAMK)
    - Increased use of Test::NoWarnings (ADAMK)
    - Converted half the remaining lib.pl tests to t::lib::Test (ADAMK)
    - Require Win32.pm on Windows (CHORNY)

1.22_06 2009-04-15
    - Simplifying various miscellaneous code (ADAMK)
    - Adding support for non-latin unicode filenames on Windows (ADAMK)

1.22_05 2009-04-15
    - Hopefully the last dev release before the next production release.
    - Updated to SQLite 3.6.13 (DUNCAND)
    - Setting svn:eol-style to native to prevent EOL issues (ADAMK)
    - Resolved #44861: tweaked Makefile.PL to support older HP-UX (ISHIGAKI)

1.22_04 2009-04-11
    - Adding support parsing attributes out of the DSN (ADAMK)
    - Inserted pTHX_/aTHX_ for better efficiency (suggested in #44884 by TIMB) (ISHIGAKI)

Changes view on Meta::CPAN


1.22_03 2009-04-10
    - Resolved #44876: Patch to fix includes in the SQLITE_LOCATION case by janus (ISHIGAKI)
    - Added PERL_NO_GET_CONTEXT for efficiency (suggested in #44884 by TIMB) (ISHIGAKI)
    - Refactored error handling (suggested in #44884, #44871 by TIMB) (ISHIGAKI)

1.22_02 2009-04-09
    - Added missing documentation bits for 'create_collation'
      and 'progress_handler' (DAMI)
    - Resolved RT#25924 (Arguments to user-defined functions do not
      respect unicode setting) (DAMI)
    - Added comments on the return values on error, and fixed another
      wrong return value in execute (ISHIGAKI)
    - Added SQL_NULLABLE_UNKNOWN; still wonders if the error above
      should be ignored or not (ISHIGAKI)

1.22_01 2009-04-09
    - Resolved #25371: Calls sv_utf8_upgrade on strings going into
       the database to make sure latin-1 strings are not saved as
       Malformed UTF-8 character in the SQLite TEXT column (MIYAGAWA)

Changes view on Meta::CPAN

      - Improved error handling
      (many MANY thanks to Tim for all these patches!)
    - Updated to sqlite 2.8.0

0.24
    - Fixed major crash bug affecting Mac OS X
    - Removed test.pl from distribution
    - Upgraded to sqlite 2.7.6

0.23
    - Fixed unicode tests

0.22
    - Merge with sqlite 2.7.4

0.21
    - Ooops - forgot new opcodes files from MANIFEST

0.20
    - Port to SQLite 2.7.2
    - Fixed bug in not freeing memory if you re-execute a $sth

Changes view on Meta::CPAN

    - Fixed getsqlite.pl
    
0.16
    - Upgraded to SQLite 2.5.0

0.15
    - Upgraded to SQLite 2.4.5

0.14
    - Added NoUTF8Flag option, so that returned strings don't get flagged
      with SvUTF8_on() - needed when you're storing non-unicode in the database

0.13
    - Upgraded to SQLite 2.4.3
    - Added script to download sqlite core library when it's upgraded

0.12
    - Upgraded to SQLite 2.4.2

0.11
    - Upgraded to SQLite 2.4.0, which adds views, subqueries, new builtin

MANIFEST view on Meta::CPAN

t/02_logon.t
t/03_create_table.t
t/04_insert.t
t/05_select.t
t/06_tran.t
t/07_error.t
t/08_busy.t
t/09_create_function.t
t/10_create_aggregate.t
t/11_get_info.t
t/12_unicode.t
t/13_create_collation.t
t/14_progress_handler.t
t/15_ak_dbd.t
t/16_column_info.t
t/17_createdrop.t
t/18_insertfetch.t
t/19_bindparam.t
t/20_blobs.t
t/21_blobtext.t
t/22_listfields.t

MANIFEST view on Meta::CPAN

t/65_db_config.t
t/66_get_autocommit.t
t/cookbook_variance.t
t/lib/SQLiteTest.pm
t/rt_106151_outermost_savepoint.t
t/rt_106950_extra_warnings_with_savepoints.t
t/rt_115465_column_info_with_spaces.t
t/rt_124227_index_regression.t
t/rt_15186_prepcached.t
t/rt_21406_auto_finish.t
t/rt_25371_asymmetric_unicode.t
t/rt_25460_numeric_aggregate.t
t/rt_25924_user_defined_func_unicode.t
t/rt_26775_distinct.t
t/rt_27553_prepared_cache_and_analyze.t
t/rt_29058_group_by.t
t/rt_29629_sqlite_where_length.t
t/rt_31324_full_names.t
t/rt_32889_prepare_cached_reexecute.t
t/rt_36836_duplicate_key.t
t/rt_36838_unique_and_bus_error.t
t/rt_40594_nullable.t
t/rt_48393_debug_panic_with_commit.t
t/rt_50503_fts3.t
t/rt_52573_manual_exclusive_lock.t
t/rt_53235_icu_compatibility.t
t/rt_62370_diconnected_handles_operation.t
t/rt_64177_ping_wipes_out_the_errstr.t
t/rt_67581_bind_params_mismatch.t
t/rt_71311_bind_col_and_unicode.t
t/rt_73159_fts_tokenizer_segfault.t
t/rt_73787_exponential_buffer_overflow.t
t/rt_76395_int_overflow.t
t/rt_77724_primary_key_with_a_whitespace.t
t/rt_78833_utf8_flag_for_column_names.t
t/rt_81536_multi_column_primary_key_info.t
t/rt_88228_sqlite_3_8_0_crash.t
t/rt_96050_db_filename_for_a_closed_database.t
t/rt_96877_unicode_statements.t
t/rt_96878_fts_contentless_table.t
t/rt_97598_crash_on_disconnect_with_virtual_tables.t
t/virtual_table/00_base.t
t/virtual_table/01_destroy.t
t/virtual_table/02_find_function.t
t/virtual_table/10_filecontent.t
t/virtual_table/11_filecontent_fulltext.t
t/virtual_table/20_perldata.t
t/virtual_table/21_perldata_charinfo.t
t/virtual_table/rt_124941.t

README view on Meta::CPAN


  The above will allocate 800M for DB cache; the default is 2M. Your
  sweet spot probably lies somewhere in between.

DRIVER PRIVATE ATTRIBUTES
 Database Handle Attributes
  sqlite_version
      Returns the version of the SQLite library which DBD::SQLite is
      using, e.g., "2.8.0". Can only be read.

  sqlite_unicode
      If set to a true value, DBD::SQLite will turn the UTF-8 flag
      on for all text strings coming out of the database (this
      feature is currently disabled for perl < 5.8.5). For more
      details on the UTF-8 flag see perlunicode. The default is for
      the UTF-8 flag to be turned off.

      Also note that due to some bizarreness in SQLite's type system
      (see <http://www.sqlite.org/datatype3.html>), if you want to
      retain blob-style behavior for some columns under
      "$dbh->{sqlite_unicode} = 1" (say, to store images in the
      database), you have to state so explicitly using the
      3-argument form of "bind_param" in DBI when doing updates:

        use DBI qw(:sql_types);
        $dbh->{sqlite_unicode} = 1;
        my $sth = $dbh->prepare("INSERT INTO mytable (blobcolumn) VALUES (?)");
  
        # Binary_data will be stored as is.
        $sth->bind_param(1, $binary_data, SQL_BLOB);

      Defining the column type as "BLOB" in the DDL is not
      sufficient.

      This attribute was originally named as "unicode", and renamed
      to "sqlite_unicode" for integrity since version 1.26_06. Old
      "unicode" attribute is still accessible but will be deprecated
      in the near future.

  sqlite_allow_multiple_statements
      If you set this to true, "do" method will process multiple
      statements at one go. This may be handy, but with performance
      penalty. See above for details.

  sqlite_use_immediate_transaction
      If you set this to true, DBD::SQLite tries to issue a "begin
      immediate transaction" (instead of "begin transaction") when

README view on Meta::CPAN

        txt1 COLLATE perl,
        txt2 COLLATE perllocale,
        txt3 COLLATE nocase
    )

  or

    SELECT * FROM foo ORDER BY name COLLATE perllocale

 Unicode handling
  If the attribute "$dbh->{sqlite_unicode}" is set, strings coming
  from the database and passed to the collation function will be
  properly tagged with the utf8 flag; but this only works if the
  "sqlite_unicode" attribute is set before the first call to a perl
  collation sequence . The recommended way to activate unicode is to
  set the parameter at connection time :

    my $dbh = DBI->connect(
        "dbi:SQLite:dbname=foo", "", "",
        {
            RaiseError     => 1,
            sqlite_unicode => 1,
        }
    );

 Adding user-defined collations
  The native SQLite API for adding user-defined collations is
  exposed through methods "sqlite_create_collation" and
  "sqlite_collation_needed".

  To avoid calling these functions every time a $dbh handle is
  created, "DBD::SQLite" offers a simpler interface through the

dbdimp.c view on Meta::CPAN

            }
        }

        /* sqlite_string_mode should be detected earlier, to register default functions correctly */

        SV** string_mode_svp = hv_fetchs(hv, "sqlite_string_mode", 0);
        if (string_mode_svp != NULL && SvOK(*string_mode_svp)) {
            string_mode = _extract_sqlite_string_mode_from_sv(aTHX_ *string_mode_svp);

        /* Legacy alternatives to sqlite_string_mode: */
        } else if (hv_exists(hv, "sqlite_unicode", 14)) {
            val = hv_fetch(hv, "sqlite_unicode", 14, 0);
            if ( (val && SvOK(*val)) ? SvIV(*val) : 0 ) {
                string_mode = DBD_SQLITE_STRING_MODE_UNICODE_NAIVE;
            }
        } else if (hv_exists(hv, "unicode", 7)) {
            val = hv_fetch(hv, "unicode", 7, 0);
            if ( (val && SvOK(*val)) ? SvIV(*val) : 0 ) {
                string_mode = DBD_SQLITE_STRING_MODE_UNICODE_NAIVE;
            }
        }
    }
    rc = sqlite_open2(dbname, &(imp_dbh->db), flag, extended);
    if ( rc != SQLITE_OK ) {
        return FALSE; /* -> undef in lib/DBD/SQLite.pm */
    }
    DBIc_IMPSET_on(imp_dbh);

dbdimp.c view on Meta::CPAN

            sqlite_trace(dbh, imp_dbh, 3, form("Unicode support is disabled for this version of perl."));
            string_mode = DBD_SQLITE_STRING_MODE_PV;
        }
#endif

        imp_dbh->string_mode = string_mode;

        return TRUE;
    }

    if (strEQ(key, "sqlite_unicode")) {
        /* it's too early to warn the deprecation of sqlite_unicode as it's widely used */
#if PERL_UNICODE_DOES_NOT_WORK_WELL
        sqlite_trace(dbh, imp_dbh, 3, form("Unicode support is disabled for this version of perl."));
        imp_dbh->string_mode = DBD_SQLITE_STRING_MODE_PV;
#else
        imp_dbh->string_mode = SvTRUE(valuesv) ? DBD_SQLITE_STRING_MODE_UNICODE_NAIVE : DBD_SQLITE_STRING_MODE_PV;
#endif
        return TRUE;
    }

    if (strEQ(key, "unicode")) {
        _warn_deprecated_if_possible(key, "sqlite_string_mode");
#if PERL_UNICODE_DOES_NOT_WORK_WELL
        sqlite_trace(dbh, imp_dbh, 3, form("Unicode support is disabled for this version of perl."));
        imp_dbh->string_mode = DBD_SQLITE_STRING_MODE_PV;
#else
        imp_dbh->string_mode = SvTRUE(valuesv) ? DBD_SQLITE_STRING_MODE_UNICODE_NAIVE : DBD_SQLITE_STRING_MODE_PV;
#endif
        return TRUE;
    }

dbdimp.c view on Meta::CPAN

       return sv_2mortal(newSViv(imp_dbh->extended_result_codes ? 1 : 0));
   }
   if (strEQ(key, "sqlite_prefer_numeric_type")) {
       return sv_2mortal(newSViv(imp_dbh->prefer_numeric_type ? 1 : 0));
   }

   if (strEQ(key, "sqlite_string_mode")) {
       return sv_2mortal(newSVuv(imp_dbh->string_mode));
   }

   if (strEQ(key, "sqlite_unicode") || strEQ(key, "unicode")) {
        _warn_deprecated_if_possible(key, "sqlite_string_mode");
#if PERL_UNICODE_DOES_NOT_WORK_WELL
       sqlite_trace(dbh, imp_dbh, 3, "Unicode support is disabled for this version of perl.");
       return sv_2mortal(newSViv(0));
#else
       return sv_2mortal(newSViv(imp_dbh->string_mode == DBD_SQLITE_STRING_MODE_UNICODE_NAIVE ? 1 : 0));
#endif
   }

    return NULL;

dbdimp.c view on Meta::CPAN

    } else {
        sqlite_set_result(aTHX_ context, POPs, 0 );
    }

    PUTBACK;
    FREETMPS;
    LEAVE;
}

static void
sqlite_db_func_dispatcher_unicode_naive(sqlite3_context *context, int argc, sqlite3_value **value)
{
    sqlite_db_func_dispatcher(DBD_SQLITE_STRING_MODE_UNICODE_NAIVE, context, argc, value);
}

static void
sqlite_db_func_dispatcher_unicode_fallback(sqlite3_context *context, int argc, sqlite3_value **value)
{
    sqlite_db_func_dispatcher(DBD_SQLITE_STRING_MODE_UNICODE_FALLBACK, context, argc, value);
}

static void
sqlite_db_func_dispatcher_unicode_strict(sqlite3_context *context, int argc, sqlite3_value **value)
{
    sqlite_db_func_dispatcher(DBD_SQLITE_STRING_MODE_UNICODE_STRICT, context, argc, value);
}

static void
sqlite_db_func_dispatcher_bytes(sqlite3_context *context, int argc, sqlite3_value **value)
{
    sqlite_db_func_dispatcher(DBD_SQLITE_STRING_MODE_BYTES, context, argc, value);
}

dbdimp.c view on Meta::CPAN

{
    sqlite_db_func_dispatcher(DBD_SQLITE_STRING_MODE_PV, context, argc, value);
}

typedef void (*dispatch_func_t)(sqlite3_context*, int, sqlite3_value**);

static dispatch_func_t _FUNC_DISPATCHER[_DBD_SQLITE_STRING_MODE_COUNT] = {
    sqlite_db_func_dispatcher_pv,
    sqlite_db_func_dispatcher_bytes,
    NULL, NULL,
    sqlite_db_func_dispatcher_unicode_naive,
    sqlite_db_func_dispatcher_unicode_fallback,
    sqlite_db_func_dispatcher_unicode_strict,
};

int
sqlite_db_create_function(pTHX_ SV *dbh, const char *name, int argc, SV *func, int flags)
{
    D_imp_dbh(dbh);
    int rc;
    SV *func_sv;

    if (!DBIc_ACTIVE(imp_dbh)) {

dbdimp.h view on Meta::CPAN

#include "SQLiteXS.h"
#include "sqlite3.h"

#define MY_CXT_KEY "DBD::SQLite::_guts" XS_VERSION

typedef enum {
    DBD_SQLITE_STRING_MODE_PV,
    DBD_SQLITE_STRING_MODE_BYTES,

    /* Leave space here so that we can use DBD_SQLITE_STRING_MODE_UNICODE_ANY
       as a means of checking for any unicode mode. */

    DBD_SQLITE_STRING_MODE_UNICODE_NAIVE = 4,
    DBD_SQLITE_STRING_MODE_UNICODE_FALLBACK,
    DBD_SQLITE_STRING_MODE_UNICODE_STRICT,

    _DBD_SQLITE_STRING_MODE_COUNT,
} dbd_sqlite_string_mode_t;

#define DBD_SQLITE_STRING_MODE_UNICODE_ANY DBD_SQLITE_STRING_MODE_UNICODE_NAIVE

lib/DBD/SQLite.pm view on Meta::CPAN

            }
        }
    }

    if (my $flags = $attr->{sqlite_open_flags}) {
        unless ($flags & (DBD::SQLite::OPEN_READONLY() | DBD::SQLite::OPEN_READWRITE())) {
            $attr->{sqlite_open_flags} |= DBD::SQLite::OPEN_READWRITE() | DBD::SQLite::OPEN_CREATE();
        }
    }

    # To avoid unicode and long file name problems on Windows,
    # convert to the shortname if the file (or parent directory) exists.
    if ( $^O =~ /MSWin32/ and $real ne ':memory:' and $real ne '' and $real !~ /^file:/ and !-f $real ) {
        require File::Basename;
        my ($file, $dir, $suffix) = File::Basename::fileparse($real);
        # We are creating a new file.
        # Does the directory it's in at least exist?
        if ( -d $dir ) {
            require Win32;
            $real = join '', grep { defined } Win32::GetShortPathName($dir), $file, $suffix;
        } else {

lib/DBD/SQLite.pm view on Meta::CPAN

decode, but it can also B<corrupt> B<Perl> B<itself!>

=item * DBD_SQLITE_STRING_MODE_PV (default, but B<DO> B<NOT> B<USE>): Like
DBD_SQLITE_STRING_MODE_BYTES, but when translating Perl strings to SQLite
the Perl string's internal byte buffer is given to SQLite. B<This> B<is>
B<bad>, but it's been the default for many years, and changing that would
break existing applications.

=back

=item C<sqlite_unicode> or C<unicode> (deprecated)

If truthy, equivalent to setting C<sqlite_string_mode> to
DBD_SQLITE_STRING_MODE_UNICODE_NAIVE; if falsy, equivalent to
DBD_SQLITE_STRING_MODE_PV.

Prefer C<sqlite_string_mode> in all new code.

=item sqlite_allow_multiple_statements

If you set this to true, C<do> method will process multiple

lib/DBD/SQLite.pm view on Meta::CPAN

or

  SELECT * FROM foo ORDER BY name COLLATE perllocale

=head2 Unicode handling

Depending on the C<< $dbh->{sqlite_string_mode} >> value, strings coming
from the database and passed to the collation function may be decoded as
UTF-8. This only works, though, if the C<sqlite_string_mode> attribute is
set B<before> the first call to a perl collation sequence. The recommended
way to activate unicode is to set C<sqlite_string_mode> at connection time:

  my $dbh = DBI->connect(
      "dbi:SQLite:dbname=foo", "", "",
      {
          RaiseError         => 1,
          sqlite_string_mode => DBD_SQLITE_STRING_MODE_UNICODE_STRICT,
      }
  );

=head2 Adding user-defined collations

lib/DBD/SQLite/Fulltext_search.pod view on Meta::CPAN

=item icu

The I<icu> tokenizer uses the ICU library to decide how to
identify word characters in different languages; however, this
requires SQLite to be compiled with the C<SQLITE_ENABLE_ICU>
pre-processor symbol defined. So, to use this tokenizer, you need
edit F<Makefile.PL> to add this flag in C<@CC_DEFINE>, and then
recompile C<DBD::SQLite>; of course, the prerequisite is to have
an ICU library available on your system.

=item unicode61

The I<unicode61> tokenizer works very much like "simple" except that it
does full unicode case folding according to rules in Unicode Version
6.1 and it recognizes unicode space and punctuation characters and
uses those to separate tokens. By contrast, the simple tokenizer only
does case folding of ASCII characters and only recognizes ASCII space
and punctuation characters as token separators.

By default, "unicode61" also removes all diacritics from Latin script
characters. This behaviour can be overridden by adding the tokenizer
argument C<"remove_diacritics=0">. For example:

  -- Create tables that remove diacritics from Latin script characters
  -- as part of tokenization.
  CREATE VIRTUAL TABLE txt1 USING fts4(tokenize=unicode61);
  CREATE VIRTUAL TABLE txt2 USING fts4(tokenize=unicode61 "remove_diacritics=1");

  -- Create a table that does not remove diacritics from Latin script
  -- characters as part of tokenization.
  CREATE VIRTUAL TABLE txt3 USING fts4(tokenize=unicode61 "remove_diacritics=0");

Additional options can customize the set of codepoints that unicode61
treats as separator characters or as token characters -- see the
documentation in L<http://www.sqlite.org/fts3.html#unicode61>.

=back

If a more complex tokenizing algorithm is required, for example to
implement stemming, discard punctuation, or to recognize compound words,
use the perl tokenizer to implement your own logic, as explained below.

=head2 Perl tokenizers

=head3 Declaring a perl tokenizer

lib/DBD/SQLite/VirtualTable/PerlData.pm view on Meta::CPAN

        USING perl(path, dev, ino, mode, nlink, uid, gid, rdev, size,
                         atime, mtime, ctime, blksize, blocks,
                   arrayrefs="main::file_stats");

  # search files
  my $sth = $dbh->prepare(<<"");
    SELECT * FROM file_stats 
      WHERE mtime BETWEEN ? AND ?
        AND uid IN (...)

=head2 Hashref example : unicode characters

Given any unicode character, the L<Unicode::UCD/charinfo> function
returns a hashref with various bits of information about that character.
So this can be exploited in a virtual table :

  use Unicode::UCD 'charinfo';
  our $chars = [map {charinfo($_)} 0x300..0x400]; # arbitrary subrange

  # create a temporary virtual table
  $dbh->do(<<"");
    CREATE VIRTUAL TABLE charinfo USING perl(
      code, name, block, script, category,

ppport.h view on Meta::CPAN

parse_arithexpr||5.013008|
parse_barestmt||5.013007|
parse_block||5.013007|
parse_body|||
parse_fullexpr||5.013008|
parse_fullstmt||5.013005|
parse_label||5.013007|
parse_listexpr||5.013008|
parse_stmtseq||5.013006|
parse_termexpr||5.013008|
parse_unicode_opts|||
parser_dup|||
parser_free|||
path_is_absolute|||n
peep|||
pending_Slabs_to_ro|||
perl_alloc_using|||n
perl_alloc|||n
perl_clone_using|||n
perl_clone|||n
perl_construct|||n

ppport.h view on Meta::CPAN

#endif
#ifndef PERL_PV_PRETTY_DUMP
#  define PERL_PV_PRETTY_DUMP            PERL_PV_PRETTY_ELLIPSES|PERL_PV_PRETTY_QUOTE
#endif

#ifndef PERL_PV_PRETTY_REGPROP
#  define PERL_PV_PRETTY_REGPROP         PERL_PV_PRETTY_ELLIPSES|PERL_PV_PRETTY_LTGT|PERL_PV_ESCAPE_RE
#endif

/* Hint: pv_escape
 * Note that unicode functionality is only backported to
 * those perl versions that support it. For older perl
 * versions, the implementation will fall back to bytes.
 */

#ifndef pv_escape
#if defined(NEED_pv_escape)
static char * DPPP_(my_pv_escape)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, STRLEN * const escaped, const U32 flags);
static
#else
extern char * DPPP_(my_pv_escape)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, STRLEN * const escaped, const U32 flags);

sqlite3.c view on Meta::CPAN

**
** ^The third argument is the value to bind to the parameter.
** ^If the third parameter to sqlite3_bind_text() or sqlite3_bind_text16()
** or sqlite3_bind_blob() is a NULL pointer then the fourth parameter
** is ignored and the end result is the same as sqlite3_bind_null().
** ^If the third parameter to sqlite3_bind_text() is not NULL, then
** it should be a pointer to well-formed UTF8 text.
** ^If the third parameter to sqlite3_bind_text16() is not NULL, then
** it should be a pointer to well-formed UTF16 text.
** ^If the third parameter to sqlite3_bind_text64() is not NULL, then
** it should be a pointer to a well-formed unicode string that is
** either UTF8 if the sixth parameter is SQLITE_UTF8, or UTF16
** otherwise.
**
** [[byte-order determination rules]] ^The byte-order of
** UTF16 input text is determined by the byte-order mark (BOM, U+FEFF)
** found in first character, which is removed, or in the absence of a BOM
** the byte order is the native byte order of the host
** machine for sqlite3_bind_text16() or the byte order specified in
** the 6th parameter for sqlite3_bind_text64().)^
** ^If UTF16 input text contains invalid unicode
** characters, then SQLite might change those invalid characters
** into the unicode replacement character: U+FFFD.
**
** ^(In those routines that have a fourth argument, its value is the
** number of bytes in the parameter.  To be clear: the value is the
** number of <u>bytes</u> in the value, not the number of characters.)^
** ^If the fourth parameter to sqlite3_bind_text() or sqlite3_bind_text16()
** is negative, then the length of the string is
** the number of bytes up to the first zero terminator.
** If the fourth parameter to sqlite3_bind_blob() is negative, then
** the behavior is undefined.
** If a non-negative fourth parameter is provided to sqlite3_bind_text()

sqlite3.c view on Meta::CPAN

** specified by the interface procedure.  ^So, for example, if
** sqlite3_result_text16le() is invoked with text that begins
** with bytes 0xfe, 0xff (a big-endian byte-order mark) then the
** first two bytes of input are skipped and the remaining input
** is interpreted as UTF16BE text.
**
** ^For UTF16 input text to the sqlite3_result_text16(),
** sqlite3_result_text16be(), sqlite3_result_text16le(), and
** sqlite3_result_text64() routines, if the text contains invalid
** UTF16 characters, the invalid characters might be converted
** into the unicode replacement character, U+FFFD.
**
** ^The sqlite3_result_value() interface sets the result of
** the application-defined function to be a copy of the
** [unprotected sqlite3_value] object specified by the 2nd parameter.  ^The
** sqlite3_result_value() interface makes a copy of the [sqlite3_value]
** so that the [sqlite3_value] specified in the parameter may change or
** be deallocated after sqlite3_result_value() returns without harm.
** ^A [protected sqlite3_value] object may always be used where an
** [unprotected sqlite3_value] object is required, so either
** kind of [sqlite3_value] object can be used with this interface.

sqlite3.c view on Meta::CPAN

**
**  *  MEM_Real                Real stored in Mem.u.r.
**
**  *  MEM_IntReal             Real stored as an integer in Mem.u.i.
**
** If the MEM_Null flag is set, then the value is an SQL NULL value.
** For a pointer type created using sqlite3_bind_pointer() or
** sqlite3_result_pointer() the MEM_Term and MEM_Subtype flags are also set.
**
** If the MEM_Str flag is set then Mem.z points at a string representation.
** Usually this is encoded in the same unicode encoding as the main
** database (see below for exceptions). If the MEM_Term flag is also
** set, then the string is nul terminated. The MEM_Int and MEM_Real
** flags may coexist with the MEM_Str flag.
*/
#define MEM_Undefined 0x0000   /* Value is undefined */
#define MEM_Null      0x0001   /* Value is NULL (or a pointer) */
#define MEM_Str       0x0002   /* Value is a string */
#define MEM_Int       0x0004   /* Value is an integer */
#define MEM_Real      0x0008   /* Value is a real number */
#define MEM_Blob      0x0010   /* Value is a BLOB */

sqlite3.c view on Meta::CPAN

    *zOut++ = (u8)(c&0x00FF);                                       \
  }else{                                                            \
    *zOut++ = (u8)(0x00D8 + (((c-0x10000)>>18)&0x03));              \
    *zOut++ = (u8)(((c>>10)&0x003F) + (((c-0x10000)>>10)&0x00C0));  \
    *zOut++ = (u8)(0x00DC + ((c>>8)&0x03));                         \
    *zOut++ = (u8)(c&0x00FF);                                       \
  }                                                                 \
}

/*
** Translate a single UTF-8 character.  Return the unicode value.
**
** During translation, assume that the byte that zTerm points
** is a 0x00.
**
** Write a pointer to the next unread byte back into *pzNext.
**
** Notes On Invalid UTF-8:
**
**  *  This routine never allows a 7-bit character (0x00 through 0x7f) to
**     be encoded as a multi-byte character.  Any multi-byte character that

sqlite3.c view on Meta::CPAN

**  *  This routine never allows a UTF16 surrogate value to be encoded.
**     If a multi-byte character attempts to encode a value between
**     0xd800 and 0xe000 then it is rendered as 0xfffd.
**
**  *  Bytes in the range of 0x80 through 0xbf which occur as the first
**     byte of a character are interpreted as single-byte characters
**     and rendered as themselves even though they are technically
**     invalid characters.
**
**  *  This routine accepts over-length UTF8 encodings
**     for unicode values 0x80 and greater.  It does not change over-length
**     encodings to 0xfffd as some systems recommend.
*/
#define READ_UTF8(zIn, zTerm, c)                           \
  c = *(zIn++);                                            \
  if( c>=0xc0 ){                                           \
    c = sqlite3Utf8Trans1[c-0xc0];                         \
    while( zIn!=zTerm && (*zIn & 0xc0)==0x80 ){            \
      c = (c<<6) + (0x3f & *(zIn++));                      \
    }                                                      \
    if( c<0x80                                             \

sqlite3.c view on Meta::CPAN

      pMem->z[pMem->n+1] = '\0';
      pMem->flags |= MEM_Term;
      pMem->enc = bom;
    }
  }
  return rc;
}
#endif /* SQLITE_OMIT_UTF16 */

/*
** pZ is a UTF-8 encoded unicode string. If nByte is less than zero,
** return the number of unicode characters in pZ up to (but not including)
** the first 0x00 byte. If nByte is not less than zero, return the
** number of unicode characters in the first nByte of pZ (or up to
** the first 0x00, whichever comes first).
*/
SQLITE_PRIVATE int sqlite3Utf8CharLen(const char *zIn, int nByte){
  int r = 0;
  const u8 *z = (const u8*)zIn;
  const u8 *zTerm;
  if( nByte>=0 ){
    zTerm = &z[nByte];
  }else{
    zTerm = (const u8*)(-1);

sqlite3.c view on Meta::CPAN

    sqlite3VdbeMemRelease(&m);
    m.z = 0;
  }
  assert( (m.flags & MEM_Term)!=0 || db->mallocFailed );
  assert( (m.flags & MEM_Str)!=0 || db->mallocFailed );
  assert( m.z || db->mallocFailed );
  return m.z;
}

/*
** zIn is a UTF-16 encoded unicode string at least nChar characters long.
** Return the number of bytes in the first nChar unicode characters
** in pZ.  nChar must be non-negative.
*/
SQLITE_PRIVATE int sqlite3Utf16ByteLen(const void *zIn, int nChar){
  int c;
  unsigned char const *z = zIn;
  int n = 0;

  if( SQLITE_UTF16NATIVE==SQLITE_UTF16LE ) z++;
  while( n<nChar ){
    c = z[0];

sqlite3.c view on Meta::CPAN

    return 0;
  }
  zTextMbcs = winUnicodeToMbcs(zTmpWide, useAnsi);
  sqlite3_free(zTmpWide);
  return zTextMbcs;
}

/*
** This is a public wrapper for the winUtf8ToUnicode() function.
*/
SQLITE_API LPWSTR sqlite3_win32_utf8_to_unicode(const char *zText){
#ifdef SQLITE_ENABLE_API_ARMOR
  if( !zText ){
    (void)SQLITE_MISUSE_BKPT;
    return 0;
  }
#endif
#ifndef SQLITE_OMIT_AUTOINIT
  if( sqlite3_initialize() ) return 0;
#endif
  return winUtf8ToUnicode(zText);
}

/*
** This is a public wrapper for the winUnicodeToUtf8() function.
*/
SQLITE_API char *sqlite3_win32_unicode_to_utf8(LPCWSTR zWideText){
#ifdef SQLITE_ENABLE_API_ARMOR
  if( !zWideText ){
    (void)SQLITE_MISUSE_BKPT;
    return 0;
  }
#endif
#ifndef SQLITE_OMIT_AUTOINIT
  if( sqlite3_initialize() ) return 0;
#endif
  return winUnicodeToUtf8(zWideText);

sqlite3.c view on Meta::CPAN

** This function is the same as sqlite3_win32_set_directory (below); however,
** it accepts a UTF-16 string.
*/
SQLITE_API int sqlite3_win32_set_directory16(
  unsigned long type, /* Identifier for directory being set or reset */
  const void *zValue  /* New value for directory being set or reset */
){
  int rc;
  char *zUtf8 = 0;
  if( zValue ){
    zUtf8 = sqlite3_win32_unicode_to_utf8(zValue);
    if( zUtf8==0 ) return SQLITE_NOMEM_BKPT;
  }
  rc = sqlite3_win32_set_directory8(type, zUtf8);
  if( zUtf8 ) sqlite3_free(zUtf8);
  return rc;
}

/*
** This function sets the data directory or the temporary directory based on
** the provided arguments.  The type argument must be 1 in order to set the

sqlite3.c view on Meta::CPAN

  sqlite3QuoteValue(&str,argv[0]);
  sqlite3_result_text(context, sqlite3StrAccumFinish(&str), str.nChar,
                      SQLITE_DYNAMIC);
  if( str.accError!=SQLITE_OK ){
    sqlite3_result_null(context);
    sqlite3_result_error_code(context, str.accError);
  }
}

/*
** The unicode() function.  Return the integer unicode code-point value
** for the first character of the input string.
*/
static void unicodeFunc(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
){
  const unsigned char *z = sqlite3_value_text(argv[0]);
  (void)argc;
  if( z && z[0] ) sqlite3_result_int(context, sqlite3Utf8Read(&z));
}

/*
** The char() function takes zero or more arguments, each of which is
** an integer.  It constructs a string where each character of the string
** is the unicode character for the corresponding integer argument.
*/
static void charFunc(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
){
  unsigned char *z, *zOut;
  int i;
  zOut = z = sqlite3_malloc64( argc*4+1 );
  if( z==0 ){

sqlite3.c view on Meta::CPAN

    FUNCTION(max,                0, 1, 1, 0                ),
    WAGGREGATE(max, 1, 1, 1, minmaxStep, minMaxFinalize, minMaxValue, 0,
                                 SQLITE_FUNC_MINMAX|SQLITE_FUNC_ANYORDER ),
    FUNCTION2(typeof,            1, 0, 0, typeofFunc,  SQLITE_FUNC_TYPEOF),
    FUNCTION2(subtype,           1, 0, 0, subtypeFunc, SQLITE_FUNC_TYPEOF),
    FUNCTION2(length,            1, 0, 0, lengthFunc,  SQLITE_FUNC_LENGTH),
    FUNCTION2(octet_length,      1, 0, 0, bytelengthFunc,SQLITE_FUNC_BYTELEN),
    FUNCTION(instr,              2, 0, 0, instrFunc        ),
    FUNCTION(printf,            -1, 0, 0, printfFunc       ),
    FUNCTION(format,            -1, 0, 0, printfFunc       ),
    FUNCTION(unicode,            1, 0, 0, unicodeFunc      ),
    FUNCTION(char,              -1, 0, 0, charFunc         ),
    FUNCTION(abs,                1, 0, 0, absFunc          ),
#ifdef SQLITE_DEBUG
    FUNCTION(fpdecode,           3, 0, 0, fpdecodeFunc     ),
#endif
#ifndef SQLITE_OMIT_FLOATING_POINT
    FUNCTION(round,              1, 0, 0, roundFunc        ),
    FUNCTION(round,              2, 0, 0, roundFunc        ),
#endif
    FUNCTION(upper,              1, 0, 0, upperFunc        ),

sqlite3.c view on Meta::CPAN

    nBytes = sz;
  }
  sqlite3_mutex_enter(db->mutex);
  zSql8 = sqlite3Utf16to8(db, zSql, nBytes, SQLITE_UTF16NATIVE);
  if( zSql8 ){
    rc = sqlite3LockAndPrepare(db, zSql8, -1, prepFlags, 0, ppStmt, &zTail8);
  }

  if( zTail8 && pzTail ){
    /* If sqlite3_prepare returns a tail pointer, we calculate the
    ** equivalent pointer into the UTF-16 string by counting the unicode
    ** characters between zSql8 and zTail8, and then returning a pointer
    ** the same number of characters into the UTF-16 string.
    */
    int chars_parsed = sqlite3Utf8CharLen(zSql8, (int)(zTail8-zSql8));
    *pzTail = (u8 *)zSql + sqlite3Utf16ByteLen(zSql, chars_parsed);
  }
  sqlite3DbFree(db, zSql8);
  rc = sqlite3ApiExit(db, rc);
  sqlite3_mutex_leave(db->mutex);
  return rc;

sqlite3.c view on Meta::CPAN

#define CC_LP        17    /* '(' */
#define CC_RP        18    /* ')' */
#define CC_SEMI      19    /* ';' */
#define CC_PLUS      20    /* '+' */
#define CC_STAR      21    /* '*' */
#define CC_PERCENT   22    /* '%' */
#define CC_COMMA     23    /* ',' */
#define CC_AND       24    /* '&' */
#define CC_TILDA     25    /* '~' */
#define CC_DOT       26    /* '.' */
#define CC_ID        27    /* unicode characters usable in IDs */
#define CC_ILLEGAL   28    /* Illegal character */
#define CC_NUL       29    /* 0x00 */
#define CC_BOM       30    /* First byte of UTF8 BOM:  0xEF 0xBB 0xBF */

static const unsigned char aiClass[] = {
#ifdef SQLITE_ASCII
/*         x0  x1  x2  x3  x4  x5  x6  x7  x8  x9  xa  xb  xc  xd  xe  xf */
/* 0x */   29, 28, 28, 28, 28, 28, 28, 28, 28,  7,  7, 28,  7,  7, 28, 28,
/* 1x */   28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28,
/* 2x */    7, 15,  8,  5,  4, 22, 24,  8, 17, 18, 21, 20, 23, 11, 26, 16,

sqlite3.c view on Meta::CPAN

    Fts3Table*, Fts3MultiSegReader*, int, const char*, int);
SQLITE_PRIVATE int sqlite3Fts3MsrIncrNext(
    Fts3Table *, Fts3MultiSegReader *, sqlite3_int64 *, char **, int *);
SQLITE_PRIVATE int sqlite3Fts3EvalPhrasePoslist(Fts3Cursor *, Fts3Expr *, int iCol, char **);
SQLITE_PRIVATE int sqlite3Fts3MsrOvfl(Fts3Cursor *, Fts3MultiSegReader *, int *);
SQLITE_PRIVATE int sqlite3Fts3MsrIncrRestart(Fts3MultiSegReader *pCsr);

/* fts3_tokenize_vtab.c */
SQLITE_PRIVATE int sqlite3Fts3InitTok(sqlite3*, Fts3Hash *, void(*xDestroy)(void*));

/* fts3_unicode2.c (functions generated by parsing unicode text files) */
#ifndef SQLITE_DISABLE_FTS3_UNICODE
SQLITE_PRIVATE int sqlite3FtsUnicodeFold(int, int);
SQLITE_PRIVATE int sqlite3FtsUnicodeIsalnum(int);
SQLITE_PRIVATE int sqlite3FtsUnicodeIsdiacritic(int);
#endif

SQLITE_PRIVATE int sqlite3Fts3ExprIterate(Fts3Expr*, int (*x)(Fts3Expr*,int,void*), void*);

SQLITE_PRIVATE int sqlite3Fts3IntegrityCheck(Fts3Table *p, int *pbOk);

sqlite3.c view on Meta::CPAN

    sqlite3Fts3HashInit(&pHash->hash, FTS3_HASH_STRING, 1);
    pHash->nRef = 0;
  }

  /* Load the built-in tokenizers into the hash table */
  if( rc==SQLITE_OK ){
    if( sqlite3Fts3HashInsert(&pHash->hash, "simple", 7, (void *)pSimple)
     || sqlite3Fts3HashInsert(&pHash->hash, "porter", 7, (void *)pPorter)

#ifndef SQLITE_DISABLE_FTS3_UNICODE
     || sqlite3Fts3HashInsert(&pHash->hash, "unicode61", 10, (void *)pUnicode)
#endif
#ifdef SQLITE_ENABLE_ICU
     || (pIcu && sqlite3Fts3HashInsert(&pHash->hash, "icu", 4, (void *)pIcu))
#endif
    ){
      rc = SQLITE_NOMEM;
    }
  }

#ifdef SQLITE_TEST

sqlite3.c view on Meta::CPAN

  }else{
    /* Retrieve matchinfo() data. */
    fts3GetMatchinfo(pContext, pCsr, zFormat);
    sqlite3Fts3SegmentsClose(pTab);
  }
}

#endif

/************** End of fts3_snippet.c ****************************************/
/************** Begin file fts3_unicode.c ************************************/
/*
** 2012 May 24
**
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
******************************************************************************
**
** Implementation of the "unicode" full-text-search tokenizer.
*/

#ifndef SQLITE_DISABLE_FTS3_UNICODE

/* #include "fts3Int.h" */
#if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS3)

/* #include <assert.h> */
/* #include <stdlib.h> */
/* #include <stdio.h> */

sqlite3.c view on Meta::CPAN

  }else{                                               \
    *zOut++ = 0xF0 + (u8)((c>>18) & 0x07);             \
    *zOut++ = 0x80 + (u8)((c>>12) & 0x3F);             \
    *zOut++ = 0x80 + (u8)((c>>6) & 0x3F);              \
    *zOut++ = 0x80 + (u8)(c & 0x3F);                   \
  }                                                    \
}

#endif /* ifndef SQLITE_AMALGAMATION */

typedef struct unicode_tokenizer unicode_tokenizer;
typedef struct unicode_cursor unicode_cursor;

struct unicode_tokenizer {
  sqlite3_tokenizer base;
  int eRemoveDiacritic;
  int nException;
  int *aiException;
};

struct unicode_cursor {
  sqlite3_tokenizer_cursor base;
  const unsigned char *aInput;    /* Input text being tokenized */
  int nInput;                     /* Size of aInput[] in bytes */
  int iOff;                       /* Current offset within aInput[] */
  int iToken;                     /* Index of next token to be returned */
  char *zToken;                   /* storage for current token */
  int nAlloc;                     /* space allocated at zToken */
};


/*
** Destroy a tokenizer allocated by unicodeCreate().
*/
static int unicodeDestroy(sqlite3_tokenizer *pTokenizer){
  if( pTokenizer ){
    unicode_tokenizer *p = (unicode_tokenizer *)pTokenizer;
    sqlite3_free(p->aiException);
    sqlite3_free(p);
  }
  return SQLITE_OK;
}

/*
** As part of a tokenchars= or separators= option, the CREATE VIRTUAL TABLE
** statement has specified that the tokenizer for this table shall consider
** all characters in string zIn/nIn to be separators (if bAlnum==0) or
** token characters (if bAlnum==1).
**
** For each codepoint in the zIn/nIn string, this function checks if the
** sqlite3FtsUnicodeIsalnum() function already returns the desired result.
** If so, no action is taken. Otherwise, the codepoint is added to the
** unicode_tokenizer.aiException[] array. For the purposes of tokenization,
** the return value of sqlite3FtsUnicodeIsalnum() is inverted for all
** codepoints in the aiException[] array.
**
** If a standalone diacritic mark (one that sqlite3FtsUnicodeIsdiacritic()
** identifies as a diacritic) occurs in the zIn/nIn string it is ignored.
** It is not possible to change the behavior of the tokenizer with respect
** to these codepoints.
*/
static int unicodeAddExceptions(
  unicode_tokenizer *p,           /* Tokenizer to add exceptions to */
  int bAlnum,                     /* Replace Isalnum() return value with this */
  const char *zIn,                /* Array of characters to make exceptions */
  int nIn                         /* Length of z in bytes */
){
  const unsigned char *z = (const unsigned char *)zIn;
  const unsigned char *zTerm = &z[nIn];
  unsigned int iCode;
  int nEntry = 0;

  assert( bAlnum==0 || bAlnum==1 );

sqlite3.c view on Meta::CPAN

    p->aiException = aNew;
    p->nException = nNew;
  }

  return SQLITE_OK;
}

/*
** Return true if the p->aiException[] array contains the value iCode.
*/
static int unicodeIsException(unicode_tokenizer *p, int iCode){
  if( p->nException>0 ){
    int *a = p->aiException;
    int iLo = 0;
    int iHi = p->nException-1;

    while( iHi>=iLo ){
      int iTest = (iHi + iLo) / 2;
      if( iCode==a[iTest] ){
        return 1;
      }else if( iCode>a[iTest] ){

sqlite3.c view on Meta::CPAN

    }
  }

  return 0;
}

/*
** Return true if, for the purposes of tokenization, codepoint iCode is
** considered a token character (not a separator).
*/
static int unicodeIsAlnum(unicode_tokenizer *p, int iCode){
  assert( (sqlite3FtsUnicodeIsalnum(iCode) & 0xFFFFFFFE)==0 );
  return sqlite3FtsUnicodeIsalnum(iCode) ^ unicodeIsException(p, iCode);
}

/*
** Create a new tokenizer instance.
*/
static int unicodeCreate(
  int nArg,                       /* Size of array argv[] */
  const char * const *azArg,      /* Tokenizer creation arguments */
  sqlite3_tokenizer **pp          /* OUT: New tokenizer handle */
){
  unicode_tokenizer *pNew;        /* New tokenizer object */
  int i;
  int rc = SQLITE_OK;

  pNew = (unicode_tokenizer *) sqlite3_malloc(sizeof(unicode_tokenizer));
  if( pNew==NULL ) return SQLITE_NOMEM;
  memset(pNew, 0, sizeof(unicode_tokenizer));
  pNew->eRemoveDiacritic = 1;

  for(i=0; rc==SQLITE_OK && i<nArg; i++){
    const char *z = azArg[i];
    int n = (int)strlen(z);

    if( n==19 && memcmp("remove_diacritics=1", z, 19)==0 ){
      pNew->eRemoveDiacritic = 1;
    }
    else if( n==19 && memcmp("remove_diacritics=0", z, 19)==0 ){
      pNew->eRemoveDiacritic = 0;
    }
    else if( n==19 && memcmp("remove_diacritics=2", z, 19)==0 ){
      pNew->eRemoveDiacritic = 2;
    }
    else if( n>=11 && memcmp("tokenchars=", z, 11)==0 ){
      rc = unicodeAddExceptions(pNew, 1, &z[11], n-11);
    }
    else if( n>=11 && memcmp("separators=", z, 11)==0 ){
      rc = unicodeAddExceptions(pNew, 0, &z[11], n-11);
    }
    else{
      /* Unrecognized argument */
      rc  = SQLITE_ERROR;
    }
  }

  if( rc!=SQLITE_OK ){
    unicodeDestroy((sqlite3_tokenizer *)pNew);
    pNew = 0;
  }
  *pp = (sqlite3_tokenizer *)pNew;
  return rc;
}

/*
** Prepare to begin tokenizing a particular string.  The input
** string to be tokenized is pInput[0..nBytes-1].  A cursor
** used to incrementally tokenize this string is returned in
** *ppCursor.
*/
static int unicodeOpen(
  sqlite3_tokenizer *p,           /* The tokenizer */
  const char *aInput,             /* Input string */
  int nInput,                     /* Size of string aInput in bytes */
  sqlite3_tokenizer_cursor **pp   /* OUT: New cursor object */
){
  unicode_cursor *pCsr;

  pCsr = (unicode_cursor *)sqlite3_malloc(sizeof(unicode_cursor));
  if( pCsr==0 ){
    return SQLITE_NOMEM;
  }
  memset(pCsr, 0, sizeof(unicode_cursor));

  pCsr->aInput = (const unsigned char *)aInput;
  if( aInput==0 ){
    pCsr->nInput = 0;
    pCsr->aInput = (const unsigned char*)"";
  }else if( nInput<0 ){
    pCsr->nInput = (int)strlen(aInput);
  }else{
    pCsr->nInput = nInput;
  }

  *pp = &pCsr->base;
  UNUSED_PARAMETER(p);
  return SQLITE_OK;
}

/*
** Close a tokenization cursor previously opened by a call to
** simpleOpen() above.
*/
static int unicodeClose(sqlite3_tokenizer_cursor *pCursor){
  unicode_cursor *pCsr = (unicode_cursor *) pCursor;
  sqlite3_free(pCsr->zToken);
  sqlite3_free(pCsr);
  return SQLITE_OK;
}

/*
** Extract the next token from a tokenization cursor.  The cursor must
** have been opened by a prior call to simpleOpen().
*/
static int unicodeNext(
  sqlite3_tokenizer_cursor *pC,   /* Cursor returned by simpleOpen */
  const char **paToken,           /* OUT: Token text */
  int *pnToken,                   /* OUT: Number of bytes at *paToken */
  int *piStart,                   /* OUT: Starting offset of token */
  int *piEnd,                     /* OUT: Ending offset of token */
  int *piPos                      /* OUT: Position integer of token */
){
  unicode_cursor *pCsr = (unicode_cursor *)pC;
  unicode_tokenizer *p = ((unicode_tokenizer *)pCsr->base.pTokenizer);
  unsigned int iCode = 0;
  char *zOut;
  const unsigned char *z = &pCsr->aInput[pCsr->iOff];
  const unsigned char *zStart = z;
  const unsigned char *zEnd;
  const unsigned char *zTerm = &pCsr->aInput[pCsr->nInput];

  /* Scan past any delimiter characters before the start of the next token.
  ** Return SQLITE_DONE early if this takes us all the way to the end of
  ** the input.  */
  while( z<zTerm ){
    READ_UTF8(z, zTerm, iCode);
    if( unicodeIsAlnum(p, (int)iCode) ) break;
    zStart = z;
  }
  if( zStart>=zTerm ) return SQLITE_DONE;

  zOut = pCsr->zToken;
  do {
    int iOut;

    /* Grow the output buffer if required. */
    if( (zOut-pCsr->zToken)>=(pCsr->nAlloc-4) ){

sqlite3.c view on Meta::CPAN

    /* Write the folded case of the last character read to the output */
    zEnd = z;
    iOut = sqlite3FtsUnicodeFold((int)iCode, p->eRemoveDiacritic);
    if( iOut ){
      WRITE_UTF8(zOut, iOut);
    }

    /* If the cursor is not at EOF, read the next character */
    if( z>=zTerm ) break;
    READ_UTF8(z, zTerm, iCode);
  }while( unicodeIsAlnum(p, (int)iCode)
       || sqlite3FtsUnicodeIsdiacritic((int)iCode)
  );

  /* Set the output variables and return. */
  pCsr->iOff = (int)(z - pCsr->aInput);
  *paToken = pCsr->zToken;
  *pnToken = (int)(zOut - pCsr->zToken);
  *piStart = (int)(zStart - pCsr->aInput);
  *piEnd = (int)(zEnd - pCsr->aInput);
  *piPos = pCsr->iToken++;
  return SQLITE_OK;
}

/*
** Set *ppModule to a pointer to the sqlite3_tokenizer_module
** structure for the unicode tokenizer.
*/
SQLITE_PRIVATE void sqlite3Fts3UnicodeTokenizer(sqlite3_tokenizer_module const **ppModule){
  static const sqlite3_tokenizer_module module = {
    0,
    unicodeCreate,
    unicodeDestroy,
    unicodeOpen,
    unicodeClose,
    unicodeNext,
    0,
  };
  *ppModule = &module;
}

#endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS3) */
#endif /* ifndef SQLITE_DISABLE_FTS3_UNICODE */

/************** End of fts3_unicode.c ****************************************/
/************** Begin file fts3_unicode2.c ***********************************/
/*
** 2012-05-25
**
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**

sqlite3.c view on Meta::CPAN

/*
** DO NOT EDIT THIS MACHINE GENERATED FILE.
*/

#ifndef SQLITE_DISABLE_FTS3_UNICODE
#if defined(SQLITE_ENABLE_FTS3) || defined(SQLITE_ENABLE_FTS4)

/* #include <assert.h> */

/*
** Return true if the argument corresponds to a unicode codepoint
** classified as either a letter or a number. Otherwise false.
**
** The results are undefined if the value passed to this function
** is less than zero.
*/
SQLITE_PRIVATE int sqlite3FtsUnicodeIsalnum(int c){
  /* Each unsigned integer in the following array corresponds to a contiguous
  ** range of unicode codepoints that are not either letters or numbers (i.e.
  ** codepoints for which this function should return 0).
  **
  ** The most significant 22 bits in each 32-bit value contain the first
  ** codepoint in the range. The least significant 10 bits are used to store
  ** the size of the range (always at least 1). In other words, the value
  ** ((C<<22) + N) represents a range of N codepoints starting with codepoint
  ** C. It is not possible to represent a range larger than 1023 codepoints
  ** using this format.
  */
  static const unsigned int aEntry[] = {

sqlite3.c view on Meta::CPAN

      iHi = iTest-1;
    }
  }
  assert( key>=aDia[iRes] );
  if( bComplex==0 && (aChar[iRes] & 0x80) ) return c;
  return (c > (aDia[iRes]>>3) + (aDia[iRes]&0x07)) ? c : ((int)aChar[iRes] & 0x7F);
}


/*
** Return true if the argument interpreted as a unicode codepoint
** is a diacritical modifier character.
*/
SQLITE_PRIVATE int sqlite3FtsUnicodeIsdiacritic(int c){
  unsigned int mask0 = 0x08029FDF;
  unsigned int mask1 = 0x000361F8;
  if( c<768 || c>817 ) return 0;
  return (c < 768+32) ?
      (mask0 & ((unsigned int)1 << (c-768))) :
      (mask1 & ((unsigned int)1 << (c-768-32)));
}


/*
** Interpret the argument as a unicode codepoint. If the codepoint
** is an upper case character that has a lower case equivalent,
** return the codepoint corresponding to the lower case version.
** Otherwise, return a copy of the argument.
**
** The results are undefined if the value passed to this function
** is less than zero.
*/
SQLITE_PRIVATE int sqlite3FtsUnicodeFold(int c, int eRemoveDiacritic){
  /* Each entry in the following array defines a rule for folding a range
  ** of codepoints to lower case. The rule applies to a range of nRange

sqlite3.c view on Meta::CPAN

  ** to all nRange codepoints (i.e. all nRange codepoints are upper case and
  ** need to be folded). Or, if it is set, then the rule only applies to
  ** every second codepoint in the range, starting with codepoint C.
  **
  ** The 7 most significant bits in flags are an index into the aiOff[]
  ** array. If a specific codepoint C does require folding, then its lower
  ** case equivalent is ((C + aiOff[flags>>1]) & 0xFFFF).
  **
  ** The contents of this array are generated by parsing the CaseFolding.txt
  ** file distributed as part of the "Unicode Character Database". See
  ** http://www.unicode.org for details.
  */
  static const struct TableEntry {
    unsigned short iCode;
    unsigned char flags;
    unsigned char nRange;
  } aEntry[] = {
    {65, 14, 26},          {181, 64, 1},          {192, 14, 23},
    {216, 14, 7},          {256, 1, 48},          {306, 1, 6},
    {313, 1, 16},          {330, 1, 46},          {376, 116, 1},
    {377, 1, 6},           {383, 104, 1},         {385, 50, 1},

sqlite3.c view on Meta::CPAN


  else if( c>=66560 && c<66600 ){
    ret = c + 40;
  }

  return ret;
}
#endif /* defined(SQLITE_ENABLE_FTS3) || defined(SQLITE_ENABLE_FTS4) */
#endif /* !defined(SQLITE_DISABLE_FTS3_UNICODE) */

/************** End of fts3_unicode2.c ***************************************/
/************** Begin file json.c ********************************************/
/*
** 2015-08-12
**
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.

sqlite3.c view on Meta::CPAN

          }else if( v<=0x7ff ){
            assert( szEscape>=2 );
            zOut[iOut++] = (char)(0xc0 | (v>>6));
            zOut[iOut++] = 0x80 | (v&0x3f);
          }else if( v<0x10000 ){
            assert( szEscape>=3 );
            zOut[iOut++] = 0xe0 | (v>>12);
            zOut[iOut++] = 0x80 | ((v>>6)&0x3f);
            zOut[iOut++] = 0x80 | (v&0x3f);
          }else if( v==JSON_INVALID_CHAR ){
            /* Silently ignore illegal unicode */
          }else{
            assert( szEscape>=4 );
            zOut[iOut++] = 0xf0 | (v>>18);
            zOut[iOut++] = 0x80 | ((v>>12)&0x3f);
            zOut[iOut++] = 0x80 | ((v>>6)&0x3f);
            zOut[iOut++] = 0x80 | (v&0x3f);
          }
          iIn += szEscape - 1;
        }else{
          zOut[iOut++] = c;

sqlite3.c view on Meta::CPAN

**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** $Id: icu.c,v 1.7 2007/12/13 21:54:11 drh Exp $
**
** This file implements an integration between the ICU library
** ("International Components for Unicode", an open-source library
** for handling unicode data) and SQLite. The integration uses
** ICU to provide the following to SQLite:
**
**   * An implementation of the SQL regexp() function (and hence REGEXP
**     operator) using the ICU uregex_XX() APIs.
**
**   * Implementations of the SQL scalar upper() and lower() functions
**     for case mapping.
**
**   * Integration of ICU and SQLite collation sequences.
**
**   * An implementation of the LIKE operator that uses ICU to
**     provide case-independent matching.
*/

#if !defined(SQLITE_CORE)                  \
 || defined(SQLITE_ENABLE_ICU)             \
 || defined(SQLITE_ENABLE_ICU_COLLATIONS)

/* Include ICU headers */
#include <unicode/utypes.h>
#include <unicode/uregex.h>
#include <unicode/ustring.h>
#include <unicode/ucol.h>

/* #include <assert.h> */

#ifndef SQLITE_CORE
/*   #include "sqlite3ext.h" */
  SQLITE_EXTENSION_INIT1
#else
/*   #include "sqlite3.h" */
#endif

sqlite3.c view on Meta::CPAN

** This file implements a tokenizer for fts3 based on the ICU library.
*/
/* #include "fts3Int.h" */
#if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS3)
#ifdef SQLITE_ENABLE_ICU

/* #include <assert.h> */
/* #include <string.h> */
/* #include "fts3_tokenizer.h" */

#include <unicode/ubrk.h>
/* #include <unicode/ucol.h> */
/* #include <unicode/ustring.h> */
#include <unicode/utf16.h>

typedef struct IcuTokenizer IcuTokenizer;
typedef struct IcuCursor IcuCursor;

struct IcuTokenizer {
  sqlite3_tokenizer base;
  char *zLocale;
};

struct IcuCursor {

sqlite3.c view on Meta::CPAN

*/

static int sqlite3Fts5VocabInit(Fts5Global*, sqlite3*);

/*
** End of interface to code in fts5_vocab.c.
**************************************************************************/


/**************************************************************************
** Interface to automatically generated code in fts5_unicode2.c.
*/
static int sqlite3Fts5UnicodeIsdiacritic(int c);
static int sqlite3Fts5UnicodeFold(int c, int bRemoveDiacritic);

static int sqlite3Fts5UnicodeCatParse(const char*, u8*);
static int sqlite3Fts5UnicodeCategory(u32 iCode);
static void sqlite3Fts5UnicodeAscii(u8*, u8*);
/*
** End of interface to code in fts5_unicode2.c.
**************************************************************************/

#endif

#define FTS5_OR                               1
#define FTS5_AND                              2
#define FTS5_NOT                              3
#define FTS5_TERM                             4
#define FTS5_COLON                            5
#define FTS5_MINUS                            6

sqlite3.c view on Meta::CPAN



/*
** Return true if character 't' may be part of an FTS5 bareword, or false
** otherwise. Characters that may be part of barewords:
**
**   * All non-ASCII characters,
**   * The 52 upper and lower case ASCII characters, and
**   * The 10 integer ASCII characters.
**   * The underscore character "_" (0x5F).
**   * The unicode "subsitute" character (0x1A).
*/
static int sqlite3Fts5IsBareword(char t){
  u8 aBareword[128] = {
    0, 0, 0, 0, 0, 0, 0, 0,    0, 0, 0, 0, 0, 0, 0, 0,   /* 0x00 .. 0x0F */
    0, 0, 0, 0, 0, 0, 0, 0,    0, 0, 1, 0, 0, 0, 0, 0,   /* 0x10 .. 0x1F */
    0, 0, 0, 0, 0, 0, 0, 0,    0, 0, 0, 0, 0, 0, 0, 0,   /* 0x20 .. 0x2F */
    1, 1, 1, 1, 1, 1, 1, 1,    1, 1, 0, 0, 0, 0, 0, 0,   /* 0x30 .. 0x3F */
    0, 1, 1, 1, 1, 1, 1, 1,    1, 1, 1, 1, 1, 1, 1, 1,   /* 0x40 .. 0x4F */
    1, 1, 1, 1, 1, 1, 1, 1,    1, 1, 1, 0, 0, 0, 0, 1,   /* 0x50 .. 0x5F */
    0, 1, 1, 1, 1, 1, 1, 1,    1, 1, 1, 1, 1, 1, 1, 1,   /* 0x60 .. 0x6F */

sqlite3.c view on Meta::CPAN

  */
  if( rc==SQLITE_OK && pRet->bContentlessDelete && pRet->bColumnsize==0 ){
    *pzErr = sqlite3_mprintf(
        "contentless_delete=1 is incompatible with columnsize=0"
    );
    rc = SQLITE_ERROR;
  }

  /* If a tokenizer= option was successfully parsed, the tokenizer has
  ** already been allocated. Otherwise, allocate an instance of the default
  ** tokenizer (unicode61) now.  */
  if( rc==SQLITE_OK && pRet->pTok==0 ){
    rc = fts5ConfigDefaultTokenizer(pGlobal, pRet);
  }

  /* If no zContent option was specified, fill in the default values. */
  if( rc==SQLITE_OK && pRet->zContent==0 ){
    const char *zTail = 0;
    assert( pRet->eContent==FTS5_CONTENT_NORMAL
         || pRet->eContent==FTS5_CONTENT_NONE
    );

sqlite3.c view on Meta::CPAN

  sqlite3_context *pCtx,          /* Function call context */
  int nArg,                       /* Number of args */
  sqlite3_value **apVal           /* Function arguments */
){
  fts5ExprFunction(pCtx, nArg, apVal, 1);
}

/*
** The implementation of an SQLite user-defined-function that accepts a
** single integer as an argument. If the integer is an alpha-numeric
** unicode code point, 1 is returned. Otherwise 0.
*/
static void fts5ExprIsAlnum(
  sqlite3_context *pCtx,          /* Function call context */
  int nArg,                       /* Number of args */
  sqlite3_value **apVal           /* Function arguments */
){
  int iCode;
  u8 aArr[32];
  if( nArg!=1 ){
    sqlite3_result_error(pCtx,

sqlite3.c view on Meta::CPAN

          return 0;
        }
      }
    }
  }
  return n;
}

/*
** pIn is a UTF-8 encoded string, nIn bytes in size. Return the number of
** unicode characters in the string.
*/
static int fts5IndexCharlen(const char *pIn, int nIn){
  int nChar = 0;
  int i = 0;
  while( i<nIn ){
    if( (unsigned char)pIn[i++]>=0xc0 ){
      while( i<nIn && (pIn[i] & 0xc0)==0x80 ) i++;
    }
    nChar++;
  }

sqlite3.c view on Meta::CPAN

*/


/* #include "fts5Int.h" */

/**************************************************************************
** Start of ascii tokenizer implementation.
*/

/*
** For tokenizers with no "unicode" modifier, the set of token characters
** is the same as the set of ASCII range alphanumeric characters.
*/
static unsigned char aAsciiTokenChar[128] = {
  0, 0, 0, 0, 0, 0, 0, 0,   0, 0, 0, 0, 0, 0, 0, 0,   /* 0x00..0x0F */
  0, 0, 0, 0, 0, 0, 0, 0,   0, 0, 0, 0, 0, 0, 0, 0,   /* 0x10..0x1F */
  0, 0, 0, 0, 0, 0, 0, 0,   0, 0, 0, 0, 0, 0, 0, 0,   /* 0x20..0x2F */
  1, 1, 1, 1, 1, 1, 1, 1,   1, 1, 0, 0, 0, 0, 0, 0,   /* 0x30..0x3F */
  0, 1, 1, 1, 1, 1, 1, 1,   1, 1, 1, 1, 1, 1, 1, 1,   /* 0x40..0x4F */
  1, 1, 1, 1, 1, 1, 1, 1,   1, 1, 1, 0, 0, 0, 0, 0,   /* 0x50..0x5F */
  0, 1, 1, 1, 1, 1, 1, 1,   1, 1, 1, 1, 1, 1, 1, 1,   /* 0x60..0x6F */

sqlite3.c view on Meta::CPAN

    rc = xToken(pCtx, 0, pFold, nByte, is, ie);
    is = ie+1;
  }

  if( pFold!=aFold ) sqlite3_free(pFold);
  if( rc==SQLITE_DONE ) rc = SQLITE_OK;
  return rc;
}

/**************************************************************************
** Start of unicode61 tokenizer implementation.
*/


/*
** The following two macros - READ_UTF8 and WRITE_UTF8 - have been copied
** from the sqlite3 source file utf.c. If this file is compiled as part
** of the amalgamation, they are not required.
*/
#ifndef SQLITE_AMALGAMATION

sqlite3.c view on Meta::CPAN

  unsigned char aTokenChar[128];  /* ASCII range token characters */
  char *aFold;                    /* Buffer to fold text into */
  int nFold;                      /* Size of aFold[] in bytes */
  int eRemoveDiacritic;           /* True if remove_diacritics=1 is set */
  int nException;
  int *aiException;

  unsigned char aCategory[32];    /* True for token char categories */
};

/* Values for eRemoveDiacritic (must match internals of fts5_unicode2.c) */
#define FTS5_REMOVE_DIACRITICS_NONE    0
#define FTS5_REMOVE_DIACRITICS_SIMPLE  1
#define FTS5_REMOVE_DIACRITICS_COMPLEX 2

static int fts5UnicodeAddExceptions(
  Unicode61Tokenizer *p,          /* Tokenizer object */
  const char *z,                  /* Characters to treat as exceptions */
  int bTokenChars                 /* 1 for 'tokenchars', 0 for 'separators' */
){
  int rc = SQLITE_OK;

sqlite3.c view on Meta::CPAN

      }else{
        iHi = iTest-1;
      }
    }
  }

  return 0;
}

/*
** Delete a "unicode61" tokenizer.
*/
static void fts5UnicodeDelete(Fts5Tokenizer *pTok){
  if( pTok ){
    Unicode61Tokenizer *p = (Unicode61Tokenizer*)pTok;
    sqlite3_free(p->aiException);
    sqlite3_free(p->aFold);
    sqlite3_free(p);
  }
  return;
}

static int unicodeSetCategories(Unicode61Tokenizer *p, const char *zCat){
  const char *z = zCat;

  while( *z ){
    while( *z==' ' || *z=='\t' ) z++;
    if( *z && sqlite3Fts5UnicodeCatParse(z, p->aCategory) ){
      return SQLITE_ERROR;
    }
    while( *z!=' ' && *z!='\t' && *z!='\0' ) z++;
  }

  sqlite3Fts5UnicodeAscii(p->aCategory, p->aTokenChar);
  return SQLITE_OK;
}

/*
** Create a "unicode61" tokenizer.
*/
static int fts5UnicodeCreate(
  void *pUnused,
  const char **azArg, int nArg,
  Fts5Tokenizer **ppOut
){
  int rc = SQLITE_OK;             /* Return code */
  Unicode61Tokenizer *p = 0;      /* New tokenizer object */

  UNUSED_PARAM(pUnused);

sqlite3.c view on Meta::CPAN

        rc = SQLITE_NOMEM;
      }

      /* Search for a "categories" argument */
      for(i=0; rc==SQLITE_OK && i<nArg-1; i+=2){
        if( 0==sqlite3_stricmp(azArg[i], "categories") ){
          zCat = azArg[i+1];
        }
      }
      if( rc==SQLITE_OK ){
        rc = unicodeSetCategories(p, zCat);
      }

      for(i=0; rc==SQLITE_OK && i<nArg-1; i+=2){
        const char *zArg = azArg[i+1];
        if( 0==sqlite3_stricmp(azArg[i], "remove_diacritics") ){
          if( (zArg[0]!='0' && zArg[0]!='1' && zArg[0]!='2') || zArg[1] ){
            rc = SQLITE_ERROR;
          }else{
            p->eRemoveDiacritic = (zArg[0] - '0');
            assert( p->eRemoveDiacritic==FTS5_REMOVE_DIACRITICS_NONE

sqlite3.c view on Meta::CPAN

*/
static int fts5PorterCreate(
  void *pCtx,
  const char **azArg, int nArg,
  Fts5Tokenizer **ppOut
){
  fts5_api *pApi = (fts5_api*)pCtx;
  int rc = SQLITE_OK;
  PorterTokenizer *pRet;
  void *pUserdata = 0;
  const char *zBase = "unicode61";

  if( nArg>0 ){
    zBase = azArg[0];
  }

  pRet = (PorterTokenizer*)sqlite3_malloc(sizeof(PorterTokenizer));
  if( pRet ){
    memset(pRet, 0, sizeof(PorterTokenizer));
    rc = pApi->xFindTokenizer(pApi, zBase, &pUserdata, &pRet->tokenizer);
  }else{

sqlite3.c view on Meta::CPAN

}

/*
** Register all built-in tokenizers with FTS5.
*/
static int sqlite3Fts5TokenizerInit(fts5_api *pApi){
  struct BuiltinTokenizer {
    const char *zName;
    fts5_tokenizer x;
  } aBuiltin[] = {
    { "unicode61", {fts5UnicodeCreate, fts5UnicodeDelete, fts5UnicodeTokenize}},
    { "ascii",     {fts5AsciiCreate, fts5AsciiDelete, fts5AsciiTokenize }},
    { "porter",    {fts5PorterCreate, fts5PorterDelete, fts5PorterTokenize }},
    { "trigram",   {fts5TriCreate, fts5TriDelete, fts5TriTokenize}},
  };

  int rc = SQLITE_OK;             /* Return code */
  int i;                          /* To iterate through builtin functions */

  for(i=0; rc==SQLITE_OK && i<ArraySize(aBuiltin); i++){
    rc = pApi->xCreateTokenizer(pApi,

sqlite3.c view on Meta::CPAN

      iHi = iTest-1;
    }
  }
  assert( key>=aDia[iRes] );
  if( bComplex==0 && (aChar[iRes] & 0x80) ) return c;
  return (c > (aDia[iRes]>>3) + (aDia[iRes]&0x07)) ? c : ((int)aChar[iRes] & 0x7F);
}


/*
** Return true if the argument interpreted as a unicode codepoint
** is a diacritical modifier character.
*/
static int sqlite3Fts5UnicodeIsdiacritic(int c){
  unsigned int mask0 = 0x08029FDF;
  unsigned int mask1 = 0x000361F8;
  if( c<768 || c>817 ) return 0;
  return (c < 768+32) ?
      (mask0 & ((unsigned int)1 << (c-768))) :
      (mask1 & ((unsigned int)1 << (c-768-32)));
}


/*
** Interpret the argument as a unicode codepoint. If the codepoint
** is an upper case character that has a lower case equivalent,
** return the codepoint corresponding to the lower case version.
** Otherwise, return a copy of the argument.
**
** The results are undefined if the value passed to this function
** is less than zero.
*/
static int sqlite3Fts5UnicodeFold(int c, int eRemoveDiacritic){
  /* Each entry in the following array defines a rule for folding a range
  ** of codepoints to lower case. The rule applies to a range of nRange

sqlite3.c view on Meta::CPAN

  ** to all nRange codepoints (i.e. all nRange codepoints are upper case and
  ** need to be folded). Or, if it is set, then the rule only applies to
  ** every second codepoint in the range, starting with codepoint C.
  **
  ** The 7 most significant bits in flags are an index into the aiOff[]
  ** array. If a specific codepoint C does require folding, then its lower
  ** case equivalent is ((C + aiOff[flags>>1]) & 0xFFFF).
  **
  ** The contents of this array are generated by parsing the CaseFolding.txt
  ** file distributed as part of the "Unicode Character Database". See
  ** http://www.unicode.org for details.
  */
  static const struct TableEntry {
    unsigned short iCode;
    unsigned char flags;
    unsigned char nRange;
  } aEntry[] = {
    {65, 14, 26},          {181, 64, 1},          {192, 14, 23},
    {216, 14, 7},          {256, 1, 48},          {306, 1, 6},
    {313, 1, 16},          {330, 1, 46},          {376, 116, 1},
    {377, 1, 6},           {383, 104, 1},         {385, 50, 1},

sqlite3.h view on Meta::CPAN

**
** ^The third argument is the value to bind to the parameter.
** ^If the third parameter to sqlite3_bind_text() or sqlite3_bind_text16()
** or sqlite3_bind_blob() is a NULL pointer then the fourth parameter
** is ignored and the end result is the same as sqlite3_bind_null().
** ^If the third parameter to sqlite3_bind_text() is not NULL, then
** it should be a pointer to well-formed UTF8 text.
** ^If the third parameter to sqlite3_bind_text16() is not NULL, then
** it should be a pointer to well-formed UTF16 text.
** ^If the third parameter to sqlite3_bind_text64() is not NULL, then
** it should be a pointer to a well-formed unicode string that is
** either UTF8 if the sixth parameter is SQLITE_UTF8, or UTF16
** otherwise.
**
** [[byte-order determination rules]] ^The byte-order of
** UTF16 input text is determined by the byte-order mark (BOM, U+FEFF)
** found in first character, which is removed, or in the absence of a BOM
** the byte order is the native byte order of the host
** machine for sqlite3_bind_text16() or the byte order specified in
** the 6th parameter for sqlite3_bind_text64().)^
** ^If UTF16 input text contains invalid unicode
** characters, then SQLite might change those invalid characters
** into the unicode replacement character: U+FFFD.
**
** ^(In those routines that have a fourth argument, its value is the
** number of bytes in the parameter.  To be clear: the value is the
** number of <u>bytes</u> in the value, not the number of characters.)^
** ^If the fourth parameter to sqlite3_bind_text() or sqlite3_bind_text16()
** is negative, then the length of the string is
** the number of bytes up to the first zero terminator.
** If the fourth parameter to sqlite3_bind_blob() is negative, then
** the behavior is undefined.
** If a non-negative fourth parameter is provided to sqlite3_bind_text()

sqlite3.h view on Meta::CPAN

** specified by the interface procedure.  ^So, for example, if
** sqlite3_result_text16le() is invoked with text that begins
** with bytes 0xfe, 0xff (a big-endian byte-order mark) then the
** first two bytes of input are skipped and the remaining input
** is interpreted as UTF16BE text.
**
** ^For UTF16 input text to the sqlite3_result_text16(),
** sqlite3_result_text16be(), sqlite3_result_text16le(), and
** sqlite3_result_text64() routines, if the text contains invalid
** UTF16 characters, the invalid characters might be converted
** into the unicode replacement character, U+FFFD.
**
** ^The sqlite3_result_value() interface sets the result of
** the application-defined function to be a copy of the
** [unprotected sqlite3_value] object specified by the 2nd parameter.  ^The
** sqlite3_result_value() interface makes a copy of the [sqlite3_value]
** so that the [sqlite3_value] specified in the parameter may change or
** be deallocated after sqlite3_result_value() returns without harm.
** ^A [protected sqlite3_value] object may always be used where an
** [unprotected sqlite3_value] object is required, so either
** kind of [sqlite3_value] object can be used with this interface.

t/02_logon.t view on Meta::CPAN


use strict;
use warnings;
use lib "t/lib";
use SQLiteTest qw/connect_ok @CALL_FUNCS/;
use Test::More;
use if -d ".git", "Test::FailWarnings";

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $unicode_opt = DBD_SQLITE_STRING_MODE_UNICODE_STRICT;

my $show_diag = 0;
foreach my $call_func (@CALL_FUNCS) {

	# Ordinary connect
	SCOPE: {
		my $dbh = connect_ok();
		ok( $dbh->{sqlite_version}, '->{sqlite_version} ok' );
		is( $dbh->{AutoCommit}, 1, 'AutoCommit is on by default' );
		diag("sqlite_version=$dbh->{sqlite_version}") unless $show_diag++;

t/02_logon.t view on Meta::CPAN

		ok( defined $dbh->$call_func(0, 'busy_timeout') );
		is( $dbh->$call_func('busy_timeout'), 0, 'Set busy_timeout to 0' );
	}

	# Attributes in the connect string
	SKIP: {
		unless ( $] >= 5.008005 ) {
			skip( 'Unicode is not supported before 5.8.5', 2 );
		}
		my $file = 'foo'.$$;
		my $dbh = DBI->connect( "dbi:SQLite:dbname=$file;sqlite_string_mode=$unicode_opt", '', '' );
		isa_ok( $dbh, 'DBI::db' );
		is( $dbh->{sqlite_string_mode}, $unicode_opt, 'Unicode is on' );
		$dbh->disconnect;
		unlink $file;
	}

	# dbname, db, database
	SCOPE: {
		for my $key (qw/database db dbname/) {
			my $file = 'foo'.$$;
			unlink $file if -f $file;
			ok !-f $file, 'database file does not exist';

t/12_unicode.t view on Meta::CPAN

# This is a test for correct handling of the "unicode" database
# handle parameter.

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $unicode_opt = DBD_SQLITE_STRING_MODE_UNICODE_STRICT;

BEGIN { requires_unicode_support() }

#
#   Include std stuff
#
use Carp;
use DBI qw(:sql_types);

# Unintuitively, still has the effect of loading bytes.pm :-)
no bytes;

t/12_unicode.t view on Meta::CPAN

	ok(
		! is_utf8($textback),
		"Reading text gives binary too (for now)",
	);
	is($bytesback, $bytestring, "No blob corruption");
	is($textback, $bytestring, "Same text, different encoding");
}

# Start over but now activate Unicode support.
SCOPE: {
	my $dbh = connect_ok( dbfile => 'foo', sqlite_string_mode => $unicode_opt );
	is( $dbh->{sqlite_string_mode}, $unicode_opt, 'Unicode is on' );

	($textback, $bytesback) = database_roundtrip($dbh, $utfstring, $bytestring);

	ok(! is_utf8($bytesback), "Reading blob still gives binary");
	ok(is_utf8($textback), "Reading text returns UTF-8");
	ok($bytesback eq $bytestring, "Still no blob corruption");
	ok($textback eq $utfstring, "Same text");

	my $lengths = $dbh->selectall_arrayref(
		"SELECT length(a), length(b) FROM table1"

t/13_create_collation.t view on Meta::CPAN

use warnings;
no if $] >= 5.022, "warnings", "locale";
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";
use Encode qw/decode/;
use DBD::SQLite;
use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $unicode_opt = DBD_SQLITE_STRING_MODE_UNICODE_STRICT;

BEGIN { requires_unicode_support(); }

BEGIN {
	# Sadly perl for windows (and probably sqlite, too) may hang
	# if the system locale doesn't support european languages.
	# en-us should be a safe default. if it doesn't work, use 'C'.
	if ( $^O eq 'MSWin32') {
		use POSIX 'locale_h';
		setlocale(LC_COLLATE, 'en-us');
	}
}

t/13_create_collation.t view on Meta::CPAN

      "can't override registered collation");
my $tied = tied %DBD::SQLite::COLLATION;
delete $tied->{foo};
$DBD::SQLite::COLLATION{foo} = \&by_num_desc; # override, no longer dies
is($DBD::SQLite::COLLATION{foo}, \&by_num_desc, "overridden collation");

# now really test the collation functions

foreach my $call_func (@CALL_FUNCS) {

  for my $unicode_opt (DBD_SQLITE_STRING_MODE_BYTES, DBD_SQLITE_STRING_MODE_UNICODE_STRICT) {

    # connect
    my $dbh = connect_ok( RaiseError => 1, sqlite_string_mode => $unicode_opt );

    # populate test data
    my @words = qw{
	berger Bergèòe bergèòe Bergere
	HOT hôôe 
	héôéòoclite héôaïòe hêôre héòaut
	HAT hâôer 
	féôu fêôe fèöe ferme
     };
    if ($unicode_opt != DBD_SQLITE_STRING_MODE_BYTES) {
      utf8::upgrade($_) foreach @words;
    }

    $dbh->do( 'CREATE TEMP TABLE collate_test ( txt )' );
    $dbh->do( "INSERT INTO collate_test VALUES ( '$_' )" ) foreach @words;

    # test builtin collation "perl"
    my @sorted    = sort @words;
    my $db_sorted = $dbh->selectcol_arrayref("$sql COLLATE perl");
    is_deeply(\@sorted, $db_sorted, "collate perl (@sorted // @$db_sorted)");

t/33_non_latin_path.t view on Meta::CPAN

use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";
use File::Temp ();
use File::Spec::Functions ':ALL';

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $unicode_opt = DBD_SQLITE_STRING_MODE_UNICODE_STRICT;

BEGIN { requires_unicode_support() }

my $dir = File::Temp::tempdir( CLEANUP => 1 );
foreach my $subdir ( 'longascii', 'adatbázis', 'name with spaces', '¿¿¿ ¿¿¿¿¿¿') {
	if ($^O eq 'cygwin') {
		next if (($subdir eq 'adatbázis') || ($subdir eq '¿¿¿ ¿¿¿¿¿¿'));
	}
	# rt48048: don't need to "use utf8" nor "require utf8"
	utf8::upgrade($subdir);
	ok(
		mkdir(catdir($dir, $subdir)),

t/33_non_latin_path.t view on Meta::CPAN

			RaiseError => 1,
			PrintError => 0,
		} );
		isa_ok( $dbh, 'DBI::db' );
	};
	is( $@, '', "Could connect to database in $subdir" );
	diag( $@ ) if $@;

	unlink(_path($dbfile))  if -e _path($dbfile);

	# Repeat with the unicode flag on
	my $ufile = $dbfile;
	eval {
		my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile", undef, undef, {
			RaiseError => 1,
			PrintError => 0,
			sqlite_string_mode => $unicode_opt,
		} );
		isa_ok( $dbh, 'DBI::db' );
	};
	is( $@, '', "Could connect to database in $subdir" );
	diag( $@ ) if $@;

	# Reopen the database
	eval {
		my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile", undef, undef, {
			RaiseError => 1,
			PrintError => 0,
			sqlite_string_mode => $unicode_opt,
		} );
		isa_ok( $dbh, 'DBI::db' );
	};
	is( $@, '', "Could connect to database in $subdir" );
	diag( $@ ) if $@;

	unlink(_path($ufile))  if -e _path($ufile);
	
	# when the name of the database file has non-latin characters
	my $dbfilex = catfile($dir, "$subdir.db");

t/37_regexp.t view on Meta::CPAN


my @words = qw{
	berger Bergère bergère Bergere
	HOT hôte 
	hétéroclite hétaïre hêtre héraut
	HAT hâter 
	fétu fête fève ferme
     };
my @regexes = qw(  ^b\\w+ (?i:^b\\w+) );

BEGIN { requires_unicode_support() }
BEGIN {
	# Sadly perl for windows (and probably sqlite, too) may hang
	# if the system locale doesn't support european languages.
	# en-us should be a safe default. if it doesn't work, use 'C'.
	if ( $^O eq 'MSWin32') {
		use POSIX 'locale_h';
		setlocale(LC_COLLATE, 'en-us');
	}
}
use locale;

t/43_fts3.t view on Meta::CPAN

  ['"qui gardait"'        => 1       ],
  ["moutons NOT lait"     => 1       ],
  ["il était"             => 0       ],
  ["(il OR elle) AND un*" => 0, 2    ],
);

my $ix_une_native = index($texts[0], "une");
my $ix_une_utf8   = do {use bytes; utf8::upgrade(my $bergere_utf8 = $texts[0]); index($bergere_utf8, "une");};

BEGIN {
	requires_unicode_support();

	if (!has_fts()) {
		plan skip_all => 'FTS is disabled for this DBD::SQLite';
	}
	if ($DBD::SQLite::sqlite_version_number >= 3011000 and $DBD::SQLite::sqlite_version_number < 3012000 and !has_compile_option('ENABLE_FTS3_TOKENIZER')) {
		plan skip_all => 'FTS3 tokenizer is disabled for this DBD::SQLite';
	}
}

t/62_regexp_multibyte_char_class.t view on Meta::CPAN

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
#use if -d ".git", "Test::FailWarnings"; # see RT#112220

BEGIN { requires_unicode_support() }

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

# special case for multibyte (non-ASCII) character class,
# which only works correctly under the unicode mode
my @words = ("\x{e3}\x{83}\x{86}\x{e3}\x{82}\x{b9}\x{e3}\x{83}\x{88}", "\x{e3}\x{83}\x{86}\x{e3}\x{83}\x{b3}\x{e3}\x{83}\x{88}"); # ãƒ†ã‚¹ãƒˆ ãƒ†ãƒ³ãƒˆ

my $regex = "\x{e3}\x{83}\x{86}[\x{e3}\x{82}\x{b9}\x{e3}\x{83}\x{b3}]\x{e3}\x{83}\x{88}"; # ãƒ†[ã‚¹ãƒ³]ãƒˆ

foreach my $call_func (@CALL_FUNCS) {

  for my $string_mode (DBD_SQLITE_STRING_MODE_PV, DBD_SQLITE_STRING_MODE_UNICODE_STRICT) {

    # connect
    my $dbh = connect_ok( RaiseError => 1, sqlite_string_mode => $string_mode );

t/lib/SQLiteTest.pm view on Meta::CPAN

# Support code for DBD::SQLite tests

use strict;
use Exporter   ();
use File::Spec ();
use Test::More ();

our @ISA     = 'Exporter';
our @EXPORT  = qw/
    connect_ok dies dbfile @CALL_FUNCS $sqlite_call
    has_sqlite requires_sqlite requires_unicode_support
    allow_warnings has_compile_option has_fts
/;
our @CALL_FUNCS;
our $sqlite_call;

my $parent;
my %dbfiles;

BEGIN {
	# Allow tests to load modules bundled in /inc

t/lib/SQLiteTest.pm view on Meta::CPAN

=cut

sub requires_sqlite {
  my $version = shift;
  unless (has_sqlite($version)) {
    Test::More::plan skip_all => "this test requires SQLite $version and newer";
    exit;
  }
}

=head2 requires_unicode_support

  BEGIN { requires_unicode_support(); }

skips all the tests if Perl does not have sane Unicode support.

=cut

sub requires_unicode_support {
  unless ($] >= 5.008005) {
    Test::More::plan skip_all => "Unicode is not supported before 5.8.5";
    exit;
  }
}

=head2 allow_warnings

  allow_warnings { eval {...} };

t/rt_25371_asymmetric_unicode.t view on Meta::CPAN

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $unicode_opt = DBD_SQLITE_STRING_MODE_UNICODE_STRICT;

BEGIN { requires_unicode_support(); }

my $dbh = connect_ok( sqlite_string_mode => $unicode_opt );
is( $dbh->{sqlite_string_mode}, $unicode_opt, 'string mode is unicode/strict' );

ok( $dbh->do(<<'END_SQL'), 'CREATE TABLE' );
CREATE TABLE foo (
	bar varchar(255)
)
END_SQL

foreach ( "\0", "A", "\xe9", "\x{20ac}" ) {
	ok( $dbh->do("INSERT INTO foo VALUES ( ? )", {}, $_), 'INSERT' );
	my $foo = $dbh->selectall_arrayref("SELECT bar FROM foo");

t/rt_25924_user_defined_func_unicode.t view on Meta::CPAN

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $unicode_opt = DBD_SQLITE_STRING_MODE_UNICODE_STRICT;

BEGIN { requires_unicode_support() }

foreach my $call_func (@CALL_FUNCS) {
	my $dbh = connect_ok( sqlite_string_mode => $unicode_opt );
	ok($dbh->$call_func( "perl_uc", 1, \&perl_uc, "create_function" ));

	ok( $dbh->do(<<'END_SQL'), 'CREATE TABLE' );
CREATE TABLE foo (
	bar varchar(255)
)
END_SQL

	my @words = qw{Bergère hôte hétaïre hêtre};
	foreach my $word (@words) {
		# rt48048: don't need to "use utf8" nor "require utf8"
		utf8::upgrade($word);
		ok( $dbh->do("INSERT INTO foo VALUES ( ? )", {}, $word), 'INSERT' );
		my $foo = $dbh->selectall_arrayref("SELECT perl_uc(bar) FROM foo");
		is_deeply( $foo, [ [ perl_uc($word) ] ], 'unicode upcase ok' );
		ok( $dbh->do("DELETE FROM foo"), 'DELETE ok' );
	}
	$dbh->disconnect;
}

sub perl_uc {
	my $string = shift;
	return uc($string);
}

t/rt_71311_bind_col_and_unicode.t view on Meta::CPAN

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";
use DBI qw/:sql_types/;

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

BEGIN{ requires_unicode_support(); }

my $dbh = connect_ok(sqlite_string_mode => DBD_SQLITE_STRING_MODE_UNICODE_STRICT);
$dbh->do('create table test1 (id integer, b blob)');

my $blob = "\x{82}\x{A0}";
my $str  = "\x{20ac}";

{
	my $sth = $dbh->prepare('insert into test1 values (?, ?)');

t/rt_71311_bind_col_and_unicode.t view on Meta::CPAN

	my $sth = $dbh->prepare('select * from test1 order by id');

	$sth->execute;

	my $expected = [undef, 1, 0, 0, 1, 1, 1];
	for (1..6) {
		my $row = $sth->fetch;

		ok $row && $row->[0] == $_;
		ok $row && utf8::is_utf8($row->[1]) == $expected->[$_],
			"row $_ is ".($expected->[$_] ? "unicode" : "not unicode");
	}

	$sth->finish;
}

{
	my $sth = $dbh->prepare('select * from test1 order by id');
	$sth->bind_col(1, \my $col1);
	$sth->bind_col(2, \my $col2);
	$sth->execute;

	my $expected = [undef, 1, 0, 0, 1, 1, 1];
	for (1..6) {
		$sth->fetch;

		ok $col1 && $col1 == $_;
		ok $col1 && utf8::is_utf8($col2) == $expected->[$_],
			"row $_ is ".($expected->[$_] ? "unicode" : "not unicode");
	}
	$sth->finish;
}

{
	my $sth = $dbh->prepare('select * from test1 order by id');
	$sth->bind_col(1, \my $col1);
	$sth->bind_col(2, \my $col2, SQL_BLOB);
	$sth->execute;

	my $expected = [undef, 0, 0, 0, 0, 0, 0];
	for (1..6) {
		$sth->fetch;

		ok $col1 && $col1 == $_;
		ok $col2 && utf8::is_utf8($col2) == $expected->[$_],
			"row $_ is ".($expected->[$_] ? "unicode" : "not unicode");
	}
	$sth->finish;
}

{
	my $sth = $dbh->prepare('select * from test1 order by id');
	$sth->bind_col(1, \my $col1);
	$sth->bind_col(2, \my $col2, {TYPE => SQL_BLOB});
	$sth->execute;

	my $expected = [undef, 0, 0, 0, 0, 0, 0];
	for (1..6) {
		$sth->fetch;
		ok $col1 && $col1 == $_;
		ok $col2 && utf8::is_utf8($col2) == $expected->[$_],
			"row $_ is ".($expected->[$_] ? "unicode" : "not unicode");
	}
	$sth->finish;
}

done_testing;

t/rt_78833_utf8_flag_for_column_names.t view on Meta::CPAN

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";
use Encode;

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

BEGIN { requires_unicode_support() }

unicode_test("\x{263A}");  # (decoded) smiley character
unicode_test("\x{0100}");  # (decoded) capital A with macron

sub unicode_test {
    my $unicode = shift;

    ok Encode::is_utf8($unicode), "correctly decoded";

    my $unicode_encoded = encode_utf8($unicode);

    { # tests for an environment where everything is encoded

        my $dbh = connect_ok(sqlite_string_mode => DBD_SQLITE_STRING_MODE_BYTES);
        $dbh->do("pragma foreign_keys = on");
        my $unicode_quoted = $dbh->quote_identifier($unicode_encoded);
        $dbh->do("create table $unicode_quoted (id, $unicode_quoted primary key)");
        $dbh->do("create table bar (id, ref references $unicode_quoted ($unicode_encoded))");

        ok $dbh->do("insert into $unicode_quoted values (?, ?)", undef, 1, "text"), "insert successfully";
        ok $dbh->do("insert into $unicode_quoted (id, $unicode_quoted) values (?, ?)", undef, 2, "text2"), "insert with unicode name successfully";

        {
            my $sth = $dbh->prepare("insert into $unicode_quoted (id) values (:$unicode_encoded)");
            $sth->bind_param(":$unicode_encoded", 5);
            $sth->execute;
            my ($id) = $dbh->selectrow_array("select id from $unicode_quoted where id = :$unicode_encoded", undef, 5);
            is $id => 5, "unicode placeholders";
        }

        {
            my $sth = $dbh->prepare("select * from $unicode_quoted where id = ?");
            $sth->execute(1);
            my $row = $sth->fetchrow_hashref;
            is $row->{id} => 1, "got correct row";
            is $row->{$unicode_encoded} => "text", "got correct (encoded) unicode column data";
            ok !exists $row->{$unicode}, "(decoded) unicode column does not exist";
        }

        {
            my $sth = $dbh->prepare("select $unicode_quoted from $unicode_quoted where id = ?");
            $sth->execute(1);
            my $row = $sth->fetchrow_hashref;
            is $row->{$unicode_encoded} => "text", "got correct (encoded) unicode column data";
            ok !exists $row->{$unicode}, "(decoded) unicode column does not exist";
        }

        {
            my $sth = $dbh->prepare("select id from $unicode_quoted where $unicode_quoted = ?");
            $sth->execute("text");
            my ($id) = $sth->fetchrow_array;
            is $id => 1, "got correct id by the (encoded) unicode column value";
        }

        {
            my $sth = $dbh->column_info(undef, undef, $unicode_encoded, $unicode_encoded);
            my $column_info = $sth->fetchrow_hashref;
            is $column_info->{COLUMN_NAME} => $unicode_encoded, "column_info returns the correctly encoded column name";
        }

        {
            my $sth = $dbh->primary_key_info(undef, undef, $unicode_encoded);
            my $primary_key_info = $sth->fetchrow_hashref;
            is $primary_key_info->{COLUMN_NAME} => $unicode_encoded, "primary_key_info returns the correctly encoded primary key name";
        }

        if (has_sqlite('3.6.14')) {
            my $sth = $dbh->foreign_key_info(undef, undef, $unicode_encoded, undef, undef, 'bar');
            my $foreign_key_info = $sth->fetchrow_hashref;
            is $foreign_key_info->{PKCOLUMN_NAME} => $unicode_encoded, "foreign_key_info returns the correctly encoded foreign key name";
        }

        {
            my $sth = $dbh->table_info(undef, undef, $unicode_encoded);
            my $table_info = $sth->fetchrow_hashref;
            is $table_info->{TABLE_NAME} => $unicode_encoded, "table_info returns the correctly encoded table name";
        }
    }

    { # tests for an environment where everything is decoded
        my $dbh = connect_ok(sqlite_string_mode => DBD_SQLITE_STRING_MODE_UNICODE_STRICT);
        $dbh->do("pragma foreign_keys = on");
        my $unicode_quoted = $dbh->quote_identifier($unicode);
        $dbh->do("create table $unicode_quoted (id, $unicode_quoted primary key)");
        $dbh->do("create table bar (id, ref references $unicode_quoted ($unicode_quoted))");

        ok $dbh->do("insert into $unicode_quoted values (?, ?)", undef, 1, "text"), "insert successfully";
        ok $dbh->do("insert into $unicode_quoted (id, $unicode_quoted) values (?, ?)", undef, 2, "text2"), "insert with unicode name successfully";

        {
            my $sth = $dbh->prepare("insert into $unicode_quoted (id) values (:$unicode)");
            $sth->bind_param(":$unicode", 5);
            $sth->execute;
            my ($id) = $dbh->selectrow_array("select id from $unicode_quoted where id = :$unicode", undef, 5);
            is $id => 5, "unicode placeholders";
        }

        {
            my $sth = $dbh->prepare("select * from $unicode_quoted where id = ?");
            $sth->execute(1);
            my $row = $sth->fetchrow_hashref;
            is $row->{id} => 1, "got correct row";
            is $row->{$unicode} => "text", "got correct (decoded) unicode column data";
            ok !exists $row->{$unicode_encoded}, "(encoded) unicode column does not exist";
        }

        {
            my $sth = $dbh->prepare("select $unicode_quoted from $unicode_quoted where id = ?");
            $sth->execute(1);
            my $row = $sth->fetchrow_hashref;
            is $row->{$unicode} => "text", "got correct (decoded) unicode column data";
            ok !exists $row->{$unicode_encoded}, "(encoded) unicode column does not exist";
        }

        {
            my $sth = $dbh->prepare("select id from $unicode_quoted where $unicode_quoted = ?");
            $sth->execute("text2");
            my ($id) = $sth->fetchrow_array;
            is $id => 2, "got correct id by the (decoded) unicode column value";
        }

        {
            my $sth = $dbh->column_info(undef, undef, $unicode, $unicode);
            my $column_info = $sth->fetchrow_hashref;
            is $column_info->{COLUMN_NAME} => $unicode, "column_info returns the correctly decoded column name";
        }

        {
            my $sth = $dbh->primary_key_info(undef, undef, $unicode);
            my $primary_key_info = $sth->fetchrow_hashref;
            is $primary_key_info->{COLUMN_NAME} => $unicode, "primary_key_info returns the correctly decoded primary key name";
        }

        if (has_sqlite('3.6.14')) {
            my $sth = $dbh->foreign_key_info(undef, undef, $unicode, undef, undef, 'bar');
            my $foreign_key_info = $sth->fetchrow_hashref;
            is $foreign_key_info->{PKCOLUMN_NAME} => $unicode, "foreign_key_info returns the correctly decoded foreign key name";
        }

        {
            my $sth = $dbh->table_info(undef, undef, $unicode);
            my $table_info = $sth->fetchrow_hashref;
            is $table_info->{TABLE_NAME} => $unicode, "table_info returns the correctly decoded table name";
        }
    }
}

done_testing;

t/rt_96877_unicode_statements.t view on Meta::CPAN

# According to the sqlite doc, the SQL argument to sqlite3_prepare_v2
# should be in utf8, but DBD::SQLite does not ensure this (even with
# sqlite_unicode => 1). Only bind values are properly converted.

use strict;
use warnings;
use lib "t/lib";
use SQLiteTest;
use Test::More;
use if -d ".git", "Test::FailWarnings";

BEGIN { requires_unicode_support() }

use DBD::SQLite::Constants ':dbd_sqlite_string_mode';

my $dbh = connect_ok( sqlite_string_mode => DBD_SQLITE_STRING_MODE_UNICODE_NAIVE );
is( $dbh->{sqlite_string_mode}, DBD_SQLITE_STRING_MODE_UNICODE_NAIVE, 'Unicode is on' );

ok( $dbh->do(<<'END_SQL'), 'CREATE TABLE' );
CREATE TABLE foo (
	bar varchar(255)
)

t/virtual_table/21_perldata_charinfo.t view on Meta::CPAN

use strict;
use warnings;
# test the example described in 
# L<DBD::SQLite::VirtualTable::PerlData/"Hashref example : unicode characters">

use lib "t/lib";
use SQLiteTest qw/connect_ok $sqlite_call/;
use Test::More;

BEGIN {
  # check for old Perls which did not have Unicode::UCD in core
  unless (eval "use Unicode::UCD 'charinfo'; 1") {
    plan skip_all => "Unicode::UCD does not seem to be installed";
  }

( run in 0.962 second using v1.01-cache-2.11-cpan-f29a10751f0 )