Lexilla icon Lexilla

Lexilla Documentation

Last edited 21 April 2021 NH

Introduction

Lexilla is a library containing lexers for use with Scintilla. It can be either a static library that is linked into an application or a shared library that is loaded at runtime.

Lexilla does not interact with the display so there is no need to compile it for a particular GUI toolkit. Therefore there can be a common library shared by applications using different GUI toolkits. In some circumstances there may need to be both 32-bit and 64-bit versions on one system to match different applications.

Different extensions are commonly used for shared libraries: .so on Linux, .dylib on macOS, and .DLL on Windows.

The Lexilla protocol

A set of functions is defined by Lexilla for use by applications. Libraries that provide these functions can be used as a replacement for Lexilla or to add new lexers beyond those provided by Lexilla.

The Lexilla protocol is a superset of the external lexer protocol and defines these functions that may be exported from a shared library:
int GetLexerCount()
void GetLexerName(unsigned int index, char *name, int buflength)
LexerFactoryFunction GetLexerFactory(unsigned int index)
ILexer5 *CreateLexer(const char *name)
const char *LexerNameFromID(int identifier)
const char *GetLibraryPropertyNames()
void SetLibraryProperty(const char *key, const char *value)
const char *GetNameSpace()

ILexer5 is defined by Scintilla in include/ILexer.h as the interface provided by lexers which is called by Scintilla. Many clients do not actually need to call methods on ILexer5 - they just take the return from CreateLexer and plug it straight into Scintilla so it can be treated as a machine pointer (void *).

LexerFactoryFunction is defined as a function that takes no arguments and returns an ILexer5 *: ILexer5 *(*LexerFactoryFunction)() but this can be ignored by most client code.

The Lexilla protocol is a superset of the earlier external lexer protocol that defined the first 3 functions (GetLexerCount, GetLexerName, GetLexerFactory) so Lexilla can be loaded by applications that support that protocol. GetLexerFactory will rarely be used now as it is easier to call CreateLexer.

CreateLexer is the main call that will create a lexer for a particular language. The returned lexer can then be set as the current lexer in Scintilla by calling SCI_SETILEXER.

LexerNameFromID is an optional function that returns the name for a lexer identifier. LexerNameFromID(SCLEX_CPP) → "cpp". This is a temporary affordance to make it easier to convert applications to using Lexilla. Applications should move to using lexer names instead of IDs. This function is deprecated, showing warnings with some compilers, and will be removed in a future version of Lexilla.

SetLibraryProperty and GetLibraryPropertyNames are optional functions that can be defined if a library requires initialisation before calling other methods. For example, a lexer library that reads language definitions from XML files may require that the directory containing these files be set before a call to CreateLexer. SetLibraryProperty("definitions.directory", "/usr/share/xeditor/language-definitions") If a library implements SetLibraryProperty then it may also provide a set of valid property names with GetLibraryPropertyNames that can then be used by the application to define configuration file property names or user interface elements for options dialogs.

GetNameSpace is an optional function that returns a namespace string that can be used to disambiguate lexers with the same name from different providers. If Lexilla and XMLLexers both provide a "cpp" lexer than a request for "cpp" may be satisfied by either but "xmllexers.cpp" unambiguously refers to the "cpp" lexer from XMLLexers.

Building Lexilla

Before using Lexilla it must be built or downloaded.

Lexilla requires some headers from Scintilla to build and expects a directory named "scintilla" containing a copy of Scintilla 5+ to be a peer of the Lexilla top level directory conventionally called "lexilla".

To build Lexilla, in the lexilla/src directory, run make (for gcc or clang)
make
or nmake for MSVC
nmake -f lexilla.mak

After building Lexilla, its test suite can be run with make/nmake in the lexilla/test directory. For gcc or clang
make test
or for MSVC
nmake -f testlexers.mak test
Each test case should show "Lexing ..." and errors will display a diagnostic, commonly showing a difference between the actual and expected result:
C:\u\hg\lexilla\test\examples\python\x.py:1: is different

There are also RunTest.sh / RunTest.bat scripts in the scripts directory to build Lexilla and then build and run the tests. These both use gcc/clang, not MSVC.

There are Microsoft Visual C++ and Xcode projects that can be used to build Lexilla. For Visual C++: src/Lexilla.vcxproj. For Xcode: src/Lexilla/Lexilla.xcodeproj. There is also test/TestLexers.vcxproj to build the tests with Visual C++.

Using Lexilla

Definitions for using Lexilla from C and C++ are included in lexilla/include/Lexilla.h. For C++, scintilla/include/ILexer.h should be included before Lexilla.h as the ILexer5 type is used. For C, ILexer.h should not be included as C does not understand it and from C, void* is used instead of ILexer5*.

For many applications the main Lexilla operations are loading the Lexilla library, creating a lexer and using that lexer in Scintilla. Applications need to define the location (or locations) they expect to find Lexilla or libraries that support the Lexilla protocol. They also need to define how they request particular lexers, perhaps with a mapping from file extensions to lexer names.

From C - CheckLexilla

An example C program for accessing Lexilla is provided in lexilla/examples/CheckLexilla. Build with make and run with make check.

From C++ - LexillaAccess

A C++ module, LexillaAccess.cxx / LexillaAccess.h is provided in lexilla/access. This can either be compiled into the application when it is sufficient or the source code can be copied into the application and customized when the application has additional requirements (such as checking code signatures). SciTE uses LexillaAccess.

LexillaAccess supports loading multiple shared libraries implementing the Lexilla protocol at one time.

From Qt

For Qt, use either LexillaAccess from above or Qt's QLibrary class. With 'Call' defined to call Scintilla APIs.
#if _WIN32
    typedef void *(__stdcall *CreateLexerFn)(const char *name);
#else
    typedef void *(*CreateLexerFn)(const char *name);
#endif
    QFunctionPointer fn = QLibrary::resolve("lexilla", "CreateLexer");
    void *lexCpp = ((CreateLexerFn)fn)("cpp");
    Call(SCI_SETILEXER, 0, (sptr_t)(void *)lexCpp);

Applications may discover the set of lexers provided by a library by first calling GetLexerCount to find the number of lexers implemented in the library then looping over calling GetLexerName with integers 0 to GetLexerCount()-1.

Applications may set properties on a library by calling SetLibraryProperty if provided. This may be needed for initialisation so should before calling GetLexerCount or CreateLexer. A set of property names may be available from GetLibraryPropertyNames if provided. It returns a string pointer where the string contains a list of property names separated by '\n'. It is up to applications to define how properties are defined and persisted in its user interface and configuration files.

Modifying or adding lexers

Lexilla can be modified or a new library created that can be used to replace or augment Lexilla.

Lexer libraries that provide the same functions as Lexilla may provide lexers for use by Scintilla, augmenting or replacing those provided by Lexilla. To allow initialisation of lexer libraries, a SetLibraryProperty(const char *key, const char *value) may optionally be implemented. For example, a lexer library that uses XML based lexer definitions may be provided with a directory to search for such definitions. Lexer libraries should ignore any properties that they do not understand. The set of properties supported by a lexer library is specified as a '\n' separated list of property names by an optional const char *GetLibraryPropertyNames() function.

Lexilla and its contained lexers can be tested with the TestLexers program in lexilla/test. Read lexilla/test/README for information on building and using TestLexers.

An example of a simple lexer housed in a shared library that is compatible with the Lexilla protocol can be found in lexilla/examples/SimpleLexer. It is implemented in C++. Build with make and check by running CheckLexilla against it with make check.