This post is a small knowledgebase for working with pinvoke. It explains a litte of the background and offers a handy what-(not)-to-do type of guide. Enjoy !
Bool and BOOL
There are two common native boolean types: bool and BOOL. They cause issues with pinvoke, as bool has a size of 1 byte (c++), and BOOL is 4 bytes (c uses int for bool).
The managed bool has always a size of 1 byte. When marshalling the managed boolean will convert to the 4-byte native BOOL. This behaviour leads to different ways of handling 1-byte bools. To explicitly convert the 1-Byte bool to the managed bool use:
[return: MarshalAs(Unmanaged.I1)] public extern static bool SomeMethod();
To emphasize that you thought about the bool/BOOL issue and made sure you pick the right one, you can add the following:
[return: MarshalAs(UnmanagedType.Bool)] public static extern bool SomeOtherOperation();
This will make it very clear that the native BOOL is used and, although not required, this is preferred.
There are two different character (Ansi, Unicode) and three settings (Ansi, Unicode, Auto) for the character set. This field controls string marshaling and determines how function names are found in the dll.
Narrow and Wide Functions
Often APIs export two versions of the same method. One working with narrow (ASCII) string arguments, the other working with wide (Unicode) strings. The Win32 API e.g. exposes two entry-poits for the MessageBox Function:
- Provides 1-byte character ASCII format, nowadays almost only used on Windows 95/98.
- Provides 2-byte character Unicode formatting, used on modern Windows Systems
There are three different settings for the character set. ANSI is the default behaviour. The ExactSpelling field con work in conjunction with CharSet. When set to false the platform invoke attempts to call the Unicode/Ascii version first, and on failure tries to call the Ascii/Unicode version. (Fallback to the other version)
- During marshaling managed Unicode strings get converted to ANSI format
- During marshaling the managed Unicode string is copied.
- Platform Invoke determines at runtime which format is required.
When using the ANSI setting you might run into issues if there is a character (e.g. in a file) that is not represented in the ANSI page being used. In general you'll want to use Unicode, but Unicode might not work on older systems (Windows 98,95,ME). Additionally some functions use ASCII, so you'll have to use ANSI there.
(Modern) Windows systems use a UTF-16 encoding.
Subscribe to Ars Artificia
Get the latest posts delivered right to your inbox