PlanetSquires Forums

Support Forums => General Board => Topic started by: José Roca on April 17, 2011, 05:27:05 PM

Title: An Unicode reflexion
Post by: José Roca on April 17, 2011, 05:27:05 PM
Now that PB 10 has native unicode support, do you know of any reason, other than having to use WSTRING and WSTRINGZ instead of STRING and ASCIIZ, for having two declares, one for the “A” functions and another for the “W” functions? If we use WSTRING/WSTRINGZ aren’t the unicode declares enough? If I'm not wrong, what the API "A" functions do is to convert string parameters to unicode and call the "W" function, and with window messages: "The system does automatic two-way translation (Unicode to ANSI) for window messages. For example, if an ANSI window message is sent to a window that uses the Unicode character set, the system translates that message into a Unicode message before calling the window procedure. The system calls IsWindowUnicode to determine whether to translate the message. "

I'm asking just in case I'm missing something. Otherwise, we could remove all the ansi related stuff from the includes and PBer's only will need to get used to WSTRINGs and WSTRINGZs instead of STRINGs and ASCIIZs. This would have some benefits such as somewhat faster compiles (less stuff to parse) and the removal of a ton of #IF/ENDIFs and MACROs that can cause nesting overflows with programs that use many includes besides the API ones.

What do you think?
Title: Re: An Unicode reflexion
Post by: Jim Dunn on April 17, 2011, 10:06:13 PM
QuoteI think I agree with you... but if ripping out all the ansi stuff would mean that some people might not try the "Jose Headers" or that people might not try FireFly... then I might vote "wait until PB11" ???

After reading/thinking on your other forum, I've decided YES, drop ANSI... if people need ansi, they can use the PowerBASIC headers.  : )
Title: Re: An Unicode reflexion
Post by: José Roca on April 17, 2011, 10:28:21 PM
I think that now is the time. New compilers, new headers and new visual designer.

I want to use libraries for the wrapper functions, but SLLs don't allow #IF/#ENDIF conditional compilation. Therefore, I have to duplicate the code, and the ansi version doesn't offer any advantage, and some have disadvantages. The only chage is to use WSTRING/WSTRINGZ instead of STRING/ASCIIZ. Old code must be compiled with old headers and new tools must not be handicaped by legacy code; otherwise, we can't progress.

Also, conditional compilation for each function with "A" and "W" versions means tons of #IF/ENDIFs and MACROs, and this can cause nesting problems: it has happened to me with the CSED editor, and I had to solve it changing the order of the includes.

Title: Re: An Unicode reflexion
Post by: Pat Dooley on April 21, 2011, 01:30:59 PM
At this point it appears to be a lost cause but I'm going to put in my vote for Unicode only headers.
Once I realized that ANSI is still available for the limited times it cannot be avoided (non-Unicode ready DLLs for example) I really don't see any problem.
It is surprising to me that Jose seems to be going ahead with both the wide and ANSI versions after making such a good argument on this and his own forum.
Title: Re: An Unicode reflexion
Post by: Douglas McDonald on April 21, 2011, 06:17:10 PM
I'm not going to pretend I understand all this but I do have one question. I write a lot of software that controls things like digital multi meters, function generators and other test equipment. Much of this is still done via RS232 or USB. How will all unicode effect this? All these require ASCII character code to work. Maybe it wont effect things at all.

Thanks
Doug
Title: Re: An Unicode reflexion
Post by: José Roca on April 21, 2011, 06:41:09 PM
Contrarily to my compatriot, Don Quixote, I don't want to fight against wind mills. About 2/3 of the new headers are already unicode only, since they deal with COM. The duplicity only remains with the declares for the API functions. They're only useful to Windows 95/98 programmers (I only know of two PBer's that still use them, and they will never use my headers), but because of the pandemic of FUD and lack of understanding most PBer's will use ansi for the time being. Only when absolutely needed will they venture to add an W to STRING, and with fear, uncertainty and doubt.

Anyway, most of my new wrappers will be unicode only, so if the ansi crowd wants to use them with ansi strings, they will have to pass parameters using BYCOPY.

My main concern is that to support both ansi and unicode declares, the headers have to use tons of #IF/ENDIFs, and this can cause nesting overflows with complex applications that use many includes. This problem is what gave me the idea of removing the ansi stuff.
Title: Re: An Unicode reflexion
Post by: José Roca on April 21, 2011, 06:49:10 PM
Quote from: Douglas McDonald on April 21, 2011, 06:17:10 PM
I'm not going to pretend I understand all this but I do have one question. I write a lot of software that controls things like digital multi meters, function generators and other test equipment. Much of this is still done via RS232 or USB. How will all unicode effect this? All these require ASCII character code to work. Maybe it wont effect things at all.

Thanks
Doug

I really don't know how to make it clear. What I'm talking about is the calls to the Windows API functions, not of your data. For your data, use wathever you need/wish. I'm not still so crazy as going to suggest you to use an unicode string to store binary data. All the API functions that require binary data have as parameters pointers to byte arrays, not strings, even if most PBer's (including me) use ansi strings as buffers for it instead of truly byte arrays. Use STRING and ASCIIZ when needed; they haven't been removed (and would never be) from the compiler.

You will also need to use ansi with third party DLLs that don't have unicode support. My point is that, for calling the Windows API functions, using ansi is disadvantageous.

As I said, the real problem is the lack of undestanding.

Title: Re: An Unicode reflexion
Post by: Douglas McDonald on April 22, 2011, 08:32:44 AM
Thanks, that's what I thought but wanted to make sure. A fully admitt I don't understand. I only know enough to be dangerous at least I admit it. I seldom have to dive into lower lever, for lack of a better word, stuff. When I do you've always had a solution and I thank you. I need to dive into your web browser control next.

Thank you
Title: Re: An Unicode reflexion
Post by: José Roca on April 22, 2011, 04:36:00 PM
 
When the ansi version is not just a wrapper for the unicode version it is because limitations of Windows 95/98. For example, GetOpenFileNameW isn't supported in these OSes, and the ansi version limits the buffer for multiple selection to 32 Kb, that is also the limit for PB's DISPLAY OPEN FILE. With the unicode version, this buffer is unlimited.

I have written a reworked version of OpenFileDialog that assumes a 32 Kb buffer by default, but allows you to set the size of the buffer throught the optional parameter dwBufLen. Note that the OpenFileDialog wrapper available in old PB include files (it has been reoved from the new ones) had a buffer of 8192 bytes.


' ========================================================================================
' Open File Dialog
' It allows both the use of "|" as a separator (for compatibility with legacy code) and
' of nulls, used by PB's DISPLAY OPEN FILE statement.
' bstrFilter = "BASIC|*.BAS;*.INC;*.BI|"
' bstrFilter = CHR$("BASIC", 0, "*.BAS;*.INC;*.BI", 0)
' The minimum buffer is %MAX_PATH and the maximum buffer is unlimited.
' Can not be used with Windows versions below Windows 2000.
' ========================================================================================
FUNCTION AfxOpenFileDialog ( _
   BYVAL hwnd AS DWORD _                         ' // Parent window
, BYVAL bstrTitle AS WSTRING _                  ' // Caption
, BYREF bstrFile AS WSTRING _                   ' // Filename
, BYVAL bstrInitialDir AS WSTRING _             ' // Start directory
, BYVAL bstrFilter AS WSTRING _                 ' // Filename filter
, BYVAL bstrDefExt AS WSTRING _                 ' // Default extension
, BYREF dwFlags AS DWORD _                      ' // Flags
, OPTIONAL BYVAL dwBufLen AS DWORD _            ' // Buffer length
) COMMON AS LONG

   LOCAL ix AS LONG
   LOCAL ofn AS OPENFILENAMEW
   LOCAL wszFileTitle AS WSTRINGZ * %MAX_PATH

   ' // Filter is a sequence of ASCIIZ strings with a final (extra) $NUL terminator
   REPLACE "|" WITH $NUL IN bstrFilter
   bstrFilter += $$NUL

   ' // If the initial directory has not been specified, assume the current directory
   IF LEN(bstrInitialDir) = 0 THEN bstrInitialDir = CURDIR$

   ' // The size of the buffer must be at least %MAX_PATH bytes
   IF dwBufLen = 0 THEN dwBufLen = 32768   ' // 32 Kb buffer (enough for at least 126 files)
   IF dwBufLen < 260 THEN dwBufLen = 260
   IF LEN(bstrFile) < dwBufLen THEN bstrFile += SPACE$(dwBufLen - LEN(bstrFile))

   ' // Fill the members of the structure
   ofn.lStructSize      = SIZEOF(ofn)
   ofn.hwndOwner        = hwnd
   ofn.lpstrFilter      = STRPTR(bstrFilter)
   ofn.nFilterIndex     = 1
   ofn.lpstrFile        = STRPTR(bstrFile)
   ofn.nMaxFile         = LEN(bstrFile)
   ofn.lpstrFileTitle   = VARPTR(wszFileTitle)
   ofn.nMaxFileTitle    = SIZEOF(wszFileTitle)
   ofn.lpstrInitialDir  = STRPTR(bstrInitialDir)
   IF LEN(bstrTitle) THEN
      ofn.lpstrTitle    = STRPTR(bstrTitle)
   END IF
   ofn.Flags            = dwFlags
   IF LEN(bstrDefExt) THEN
      ofn.lpstrDefExt   = STRPTR(bstrDefExt)
   END IF

   FUNCTION = GetOpenFilenameW(ofn)
   ix = INSTR(bstrFile, $NUL & $NUL)
   IF ix THEN
      bstrFile = LEFT$(bstrFile, ix - 1)
   ELSE
      ix = INSTR(bstrFile, $NUL)
      IF ix THEN
         bstrFile = LEFT$(bstrFile, ix - 1)
      ELSE
         bstrFile = ""
      END IF
   END IF

   dwFlags = ofn.Flags

END FUNCTION
' ========================================================================================


I'm going also to add the following functions, that mimic the ones available in the C runtime, but using strings instead of longs.


' ========================================================================================
' Returns -1 if c is a letter (a-z or A-Z) or a digit (0-9).
' ========================================================================================
FUNCTION AfxIsAlnumW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = (c >= "a" AND c < "z") OR (c >= "A" AND c <= "Z") OR (c >= "0" AND c <= "9")
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a letter (a-z or A-Z) or a digit (0-9).
' ========================================================================================
FUNCTION AfxIsAlnumA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = (c >= "a" AND c < "z") OR (c >= "A" AND c <= "Z") OR (c >= "0" AND c <= "9")
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a letter (A-Z or a-z).
' ========================================================================================
FUNCTION AfxIsAlphaW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = (c >= "a" AND c < "z") OR (c >= "A" AND c <= "Z")
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a letter (A-Z or a-z).
' ========================================================================================
FUNCTION AfxIsAlphaA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = (c >= "a" AND c < "z") OR (c >= "A" AND c <= "Z")
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if the low order byte of c is in the range to 127 (&H00-&H7F).
' ========================================================================================
FUNCTION AfxIsAsciiW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = ASC(c) >= 0 AND ASC(c) < 128
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if the low order byte of c is in the range to 127 (&H00-&H7F).
' ========================================================================================
FUNCTION AfxIsAsciiA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = ASC(c) >= 0 AND ASC(c) < 128
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a blank character.
' ========================================================================================
FUNCTION AfxIsBlankW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = c = " " OR c = $TAB
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a blank character.
' ========================================================================================
FUNCTION AfxIsBlankA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = c = " " OR c = $TAB
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a delete character or ordinary control character (&H7F or &H00-&H1F).
' ========================================================================================
FUNCTION AfxIsCntrlW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = ASC(c) = &H7F OR (ASC(c) > = 0 AND ASC(c) <= &H1F)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a delete character or ordinary control character (&H7F or &H00-&H1F).
' ========================================================================================
FUNCTION AfxIsCntrlA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = ASC(c) = &H7F OR (ASC(c) > = 0 AND ASC(c) <= &H1F)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a decimal digit (0-9).
' ========================================================================================
FUNCTION AfxIsDigitW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = c >= "0" AND c <= "9"
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a decimal digit (0-9).
' ========================================================================================
FUNCTION AfxIsDigitA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = c >= "0" AND c <= "9"
END FUNCTION
' ========================================================================================

' ========================================================================================
' Retuns -1 if c is a lowercase letter (a-z).
' ========================================================================================
FUNCTION AfxIsLowerW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = c >= "a" AND c <= "z"
END FUNCTION
' ========================================================================================

' ========================================================================================
' Retuns -1 if c is a lowercase letter (a-z).
' ========================================================================================
FUNCTION AfxIsLowerA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = c >= "a" AND c <= "z"
END FUNCTION
' ========================================================================================

' ========================================================================================
' Like AfxIsPrint, returns -1 if c is a printing character (&H21-&H7E), but excluding the
' space character (&H20).
' ========================================================================================
FUNCTION AfxIsGraphW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = ASC(c) > &H20 AND ASC(c) <= &H7E
END FUNCTION
' ========================================================================================

' ========================================================================================
' Like AfxIsPrint, returns -1 if c is a printing character (&H21-&H7E), but excluding the
' space character (&H20).
' ========================================================================================
FUNCTION AfxIsGraphA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = ASC(c) > &H20 AND ASC(c) <= &H7E
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a printing character (&H20-&H7E).
' ========================================================================================
FUNCTION AfxIsPrintW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = ASC(c) >= &H20 AND ASC(c) <= &H7E
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a printing character (&H20-&H7E).
' ========================================================================================
FUNCTION AfxIsPrintA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = ASC(c) >= &H20 AND ASC(c) <= &H7E
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a printable punctuation character.
' ========================================================================================
FUNCTION AfxIsPunctW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = AfxIsGraphW(c) AND NOT AfxIsAlnumW(c)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a printable punctuation character.
' ========================================================================================
FUNCTION AfxIsPunctA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = AfxIsGraphA(c) AND NOT AfxIsAlnumA(c)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returs -1 if c is a space, tab, carriage return, new line, vertical tab, or formfeed
' (&H09-&H0D, &H20).
' ========================================================================================
FUNCTION AfxIsSpaceW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = (ASC(c) >= &H09 AND ASC(c) <= &H0D) OR ASC(c) = &H20
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returs -1 if c is a space, tab, carriage return, new line, vertical tab, or formfeed
' (&H09-&H0D, &H20).
' ========================================================================================
FUNCTION AfxIsSpaceA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = (ASC(c) >= &H09 AND ASC(c) <= &H0D) OR ASC(c) = &H20
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a uppercase letter (A-Z).
' ========================================================================================
FUNCTION AfxIsUpperW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = c >= "A" AND c <= "Z"
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is a uppercase letter (A-Z).
' ========================================================================================
FUNCTION AfxIsUpperA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = c >= "A" AND c <= "Z"
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is an hexadecimal digit (0-9, A-F, a-f).
' ========================================================================================
FUNCTION AfxIsXDigitW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = (c >= "0" AND c <= "9") OR (c >= "A" AND c <= "F") OR (c >= "a" AND c <= "f")
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns -1 if c is an hexadecimal digit (0-9, A-F, a-f).
' ========================================================================================
FUNCTION AfxIsXDigitA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = (c >= "0" AND c <= "9") OR (c >= "A" AND c <= "F") OR (c >= "a" AND c <= "f")
END FUNCTION
' ========================================================================================

' ========================================================================================
' Coerces a character to the ASCII range (0-127) by zeroing any higher-order bits.
' ========================================================================================
FUNCTION AfxToAsciiW (BYVAL c AS WSTRING) AS WSTRING
   FUNCTION = CHR$$(ASC(c) AND &H7F)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Coerces a character to the ASCII range (0-127) by zeroing any higher-order bits.
' ========================================================================================
FUNCTION AfxToAsciiA (BYVAL c AS STRING) AS STRING
   FUNCTION = CHR$(ASC(c) AND &H7F)
END FUNCTION
' ========================================================================================


Also this one:


' ========================================================================================
' Retuns -1 if c is a number (0-9), a numeric sign (+-) or a decimal point (.).
' Note: Works both with single characters and strings.
' ========================================================================================
FUNCTION AfxIsNumericW (BYVAL c AS WSTRING) COMMON AS LONG
   FUNCTION = (RETAIN$(c, ANY "+-.0123456789") = c) AND c <> ""
END FUNCTION
' ========================================================================================

' ========================================================================================
' Retuns -1 if c is a number (0-9), a numeric sign (+-) or a decimal point (.).
' Note: Works both with single characters and strings.
' ========================================================================================
FUNCTION AfxIsNumericA (BYVAL c AS STRING) COMMON AS LONG
   FUNCTION = (RETAIN$(c, ANY "+-.0123456789") = c) AND c <> ""
END FUNCTION
' ========================================================================================



I have also added to CWindow the method AddYouTubeVideo, that allows to embed a YouTube video in your application passing the 11 character code of the video, and AddWindowsMediaPlayer that allows to embed a instance of Windows Media Player in your application passing the path of the video.

I'm also working in  a way to add easily a Virtual Earth map or a Google map.

And many other things, if I found the time to do them.

Title: Re: An Unicode reflexion
Post by: Rolf Brandt on April 23, 2011, 04:05:20 AM
QuoteI'm also working in  a way to add easily a Virtual Earth map or a Google map.

And many other things, if a found the time to do them.
Great! I am looking out for Google map.

Thanks for all these great enhancements, Jose.

rb