• Welcome to PlanetSquires Forums.
 

Strings in FreeBASIC

Started by Paul Squires, August 25, 2015, 09:11:16 AM

Previous topic - Next topic

Paul Squires

I am gearing up to start changing all my code from ANSI strings to Unicode WSTRINGs so I want to be 100% sure that I do it right. I know that FB does not have a dynamic wstring data type. I was testing strings being sent to subs/functions via parameters and as Jose has indicated, using ByRef as Const WString seems to correctly accept any type of incoming string (see code below).

When designing my functions that return a string, I am confident(?) that just using "String" as the return will work correctly. The compiler will coerce/convert the standard to the correct variable type that is receiving the string data (see code below). Best of all, none of these internal string conversion produce any compiler warnings.

Am I correct in going with the premise that:
1) For incoming function string parameters that using "ByRef As Const WString" will work in all cases.
2) Using "As String" as the return datatype from a string function will work in all cases.



#Define UNICODE


Function MyFunction( ByRef wst As Const WString ) As String

   ? "InFunction:", wst, Len(wst), Sizeof(wst)
   
   Return wstr2
End Function


Dim st      As String
Dim zst     As ZString * 100
Dim wst     As WString * 100
Dim wreturn As WString * 100

st  = "This is a string"
zst = "This is a zstr"
wst = "This is a wide string"

wreturn = MyFunction(st)
? "Return:", wreturn, Len(wreturn), Sizeof(wreturn)

wreturn = MyFunction(zst)
? "Return:", wreturn, Len(wreturn), Sizeof(wreturn)

wreturn = MyFunction(wst)
? "Return:", wreturn, Len(wreturn), Sizeof(wreturn)

wreturn = MyFunction("This is a literal string")
? "Return:", wreturn, Len(wreturn), Sizeof(wreturn)

Sleep


Paul Squires
PlanetSquires Software
WinFBE Editor and Visual Designer

José Roca

If you have a WSTRING and want to return a STRING, you can do:

FUNCTION = STR(*wstr)

This is what I'm doing in the function ToAnsi of my BSTR class


' ========================================================================================
' Returns the text of the string converted to ansi
' ========================================================================================
FUNCTION CBStr.ToAnsi () AS STRING
   IF m_bstr THEN FUNCTION = STR(*m_bstr)
END FUNCTION
' ========================================================================================


José Roca

Quote
Str also converts Unicode character strings to ASCII character strings. Used this way it does the opposite of WStr. If an ASCII character string is given, that string is returned unmodified.

It is like using ACODE$ in PowerBASIC.

José Roca

#3
Quote
Am I correct in going with the premise that:
1) For incoming function string parameters that using "ByRef As Const WString" will work in all cases.
2) Using "As String" as the return datatype from a string function will work in all cases

In my experience the conversions are automatic. This is also what happens with PowerBASIC.

However, you can also use WSTR and STR as the equivalents of PB's UCODE$ and ACODE$.

José Roca

#4
But the purpose of changing it to unicode is to return an unicode string. If you're going to return an ansi string anyway, then you don't need to modify your code.

For example, if your function is declared as:


FUNCTION Foo (BYVAL s AS STRING) AS STRING
   FUNCTION = s
END FUNCTION


or


FUNCTION Foo (BYREF s AS STRING) AS STRING
   FUNCTION = s
END FUNCTION


You can call it as:


DIM wsz AS WSTRING * 20 = "Test string"
Foo wsz


And FB will convert the passed parameter to ansi automatically without warnings.

José Roca

#5
To be clear: It does not make sense to modify your code if you are going to return an ansi string anyway.

However, for calling API functions, we don't need to call the "A" versions since most of them are only wrappers that convert the passed string parameters to and back to Unicode and call the "W" function.

My CWindow class can be used both passing FB STRINGs and WSTRINGs, and works both with 32 and 64 bit. Therefore, there is not need for different versions, unless compiled as an object file.

Without native support for OLE strings, the ways to return an unicode string are:

1) To allocate it with SysAllocString and return the handle.

2) To allocate memory, copy the contents of the string to it and return the pointer.

3) To pass a WSTRING by reference.

1 and 2 have the problem of not having FB native statements to deal with it. You must also free the memory.

Regarding 3), the way to proceed is like several API functions do:


FUNCTION Foo (BYREF wsz AS WSTRING, BYVAL nSize AS INTEGER) AS INTEGER
   IF nSize < wanted size THEN return the wanted size and exit the function
   Otherwise, copy the string to wsz
END FUNCTION



José Roca

During the beta testing of PB 10, I had to do all the testing of COM and Unicode because nobody else, except Bob, seemed to grasp it.

Paul Squires

Thanks Jose, I thought that maybe the return STRING would automatically be converted to a WSTRING if the data type that the function is assigning the return value to was a WSTRING. See the test code below. Looking at the string in memory verifies that it is not converted to Unicode automatically because I do not see any 00 between the character values.


#Lang "FB"
#Define UNICODE

'
' Create a function that will return a WString. The compiler
' will convert the return STRING data to the appropriate data
' type and assign it to the variable accepting the return value.
'
Function MyFunction() As String
   Dim MyWString As WString * 100 = "WString from function"
   Return WSTR(MyWString)
End Function


Dim st  As String
Dim zst As ZString * 100
Dim wst As WString * 100


' STRING returned from function as STRING
st = MyFunction()
? "Return:", st, Len(st)


' STRING returned from function as ZSTRING
zst = MyFunction()
? "Return:", zst, Len(zst), Sizeof(zst)
' Verify the contents of the ZSTRING
For i As Integer = 0 To Len(zst) - 1
   ? zst[i];
Next   
?

' STRING returned from function as WSTRING
wst = MyFunction()
? "Return:", wst, Len(wst), Sizeof(wst)
' Verify that the wst actually contains unicode characters (it appears that it does not)
For i As Integer = 0 To Len(wst) - 1
   ? wst[i];
Next   
?

Sleep



I am looking at your Option "3) To pass a WSTRING by reference". I guess it implies that there will need to be two different functions for every function that wants to return a string to the caller depending on whether STRING or WSTRING is desired. For example, how would you re-code the following function from your collection of wrapper routines to ensure that both types of strings could be accommodated?


' ========================================================================================
' Gets the text of a window.
' Note: GetWindowText cannot retrieve the text of a control in another application.
' ========================================================================================
Function AfxGetWindowText (ByVal HWnd As Long) As WString
   Local nLen   As Long
   Local wbuffer As WString
   nLen = SendMessageW(HWnd, %WM_GETTEXTLENGTH, 0, 0)
   wbuffer = Space$(nLen + 1)
   nLen = SendMessageW(HWnd, %WM_GETTEXT, nLen + 1, ByVal Strptr(wbuffer))
   Function = Left$(wbuffer, nLen)
End Function



Paul Squires
PlanetSquires Software
WinFBE Editor and Visual Designer

José Roca

Without native support in FB for dynamic unicode strings, it is not possible to write functions that return an string that can be used in the same way. If at least WSTRING (that in reality is a WSTRINGZ) was like the native STRING type, but working with Unicode...

Remember that the lack of native support for Unicode dynamic strings has always been my first objection to this compiler.

Paul Squires

Thanks - I understand it all now. I am wondering if maybe the best route would now to be to build all my code just using your new BTR class? I want to use Unicode in all my code but it would be nice to have easier conversion between ansi and Unicode.

Dynamic Unicode strings has come up in the FB forum before (not very often, but it has). The conversation always stalls around the many different Unicode encodings especially when using Linux. In Windows we use 2-byte Unicode so it is pretty standardized. Those developing the compiler would like a dynamic Unicode data type but the work involved given all the different encodings is pretty daunting. Not to say that it will never happen, but it could be a while. The advice I was given when I asked was to write my own class.... and that's what you have done with your AfxBstr.  :)

BTW, can you post your latest code so I continue to code with it and understand its implications and usage better?
Paul Squires
PlanetSquires Software
WinFBE Editor and Visual Designer

José Roca

#10
See attachement. I have changed the name to CBStr. I like shorter names. I used the prefix Afx with PB because it had no namespaces.

I have added several overloaded operators. I will try to add another Let (=) operator that accepts an OLE string handle. This way, we could write the function as returning an OLE string handle allocated with SysAllocString and attach it to the class easily.

I'm thinking in something like


DIM bs AS CBStr
bs = SomeFunction () that returns an OLE string handle.


Still learning. I just have began to use FB a few days ago and still don't know all the possibilities of the language.

Code of the test program I'm currently using. Shows the usage of some of the procedures and operators.


#define unicode
#INCLUDE ONCE "windows.bi"
#INCLUDE ONCE "Afx/CBStr.inc"

using Afx.CBStrClass

'DIM p AS CBStr PTR
'p = @bs
'print p->Length
'Print *p->Handle

'DIM bs1 AS CBStr = "1st string"
'DIM bs2 AS CBStr = "2nd string"
'bs1.Append *bs2.Handle
'PRINT *bs1.Handle
' -- Replaced with the += or &= oprators
'bs1 &= bs2
'PRINT *bs1.Handle


'DIm bs AS CBStr = "Test string"
'DIM s AS WSTRING * 20 = "Test string"
'IF s = bs THEN PRINT "equal"

DIM bs1 AS CBStr = "Test string"
DIM bs2 AS CBStr = "pepe"
'bs1 = bs2
print *bs1.Handle
'print *bs2.Handle
print UCASE(*bs1.Handle)
print bs1.ToAnsi

'DIM bs1 AS CBStr = "Test string 1"
'DIM bs2 AS CBStr = "Test string 2"
'IF bs1 < bs2 THEN print "less"
'print "pepe"

'bs.MakeUpper
'PRINT *bs.Handle
'bs.MakeLower
'PRINT *bs.Handle
'PRINT WSTR(*bs.Handle)
'print bs.Len
'print ucase(*bs.Handle)
'Print bs.ToAnsi; "..."

print "press esc"
dim as string k
do
k = inkey( )
loop until k <> ""



José Roca

Quote from: TechSupport on August 25, 2015, 12:08:09 PM
Thanks Jose, I thought that maybe the return STRING would automatically be converted to a WSTRING if the data type that the function is assigning the return value to was a WSTRING. See the test code below. Looking at the string in memory verifies that it is not converted to Unicode automatically because I do not see any 00 between the character values.


#Lang "FB"
#Define UNICODE

'
' Create a function that will return a WString. The compiler
' will convert the return STRING data to the appropriate data
' type and assign it to the variable accepting the return value.
'
Function MyFunction() As String
   Dim MyWString As WString * 100 = "WString from function"
   Return WSTR(MyWString)
End Function


> WSTR(MyWString)

As the string is already a WString, WSTR is ignored.

As the return type is a String, the compiler converts automatically the WString to Ansi.

With PB9 we could use a STRING to store an unicode string, but that was because PB ansi strings are also OLE strings, whereas FB dynamic strings use an ASCIIZ pointer for storage and therefore can't use embedded nulls.

José Roca

#12
Got it. New attachement.


' ========================================================================================
OPERATOR CBStr.Let (BYREF bstrHandle AS BSTR)
   IF bstrHandle = NULL THEN EXIT OPERATOR
   IF m_bstr THEN SysFreeString(m_bstr)
   m_bstr = bstrHandle
END OPERATOR
' ========================================================================================


Test example:


#define unicode
#INCLUDE ONCE "windows.bi"
#INCLUDE ONCE "Afx/CBStr.inc"

FUNCTION Foo () AS BSTR
   DIM bstrHandle AS BSTR
   bstrHandle = SysAllocString("Test string")
   FUNCTION = bstrHandle
END FUNCTION

using Afx.CBStrClass

DIM bs AS CBSTR
bs = Foo
print *bs.Handle

print "press esc"
dim as string k
do
k = inkey( )
loop until k <> ""


As the ole string returned by the function is attached to the class, it will be freed when the class is destroyed.

José Roca

#13
So we can now to write similar functions, the "A" version returning a STRING and the Unicode one returning a BSTR.

The good thing about this compiler is that it provides ways to extend it. These "building" features are what I always asked to Bob, instead of wasting the time with the obsolete DDT engine.

Paul Squires

#14
So you would envision something like this:

#IfDef UNICODE
   #Define AfxGetWindowText AfxGetWindowTextW
#Else
   Function AfxGetWindowText AfxGetWindowTextA
#EndIf
                                             
Declare Function AfxGetWindowTextW(ByVal HWnd As HWnd) As CBStr                                             
Declare Function AfxGetWindowTextA(ByVal HWnd As HWnd) As String
Paul Squires
PlanetSquires Software
WinFBE Editor and Visual Designer