PlanetSquires Forums

Support Forums => General Board => Topic started by: Paul Squires on July 09, 2016, 11:45:45 PM

Title: CBSTR StringBuilder Class
Post by: Paul Squires on July 09, 2016, 11:45:45 PM
I worked on the StringBuilder class tonight and have attached it to this post.

The class mirrors the functionality of the StringBuilder object in PowerBasic.

Any problems, changes or additions then please let me know.
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 10, 2016, 12:39:52 AM
Jose, you can take this code and convert it into the standard format/style that you use for your other Afx routines. Hopefully it will be useful.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 10, 2016, 12:48:49 AM
Should be useful both to use it independently and for functions like the ones that mimic the PB ones and that use concatenations in a loop.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 04:27:19 AM
Well, I have added some constructors and operators, and to allow to deference the buffer pointer without getting all of the buffer, but only the part employed by the valid string data, I have modified some of the functions to mark the end of the string data with a double null.

Now you can even use the intrinsic FB operators with it, e.g.


' We can use FB's intrinsic operators...
DIM sb AS CBStrBld = "   Test string   "
sb = TRIM(sb)
print sb


This little example demonstrates how to use it to write a wrapper function (see function AfxRemove):


'#define _CBSTR_DEBUG_ 1
#include once "CBStr.inc"
using Afx.CBStrClass

' ========================================================================================
' Returns a copy of a string with characters or strings removed.
' If cbMatchStr is not present in cbMainStr, all of cbMainStr is returned intact.
' This function is case-sensitive.
' Usage example:
' DIM cbs AS CBSTR = AfxRemove("[]Hello[]", "[]")
' MessageBoxW 0, cbs, "", MB_OK
' ========================================================================================
PRIVATE FUNCTION AfxRemove (BYREF wszMainStr AS CONST WSTRING, BYREF wszMatchStr AS WSTRING) AS CBSTR
   DIM sb AS CBStrBld = wszMainStr
   DO
      DIM nPos AS LONG = INSTR(sb, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      sb.DelChars nPos, LEN(wszMatchStr)
   LOOP
   FUNCTION = sb.Str
END FUNCTION
' ========================================================================================

DIM  i AS LONG
DIM cbs AS CBSTR
FOR i = 1 TO 100000  ' Even one million takes little time
   cbs = AfxRemove("[]Hello[]", "[]")
NEXT
PRINT cbs

' We can use FB's intrinsic operators...
DIM sb AS CBStrBld = "   Test string   "
sb = TRIM(sb)
print sb

print "Press any key..."
sleep


Even one million of calls takes very little time.

It can also be used to pass the resulting string to functions that expect a WSTRING parameter, e.g.


DIM sb AS CBstrBld = CBstrBld("&Закрыть", 1251)
SetWindowText hButton, sb
--or--
SetWindowText hButton, sb.str


Using sb.str you can also pass the resulting string to a COM method that expects a BSTR in parameter!

I have amalgamated it, together with CBSTR, in CBSTR.inc.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 06:38:10 AM
Paul's test modified because I have changed the names from tAdd to Add, etc.


'#define _CBSTR_DEBUG_ 1
#include once "CBStr.inc"
using Afx.CBStrClass

dim sb as CBStrBld
dim cbs as CBSTR = "Paul"

for i as long = 1 to 100
   sb.Add cbs
next

print sb.Str

Print
print "String Length: "; sb.Len
Print "Capacity: "; sb.Capacity

Print
Print "First 5 character values before..."
For i As Long = 1 To 5
   Print sb.Char( i );
Next
Print

Print
Print "Change the first 5 character values..."
sb.Char(1) = 74
sb.Char(2) = 111
sb.Char(3) = 115
sb.Char(4) = 101
sb.Char(5) = 33

Print
Print "First 5 character values after..."
For i As Long = 1 To 5
   Print sb.Char( i );
Next
Print
Print
print sb.Str
print

Print
Print "Now delete the First 5 characters in the buffer..."
Print
print "String Length before: "; sb.Len
sb.DelChars( 1, 5 )
print "String Length after: "; sb.Len
print
print sb.Str; "************"
print

Print
Print "Clear the buffer and add some new text before doing an insert"
sb.Clear
sb.Add "12345678901234567890123456789012345678901234567890"
Print "Now insert 'PlanetSquires' (Len=13) starting at position 5..."
Print
print "String Length before: "; sb.Len
print sb.Str
cbs = "PlanetSquires"
sb.Insert( cbs, 5 )
print
print sb.Str
print "String Length after: "; sb.Len
print

print "Press any key..."
sleep

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 06:48:55 AM
Modified this line of the CBStrBld.ResizeBuffer function


DIM pNewBuffer AS UBYTE PTR = Allocate(nValue + 1 * SIZEOF(UBYTE) * 2)


to make room for the terminating nulls.

Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 08:26:57 AM
Jose,
  Going forward will all the string functions (remain, shrink,remove....) be builder functions or will we have stand alone routines?

James
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 11, 2016, 08:37:35 AM
Thanks Jose! Excellent job. Thanks for making the class even more useful!

I have been using the CBSTR class in some new editor code and the there have only been a couple of places where the compiler complained. For example, the Dir() function needs the double asterisk to deference. I believe you have also identified Left() and Right(). Nonetheless, I use CBSTR for situations where I need a dynamic Unicode string, otherwise I simply use wstrings.
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 08:45:20 AM
Jose,
  I am confused with the relationship of the string builder class and CBStr.
CBStr's are created with SysAllocString where it appears string builder uses Allocate.

James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 09:13:30 AM
The DelChars function was not working properly. I apologize.

Please download this new include file.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 09:24:00 AM
Quote from: James Fuller on July 11, 2016, 08:45:20 AM
Jose,
  I am confused with the relationship of the string builder class and CBStr.
CBStr's are created with SysAllocString where it appears string builder uses Allocate.

James


The relationship comes with this function, that converts the buffer to a CBSTR.


' ========================================================================================
FUNCTION CBStrBld.Str() AS CBSTR
   IF m_BufferLen = 0 THEN EXIT FUNCTION
   IF m_pBuffer = 0 THEN EXIT FUNCTION
   DIM cbOutStr AS CBSTR = SPACE(m_BufferLen \ 2)   ' class will double the size for us
   memmove(*cbOutStr, m_pBuffer, m_BufferLen)
   FUNCTION = cbOutStr
END FUNCTION
' ========================================================================================


In this wrapper


' ========================================================================================
PRIVATE FUNCTION AfxRemove (BYREF wszMainStr AS WSTRING, BYREF wszMatchStr AS WSTRING) AS CBSTR
   DIM sb AS CBStrBld = wszMainStr
   DO
      DIM nPos AS LONG = INSTR(sb, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      sb.DelChars nPos, LEN(wszMatchStr)
   LOOP
   FUNCTION = sb.Str
END FUNCTION
' ========================================================================================


All the string manipulations are done with the buffer of the CBStrBld class for speed, and in the final instruction (FUNCTION = sb.Str) we convert it to a CBSTR.

Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 09:47:59 AM
Jose,
  I believe this used to work?
James

#include once "windows.bi"
#include once "afx/CBStr.inc"
using Afx.CBStrClass
Function FF_Replace (cbsMain As CBStr,cbsMatch As CBStr,cbsReplace As CBStr) As CBStr
    If Len(cbsMain) = 0 OR Len(cbsMatch) = 0 OR Len(cbsReplace) = 0 Then
        Exit Function
    EndIf
    Dim As CBStr cbsOut = Type(cbsMain)
    Dim As Long i
    Do
        i = Instr(i,cbsOut,cbsMatch)
        If i  > 0 Then
            cbsOut = Left(**cbsOut,i-1) & **cbsReplace & Mid(**cbsOut,i + Len(cbsMatch))
           
            i += Len(cbsReplace)
        EndIf
    Loop Until i = 0
    Function = cbsOut
End Function
'==============================================================================
Dim As CBStr cbs = ("[]Hello[]"),cbs1="[]",cbs2="**",cbs3
cbs3 = FF_Replace(cbs,cbs1,cbs2)
? cbs3
sleep

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 09:52:05 AM
Quote from: TechSupport on July 11, 2016, 08:37:35 AM
Thanks Jose! Excellent job. Thanks for making the class even more useful!

I have been using the CBSTR class in some new editor code and the there have only been a couple of places where the compiler complained. For example, the Dir() function needs the double asterisk to deference. I believe you have also identified Left() and Right(). Nonetheless, I use CBSTR for situations where I need a dynamic Unicode string, otherwise I simply use wstrings.


I have added to it almost all the functionality of CBSTR. We can consider CBStrBld as a dynamic WSTRING. What it lacks is the ability to pass it to a COM method or function that has a BYREF BSTR parameter. With CBSTR we can use @cbs, that passes the adress of the underlying BSTR stored in the CBSTR class, but we can't do it with CBStrBld because the underlying type is not a BSTR, although we probably could use it with methods or functions that have a byref WSTRING parameter.

Therefore, we have now two new data types, a BSTR and a dynamic WSTRING.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 09:59:49 AM
Quote from: James Fuller on July 11, 2016, 09:47:59 AM
Jose,
  I believe this used to work?
James

#include once "windows.bi"
#include once "afx/CBStr.inc"
using Afx.CBStrClass
Function FF_Replace (cbsMain As CBStr,cbsMatch As CBStr,cbsReplace As CBStr) As CBStr
    If Len(cbsMain) = 0 OR Len(cbsMatch) = 0 OR Len(cbsReplace) = 0 Then
        Exit Function
    EndIf
    Dim As CBStr cbsOut = Type(cbsMain)
    Dim As Long i
    Do
        i = Instr(i,cbsOut,cbsMatch)
        If i  > 0 Then
            cbsOut = Left(**cbsOut,i-1) & **cbsReplace & Mid(**cbsOut,i + Len(cbsMatch))
           
            i += Len(cbsReplace)
        EndIf
    Loop Until i = 0
    Function = cbsOut
End Function
'==============================================================================
Dim As CBStr cbs = ("[]Hello[]"),cbs1="[]",cbs2="**",cbs3
cbs3 = FF_Replace(cbs,cbs1,cbs2)
? cbs3
sleep



No. You must start INSTR with at least 1. Change


Dim As Long i


to


Dim As Long i = 1

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 10:24:48 AM
This is a faster version:


' ========================================================================================
' Within a specified string, replace all occurrences of one string with another string.
' Replaces all occurrences of cbMatchStr in cbMainStr with cbReplaceWith
' The replacement can cause cbMainStr to grow or condense in size.
' When a match is found, the scan for the next match begins at the position immediately
' following the prior match.
' This function is case-sensitive.
' Usage example:
' DIM cbs AS CBSTR = AfxReplace("Hello World", "World", "Earth")
' MessageBoxW 0, cbs, "", MB_OK
' ========================================================================================
PRIVATE FUNCTION AfxReplace (BYREF wszMainStr AS WSTRING, BYREF wszMatchStr AS WSTRING, BYREF wszReplaceWith AS CBSTR) AS CBSTR
   DIM sb AS CBStrBld = wszMainStr
   DIM nPos AS LONG = 1
   DO
      nPos = INSTR(nPos, sb, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      sb = MID(sb, 1, nPos - 1) & wszReplaceWith & MID(sb, nPos + LEN(wszMatchStr))
      nPos += LEN(wszReplaceWith)
   LOOP
   FUNCTION = sb.Str
END FUNCTION
' ========================================================================================

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 10:34:36 AM
Now that I think about it, CBStrBld is not an adequate name because it is not a BSTR. I choose that name because it was planned as an auxiliary class to CBSTR, but now it has become a new data type.

What about CWStr or CWString?

Which name do you prefer?
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:00:35 AM
Quote from: Jose Roca on July 11, 2016, 09:52:05 AM
Quote from: TechSupport on July 11, 2016, 08:37:35 AM
Thanks Jose! Excellent job. Thanks for making the class even more useful!

I have been using the CBSTR class in some new editor code and the there have only been a couple of places where the compiler complained. For example, the Dir() function needs the double asterisk to deference. I believe you have also identified Left() and Right(). Nonetheless, I use CBSTR for situations where I need a dynamic Unicode string, otherwise I simply use wstrings.


I have added to it almost all the functionality of CBSTR. We can consider CBStrBld as a dynamic WSTRING. What it lacks is the ability to pass it to a COM method or function that has a BYREF BSTR parameter. With CBSTR we can use @cbs, that passes the adress of the underlying BSTR stored in the CBSTR class, but we can't do it with CBStrBld because the underlying type is not a BSTR, although we probably could use it with methods or functions that have a byref WSTRING parameter.

Therefore, we have now two new data types, a BSTR and a dynamic WSTRING.


Another difference is that CBSTR can work with strings with embedded nulls and this new data type not.
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 11:01:03 AM
Jose,
  I vote for CWStr.
Would the rule of thumb be for non-COM we use CWStr and for COM CBStr?

James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:16:58 AM
Not all COM interfaces use BSTRs. Most low-level ones use WSTRINGS, with the particularity that must be freed with CoTaskMemFree, in general. I would use CWSTR when speed is needed, such in the string functions, and CBSTR in the other cases, because as BSTRs carry with them its length, they can be used as a buffer for images and other binary files. Also when you pass them to a function you don't need to specify its length or how many characters have to be copied, and you don't have the risk of buffer overruns.


Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 11:23:51 AM
Jose,
  You mention the compiler balks on left,right for CBStr's and we must use **cbs.
Can you not overload them as you did LEN ?

James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:30:01 AM
They're not overdoloadable. However, you can use MID, that works, or **.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:33:49 AM
Quote from: James Fuller on July 11, 2016, 11:01:03 AM
Jose,
  I vote for CWStr.

James


Well, we have two votes. I think I will use it. It is short and it matches with its CBSTR counterpart.

I also have added the @ operator, so now we can do things like


DIM sb AS CWSTR = 260
GetWindowText(pWindow.hWindow, @sb, 260)
MessageBox 0, sb, "", MB_OK


This is a big advantage over WSTRINGs, that must be dimensioned at compile time and its size can't be changed at runtime. Wow! Dynamic null terminated unicode strings.
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 11:38:54 AM
Quote from: Jose Roca on July 11, 2016, 11:30:01 AM
They're not overdoloadable. However, you can use MID, that works, or **.
Yes I say that.
How about cLeft and cRight macros that use MID? I just don't like ** :)

James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:45:56 AM
And I don't like macros, but you can write as many as you wish.

If I write a macro called cLeft, I will have to pray that nobody will use it as a variable name, or as another macro... Macros are a can of worms.

You can also request a change in the compiler. Doesn't make much sense that MID works and LEFT and RIGHT not.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:50:39 AM
Done. Name officially changed to WSTR (or WStr, for VB lovers).

I also have implemented the @ operator, so WSTRs can be passed as parameters to functions that expect a WSTRING by reference (or a pointer to a unicode null terminated string, for C lovers).
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 01:44:58 PM
I have added another constructor.

Now WSTR and CBSTR collaborate even better and are very handy to write wrappers that return unicode strings. For example:


' ========================================================================================
' Gets the text of a window.
' Note: GetWindowText cannot retrieve the text of a control in another application.
' ========================================================================================
FUNCTION AfxGetWindowText (BYVAL hwnd AS HWND) AS CBSTR
   DIM nLen AS LONG = SendMessageW(hwnd, WM_GETTEXTLENGTH, 0, 0)
   DIM wszText AS CWSTR = SPACE(nLen + 1)
   SendMessageW(hwnd, WM_GETTEXT, nLen + 1, cast(LPARAM, *wszText))
   FUNCTION = wszText.Str
END FUNCTION
' ========================================================================================


That can be called as


DIM wszText AS CBSTR = AfxGetWindowText(hwnd)
AfxMsg wszText


and, with the new constructor, even as


DIM wszText AS CWSTR = AfxGetWindowText(hwnd)
AfxMsg wszText


CWSTR can also be used as the return type of the function, but in both cases you have to return FUNCTION = wszText.Str and not FUNCTION = wszText, or it will GPF.
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 02:09:17 PM
Jose,
  Check your comments in CBStr.inc especially for the CWStr. A number of references to BSTR when I think you mean CWSTR?

James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 02:27:02 PM
For example? I don't see anything wrong.
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 11, 2016, 02:38:04 PM
I guess I just don't understand the functionality:

' ========================================================================================
' One * returns the value of the BSTR pointer.
' Two ** returns the adress of the start of the string data.
' Needed because LEFT and RIGHT (cws) fail with an ambiguous call error.
' We have to use **cws (notice the double indirection) with these functions.
' ========================================================================================
OPERATOR * (BYREF cws AS CWstr) AS WSTRING PTR
   OPERATOR = cast(WSTRING PTR, cws.m_pBuffer)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Assigns new text to the BSTR
' ========================================================================================
OPERATOR CWstr.Let (BYREF wszStr AS CONST WSTRING)
   this.Clear
   this.Add(wszStr)
END OPERATOR
'
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 02:43:56 PM
You're correct. These are recent additions that I copied/adapted from CBSTR. I have modified them and a couple of others.
Title: Re: CBSTR StringBuilder Class
Post by: Marc Pons on July 11, 2016, 03:48:42 PM
josé

i think you forgot to put the len operator, without it , len gives the sizeof(type), not the len in wchars

here my proposal

' ========================================================================================
' The number of characters currently stored in the class is returned as a Long value.
' ========================================================================================

FUNCTION CWstr.Len () AS LONG
   FUNCTION = m_BufferLen \ 2     ' buffer is wide characters (2 bytes each)
END FUNCTION

OPERATOR Len(BYREF cws AS CWSTR) AS LONG
   OPERATOR = cws.len()
END OPERATOR
' ========================================================================================
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 11, 2016, 04:46:34 PM
Holy crap! I missed almost this entire conversation today! :-)  Been busy at work and am only now checking the forums. It is awesome that my little string builder class has taken on a life of it's own and that Jose was able to transform it into its own data type. I will download the new code and try it. IIRC, there is a large default capacity buffer (16K) so you might want to make the default somewhat smaller. I know that it can be overloaded but most times we won't bother doing that and 16K per string seems a bit over done.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 10:10:56 PM
I had the feeling that it was too good to be true. Apparently returning a CBSTR works because, by default, Windows caches BSTRs, so the BSTR used in the class is still accessible after it has been freed with SysFreeString and can be copied.

But this stuff of temporary types isn't working as we thought. We (or I) have misunderstood it.

The problem is that when we do FUNCTION = <our type> it first calls the destructor of our type and then calls the constructor of the temporary copy to be returned. Therefore the memory of the type to be returned has already been released and can't be copied to the temporary type returned. This is why it GPFs if we return a CWSTR using FUNCTION = CWStr, and not if we use FUNCTION = CWStr.Str, that creates a CBSTR.

To work, it should call the constructor of the temporary type to be returned before calling the destructor of the type that we intend to return, that is what I thought it was doing.

The documentation says that "The Constructor for the type, if there is one, will be called when the temporary copy is created. And the Destructor for the type, if there is one, will be called immediately after its use.", but what I'm seing when using FUNCTION  = CWStr is that the destructor for CWStr is being called before the constructor for the temporary string to be returned.

This is a show stopper. It is not safe to build a framework based in the Windows cache for BSTRs, because it will fail if it is disabled.

It doesn't make sense to me to call the destructor of the type to be returned before calling the constructor of the target type. It should be the opposite. Otherwise, we haven't the opportunity to copy the data to the target type.
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 11, 2016, 10:49:28 PM
Quote from: Jose Roca on July 11, 2016, 10:10:56 PM
It doesn't make sense to me to call the destructor of the type to be returned before calling the constructor of the target type. It should be the opposite. Otherwise, we haven't the opportunity to copy the data to the target type.

That doesn't make sense at all. Maybe try putting in some debug print statements to verify 100% of the order of construction/destruction?
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:43:14 PM
This is what I have done, and the destructor of the type to be copied is called before the constructor of the target type is being called.

A simple test:


FUNCTION Foo () AS CWSTR
   DIM wszText AS CWSTR = "Test string"
   FUNCTION = wszText
END FUNCTION


being called as


DIM cws AS CWSTR = Foo


This is the sequence:

When I do DIM wszText AS CWSTR = "Test string"

This constructor is called:


CONSTRUCTOR CWstr (BYREF ansiStr AS STRING = "", BYVAL nCodePage AS LONG = 0)


That calls


PRIVATE FUNCTION CWstr.ResizeBuffer (BYVAL nValue AS LONG) AS LONG
FUNCTION CWstr.Add (BYREF ansiStr AS STRING, BYVAL nCodePage AS LONG = 0) AS LONG
PRIVATE FUNCTION CWstr.AppendBuffer (BYVAL addrMemory AS ANY PTR, BYVAL nNumBytes AS LONG) AS LONG


But when I do


FUNCTION = wszText


The sequence is:


CWSTR Destructor
CONSTRUCTOR CWstr (BYREF cws AS CWSTR)
PRIVATE FUNCTION CWstr.ResizeBuffer (BYVAL nValue AS LONG) AS LONG
FUNCTION CWstr.Add (BYREF cws AS CWSTR) AS LONG
OPERATOR CWstr.CAST () AS ANY PTR
PRIVATE FUNCTION CWstr.AppendBuffer (BYVAL addrMemory AS ANY PTR, BYVAL nNumBytes AS LONG)


Notice that the first thing that it does is to call the CWSTR destructor.

How I'm going to copy its contents if the type has been destroyed?

The CBSTR works because the BSTR has been cached by Windows. So even if the type has been destroyed and the BSTR freed, it can still access it. This made me to think that it was working.

But if I call the Foo function with DIM cws AS CWSTR = Foo, it GPFs. This is what has made me to think that something was not working as it should.

With this behavior, it is not possible to return types from a function, unless they are simple types containing scalar values. In this case, FB does a direct copy.

The Foo function works if I change the return type to AS STRING.


FUNCTION Foo () AS STRING
   DIM wszText AS CWSTR = "Test string"
   FUNCTION = wszText
END FUNCTION


being called as


DIM cws AS CWSTR = Foo


Because it copies the CWSTR buffer to the string BEFORE destroying CWSTR.

But the problem is that can't be used with unicode because it converts it to ansi automatically.

I think that this behavior is wrong and should be changed; otherwise returning types is useless.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 11, 2016, 11:58:13 PM
But wait, I have found the solution! At least that is what I hope.

If I use RETURN instead of FUNCTION =, it works.


FUNCTION Foo () AS CWSTR
   DIM wszText AS CWSTR = "Test string"
   RETURN wszText
END FUNCTION

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 12:03:30 AM
Apparently, when using FUNCTION =, the assignment is done after the type has gone out of scope.

But when using RETURN, the assignment is done BEFORE the type goes out of scope.

So the sequence becomes:


CONSTRUCTOR CWstr (BYREF cws AS CWSTR)
PRIVATE FUNCTION CWstr.ResizeBuffer (BYVAL nValue AS LONG) AS LONG
FUNCTION CWstr.Add (BYREF cws AS CWSTR) AS LONG
OPERATOR CWstr.CAST () AS ANY PTR
PRIVATE FUNCTION CWstr.AppendBuffer (BYVAL addrMemory AS ANY PTR, BYVAL nNumBytes AS LONG)
CWSTR Destructor


That is what FUNCTION = should do.

So the solution is to use RETURN instead of FUNCTION.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 12:07:40 AM
PHEW!
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 12, 2016, 12:16:45 AM
Wow. I would never have thought there would be such a subtle difference! Awesome that you found a working solution!
:)
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 12:22:34 AM
If this not a bug, I don't know what it is.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 12:48:42 AM
Quote from: TechSupport on July 12, 2016, 12:16:45 AM
Wow. I would never have thought there would be such a subtle difference! Awesome that you found a working solution!
:)


And nobody else. The documentation says that using RETURN is like calling FUNCTION = value : EXIT FUNCTION.

But even if I use


FUNCTION Foo3 () AS CWSTR
   DIM wszText AS CWSTR = "Test string"
   FUNCTION = wszText
   EXIT FUNCTION
END FUNCTION


The destructor of wszText is being callef BEFORE the constructor of the temporary type.

Using RETURN, it works as it should.

Guess that these C programmers always use RETURN and nobody has tested FUNCTION with types like the ones that we are using.

Well, I have now to search for all FUNCTION = and use RETURN.

For a moment, I thought that I had to throw away all the work.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 12:53:07 AM
Quote from: Marc Pons on July 11, 2016, 03:48:42 PM
josé

i think you forgot to put the len operator, without it , len gives the sizeof(type), not the len in wchars

here my proposal

' ========================================================================================
' The number of characters currently stored in the class is returned as a Long value.
' ========================================================================================

FUNCTION CWstr.Len () AS LONG
   FUNCTION = m_BufferLen \ 2     ' buffer is wide characters (2 bytes each)
END FUNCTION

OPERATOR Len(BYREF cws AS CWSTR) AS LONG
   OPERATOR = cws.len()
END OPERATOR
' ========================================================================================


Hi Marc,

Yes, I will do. I modified Paul's code, where it was implemented as tLeft, and didn't remember to change it for an operator. Thanks very much for noticing it.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 02:15:42 AM
I have changed the "FUNCTION =" with "RETURN", removed the Left function, added this operator


OPERATOR LEN (BYREF cws AS CWSTR) AS LONG
   CBSTR_DP("CWSTR OPERATOR LEN")
   OPERATOR = .LEN(**cws)
END OPERATOR


and modified


' ========================================================================================
FUNCTION CWstr.Add (BYREF cws AS CWSTR) AS LONG
   CBSTR_DP("***** CWSTR Add 2 - LEN = " & WSTR(LEN(*cast(WSTRING PTR, @cws))))
   ' Incoming string is already in wide format, simply copy it to the buffer.
   DIM AS LONG nLenString = LEN(*cast(WSTRING PTR, @cws))
   IF nLenString = 0 THEN RETURN 0
   ' Copy the string into the buffer and update the length
   this.AppendBuffer(cast(ANY PTR, cws), nLenString * 2)
   RETURN 0
END FUNCTION
' ========================================================================================


Because I modified Paul's code to mark the end of the string with a double null to be able to deference it with a pointer, since CWSTR is now a null terminated data type instead of an helper string builder class.

This is the modified example of Paul:


'#define _CBSTR_DEBUG_ 1
#include once "CBStr.inc"
using Afx.CBStrClass

dim sb as CWSTR
dim cbs as CBSTR = "Paul"

for i as long = 1 to 100
   sb.Add cbs
next

print sb.Str

Print
print "String Length: "; LEN(sb)
Print "Capacity: "; sb.Capacity

Print
Print "First 5 character values before..."
For i As Long = 1 To 5
   Print sb.Char( i );
Next
Print

Print
Print "Change the first 5 character values..."
sb.Char(1) = 74
sb.Char(2) = 111
sb.Char(3) = 115
sb.Char(4) = 101
sb.Char(5) = 33

Print
Print "First 5 character values after..."
For i As Long = 1 To 5
   Print sb.Char( i );
Next
Print
Print
print sb.Str
print

Print
Print "Now delete the First 5 characters in the buffer..."
Print
print "String Length before: "; LEN(sb)
sb.DelChars( 1, 5 )
print "String Length after: "; LEN(sb)
print
print sb.Str; "************"
print

Print
Print "Clear the buffer and add some new text before doing an insert"
sb.Clear
sb.Add "12345678901234567890123456789012345678901234567890"
Print "Now insert 'PlanetSquires' (Len=13) starting at position 5..."
Print
print "String Length before: "; LEN(sb)
print sb.Str
cbs = "PlanetSquires"
sb.Insert( cbs, 5 )
print
print sb.Str
print "String Length after: "; LEN (sb)
print

print "Press any key..."
sleep


New file attached. Hope it will work fine and this will be the end of nasty suprises. All is well if it ends well.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 02:47:20 AM
Quote
IIRC, there is a large default capacity buffer (16K) so you might want to make the default somewhat smaller. I know that it can be overloaded but most times we won't bother doing that and 16K per string seems a bit over done.

I forgot your remark. What size do you suggest?

You know that when calling the API functions most of the times we have to specify the size. The advantage of using this type (CWSTR) is that the length can be specified dynamically, not at compile time like WSTRINGs. As you know, one of the nastier problems with null terminated strings is that when we don't know the size in advance we have to allocate a buffer with, e.g. Allocate, CAllocate, etc., instead of using a WSTRING. With CWSTR we can even use a variable to especify the size.

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 03:41:17 AM
We can use CBSTR or CWSTR in the same way.


' ========================================================================================
PRIVATE FUNCTION AfxGetWindowText (BYVAL hwnd AS HWND) AS CBSTR
   DIM nLen AS LONG = SendMessageW(hwnd, WM_GETTEXTLENGTH, 0, 0)
   DIM wszText AS CBSTR = SPACE(nLen + 1)
   SendMessageW(hwnd, WM_GETTEXT, nLen + 1, cast(LPARAM, *wszText))
   RETURN wszText
END FUNCTION
' ========================================================================================

' ========================================================================================
PRIVATE FUNCTION AfxGetWindowText (BYVAL hwnd AS HWND) AS CWSTR
   DIM nLen AS LONG = SendMessageW(hwnd, WM_GETTEXTLENGTH, 0, 0)
   DIM wszText AS CWSTR = SPACE(nLen + 1)
   SendMessageW(hwnd, WM_GETTEXT, nLen + 1, cast(LPARAM, *wszText))
   RETURN wszText
END FUNCTION
' ========================================================================================


For functions like this one, that don't perform concatenations, I would use CBSTR, and reserve the use of CWSTR for operations that use many concatenations.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 03:49:56 AM
For functions like these ones, that can potentially perform many concatenations, I would use CWSTR internally for speed and return the result as a CBSTR.


' ========================================================================================
' Returns a copy of a string with characters or strings removed.
' If cbMatchStr is not present in cbMainStr, all of cbMainStr is returned intact.
' This function is case-sensitive.
' Usage example:
' DIM cbs AS CBSTR = AfxRemove("[]Hello[]", "[]")
' MessageBoxW 0, cbs, "", MB_OK
' ========================================================================================
PRIVATE FUNCTION AfxRemove (BYREF wszMainStr AS WSTRING, BYREF wszMatchStr AS WSTRING) AS CBSTR
   DIM cws AS CWSTR = wszMainStr
   DO
      DIM nPos AS LONG = INSTR(cws, wszMatchStr)
      IF nPos = 0 THEN EXIT DO
      cws.DelChars nPos, LEN(wszMatchStr)
   LOOP
   FUNCTION = cws.Str
END FUNCTION
' ========================================================================================

' ========================================================================================
' Returns a copy of a string with characters or strings removed.
' If cbMatchStr is not present in cbMainStr, all of cbMainStr is returned intact.
' cbMatchStr specifies a list of single characters to be searched for individually,
' a match on any one of which will cause that character to be removed from the result.
' This function is case-sensitive.
' Usage example:
' Removing all "b", "a", and "c"
' DIM cbs AS CBSTR = AfxRemoveAny("abacadabra", "bac")
' MessageBoxW 0, cbs, "", MB_OK
' ========================================================================================
PRIVATE FUNCTION AfxRemoveAny (BYREF wszMainStr AS WSTRING, BYREF wszMatchStr AS WSTRING) AS CBSTR
   DIM cws AS CWSTR = wszMainStr
   DIM i AS LONG
   FOR i = 1 TO LEN(wszMatchStr)
      cws = AfxRemove(cws, MID(wszMatchStr, i, 1))
   NEXT
   FUNCTION = cws.Str
END FUNCTION
' ========================================================================================


But, of course, you decide...
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 12, 2016, 08:35:24 AM
Jose,
  Why are you using BSTR allocation for the third CWstr.Insert??
James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 08:57:44 AM
Because it is required by MultiByteToWideChar.
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 12, 2016, 09:42:52 AM
I really should not post to any forum before I have my second cup of coffee :)

James
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 12, 2016, 10:32:04 AM
Jose,
In my Dlg2Cwin demo I did not change the FbString input "sLine" but used all the new AfxStrxxxx functions successfully for parsing.
Is there any problem using the Fb String type except for an anticipated speed penalty?

James

Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 12, 2016, 11:36:38 AM
Quote from: Jose Roca on July 12, 2016, 02:47:20 AM
Quote
IIRC, there is a large default capacity buffer (16K) so you might want to make the default somewhat smaller. I know that it can be overloaded but most times we won't bother doing that and 16K per string seems a bit over done.

I forgot your remark. What size do you suggest?

You know that when calling the API functions most of the times we have to specify the size. The advantage of using this type (CWSTR) is that the length can be specified dynamically, not at compile time like WSTRINGs. As you know, one of the nastier problems with null terminated strings is that when we don't know the size in advance we have to allocate a buffer with, e.g. Allocate, CAllocate, etc., instead of using a WSTRING. With CWSTR we can even use a variable to especify the size.


I suggest something more reasonable like 2K or 4K. When you use CWSTR in any of your AfxStr routines then you can probably make a more reasonable judgement of what to set the Capacity of the buffer to. If the programmer is going to use the CWSTR for large concatenations then he should be smart enough to set the initial Capacity to 16K or higher. Granted, I have not done any speed tests on this so a default Capacity of 2K or 4K might be quite acceptable for the majority of cases. I fear that leaving the default as 16K for every CWSTR created seems a bit wasteful.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 12:09:44 PM
> Is there any problem using the Fb String type except for an anticipated speed penalty?

Like creating a black hole that will swallow the universe? Of course, not :)
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 12, 2016, 12:30:17 PM
I have started using CBSTR and the new AfxStr functions. Wow! Now my code is starting to look BASIC again. Yeah! No more pointers and dereferencing. I am loving it so far and I will report any problems should I encounter them.
Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 12, 2016, 12:35:22 PM
I have to hand it to you Jose, what you've created here is nothing short of amazing. Just before the Portugal/France match on Sunday I starting writing a new "toy" that I will show off to you guys very soon. I am so impressed by it and how easy development has been using your classes and helper functions. I just tested the application using 144 DPI and it looks amazing...and all I had to do was change one number in my code...no changing system settings or anything. Bravo!
Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 12, 2016, 12:48:41 PM
Paul,
  I echo your sentiments!!!
I am also amazed at the compiler's ability to do conversions.
I can pass fb Strings to AfxStrTally with no whining from the compiler and get the correct count!!!

James

Title: Re: CBSTR StringBuilder Class
Post by: Paul Squires on July 12, 2016, 01:11:52 PM
Quote from: James Fuller on July 12, 2016, 12:48:41 PM
I can pass fb Strings to AfxStrTally with no whining from the compiler and get the correct count!!!

I just experienced the same thing with AfxStrParse and AfxStrRemove. Awesome!

One thing I noticed is that Val() complains of ambiguous overload so you need to use ** double indirection.

      ' Parse to get the red,green,blue values
      r = Val( **AfxStrParse(sData, 1) )
      g = Val( **AfxStrParse(sData, 2) )
      b = Val( **AfxStrParse(sData, 3) )

Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 10:18:51 PM
I have overloaded the += and &= operators to allow appending using cws += <some text> and &= <some text> instead of cws.Add <some text>.

Same for CBSTR, that will call this new Append function:


' ========================================================================================
' Appends a string to the BSTR. The string can be a literal or a FB STRING or WSTRING variable.
' ========================================================================================
SUB CBStr.Append (BYREF wszStr AS CONST WSTRING)
   CBSTR_DP("CBSTR Append")
   DIM n1 AS UINT = SysStringLen(m_bstr)
   DIM nLen AS UINT = .LEN(wszStr)
   IF nLen = 0 THEN EXIT SUB
   DIM b AS AFX_BSTR = SysAllocStringLen(NULL, n1 + nLen)
   IF b = NULL THEN EXIT SUB
   memcpy(b, m_bstr, n1 * SIZEOF(WSTRING))
   memcpy(b + n1, @wszStr, nLen * SIZEOF(WSTRING))
   IF m_bstr THEN SysFreeString(m_bstr)
   m_bstr = b
END SUB
' ========================================================================================


I have reduced the buffer to 4 kb, that is the size of the new maximum path length in Windows 10.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 10:26:44 PM
Quote from: TechSupport on July 12, 2016, 12:35:22 PM
I have to hand it to you Jose, what you've created here is nothing short of amazing. Just before the Portugal/France match on Sunday I starting writing a new "toy" that I will show off to you guys very soon. I am so impressed by it and how easy development has been using your classes and helper functions. I just tested the application using 144 DPI and it looks amazing...and all I had to do was change one number in my code...no changing system settings or anything. Bravo!


I've improved the PowerBASIC version, in which I haven't worked since Bob's death. No worth doing it since everybody uses DDT these days.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 10:54:29 PM
Quote from: TechSupport on July 12, 2016, 01:11:52 PM
Quote from: James Fuller on July 12, 2016, 12:48:41 PM
I can pass fb Strings to AfxStrTally with no whining from the compiler and get the correct count!!!

I just experienced the same thing with AfxStrParse and AfxStrRemove. Awesome!

One thing I noticed is that Val() complains of ambiguous overload so you need to use ** double indirection.

      ' Parse to get the red,green,blue values
      r = Val( **AfxStrParse(sData, 1) )
      g = Val( **AfxStrParse(sData, 2) )
      b = Val( **AfxStrParse(sData, 3) )



The reason is that the ones that don't work don't generate intermediate temporary strings. Therefore, the cast operator of the class isn't called and they don't know what to do.

This is the operator that returns a BYREF AS WSTRING, that is the kind of parameter that the FB operators expect.


' ========================================================================================
' Returns a pointer to the BSTR
' ========================================================================================
OPERATOR CBStr.CAST () BYREF AS WSTRING
   OPERATOR =  *CAST(WSTRING PTR, m_bstr)
END OPERATOR
' ========================================================================================


A function like MID generates a temporary string that forces the creation of a temporary CBSTR that stores it and casts it. This means that MID is slower than LEFT and RIGHT.

VAL does not generate a temporary string and, therefore, also complains. If we force the creation of a temporary string, it works, e.g.


DIM cbs AS CBSTR = "12345"
print VAL(MID(cbs, 1))


Of course, it is much faster to use **, even if James hates it.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 12, 2016, 11:25:44 PM
Quote
I have to hand it to you Jose, what you've created here is nothing short of amazing. Just before the Portugal/France match on Sunday I starting writing a new "toy" that I will show off to you guys very soon. I am so impressed by it and how easy development has been using your classes and helper functions. I just tested the application using 144 DPI and it looks amazing...and all I had to do was change one number in my code...no changing system settings or anything. Bravo!


I have posted a bunch of wrapper functions that deal with paths: http://www.planetsquires.com/protect/forum/index.php?topic=3894.0

I will also add more functions to the existing include files. I have been delaying the ones that return an string.
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 13, 2016, 07:11:12 PM
I have added a constructor and an overloaded operator to allow to assign directly a CWSTR to a CBSTR.

Now we can use cbs = cws instead of cbs = cws.Str

Thanks to Marc for helping me in a detail that I was missing about forward references of types.
Title: Re: CBSTR StringBuilder Class
Post by: Johan Klassen on July 14, 2016, 09:30:13 PM
thank you Jose Roca
is this your latest version ? http://www.planetsquires.com/protect/forum/index.php?topic=3892.msg28913#msg28913
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 14, 2016, 09:48:13 PM
No. I'm still working on it.
Title: Re: CBSTR StringBuilder Class
Post by: Johan Klassen on July 14, 2016, 10:48:28 PM
ok :)
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 15, 2016, 12:14:57 AM
Got it. I was having a problem with returning a CWSTR. It was working without problems if the return type was declared AS CBSTR, but if declared AS CWSTR it GPFd because the pointer of the copy was the same that the CWSTR that was going to be destroyed. So I was getting a dangling pointer.

The solution has been to add an overloaded LET operator in which I force the creation of a new buffer in which I copy the string data by calling the ResizeBuffer function.


' ========================================================================================
OPERATOR CWstr.Let (BYREF cws AS CWStr)
   IF m_pBuffer = cws.m_pBuffer THEN EXIT OPERATOR   ' // Ignore cws = cws
   this.Clear
   this.ResizeBuffer(LEN(cws))
   this.Add(cws)
END OPERATOR
' ========================================================================================


The new CWSTR type is so fast that I doubt that you will need to have an ansi version of the string procedures.

Only using FB strings and s += "<some text>" is faster, about twice, probably because it has to copy half the bytes that CWSTR, which is unicode. Guess that if FB has a dynamic unicode string it won't be faster.

Test of speed in my computer:

FB ansi strings:

100,000 appends -> 26 ms
500,000 appends -> 144 ms
1,000,000 appends -> 294 ms

CWSTR strings:

100,000 appends -> 53 ms
500,000 appends -> 280 ms
1,000,000 appends -> 563 ms
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 15, 2016, 12:34:19 AM
We have started writing a class intended to add speed to CBSTR and have ended with a superfast dynamic null terminated unicode data type that replaces CBSTR for most uses except COM. Not bad!

I'm going to change the name of the include file from CBSTR.inc to CWSTR.inc because WSTR can mean wide string, and both data types are wide string, whereas BSTR is associated with the strings used by VB and COM and have the bad repuration of being slow.

Title: Re: CBSTR StringBuilder Class
Post by: James Fuller on July 15, 2016, 07:13:07 AM
Quote from: Jose Roca on July 15, 2016, 12:14:57 AM
The new CWSTR type is so fast that I doubt that you will need to have an ansi version of the string procedures.
1,000,000 appends -> 563 ms

It's not about speed. I need them for parsing ansi text files. I do not use unicode for File IO.

James
Title: Re: CBSTR StringBuilder Class
Post by: José Roca on July 15, 2016, 01:24:12 PM
You can pass ansi strings to the unicode functions, whereas if you do the opposite they will be converted to garbage if they don't use the Latin alphabet.