PlanetSquires Forums

Support Forums => General Board => Topic started by: James Fuller on July 04, 2016, 11:15:36 AM

Title: FF_STRING library
Post by: James Fuller on July 04, 2016, 11:15:36 AM
Attached are Pauls FF_xxxxxx string routines with Private added to each.
I also changed the order of the parameters in FF_Parse and FF_ParseAny so the delimiter is last, allowing for a default of ","

James
Title: Re: FF_STRING library
Post by: Paul Squires on July 05, 2016, 10:27:55 AM
Awesome, thanks James!

I am going to revisit this code again soon because I want to create one library that handles both ANSI and Unicode. Since working on the new editor I am using about 90% Unicode now and I need string handling routines that work well with Unicode also.

I am also thinking that rather than make each function PRIVATE, that maybe keep each on PUBLIC but compile each one as a separate object and assemble them into a static library? For larger multiple module projects this makes a lot of sense because then only one copy of the routine will be included rather than a copy for each independent module that gets compiled. I am still working my way through these types of questions because I have never had to deal with them in the past. With PB we just Include'd everything into the main application .BAS file rather than link separate standalone modules like is the PB (and C/C++) practice. I never used PB's SLL libraries either. The new editor will make creating standalone libraries composed of hundreds of individual source files (for granularity purposes) pretty easy. I can see something like this being quite useful for large code bases like Jose's AfxCtl library.
Title: Re: FF_STRING library
Post by: José Roca on July 05, 2016, 02:52:26 PM
What the C programmers do is not always the best way. Many times they are constrained by the tools that they use. I'm tired of downloading C++ demos very difficult to follow because the code is split into dozens of files. And batch makefiles for this, batch makefiles for that, environment variables... Crazy!

I once tried PB's SLL system... A lot of work to put every procedure into a separate file and to prepare header files; having to rebuild everything every time you change a comma... And the final result was that it compiled slightly faster using includes that using SLLs! In my computer, it takes more time to compile using include files only the first time, then the files are cached and it compiles very fast.

I certainly I'm not going to split the code into hundreds of tiny .bas files. I would prefer not to use it and use SendMessageW.
Title: Re: FF_STRING library
Post by: James Fuller on July 05, 2016, 03:31:31 PM
Quote from: TechSupport on July 05, 2016, 10:27:55 AM
I am going to revisit this code again soon because I want to create one library that handles both ANSI and Unicode. Since working on the new editor I am using about 90% Unicode now and I need string handling routines that work well with Unicode also.

Paul,
  How will you do unicode with no native BSTR?

James
Title: Re: FF_STRING library
Post by: Paul Squires on July 05, 2016, 04:29:21 PM
Jose - right you are, and I'm going to stick with the INCLUDE approach as well. :)

James - Instead of returning strings as the result of a FUNCTION, I would design it such that the string IN and string OUT (and OUT string buffer length) would be passed to the function as parameters. The operation would occur and the result assigned to the OUT parameter (rather than as FUNCTION = strResult). It would be more like a SUB rather than a FUNCTION. I could do that for the UNICODE versions but still have the FUNCTION = strResult for the ANSI version. I would just use overloading to determine which version to call. I hate it that FB does not have a native built-in dynamic WSTRING.

Title: Re: FF_STRING library
Post by: James Fuller on July 05, 2016, 06:32:46 PM
Paul,
  We REALLY NEED a native dynamic wide string type. I wonder if dkl can be bribed? :)
Your way is not acceptable to me. I prefer the BCX way with a static circular buffer.

James
Here is the FF_Remove for WStrings

#define unicode
#include Once "windows.bi"
#define FbTmpWStrSize 2048
'==============================================================================
Function fbTmpWStr(CharCount As Long) As WString Ptr
    Static As Long StrCnt
    Static As WString Ptr WStrFunc(FbTmpWStrSize)
    StrCnt = (StrCnt + 1) AND (FbTmpWStrSize -1)
    If WStrFunc(StrCnt) Then
        Deallocate WStrFunc(StrCnt)
        WStrFunc(StrCnt) = NULL
    EndIf
    CharCount+=1
    WStrFunc(StrCnt) = Allocate(CharCount * Len(WString))
    Function = WStrFunc(StrCnt)
End Function
'==============================================================================
Function FF_Remove(Byval wsMain As WString Ptr,Byval wsMatch As WString Ptr) As WString Ptr
    Dim As Integer i
    If Len(*wsMain) = 0 OR Len(*wsMatch) = 0 Then
        Return NULL
    EndIf
    Dim As WString Ptr wsp = fbTmpWStr(Len(*wsMain))
    *wsp = *wsMain
    Do
        i = Instr(*wsp,*wsMatch)
        If i > 0 Then
            *wsp = Left(*wsp,i-1) & Mid(*wsp,i + Len(*wsMatch))
        EndIf
    Loop Until i = 0   
    Function = wsp
End Function
'==============================================================================
Dim As WString *20 ws1 = "[]Hello[]"
Dim As WString Ptr wsp = FF_Remove(@ws1,"[]")
? *wsp

sleep



Title: Re: FF_STRING library
Post by: José Roca on July 05, 2016, 07:29:07 PM
Maybe one day you will let us know what WStrFunc does.
Title: Re: FF_STRING library
Post by: James Fuller on July 05, 2016, 07:47:41 PM
Jose,
  I thought it a bit self explanatory but ...?
In this case it is a static array of pointers to WStrings so you can return a WSTRING PTR from a function.
The array index increments and rolls over after 2048 in this case.

James
Title: Re: FF_STRING library
Post by: José Roca on July 05, 2016, 08:50:42 PM
A sort of string pool. Pray to not find one of these users that use strings of several gigabytes.
Title: Re: FF_STRING library
Post by: James Fuller on July 05, 2016, 08:57:37 PM
Jose,
  Yes I know but it's not the size it's the number in use at the same time.
And now for CBStr. This was a bit hairy and I'm not sure it's the best/only way to do it.
This has an option to delete and free the allocations.
James


#define unicode
#include "afx/CBstr.inc"
#define CBStrTmpSize 16
'==============================================================================
Function fbBstrTmp(ByVal DeleteFlag As Long = 0) As CBStr Ptr
    Static As CBStr Ptr CBStr_Tmp(CBStrTmpSize)
    Static As Long CBStr_Tmp_Count
    If DeleteFlag Then
        Dim i As Long
        For i = 1 To CBStr_Tmp_Count
            Delete CBStr_Tmp(i)
            CBStr_Tmp(i) = NULL
        Next
        CBStr_Tmp_Count = 0
    EndIf
    CBStr_Tmp_Count = (CBStr_Tmp_Count + 1) AND (CBStrTmpSize  -1)
    If CBStr_Tmp(CBStr_Tmp_Count) Then
        delete CBStr_Tmp(CBStr_Tmp_Count)
        CBStr_Tmp(CBStr_Tmp_Count) = NULL
    EndIf
    CBStr_Tmp(CBStr_Tmp_Count) = CPtr(CBStr Ptr,new CBStr Ptr)
    Function = CBStr_Tmp(CBStr_Tmp_Count)
End Function
'==============================================================================
Function FF_Remove(Byval cbsMain As CBStr Ptr,Byval cbsMatch As CBStr Ptr) As CBStr Ptr
    Dim As Long i
    If Len(*cbsMain) = 0 OR Len(*cbsMatch) = 0 Then
        Return NULL
    EndIf
    Dim As CBStr Ptr cbs = fbBStrTmp()
    Dim As WString Ptr ws1,ws2
    ws1 = **cbsMain
    ws2 = **cbsMatch
    Do
        i = Instr(*ws1,*ws2)
        If i > 0 Then
            *ws1 = Left(*ws1,i-1) & Mid(*ws1,i + Len(*ws2))
        EndIf
    Loop Until i = 0
    *cbs = *ws1
    Function = cbs
End Function
'==============================================================================
Function FbMain() As Long
    Dim As CBStr Ptr ws = new CBStr("[]Hello[]")
    Dim As CBStr Ptr ws1 = new CBStr("[]")
    Dim As CBStr Ptr wsp = FF_Remove(ws,ws1)
    ? *wsp
    Delete ws
    Delete ws1
   
    sleep

    Function = 0
End Function
End FbMain()
Title: Re: FF_STRING library
Post by: José Roca on July 05, 2016, 11:55:46 PM
A little demo of what happens with FB unicode conversions:


pWindow.AddControl("Button", , IDCANCEL, "&Close", 350, 150, 75, 23)

DIM wsz AS WSTRING * 260 = WSTR("&Закрыть")
pWindow.AddControl("Button", , IDCANCEL, wsz, 350, 200, 75, 23)

DIM cb AS CBSTR = AfxUCode("&Закрыть", 1251)
pWindow.AddControl("Button", , IDCANCEL, cb, 350, 250, 75, 23)


Maybe using a Russian version of Windows, WSTR will work because the local ansi page will be Russian, but it doesn't work if used in a computer with a different local code page.

However, the version that uses AfxUCode("&Закрыть", 1251), should work in all systems.
Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 01:12:02 AM
> And now for CBStr. This was a bit hairy and I'm not sure it's the best/only way to do it.


' ========================================================================================
SUB FF_Remove(BYREF cbsMain AS CBSTR, BYREF cbsMatch AS CBSTR, BYREF cbsOut AS CBSTR)
   IF LEN(cbsMain) = 0 OR LEN(cbsMatch) = 0 OR VARPTR(cbsOut) = NULL THEN EXIT SUB
   cbsOut = cbsMain
   DIM i AS LONG
   DO
      i = INSTR(cbsOut, cbsMatch)
      IF i THEN
         cbsOut = LEFT(**cbsOut, i - 1) & MID(**cbsOut, i + LEN(cbsMatch))
      ENDIF
   LOOP UNTIL i = 0
END SUB
' ========================================================================================



DIM cbs1 AS CBSTR = CBSTR("[]Hello[]")
DIM cbs2 AS CBSTR = CBSTR("[]")
DIM cbsOut AS CBSTR
FF_Remove(cbs1, cbs2, cbsOut)
MessageBoxW 0, *cbsOut, "", MB_OK


We can also use


DIM cbsOut AS CBSTR
FF_Remove("[]Hello[]", "[]", cbsOut)
MessageBoxW 0, *cbsOut, "", MB_OK

Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 01:34:00 AM
Paul's suggestion:


' ========================================================================================
SUB FF_Remove(BYREF wszMain AS WSTRING, BYREF wszMatch AS WSTRING, BYREF wszOut AS WSTRING)
   IF LEN(wszMain) = 0 OR LEN(wszMatch) = 0 OR VARPTR(wszOut) = NULL THEN EXIT SUB
   wszOut = wszMain
   DIM i AS LONG
   DO
      i = INSTR(wszOut, wszMatch)
      IF i THEN
         wszOut = LEFT(wszOut, i - 1) & MID(wszOut, i + LEN(wszMatch))
      ENDIF
   LOOP UNTIL i = 0
END SUB
' ========================================================================================



DIM wszOut AS WSTRING * 260
FF_Remove("[]Hello[]", "[]", wszOut)
MessageBoxW 0, wszOut, "", MB_OK


The advantage of using the CBSTR version is that we don't need to know in advance the size of the out string because it uses a dynamic BSTR, or rather an AFX_BSTR, because BSTR is broken since the latest header's update (it no longer is a pointer to an unicode string, but a pointer to an unicode character).

What? Not a single PTR parameter? This must be no FreeBASIC :)

Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 03:55:20 AM
Overloading the procedure, we can use CBSTRs or WSTRINGs:


' ========================================================================================
FUNCTION FF_Remove OVERLOAD (BYREF cbsMain AS CBSTR, BYREF cbsMatch AS CBSTR, BYREF cbsOut AS CBSTR) AS BOOLEAN
   IF LEN(cbsMain) = 0 OR LEN(cbsMatch) = 0 OR VARPTR(cbsOut) = NULL THEN EXIT FUNCTION
   cbsOut = cbsMain
   DIM i AS LONG
   DO
      i = INSTR(cbsOut, cbsMatch)
      IF i THEN
         cbsOut = LEFT(**cbsOut, i - 1) & MID(**cbsOut, i + LEN(cbsMatch))
         FUNCTION = TRUE
      ENDIF
   LOOP UNTIL i = 0
END FUNCTION
' ========================================================================================

' ========================================================================================
FUNCTION FF_Remove OVERLOAD (BYREF wszMain AS WSTRING, BYREF wszMatch AS WSTRING, BYREF wszOut AS WSTRING) AS BOOLEAN
   IF LEN(wszMain) = 0 OR LEN(wszMatch) = 0 OR VARPTR(wszOut) = NULL THEN EXIT FUNCTION
   wszOut = wszMain
   DIM i AS LONG
   DO
      i = INSTR(wszOut, wszMatch)
      IF i THEN
         wszOut = LEFT(wszOut, i - 1) & MID(wszOut, i + LEN(wszMatch))
         FUNCTION = TRUE
      ENDIF
   LOOP UNTIL i = 0
END FUNCTION
' ========================================================================================


I think that this is a good solution. If we know in advance the maximum length of the output string, it is more efficient to use a WSTRING because there won't be further allocations/deallocations of memory. If we don't know it, then we can use a CBSTR.
Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 04:06:51 AM
What we must not do is to use overloaded versions of the operators + and & (I removed them from the CBSTR class) because they generate temporary BSTRs that aren't freed. Instead, I'm using ** to point to the contents of the BSTR

LEFT(**cbsOut, i - 1) & MID(**cbsOut, i + LEN(cbsMatch))

that generates temporary WSTRINGs that the compiler frees automatically.
Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 08:51:39 AM
Jose,
  Nope! :) Come on Jose you just penned a very enjoyable narrative on the Fb unbasic way to do things :)
I will not give up my "BASIC" way to do things, sorry.
The pool technique has worked for decades with BCX.

How to do this your way?

gsDlgInfo(DI_EXSTYLE) = FF_Remove(FF_Remain(1,FF_Parse(sLine,1),"("),"%")

James




Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 10:12:50 AM
Jose,
  Do you see any problems here? This might be a compromise.
James


#define unicode
#include "afx/CBstr.inc"

FUNCTION FF_Remove OVERLOAD (BYREF cbsMain AS CBSTR, BYREF cbsMatch AS CBSTR, BYREF cbsOut AS CBSTR) AS CBSTR
   IF LEN(cbsMain) = 0 OR LEN(cbsMatch) = 0 OR VARPTR(cbsOut) = NULL THEN EXIT FUNCTION
   cbsOut = cbsMain
   DIM i AS LONG
   DO
      i = INSTR(cbsOut, cbsMatch)
      IF i THEN
         cbsOut = LEFT(**cbsOut, i - 1) & MID(**cbsOut, i + LEN(cbsMatch))
         'FUNCTION = TRUE
      ENDIF
   LOOP UNTIL i = 0
   Function = cbsOut
END FUNCTION
'==============================================================================
Function FbMain() As Long
    Dim As CBStr cbs = "[]Hello[]"
    Dim As CBStr cbs1 = "[]"
    Dim As CBStr cbs2
    cbs2 =  FF_Remove(cbs,cbs1,cbs2)
    ? **cbs2
    sleep
    Function = 0
End Function
End FbMain()
End Function

Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 11:36:32 AM
Jose,
  Take a look at Temporary Types in the help file.
James

FUNCTION FF_Remove OVERLOAD (cbsMain AS CBSTR, cbsMatch AS CBSTR) AS CBSTR
   IF LEN(cbsMain) = 0 OR LEN(cbsMatch) = 0  THEN EXIT FUNCTION
   
   Dim As CBStr cbsOut = Type(cbsMain)
   DIM i AS LONG
   DO
      i = INSTR(cbsOut, cbsMatch)
      IF i THEN
         cbsOut = LEFT(**cbsOut, i - 1) & MID(**cbsOut, i + LEN(cbsMatch))
      ENDIF
   LOOP UNTIL i = 0
   Function = cbsOut
END FUNCTION

Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 11:45:00 AM
> How to do this your way?

Step by step.
Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 11:50:38 AM
> Do you see any problems here? This might be a compromise.

Why to reassign to cbs2 its own content? It is like doing x = x.
Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 11:55:48 AM
Jose,
  I only tried this with CBStr. As I said see Temporary Type in the help file
I added a ? "CBStr Destructor SysFreeString m_bstr" to the CBStr.inc so I coud see
the destructors called. 5 altogether. 2 before the  ? **cbs2 and 3 more after leaving the function.

James
   

#define unicode
#include "afx/CBstr.inc"
FUNCTION FF_Remove OVERLOAD (cbsMain AS CBSTR, cbsMatch AS CBSTR) AS CBSTR
   IF LEN(cbsMain) = 0 OR LEN(cbsMatch) = 0  THEN EXIT FUNCTION
   
   'This is the changed. See Temporary Types in the help file
   Dim As CBStr cbsOut = Type(cbsMain)
   
   DIM i AS LONG
   DO
      i = INSTR(cbsOut, cbsMatch)
      IF i THEN
         cbsOut = LEFT(**cbsOut, i - 1) & MID(**cbsOut, i + LEN(cbsMatch))
      ENDIF
   LOOP UNTIL i = 0
   Function = cbsOut
END FUNCTION
'==============================================================================
Function FbMain() As Long
    Dim As CBStr cbs = "[]Hello[]"
    Dim As CBStr cbs1 = "[]"
    Dim As CBStr cbs2
    cbs2 =  FF_Remove(cbs,cbs1)
    ? **cbs2
    Function = 0
End Function
FbMain()
Sleep
End

Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 12:38:59 PM
Maybe you have find something of value. Will have to test.
Title: Re: FF_STRING library
Post by: José Roca on July 06, 2016, 02:44:30 PM
Seems to work fine. Congratulations!

Can't be used to return a WSTRING, but works with CBSTR.

This is what I was looking for so long: to return a temporary value that will be freed by te compiler. In the case of a TYPE, it is the destructor of this TYPE who does the cleanup, which is the correct way of doing it.

This opens new possibilities, also for variants.

With CBSTR, a class for variants and the use of abstract methods, probably I will be able to make easy the use of COM.

The show stopper has always been the problem of freeing temporary results.

As always, the best solution only comes after all the others have been discarded.
Title: Re: FF_STRING library
Post by: Paul Squires on July 06, 2016, 03:29:30 PM
Isn't it always just such a great day when you see Jose getting excited about programming!  :)

Thanks James, I will raise a glass to toast you for this discovery. Well done indeed.
Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 03:45:48 PM
It was under our noses all along and described quite well in the help file, not like Private.
I was just lucky I stumbled over it.
I really see no reason to use WSTRING's at all now. CBStr should have most/all bases covered shouldn't it?

Onward and Upward Jose :)

James
Title: Re: FF_STRING library
Post by: Paul Squires on July 06, 2016, 05:22:43 PM
I hope so. I haven't been testing all the code in the thread or playing with the CBStr like you and Jose have. I have been working on that editor. Hard for me to concentrate on too many things. Hopefully it will work with all of the permutations of operators that can act on the string. Also, if it does work well then we'll need to code a string builder class because I imagine hundreds or thousands of string concatenations will be slow. I thought that FB's STRING concatenations would be slow but turns out they are extremely fast because a STRING contains extra buffer and only gets resized when that buffer is full unlike BSTR's that have a size equal to the length of the string it holds.
Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 05:34:53 PM
Paul,
  FreeBasic's String handling is the best.
Peter did some comparisons to his new string handling in Bacon.

James
Title: Re: FF_STRING library
Post by: James Fuller on July 06, 2016, 06:33:31 PM
I found another item that might be useful in the help file?
Memory Operators -> Operator Placement New

James

Title: Re: FF_STRING library and Unicode dynamic string
Post by: Marc Pons on July 07, 2016, 08:59:17 AM
Hi ! very happy to see Jose very activ in this forum, ( I was learning c/ c++ during lasts months)

may I give my contribution to the unicode story for freebasic

some months ago, i've tried to create unicode dynamic string type ( I wanted to make a c exercice)
and i've countinued to do it in freebasic.

here is the last evolution i've done : uStringW type

it is similar to the freebasic dynamic string , created with constructor, allocating size by steps (32 when needed) , freeing automaticaly with destructor depending the scope and clearing all the remaining allocated when program exits.
the uStringW type is intended to work in windows mode (utf16 internally)  or linux mode (utf32)
for the windows mode, the surrogate pairs have been take into account,
some functions to manipilate the uStringW type are also included , mid, instr, left, right, len, reverse, replace...

the attached rar file includes the the source file as : Dyn_Wstring.bi
and some example files to test
plus 4 files ( various utf format ) to play with

Your comments are welcome  ;)
Marc

updated attachment
Title: Re: FF_STRING library and Unicode dynamic string
Post by: James Fuller on July 07, 2016, 09:42:01 AM
Quote from: Marc Pons on July 07, 2016, 08:59:17 AM
the attached rar file includes the the source file as : Dyn_Wstring.bi
and some example files to test
plus 4 files ( various utf format ) to play with

Your comments are welcome  ;)
Marc

Afraid not Marc. No Dyn_Wstring.bi in the attachment.

James
Title: Re: FF_STRING library
Post by: Marc Pons on July 07, 2016, 09:59:56 AM
sorry for the mistake

here it is

full this time
Title: Re: FF_STRING library
Post by: José Roca on July 07, 2016, 01:06:34 PM
In fact, if the type is a class with a constructor and destructor, we don't need to use the TYPE syntax because when you pass it as a parameter or return it as the result of a function, FB does it for you and calls the constructor and later the destructor. What we have to avoid is to write functions that return a handle to a BSTR and use it to pass it to a function directly, without storing it first into a variable that later we will freed. So, instead of BSTR, we will use CBSTR if we want the memory to be automatically freed.

That is, instead of writing code that generates temporary BSTRs, we should write code that generates temporary CBSTRs.

At least this has been useful to clarify the question.
Title: Re: FF_STRING library
Post by: José Roca on July 07, 2016, 01:21:12 PM
Likewise, in the case of variants we can't just to use the TYPE syntax to create a temporary one, unless it only uses scalar types, because the compiler will just to free the memory used by the structure, but for variants that store pointers to BSTRs, object references, etc., we need to call VariantClear. So, we need to wrap a variant into a class, like in the case of BSTRs.
Title: Re: FF_STRING library
Post by: José Roca on July 07, 2016, 03:29:52 PM
Therefore, these methods that I removed from the CBSTR class because can create memory leaks:


' ========================================================================================
' Concatenates two WSTRINGs and returns a new BSTR.
' ========================================================================================
FUNCTION CBStr.Concat (BYREF wszStr1 AS CONST WSTRING, BYREF wszStr2 AS CONST WSTRING) AS BSTR
   DIM n1 AS INTEGER, n2 AS INTEGER, b AS BSTR
   n1 = .LEN(wszStr1)
   n2 = .LEN(wszStr2)
   b = SysAllocStringLen(NULL, n1+n2)
   IF b = NULL THEN EXIT FUNCTION
   IF n1 THEN memcpy(b, @wszStr1, n1 * SIZEOF(WSTRING))
   IF n2 THEN memcpy(b+n1, @wszStr2, n2 * SIZEOF(WSTRING))
   FUNCTION = b
END FUNCTION
' ========================================================================================

' ========================================================================================
' Concatenates two BSTRs and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM b2 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = b1 & b2
' Print **b
' ========================================================================================
OPERATOR & (BYREF pBStr1 AS CBStr, BYREF pBStr2 AS CBStr) AS BSTR
   OPERATOR = pBStr1.Concat(*pBStr1.Handle, *pBStr2.Handle)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM wsz AS WSTRING * 250 = " - concatenated string"
' DIM b AS CBStr
' b = b1 & wsz
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR & (BYREF pBStr AS CBStr, BYREF wszStr AS WSTRING) AS BSTR
   OPERATOR = pBStr.Concat(*pBStr.Handle, wszStr)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM wsz AS WSTRING * 250 = "Test string 1"
' DIM b1 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = wsz & b1
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR & (BYREF wszStr AS WSTRING, BYREF pBStr AS CBStr) AS BSTR
   OPERATOR = pBStr.Concat(wszStr, *pBStr.Handle)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates two BSTRs and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM b2 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = b1 + b2
' Print **b
' ========================================================================================
OPERATOR + (BYREF pBStr1 AS CBStr, BYREF pBStr2 AS CBStr) AS BSTR
   OPERATOR = pBStr1.Concat(*pBStr1.Handle, *pBStr2.Handle)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM wsz AS WSTRING * 250 = " - concatenated string"
' DIM b AS CBStr
' b = b1 + wsz
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR + (BYREF pBStr AS CBStr, BYREF wszStr AS WSTRING) AS BSTR
   OPERATOR = pBStr.Concat(*pBStr.Handle, wszStr)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM wsz AS WSTRING * 250 = "Test string 1"
' DIM b1 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = wsz + b1
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR + (BYREF wszStr AS WSTRING, BYREF pBStr AS CBStr) AS BSTR
   OPERATOR = pBStr.Concat(wszStr, *pBStr.Handle)
END OPERATOR
' ========================================================================================


will have to be reworked to return a CBSTR instead of a BSTR.

For example, in CBStr.Concat, instead of returning the variable b, that is a BSTR, I will create a temporary instance of CBSTR, attach the BSTR to it and return the CBSTR.
Title: Re: FF_STRING library
Post by: José Roca on July 07, 2016, 03:45:56 PM
Marc,

I appreciate your code. Thanks very much for collaborating.

However, since this project is only for Windows, I think that we can avoid such a big complexity. We only need to support UTF-16 little-endian.

Title: Re: FF_STRING library
Post by: Marc Pons on July 08, 2016, 04:18:06 AM
Jose,

You are true, windows only needs utf-16 LE... at least to work with API.
But its also needed to convert from/to utf8 , utf32LE or BE for input/output.

However utf-16  is the real complex point , playing with surrogate pairs is a real pain ( if you don't work only on ucs-2).

The part added to work with Unix/linux systems is minimal (as used utf-32) because no surrogate pair story.

And it is quite easy to simplify the code to let only the windows version , or both with a conditionnal compilation...


The code, I am proposing is taking care of variable unicode strings without leaks ( without need of BSTR , SysAlloc...  )

But, you now, the problem with unicode ( and worst with surrogates) is the manipulation string functions
even a function like len is difficult because you have to take care of surrogates, or you only have a stupid len function,
and even solving that you only have nb of unicode units not sure it is nb of unicode characters

Why I'm interrested on pushing the code here, is some comments on how it works on your own environment, what difficulty...
you and some others here in that forum are much more interested on that subject than the ones in the Freebasic forum,
(probably because unicode is crucial for professionnal area and today Freebasic has not reached that state)

I understood today Paul / James ... still do not have found a real way, and you are going on the BSTR route

Wich I'm interrested too , more for the COM perspectives.

To finish, I would be very happy if some of the guys here, could test and give feedback