FF_STRING library

Started by James Fuller, July 04, 2016, 11:15:36 AM

Previous topic - Next topic

Marc Pons

#30
sorry for the mistake

here it is

full this time

José Roca

#31
In fact, if the type is a class with a constructor and destructor, we don't need to use the TYPE syntax because when you pass it as a parameter or return it as the result of a function, FB does it for you and calls the constructor and later the destructor. What we have to avoid is to write functions that return a handle to a BSTR and use it to pass it to a function directly, without storing it first into a variable that later we will freed. So, instead of BSTR, we will use CBSTR if we want the memory to be automatically freed.

That is, instead of writing code that generates temporary BSTRs, we should write code that generates temporary CBSTRs.

At least this has been useful to clarify the question.

José Roca

Likewise, in the case of variants we can't just to use the TYPE syntax to create a temporary one, unless it only uses scalar types, because the compiler will just to free the memory used by the structure, but for variants that store pointers to BSTRs, object references, etc., we need to call VariantClear. So, we need to wrap a variant into a class, like in the case of BSTRs.

José Roca

#33
Therefore, these methods that I removed from the CBSTR class because can create memory leaks:


' ========================================================================================
' Concatenates two WSTRINGs and returns a new BSTR.
' ========================================================================================
FUNCTION CBStr.Concat (BYREF wszStr1 AS CONST WSTRING, BYREF wszStr2 AS CONST WSTRING) AS BSTR
   DIM n1 AS INTEGER, n2 AS INTEGER, b AS BSTR
   n1 = .LEN(wszStr1)
   n2 = .LEN(wszStr2)
   b = SysAllocStringLen(NULL, n1+n2)
   IF b = NULL THEN EXIT FUNCTION
   IF n1 THEN memcpy(b, @wszStr1, n1 * SIZEOF(WSTRING))
   IF n2 THEN memcpy(b+n1, @wszStr2, n2 * SIZEOF(WSTRING))
   FUNCTION = b
END FUNCTION
' ========================================================================================

' ========================================================================================
' Concatenates two BSTRs and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM b2 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = b1 & b2
' Print **b
' ========================================================================================
OPERATOR & (BYREF pBStr1 AS CBStr, BYREF pBStr2 AS CBStr) AS BSTR
   OPERATOR = pBStr1.Concat(*pBStr1.Handle, *pBStr2.Handle)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM wsz AS WSTRING * 250 = " - concatenated string"
' DIM b AS CBStr
' b = b1 & wsz
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR & (BYREF pBStr AS CBStr, BYREF wszStr AS WSTRING) AS BSTR
   OPERATOR = pBStr.Concat(*pBStr.Handle, wszStr)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM wsz AS WSTRING * 250 = "Test string 1"
' DIM b1 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = wsz & b1
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR & (BYREF wszStr AS WSTRING, BYREF pBStr AS CBStr) AS BSTR
   OPERATOR = pBStr.Concat(wszStr, *pBStr.Handle)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates two BSTRs and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM b2 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = b1 + b2
' Print **b
' ========================================================================================
OPERATOR + (BYREF pBStr1 AS CBStr, BYREF pBStr2 AS CBStr) AS BSTR
   OPERATOR = pBStr1.Concat(*pBStr1.Handle, *pBStr2.Handle)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM b1 AS CBStr = "Test string 1"
' DIM wsz AS WSTRING * 250 = " - concatenated string"
' DIM b AS CBStr
' b = b1 + wsz
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR + (BYREF pBStr AS CBStr, BYREF wszStr AS WSTRING) AS BSTR
   OPERATOR = pBStr.Concat(*pBStr.Handle, wszStr)
END OPERATOR
' ========================================================================================

' ========================================================================================
' Concatenates a BSTR and a WSTRING and returns a new BSTR.
' Usage:
' DIM wsz AS WSTRING * 250 = "Test string 1"
' DIM b1 AS CBStr = " - concatenated string"
' DIM b AS CBStr
' b = wsz + b1
' Print **b
' Note: Instead of wsz, we can also pass an string literal or a FB string
' ========================================================================================
OPERATOR + (BYREF wszStr AS WSTRING, BYREF pBStr AS CBStr) AS BSTR
   OPERATOR = pBStr.Concat(wszStr, *pBStr.Handle)
END OPERATOR
' ========================================================================================


will have to be reworked to return a CBSTR instead of a BSTR.

For example, in CBStr.Concat, instead of returning the variable b, that is a BSTR, I will create a temporary instance of CBSTR, attach the BSTR to it and return the CBSTR.

José Roca

Marc,

I appreciate your code. Thanks very much for collaborating.

However, since this project is only for Windows, I think that we can avoid such a big complexity. We only need to support UTF-16 little-endian.


Marc Pons

Jose,

You are true, windows only needs utf-16 LE... at least to work with API.
But its also needed to convert from/to utf8 , utf32LE or BE for input/output.

However utf-16  is the real complex point , playing with surrogate pairs is a real pain ( if you don't work only on ucs-2).

The part added to work with Unix/linux systems is minimal (as used utf-32) because no surrogate pair story.

And it is quite easy to simplify the code to let only the windows version , or both with a conditionnal compilation...


The code, I am proposing is taking care of variable unicode strings without leaks ( without need of BSTR , SysAlloc...  )

But, you now, the problem with unicode ( and worst with surrogates) is the manipulation string functions
even a function like len is difficult because you have to take care of surrogates, or you only have a stupid len function,
and even solving that you only have nb of unicode units not sure it is nb of unicode characters

Why I'm interrested on pushing the code here, is some comments on how it works on your own environment, what difficulty...
you and some others here in that forum are much more interested on that subject than the ones in the Freebasic forum,
(probably because unicode is crucial for professionnal area and today Freebasic has not reached that state)

I understood today Paul / James ... still do not have found a real way, and you are going on the BSTR route

Wich I'm interrested too , more for the COM perspectives.

To finish, I would be very happy if some of the guys here, could test and give feedback