PlanetSquires Forums

Support Forums => General Board => Topic started by: James Fuller on July 08, 2016, 12:15:20 PM

Title: Dyn_WString Discussion
Post by: James Fuller on July 08, 2016, 12:15:20 PM
Marc,
  As Jose has mentioned we are only interested in Windows development here.
I am willing to do some testing but I think a windows only include might serve a better foundation.
Were you aware of the Temporary Type before you started development?

James
Title: Re: Dyn_WString Discussion
Post by: Marc Pons on July 08, 2016, 01:00:04 PM
James,
a windows only include might serve a better foundation

I will make it if needed, but remember the complexity adding linux version is really minimal...
the utf16 is the complexity

Were you aware of the Temporary Type before you started development

yes, but as i understand it is more intended to be used for Object coding

for the uStringW, the job can be done with type + constructors + operators + destructors
to play with implicit assigment from / to  string ; wstring ptr  and concat operations
as i write on another post the need for pseudo linked list is almost null , it was just because at the beginning i was not so sure about the automatic destuctor freeing

probably i will push next week a simplified .bi file targeting win only and without the pseudo linked list , that will simplify the code.

I will also check with memoryleak tool...
Title: Re: Dyn_WString Discussion
Post by: José Roca on July 08, 2016, 02:17:02 PM
James,

You don't need a Windows version only for doing tests. Don't burden Marc with unneeded work.

As I have said in the other thread, my main purpose to write the CBSTR class is to use it in COM programming, and for that I need real BSTRs. If I can use the class for other purposes, like writing wrapper procedures that deal with unicode strings, the better.
Title: Re: Dyn_WString Discussion
Post by: James Fuller on July 08, 2016, 02:37:20 PM
Jose,
  Understood and agree.
James
Title: Re: Dyn_WString Discussion
Post by: José Roca on July 08, 2016, 03:50:16 PM
Not sure if you really are understanding all the implications, so lets use a little example.

Imagine that I want to call a a procedure that returns a BSTR that I want to pass to another function.


FUNCTION Foo () AS BSTR
   FUNCTION = SysAllocString("James")
END FUNCTION


I have to do


DIM bs AS BSTR = Foo
' Pass it to the function
SysFreeString bs


That is, I can't pass Foo directly to the other function without a memory leak. I have to use an intermediate step assigning first the result to a variable to free it later.

But if I use


FUNCTION Foo () AS CBSTR
   FUNCTION = SysAllocString("James")
END FUNCTION


I can use


MessageBoxW 0, Foo, "", MB_OK


Without having to worry about freeing the memory.

How it works?

FUNCTION = SysAllocString("James")

SysAllocString("James") creates a new BSTR and returns an handle to it.

FUNCTION = tries to assign this handle to the return type. As the returned type is a CBSTR, it calls the constructor of that class and then the LET operator of that class.

As the BSTR handle has been attached to that class, it will be freed when the class is freed or goes out of scope or you assign another value to it.

If I need to call a procedure that has an out BSTR parameter, instead of declaring a variable as BSTR, pass it to the procedure and then free the BSTR when no longer needed, I can declare the variable as CBSTR and pass the address of the underlying BSTR, e.g. Foo(*cbs), and don't have to worry about freeing the memory.

So, for COM programming, this is a good solution. I will do something similar for variants.
Title: Re: Dyn_WString Discussion
Post by: José Roca on July 08, 2016, 04:07:30 PM
And this


FUNCTION Foo () AS CBSTR
   FUNCTION = SysAllocString("James")
END FUNCTION


is even better than


FUNCTION Foo () AS CBSTR
   FUNCTION = TYPE<CBSTR>(SysAllocString("James"))
END FUNCTION


because in the first case, the constructor is called once, whereas in the second case it is called twice.

So we don't really need to create temporary types with TYPE<CBSTR>, although thanks to the tests that I have done using it I have discovered that it can also do it whitout it.
Title: Re: Dyn_WString Discussion
Post by: Marc Pons on July 10, 2016, 08:12:59 AM
new adapted  win version  + some adjusments

in that attached file you will find the new bi file + a sample test to verify speed on concat ... for me it is very efficient!


I,ve being interrested on the stringbuilder approch , done by Paul but it is according my point just a workarround.

If people really need an "almost native" unicode dynamic string it is not really helpful.

If user have to do that kind of workarround solution on generic type , it is better to use directly wstring *xx or wstring ptr
For me the unicode dynamic string has to be a general purpose simple/easy  , probably not perfect but on the Basic philosophy

BSTR & VARIANT are needed for COM , they can probably use more sophisticated approach, but why generalise the difficulty ?
whith different Namespaces , stringbuilder ...

Another questions ( probably for Jose), for BSTR is it necessary to use the SysAllocString , why not emulate that functions to produce a "BSTR" like type allocated/reallocated by malloc ? Is it a specific memory area needed ? if not a "normal" unicode type could be just converted as BSTR struct to fit the job, am I dreaming ?

As always your remarks/comments are welcome



Title: Re: Dyn_WString Discussion
Post by: José Roca on July 10, 2016, 11:48:47 AM
1. With the latest version of the CBSTR class you can work as if it was a native type, using FB native operators and string functions.

2. Concatenations are slow in any language.

3. For normal use, you dont need the string builder class. Only to boost performance when you have to do many concatenations.

Quote
Another questions ( probably for Jose), for BSTR is it necessary to use the SysAllocString , why not emulate that functions to produce a "BSTR" like type allocated/reallocated by malloc ? Is it a specific memory area needed ? if not a "normal" unicode type could be just converted as BSTR struct to fit the job, am I dreaming ?

Yes, it is necessary. A WSTRING can be used with functions, COM or not, that expect a WSTRING, but not with a function or COM method that expects a BSTR. If you pass a WSTRING, when that function or method will try to retrieve its length with SysStringLen it will get a wrong value and it will fail or crash.

But SysAllocString is not the problem. I will repeat it again: the problem are multiple concatenations.

Have you tried to do this using your class?


dim as uStringW utst = "Paul Squires"

for i = 1 to 10000
   utst =  utst & " " & WSTR(i) & " " & utst
NEXT


I have needed to reboot my computer twice.


Title: Re: Dyn_WString Discussion
Post by: Marc Pons on July 10, 2016, 01:29:54 PM
Jose :  nice try  ;)

But i think you are jocking ...

try in PB just with normal strings

#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "windows.inc"

FUNCTION PBMAIN () AS LONG

local i as integer
local tst as string
? "start string concat"
tst = "Paul Squires"
for i = 1 to 100
   tst =  tst & " " & STR$(i) & " " & tst
NEXT
? "done as string concat"

END FUNCTION


if you want to demonstrate something ( as CBSTR class is very good solution , and I have no doubt for COM)
you have to be fair enougth in your comparaison , try to do your sample using CBSTR , if you had to reboot twice with uStringW type be prepared to reboot 20!

more seriously, your CBSTR is a "must" but not very efficient on multiple concatenations, due to BSTR type

and as Paul was just showing whith his little test ( that why I made the stringbuilder class)
You know perfectly that, my attempt with uStringW , is in some concatenation faster than normal dynamic FB string, in most case comparable
but if you do convertion implicitly/explicitly it adds extra time but still not at the level of CBSTR exponential increase

Please you are smart enougth to not play that game.

1. With the latest version of the CBSTR class you can work as if it was a native type, using FB native operators and string functions.

with uStringW is the same, all the string operators can work

but that applys for CBSTR or uStringW  on ucs-2 ( 2 bytes wstring) 
it does not apply for unicode units ( wich can include surrogates) , that why i added extra functions with u_prefix to extend the string manipulation functions to extended unicode (> 65535), try to reverse a string if you have extended codes, or mid or instr , you will break the unicode string... ( that why i added the extra string functions)

whith uStringW you can directly assign or concat with string, wstring ptr , long , double and obviouly with uStringW
concatenation operators available : &= ; += ; &  ; + (even for long or double without explicit convertion)

may i insiste in your answer

Quote

    Another questions ( probably for Jose), for BSTR is it necessary to use the SysAllocString , why not emulate that functions to produce a "BSTR" like type allocated/reallocated by malloc ? Is it a specific memory area needed ? if not a "normal" unicode type could be just converted as BSTR struct to fit the job, am I dreaming ?


Yes, it is necessary. A WSTRING can be used with functions, COM or not, that expect a WSTRING, but not with a function or COM method that expects a BSTR. If you pass a WSTRING, when that function or method will try to retrieve its length with SysStringLen it will get a wrong value and it will fail or crash.


I fully understand the need of BSTR "format" for COM, my question is  : is it acceptable to mimic that BSTR "format" whithout need of SysAllocString family functions that are not friendly and elaborate functions/subs to reproduce the needed expected format?
if the answer is yes, we could avoid from the freebasic side the use of BSTR playing with simpler unicode type and convert in/out when needed by COM.
Always whith the idea to let the difficulty where it is needed and for normal usage easy things , not increase the level of difficulty when it is not needed ( specific namespaces for ex...)
Title: Re: Dyn_WString Discussion
Post by: Marc Pons on July 10, 2016, 01:39:28 PM
mistake

and as Paul was just showing whith his little test ( that why I made the stringbuilder class)

would to be "  and as Paul was just showing whith his little test ( that why he made the stringbuilder class)"

sorry Paul   ( give to Ceasar, what is from Ceasar)
Title: Re: Dyn_WString Discussion
Post by: Johan Klassen on July 10, 2016, 02:07:32 PM
hello all
I hope this small tip that I was unawares until recently is useful, the tip is that you can assign a bstr to the function and then free the temporary bstr before function exit.
please don't laugh at the code, it's only to illustrate the freeing of the bstr before function exit, if you already knew this then excuse my intrusion.

Function StringToBSTR( cnv_string As ZString ) As BSTR
    Dim As BSTR sb
    Dim As integer length
    length = MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, @cnv_string, -1, NULL, 0)
    sb=SysAllocStringLen(sb,length)
    MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, @cnv_string, -1, sb, length)
    Function = sb
    SysFreeString(sb)
End Function

Title: Re: Dyn_WString Discussion
Post by: James Fuller on July 10, 2016, 02:19:57 PM
I need a bit of clarification.
What does #define unicode in source actually do?
I assume all api calls will use the "W" version but what about  text fields in a gui?
Are these unicode? If so are they utf16? or utf2?

James

Title: Re: Dyn_WString Discussion
Post by: José Roca on July 10, 2016, 09:27:23 PM
> I assume all api calls will use the "W" version but what about  text fields in a gui?
> Are these unicode? If so are they utf16? or utf2?

If they are created with CreateWindowExW they will support unicode; otherwise, don't.

Windows uses utf-16.
Title: Re: Dyn_WString Discussion
Post by: José Roca on July 10, 2016, 09:34:13 PM
Quote
I hope this small tip that I was unawares until recently is useful, the tip is that you can assign a bstr to the function and then free the temporary bstr before function exit.

I wish it was so easy. The function is returning a pointer to a BSTR that no longer exists. Later use of this pointer may cause a crash.
Title: Re: Dyn_WString Discussion
Post by: James Fuller on July 10, 2016, 09:46:01 PM
Quote from: Jose Roca on July 10, 2016, 09:27:23 PM
> I assume all api calls will use the "W" version but what about  text fields in a gui?
> Are these unicode? If so are they utf16? or utf2?

If they are created with CreateWindowExW they will support unicode; otherwise, don't.

Windows uses utf-16.

And this would be the default if #define unicode is used and and the call is just CreateWindowEx

James
Title: Re: Dyn_WString Discussion
Post by: José Roca on July 10, 2016, 10:07:36 PM
Yes. if you define unicode, the unicode version will be called; otherwise, the ansi version.