Marc,
As Jose has mentioned we are only interested in Windows development here.
I am willing to do some testing but I think a windows only include might serve a better foundation.
Were you aware of the Temporary Type before you started development?
James
James,
a windows only include might serve a better foundation
I will make it if needed, but remember the complexity adding linux version is really minimal...
the utf16 is the complexity
Were you aware of the Temporary Type before you started development
yes, but as i understand it is more intended to be used for Object coding
for the uStringW, the job can be done with type + constructors + operators + destructors
to play with implicit assigment from / to string ; wstring ptr and concat operations
as i write on another post the need for pseudo linked list is almost null , it was just because at the beginning i was not so sure about the automatic destuctor freeing
probably i will push next week a simplified .bi file targeting win only and without the pseudo linked list , that will simplify the code.
I will also check with memoryleak tool...
James,
You don't need a Windows version only for doing tests. Don't burden Marc with unneeded work.
As I have said in the other thread, my main purpose to write the CBSTR class is to use it in COM programming, and for that I need real BSTRs. If I can use the class for other purposes, like writing wrapper procedures that deal with unicode strings, the better.
Jose,
Understood and agree.
James
Not sure if you really are understanding all the implications, so lets use a little example.
Imagine that I want to call a a procedure that returns a BSTR that I want to pass to another function.
FUNCTION Foo () AS BSTR
FUNCTION = SysAllocString("James")
END FUNCTION
I have to do
DIM bs AS BSTR = Foo
' Pass it to the function
SysFreeString bs
That is, I can't pass Foo directly to the other function without a memory leak. I have to use an intermediate step assigning first the result to a variable to free it later.
But if I use
FUNCTION Foo () AS CBSTR
FUNCTION = SysAllocString("James")
END FUNCTION
I can use
MessageBoxW 0, Foo, "", MB_OK
Without having to worry about freeing the memory.
How it works?
FUNCTION = SysAllocString("James")
SysAllocString("James") creates a new BSTR and returns an handle to it.
FUNCTION = tries to assign this handle to the return type. As the returned type is a CBSTR, it calls the constructor of that class and then the LET operator of that class.
As the BSTR handle has been attached to that class, it will be freed when the class is freed or goes out of scope or you assign another value to it.
If I need to call a procedure that has an out BSTR parameter, instead of declaring a variable as BSTR, pass it to the procedure and then free the BSTR when no longer needed, I can declare the variable as CBSTR and pass the address of the underlying BSTR, e.g. Foo(*cbs), and don't have to worry about freeing the memory.
So, for COM programming, this is a good solution. I will do something similar for variants.
And this
FUNCTION Foo () AS CBSTR
FUNCTION = SysAllocString("James")
END FUNCTION
is even better than
FUNCTION Foo () AS CBSTR
FUNCTION = TYPE<CBSTR>(SysAllocString("James"))
END FUNCTION
because in the first case, the constructor is called once, whereas in the second case it is called twice.
So we don't really need to create temporary types with TYPE<CBSTR>, although thanks to the tests that I have done using it I have discovered that it can also do it whitout it.
new adapted win version + some adjusments
in that attached file you will find the new bi file + a sample test to verify speed on concat ... for me it is very efficient!
I,ve being interrested on the stringbuilder approch , done by Paul but it is according my point just a workarround.
If people really need an "almost native" unicode dynamic string it is not really helpful.
If user have to do that kind of workarround solution on generic type , it is better to use directly wstring *xx or wstring ptr
For me the unicode dynamic string has to be a general purpose simple/easy , probably not perfect but on the Basic philosophy
BSTR & VARIANT are needed for COM , they can probably use more sophisticated approach, but why generalise the difficulty ?
whith different Namespaces , stringbuilder ...
Another questions ( probably for Jose), for BSTR is it necessary to use the SysAllocString , why not emulate that functions to produce a "BSTR" like type allocated/reallocated by malloc ? Is it a specific memory area needed ? if not a "normal" unicode type could be just converted as BSTR struct to fit the job, am I dreaming ?
As always your remarks/comments are welcome
1. With the latest version of the CBSTR class you can work as if it was a native type, using FB native operators and string functions.
2. Concatenations are slow in any language.
3. For normal use, you dont need the string builder class. Only to boost performance when you have to do many concatenations.
Quote
Another questions ( probably for Jose), for BSTR is it necessary to use the SysAllocString , why not emulate that functions to produce a "BSTR" like type allocated/reallocated by malloc ? Is it a specific memory area needed ? if not a "normal" unicode type could be just converted as BSTR struct to fit the job, am I dreaming ?
Yes, it is necessary. A WSTRING can be used with functions, COM or not, that expect a WSTRING, but not with a function or COM method that expects a BSTR. If you pass a WSTRING, when that function or method will try to retrieve its length with SysStringLen it will get a wrong value and it will fail or crash.
But SysAllocString is not the problem. I will repeat it again: the problem are multiple concatenations.
Have you tried to do this using your class?
dim as uStringW utst = "Paul Squires"
for i = 1 to 10000
utst = utst & " " & WSTR(i) & " " & utst
NEXT
I have needed to reboot my computer twice.
Jose : nice try ;)
But i think you are jocking ...
try in PB just with normal strings
#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "windows.inc"
FUNCTION PBMAIN () AS LONG
local i as integer
local tst as string
? "start string concat"
tst = "Paul Squires"
for i = 1 to 100
tst = tst & " " & STR$(i) & " " & tst
NEXT
? "done as string concat"
END FUNCTION
if you want to demonstrate something ( as CBSTR class is very good solution , and I have no doubt for COM)
you have to be fair enougth in your comparaison , try to do your sample using CBSTR , if you had to reboot twice with uStringW type be prepared to reboot 20!
more seriously, your CBSTR is a "must" but not very efficient on multiple concatenations, due to BSTR type
and as Paul was just showing whith his little test ( that why I made the stringbuilder class)
You know perfectly that, my attempt with uStringW , is in some concatenation faster than normal dynamic FB string, in most case comparable
but if you do convertion implicitly/explicitly it adds extra time but still not at the level of CBSTR exponential increase
Please you are smart enougth to not play that game.
1. With the latest version of the CBSTR class you can work as if it was a native type, using FB native operators and string functions.
with uStringW is the same, all the string operators can work
but that applys for CBSTR or uStringW on ucs-2 ( 2 bytes wstring)
it does not apply for unicode units ( wich can include surrogates) , that why i added extra functions with u_prefix to extend the string manipulation functions to extended unicode (> 65535), try to reverse a string if you have extended codes, or mid or instr , you will break the unicode string... ( that why i added the extra string functions)
whith uStringW you can directly assign or concat with string, wstring ptr , long , double and obviouly with uStringW
concatenation operators available : &= ; += ; & ; + (even for long or double without explicit convertion)
may i insiste in your answer
Quote
Another questions ( probably for Jose), for BSTR is it necessary to use the SysAllocString , why not emulate that functions to produce a "BSTR" like type allocated/reallocated by malloc ? Is it a specific memory area needed ? if not a "normal" unicode type could be just converted as BSTR struct to fit the job, am I dreaming ?
Yes, it is necessary. A WSTRING can be used with functions, COM or not, that expect a WSTRING, but not with a function or COM method that expects a BSTR. If you pass a WSTRING, when that function or method will try to retrieve its length with SysStringLen it will get a wrong value and it will fail or crash.
I fully understand the need of BSTR "format" for COM, my question is : is it acceptable to mimic that BSTR "format" whithout need of SysAllocString family functions that are not friendly and elaborate functions/subs to reproduce the needed expected format?
if the answer is yes, we could avoid from the freebasic side the use of BSTR playing with simpler unicode type and convert in/out when needed by COM.
Always whith the idea to let the difficulty where it is needed and for normal usage easy things , not increase the level of difficulty when it is not needed ( specific namespaces for ex...)
mistake
and as Paul was just showing whith his little test ( that why I made the stringbuilder class)
would to be " and as Paul was just showing whith his little test ( that why he made the stringbuilder class)"
sorry Paul ( give to Ceasar, what is from Ceasar)
hello all
I hope this small tip that I was unawares until recently is useful, the tip is that you can assign a bstr to the function and then free the temporary bstr before function exit.
please don't laugh at the code, it's only to illustrate the freeing of the bstr before function exit, if you already knew this then excuse my intrusion.
Function StringToBSTR( cnv_string As ZString ) As BSTR
Dim As BSTR sb
Dim As integer length
length = MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, @cnv_string, -1, NULL, 0)
sb=SysAllocStringLen(sb,length)
MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, @cnv_string, -1, sb, length)
Function = sb
SysFreeString(sb)
End Function
I need a bit of clarification.
What does #define unicode in source actually do?
I assume all api calls will use the "W" version but what about text fields in a gui?
Are these unicode? If so are they utf16? or utf2?
James
> I assume all api calls will use the "W" version but what about text fields in a gui?
> Are these unicode? If so are they utf16? or utf2?
If they are created with CreateWindowExW they will support unicode; otherwise, don't.
Windows uses utf-16.
Quote
I hope this small tip that I was unawares until recently is useful, the tip is that you can assign a bstr to the function and then free the temporary bstr before function exit.
I wish it was so easy. The function is returning a pointer to a BSTR that no longer exists. Later use of this pointer may cause a crash.
Quote from: Jose Roca on July 10, 2016, 09:27:23 PM
> I assume all api calls will use the "W" version but what about text fields in a gui?
> Are these unicode? If so are they utf16? or utf2?
If they are created with CreateWindowExW they will support unicode; otherwise, don't.
Windows uses utf-16.
And this would be the default if #define unicode is used and and the call is just CreateWindowEx
James
Yes. if you define unicode, the unicode version will be called; otherwise, the ansi version.