Listbox - unicode items

Started by Bumblebee, April 07, 2021, 12:10:21 PM

Previous topic - Next topic

Bumblebee

I may not have noticed this before, but does the listbox support unicode characters?
Using the latest version 2.2.0

e.g. Élégance is displayed as Élégance
Failed pollinator.

Paul Squires

Hi, yes the Listbox code does support unicode. The following correctly displays your code:


   for i as long = 0 to 5
      frmMain.List1.Items.Add( "Élégance" & i )
   next

Paul Squires
PlanetSquires Software

Bumblebee

Might be a problem with the string variables I'm using.
Do I need to use CWSTR to preserve unicode characters?
Failed pollinator.

Paul Squires

CWSTR will work. WSTRING will work as well.
I doubt that STRING will work reliably.
Paul Squires
PlanetSquires Software

Bumblebee

I use regular string variables to write the file that contains unicode characters. It seems to work with no issues.
Nor did I specify utf-x encoding when writing the file.

Replacing string with cwstr causes an invalid data type in the input statement.
Failed pollinator.

José Roca

What do you understand by unicode characters? Accented characters like á, é, í, ó, ú aren't unicode.

FB ansi variables can't hold unicode characters like Серге́й Серге́евич Проко́фьев.

Bumblebee

I'm having an issue reading accented characters, as per the example.
According to Notepad, the file I'm parsing is in UTF-8 Unix (LF)

When I change variable type to CWSTR, an error occurs with the Line Input# statement.
It wants a string variable.
Failed pollinator.

José Roca

#7
If the file is utf-8, you have to read it using ansi strings and then convert it to ansi or unicode, since a listbox (or any other Windows control) doesn't understand utf-8.

Bumblebee

WStr() function will not convert strings read with Line Input#
It does work with literal strings.
#include "Afx\CWStr.inc"
dim a as string
dim b as CWSTR
a = "Élégance"
b = "Élégance2"
print a
print b
print wstr(a)
b = wstr(a)
print b
print "- read utf8 file -"
a=""
open "test.txt" for input as #1
line input #1,a
print a
b = a
?b
?wstr(a)
?wstr(b)
close
sleep
end
Failed pollinator.

José Roca

#9
> WStr() function will not convert strings read with Line Input#

Of course not. WSTR will convert ASCII to UNICODE, not UTF-8 to UNICODE.

You can either use the Windows API function MutibyteToWideChar or...

DIM cws AS CWSTR = CWSTR(<UTF-8 string>, CP_UTF8)

e.g.:

DIM s AS STRING = "José Roca"   ' My name in UTF-8
DIM cws AS CWSTR = CWSTR(s, CP_UTF8)
print cws

Bumblebee

I don't understand any of this, but it works. Thanks!
Failed pollinator.

José Roca

You should learn the differences between ASCII, ANSI, UTF-8 and UNICODE.

Bumblebee


Dim z as String
~
Line Input #1, z
z = CWSTR(z,CP_UTF8)


You said that accented characters are not unicode, so this works.
When z is written to a text file, the file is ANSI.

What could I do if there were Cyrillic characters in the UTF-8 source file?
Failed pollinator.

José Roca

An UTF-8 file can't contain cyrillic characters, it has to be UTF-16.

If you need to read files with unicode content, you can't use Line Input. As I said, FB support for unicode is weak. You can use my class CTextStream: https://github.com/JoseRoca/WinFBX/blob/master/docs/File%20Management/CTextStream%20Class.md

Bumblebee

I took the Cyrillic characters you posted and saved them in Notepad. It says the file is UTF-8.
Is this dependent on my language settings?

#include "Afx\Cwstr.inc"
dim a as string
dim b as CWSTR
'cyrillic characters encoded as utf-8
open "test.txt" for input as #1
line input #1,a
close
print a
b = a
print b
print cwstr(a,cp_utf8)
print cwstr(b,cp_utf8)
sleep
end


Ouput in terminal window:

Серге́й Серге́евич В'роко́фьев
СеÃ'â,¬ÃÂ³ÃÂµÃŒÂÃÂ¹ СеÃ'â,¬ÃÂ³ÃÂµÃŒÂÃÂµÃÂ²ÃÂ¸Ã'‡ ПÃ'â,¬ÃÂ¾ÃÂºÃÂ¾ÃŒÂÃ'„Ã'Å'ев
Серге́й Серге́евич Проко́фьев
Серге́й Серге́евич Проко́фьев

So maybe it would work.
When I was working on my file backup program, CWSTR was able to handle every filename, including those that had korean characters. However, I wasn't writing/reading those names from text files. Everything was done within CWSTR arrays.
Failed pollinator.