PlanetSquires Forums

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Read file into STRING - is this okay?  (Read 277 times)

Paul Squires

  • Administrator
  • Master Member
  • *****
  • Posts: 8635
  • Windows 10
    • PlanetSquires Software
Read file into STRING - is this okay?
« on: July 19, 2018, 07:47:23 PM »

Hi Jose,

I have a generic function that is supposed to read all bytes from a disk file into memory. I then manipulate those bytes to determine if the file is ansi, utf-8, or utf-16. The result will be a string in txtBuffer that is UTF-8 encoded. Does OPEN and GET in this context into a STRING variable screw up the bytes with automatic conversions like it does when writing to a file??? Is there a better way to do this? Maybe straight api with CreateFile?

Code: [Select]
function GetFileToString( byref wszFilename as const wstring, byref txtBuffer as string, byval pDoc as clsDocument ptr) as boolean
   
   ' Load the entire file into a string
   dim as long f = freefile
   If Open( wszFilename for Binary Access Read As #f ) = 0 Then
      If LOF(f) > 0 Then
         txtBuffer = String(LOF(f), 0)
         Get #f, , txtBuffer    '<--- could this be a problem?
      End If
   else
      return true  ' error opening file
   end if
   close #f   

   ' Check for BOM signatures
   if left(txtBuffer, 3) = chr(&HEF, &HBB, &HBF) THEN
      ' UTF8 BOM encoded
      pDoc->FileEncoding = FILE_ENCODING_UTF8_BOM
      txtBuffer = mid(txtBuffer, 4)   ' bypass the BOM
   elseif left(txtBuffer, 2) = chr(&HFF, &HFE) THEN
      ' UTF16 BOM (little endian) encoded
      pDoc->FileEncoding = FILE_ENCODING_UTF16_BOM
      txtBuffer = mid(txtBuffer, 3)   ' bypass the BOM
   else
      pDoc->FileEncoding = FILE_ENCODING_ANSI
   END IF

   select case pDoc->FileEncoding
      case FILE_ENCODING_ANSI
         ' No conversion needed. clsDocument ApplyProperties will *not*
         ' set the editor to UTF8 code.
      case FILE_ENCODING_UTF8_BOM   
         ' No conversion needed. clsDocument ApplyProperties will set
         ' the editor to UTF8 code.
      case FILE_ENCODING_UTF16_BOM
         ' Convert to UTF8 so it can display in the editor
         ' Need to pass a WSTRING pointer to the conversion function.
         txtBuffer = UnicodeToUtf8( cast(WSTRING ptr, strptr(txtBuffer)) )
   END select
     
   function = false
END FUNCTION
Logged
Paul Squires
PlanetSquires Software
FireFly Visual Designer, WinFBE Editor

Paul Squires

  • Administrator
  • Master Member
  • *****
  • Posts: 8635
  • Windows 10
    • PlanetSquires Software
Re: Read file into STRING - is this okay?
« Reply #1 on: July 19, 2018, 08:25:57 PM »

Here is code I adapted from Jose's AfxFileScan routine that appears to load the correctly load an entire file into a simple STRING variable.

Code: [Select]
   DIM dwCount AS DWORD, dwFileSize AS DWORD, dwHighSize AS DWORD, dwBytesRead AS DWORD
   DIM hFile AS HANDLE = CreateFileW(@wszFileName, GENERIC_READ, FILE_SHARE_READ, NULL, _
                         OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, NULL)
   IF hFile = INVALID_HANDLE_VALUE THEN return true
   dwFileSize = GetFileSize(hFile, @dwHighSize)
   txtBuffer = String(dwFileSize, 0)
   DIM bSuccess AS LONG = ReadFile(hFile, strptr(txtBuffer), dwFileSize, @dwBytesRead, NULL)
   CloseHandle(hFile)
   IF bSuccess = FALSE THEN return true

Logged
Paul Squires
PlanetSquires Software
FireFly Visual Designer, WinFBE Editor

Josť Roca

  • FireFly3 Registered User
  • Master Member
  • *
  • Posts: 3109
    • Jos
Re: Read file into STRING - is this okay?
« Reply #2 on: July 20, 2018, 02:39:08 AM »

There will be a problem if you were using UnicodeToUtf8(txtBuffer), but as you're passing a pointer with STRPTR I don't expect problems.

Can't say the same with OPEN(wszFilename). As OPEN does not support WSTRINGs, wszFilename will be converted to ansi. Therefore, won't work with unicode file names. This is why I'm using CreateFileW.