CWindow Release Candidate 31

Started by José Roca, August 06, 2017, 03:21:36 PM

Previous topic - Next topic

José Roca

#165
We can search for more than a word at the same time.


'#CONSOLE ON
#define UNICODE
#INCLUDE ONCE "Afx/CRegExp.inc"
using Afx

DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "The contests will be in January, July and November"
DIm cbsPattern AS CBSTR = $"\b(january|february|march|april|may|june|july|" & _
    $"august|september|october|november|december)\b"
PRINT pRegExp.Execute(cbsText, cbsPattern, TRUE)
For i AS LONG = 0 TO pRegExp.MatchesCount - 1
   print pRegExp.MatchValue(i)
NEXT

PRINT
PRINT "Press any key..."
SLEEP


Output:
January
July
November

José Roca

#166

'#CONSOLE ON
#define UNICODE
#INCLUDE ONCE "Afx/CRegExp.inc"
#include once "afx/cwstr.inc"
using Afx

' // Case insensitive, double search (c.t and d.g), whole words
' // Retrieves cat, dog, but not cats, dogs
DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "I have a cat and a dog, because I love cats and dogs"
DIM cbsPattern AS CBSTR = $"\bC.T\b|\bD.G\b"
pRegExp.Execute(cbsText, cbsPattern, TRUE)
FOR i AS LONG = 0 TO pRegExp.MatchesCount - 1
   print pRegExp.MatchValue(i)
NEXT

PRINT
PRINT "Press any key..."
SLEEP




'#CONSOLE ON
#define UNICODE
#INCLUDE ONCE "Afx/CRegExp.inc"
#include once "afx/cwstr.inc"
using Afx

' // Case insensitive, double search (c.t and d.g), whole words
' // Retrieves cut, cat, i.e. whole words with three letters that begin
' // with c and end with t.
' // With this constructor we set the pattern, ignore case and global
DIM pRegExp AS CRegExp = CRegExp($"\bC.T\b|\bD.G\b", TRUE, TRUE)
pRegExp.Execute("I have cut a cat tail")
FOR i AS LONG = 0 TO pRegExp.MatchesCount - 1
   print pRegExp.MatchValue(i)
NEXT

PRINT
PRINT "Press any key..."
SLEEP


Paul Squires

In all my years of programming I've only used regular expressions a handful of times. It is so powerful that I regret not using it more often. Your example makes it look so easy.... maybe because it is.
Paul Squires
PlanetSquires Software

José Roca

#168
It is not difficult, but it is verbose when used with FreeBasic. This is why I have wrapped it in a class, to hide the verbosity inside the methods. The only difficulty is to find the appropriate pattern, because if you don't use it often, each time you have to reread the table of metacharacters. But it s always much less work that having to write code to perform an specific task. Now we have string functions to perform the most common tasks and regular expressions. Imagine writing code to extract a word that begins with a letter, followed by 3 numbers. With the CRegExp class you just have to do pRegExp.Extract(cbsText, "[a-z][0-9][0-9][0-9]"), and if you want to do another kind of extraction, you just have to change the pattern. And if you want to do it ignoring case, you just have to pass TRUE in the bIgnoreCase parameter. They can be used to parse, extract and validate data.

I have renamed the function InStr as FindEx and added two overloaded Find functions and two overloaded Extract functions.


' ========================================================================================
' Find function with VBScript regular expressions search patterns.
' Parameters:
' - cbsSourceString = The text to be parsed.
' - cbsPattern = The pattern to match.
' - bIgnoreCase = Ignore case.
' Return value:
' - Returns the position of the match or 0 if not found.
'   The length of the match can be retrieved calling pRegExp.MatchLen.
' Usage example:
'   DIM pRegExp AS CRegExp
'   DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
'   DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
'   DIM nPos AS LONG = pRegExp.Find(cbsText, cbsPattern)
' ========================================================================================
PRIVATE FUNCTION CRegExp.Find (BYREF cbsSourceString AS CBSTR, BYREF cbsPattern AS CBSTR, BYVAL bIgnoreCase AS BOOLEAN = FALSE) AS LONG
   IF m_pRegExp = NULL THEN this.SetResult(E_POINTER): RETURN 0
   IF m_pMatches THEN AfxSafeRelease(m_pMatches)
   m_pRegExp->put_IgnoreCase(bIgnoreCase)
   m_pRegExp->put_Global(FALSE)
   m_pRegExp->put_Multiline(FALSE)
   m_pRegExp->put_Pattern(cbsPattern)
   this.SetResult(m_pRegExp->Execute(cbsSourceString, cast(Afx_IDispatch PTR PTR, @m_pMatches)))
   RETURN this.MatchPos(0)
END FUNCTION
' ========================================================================================

' ========================================================================================
' Find function with VBScript regular expressions search patterns.
' Parameters:
' - nStart: The position in the string at which the search will begin. The first character
'   starts at position 1.
' - cbsSourceString = The text to be parsed.
' - cbsPattern = The pattern to match.
' - bIgnoreCase = Ignore case.
' Return value:
' - Returns the position of the match or 0 if not found.
'   The length of the match can be retrieved calling pRegExp.MatchLen.
' Usage example:
'   DIM pRegExp AS CRegExp
'   DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
'   DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
'   DIM nPos AS LONG = pRegExp.Find(15, cbsText, cbsPattern)
' ========================================================================================
PRIVATE FUNCTION CRegExp.Find (BYVAL nStart AS LONG, BYREF cbsSourceString AS CBSTR, BYREF cbsPattern AS CBSTR, BYVAL bIgnoreCase AS BOOLEAN = FALSE) AS LONG
   IF m_pRegExp = NULL THEN this.SetResult(E_POINTER): RETURN 0
   IF m_pMatches THEN AfxSafeRelease(m_pMatches)
   m_pRegExp->put_IgnoreCase(bIgnoreCase)
   m_pRegExp->put_Global(FALSE)
   m_pRegExp->put_Multiline(FALSE)
   m_pRegExp->put_Pattern(cbsPattern)
   DIM pbstr AS BSTR
   IF nStart < 0 OR nStart > SysStringLen(cbsSourceString.m_bstr) THEN RETURN 0
   DIM cbs AS CBSTR = cbsSourceString.MidChars(nStart)
   this.SetResult(m_pRegExp->Execute(cbs, cast(Afx_IDispatch PTR PTR, @m_pMatches)))
   RETURN this.MatchPos(0) + nStart - 1
END FUNCTION
' ========================================================================================

' ========================================================================================
' Global, multiline find function with VBScript regular expressions search patterns.
' Parameters:
' - cbsSourceString = The text to be parsed.
' - cbsPattern = The pattern to match.
' - bIgnoreCase = Ignore case.
' - bGlobal : FALSE = Return only the first match; TRUE = return all matches.
' - bMultiline : TRUE = Match at the start and the end of multiple lines separated by line breaks.
' Return value:
' - Returns a list of comma separated "index, length" value pairs. The pairs are separated
'   by a semicolon.
' Usage Example:
'   DIM pRegExp AS CRegExp
'   DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
'   DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
'   DIM cbsOut AS CBSTR
'   cbsOut = pRegExp.FindAll(cbsText, cbsPattern)
' ========================================================================================
PRIVATE FUNCTION CRegExp.FindAll (BYREF cbsSourceString AS CBSTR, BYREF cbsPattern AS CBSTR, _
   BYVAL bIgnoreCase AS BOOLEAN = FALSE, BYVAL bGlobal AS BOOLEAN = TRUE, BYVAL bMultiline AS BOOLEAN = TRUE) AS CBSTR

   IF m_pRegExp = NULL THEN this.SetResult(E_POINTER): RETURN ""
   m_pRegExp->put_IgnoreCase(bIgnoreCase)
   m_pRegExp->put_Global(bGlobal)
   m_pRegExp->put_Multiline(bMultiline)
   m_pRegExp->put_Pattern(cbsPattern)

   DIM cbsOut AS CBSTR, pMatches AS Afx_IMatchCollection2 PTR
   this.SetResult(m_pRegExp->Execute(cbsSourceString, cast(Afx_IDispatch PTR PTR, @pMatches)))
   IF pMatches THEN
      DIM nCount AS LONG
      pMatches->get_Count(@nCount)
      FOR i AS LONG = 0 TO nCount - 1
         DIM pMatch AS Afx_IMatch2 PTR
         this.SetResult(pMatches->get_Item(i, cast(Afx_IDispatch PTR PTR, @pMatch)))
         IF pMatch THEN
            DIM nFirstIndex AS LONG
            pMatch->get_FirstIndex(@nFirstIndex)
            DIM nLen AS LONG
            pMatch->get_Length(@nLen)
            IF i < nCount - 1 THEN
               cbsOut += STR(nFirstIndex + 1) & "," & STR(nLen) & ";"
            ELSE
               cbsOut += STR(nFirstIndex + 1) & "," & STR(nLen)
            END IF
            AfxSafeRelease(pMatch)
         END IF
      NEXT
      AfxSafeRelease(pMatches)
   END IF
   RETURN cbsOut

END FUNCTION
' ========================================================================================

' ========================================================================================
' Extracts a substring using VBScript regular expressions search patterns.
' Parameters:
' - cbsSourceString = The text to be parsed.
' - cbsPattern = The pattern to match.
' - bIgnoreCase = Ignore case.
' Return value:
' - Returns the retrieved string on exit or an empty string on failure.
' Usage example:
'   DIM pRegExp AS CRegExp
'   DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
'   DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
'   DIM cbs AS CBSTR = pRegExp.Extract(cbsText, cbsPattern)
' ========================================================================================
PRIVATE FUNCTION CRegExp.ExtractStr (BYREF cbsSourceString AS CBSTR, BYREF cbsPattern AS CBSTR, BYVAL bIgnoreCase AS BOOLEAN = FALSE) AS CBSTR
   IF m_pRegExp = NULL THEN this.SetResult(E_POINTER): RETURN 0
   IF m_pMatches THEN AfxSafeRelease(m_pMatches)
   m_pRegExp->put_IgnoreCase(bIgnoreCase)
   m_pRegExp->put_Global(FALSE)
   m_pRegExp->put_Multiline(FALSE)
   m_pRegExp->put_Pattern(cbsPattern)
   this.SetResult(m_pRegExp->Execute(cbsSourceString, cast(Afx_IDispatch PTR PTR, @m_pMatches)))
   DIM nPos AS LONG = this.MatchPos(0)
   DIM nLen AS LONG = this.MatchLen(0)
   DIM cbs AS CBSTR = MID(cbsSourceString, nPos, nLen)
   RETURN cbs
END FUNCTION
' ========================================================================================

' ========================================================================================
' Extracts a substring using VBScript regular expressions search patterns.
' Parameters:
' - nStart: The position in the string at which the search will begin. The first character
'   starts at position 1.
' - cbsSourceString = The text to be parsed.
' - cbsPattern = The pattern to match.
' - bIgnoreCase = Ignore case.
' Return value:
' - Returns the position of the match or 0 if not found.
'   The length can be retrieves calling pRegExp.MatchLen.
' Usage example:
'   DIM pRegExp AS CRegExp
'   DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
'   DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
'   DIM nPos AS LONG = pRegExp.Extract(15, cbsText, cbsPattern)
' ========================================================================================
PRIVATE FUNCTION CRegExp.ExtractStr (BYVAL nStart AS LONG, BYREF cbsSourceString AS CBSTR, BYREF cbsPattern AS CBSTR, BYVAL bIgnoreCase AS BOOLEAN = FALSE) AS CBSTR
   IF m_pRegExp = NULL THEN this.SetResult(E_POINTER): RETURN 0
   IF m_pMatches THEN AfxSafeRelease(m_pMatches)
   m_pRegExp->put_IgnoreCase(bIgnoreCase)
   m_pRegExp->put_Global(FALSE)
   m_pRegExp->put_Multiline(FALSE)
   m_pRegExp->put_Pattern(cbsPattern)
   DIM pbstr AS BSTR
   IF nStart < 0 OR nStart > SysStringLen(cbsSourceString.m_bstr) THEN RETURN 0
   DIM cbs AS CBSTR = cbsSourceString.MidChars(nStart)
   this.SetResult(m_pRegExp->Execute(cbs, cast(Afx_IDispatch PTR PTR, @m_pMatches)))
   DIM nPos AS LONG = this.MatchPos(0)
   DIM nLen AS LONG = this.MatchLen(0)
   RETURN MID(cbsSourceString, nPos + nStart - 1, nLen)
END FUNCTION
' ========================================================================================


Remark: If you google to find patterns, make sure that they are for VbScript, because other implementations can use different metacharacters or have additional features.

José Roca

Some more (the main purpose is to test the class to make changes if needed):


' // Adds an space after the dots that are immediately followed by a word
DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "This is a text.With some dots.Between words. This one not."
DIM cbsPattern AS CBSTR = "(\.)(\w)"
DIM cbs AS CBSTR = pRegExp.ReplaceStr(cbsText, cbsPattern, "$1 $2")
print cbs



' // Replaces an occurrence of two consecutive identical words in a string of text
' // with a single occurrence of the same word
DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "Is is the cost of of gasoline going up up?."
DIM cbsPattern AS CBSTR = $"\b([a-z]+) \1\b"
DIM cbs AS CBSTR = pRegExp.ReplaceStr(cbsText, cbsPattern, "$1")
print cbs



' // Breaks down a URI down to the protocol (ftp, http, and so on), the domain
' // address, and the page/path
DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "http://msdn.microsoft.com:80/scripting/default.htm"
DIM cbsPattern AS CBSTR = $"(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)"
pRegExp.Execute(cbsText, cbsPattern)
FOR i AS LONG = 0 TO pRegExp.SubMatchesCount - 1
   print pRegExp.SubMatchValue(0, i)
NEXT


José Roca

As the main difficulty is to get the right pattern (it is impossible to remember something like "(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)" ), it would be a good idea to collect "recipes" and include them in the documentation.

José Roca

#171
For example,


' // Finding variations on words (John, Jon, Jonathan)
DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "Hello, my name is John Doe, what's your name?"
DIM cbsPattern AS CBSTR = $"\bJoh?n(athan)? Doe\b"
pRegExp.Execute(cbsText, cbsPattern)
IF pRegExp.MatchesCount THEN print pRegExp.MatchValue


Recipe taken from the book "Regular Expression Recipes for Windows Developers: A Problem-Solution Approach by Nathan Good

This expression works by finding the common and optional parts of a word and searching based on them. John, Jon, Jonathan are all similar. They start by Jo and have an n in them. The rest is the h in John or the athan ending in Jonathan:

\b a word boundary...
J followed by
o then...
h that is...
? optional, followed by...
n followed by...
(...) a group of characters...
Note: In the example, this group of characters is athan, which will let the
expression match Jonathan. It may or may not appear as a whole part, so that's
why is grouped win parentheses and followed by ?
? that may appear once but isn't required, followed by
<space> a space, followed by...
D then...
o and finally...
e and e, then...
\b another word boundary

José Roca

#172
Another "recipe": Finding similar words.


' // Finding similar words
DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "My cat found a dead bat over the mat"
DIM cbsPattern AS CBSTR = $"\b[bcm]at\b"
pRegExp.Execute(cbsText, cbsPattern)
FOR i AS LONG = 0 TO pRegExp.MatchesCount - 1
   print pRegExp.MatchValue(i)
NEXT


\b a word boundary
[bcm] one of b, c, or m, followed by...
a then...
t and finally...
\b a word boundary

As an alternative, we can use the operator "|":


DIM cbsPattern AS CBSTR = $"\b(b|c|m)at\b"


José Roca

#173
Check if a string is numeric


#define UNICODE
#INCLUDE ONCE "Afx/CRegExp.inc"
USING Afx

DIM pRegExp AS CRegExp
DIM cbsText AS CBSTR = "1.2345678901234567e+029"
DIM cbsPattern AS CBSTR = "^[\+\-]?\d*\.?\d+(?:[Ee][\+\-]?\d+)?$"
PRINT pRegExp.Test(cbsText, cbsPattern)


Output:
True

Pattern: "^[\+\-]?\d*\.?\d+(?:[Ee][\+\-]?\d+)?$"

The initial "^" and the final "$" match the start and the end of the string, to ensure the check spans the whole string.
The "[\+\-]?" part is the initial plus or minus sign with the "?" multiplier that allows zero or one instance of it.
The "\d*" is a digit, zero or more times.
"\.?" is the decimal point, zero or one time.
The "\d+" part matches a digit one or more times.
The "(?:[Ee][\+\-]?\d+)?" matches "e+", "e-", "E+" or "E-" followed by one or more digits, with the "?" multiplier that allows zero or one instance of it.

James Fuller

Jose,
  Fantastic stuff as always!
How about a complete package update on the first thread post or a new Candidate (32)

Are you now using  fbc  1.06 for all your testing?

James

José Roca

You can find the latest updated code here:

https://github.com/JoseRoca/WinFBX

I plan to use GitHub to keep the code always updated.

> Are you now using  fbc  1.06 for all your testing?

Yes, mainly to catch the use of suffixes. We were so used to suffixes with PB, that it is too easy to use HEX$ instead of HEX, SPACE$ instead of SPACE, etc.

José Roca

#176
I have added some overloaded methods to allow compound syntax.

Now, besides


DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
DIM pRegExp AS CRegExp
print pRegExp.Extract(cbsPattern, cbsText)


you can also do


DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
DIM cbsText AS CBSTR = "blah blah a234 blah blah x345 blah blah"
print CRegExp(cbsPattern).Extract(cbsText)



' // Ignore case
DIM cbsPattern AS CBSTR = "[a-z][0-9][0-9][0-9]"
DIM cbsText AS CBSTR = "blah blah A234 blah blah x345 blah blah"
print CRegExp(cbsPattern).Extract(cbsText, TRUE)



DIM cbsText AS CBSTR = "anything textToExtract anything TextToExtract anything"
DIM cbsPattern AS CBSTR = $"\btextToExtract\b"
DIM pRegExp AS CRegExp = CRegExp(cbsPattern).Execute(cbsText, TRUE)
FOR i AS LONG = 0 TO pRegExp.MatchesCount
   print pRegExp.MatchValue(i)
NEXT


José Roca

#177
Finally I have figured how to set the focus in a document (usually an html page) hosted in an instance of the CWebBrowser class.


' ========================================================================================
' Sets the focus in the hosted document.
' Return value:
' - S_OK if successful, or an error value otherwise.
' ========================================================================================
PRIVATE FUNCTION CWebBrowser.SetFocus () AS HRESULT
   IF m_pWebBrowser = NULL THEN RETURN E_POINTER
   DIM pIHTMLDocument2 AS IHTMLDocument2 PTR
   m_pWebBrowser->get_Document(@cast(ANY PTR, pIHTMLDocument2))
   IF pIHTMLDocument2 = NULL THEN RETURN E_NOINTERFACE
   DIM pWindow2 AS IHTMLWindow2 PTR
   pIHTMLDocument2->lpvtbl->get_parentWindow(pIHTMLDocument2, @pWindow2)
   AfxSafeRelease(pIHTMLDocument2)
   IF pWindow2 = NULL THEN RETURN E_NOINTERFACE
   pWindow2->lpvtbl->focus(pWindow2)
   AfxSafeRelease(pWindow2)
END FUNCTION
' ========================================================================================


This is something that has elluded me during years and today I have had an inspiration :)

Now we can use WebGL in an instance of the WebBrowser control embedded in our application being responsive to the key presses.


   ' // Add a WebBrowser control
   DIM pwb AS CWebBrowser = CWebBrowser(@pWindow, IDC_WEBBROWSER, 0, 0, pWindow.ClientWidth, pWindow.ClientHeight)
   ' // Set the IDocHostUIHandler interface
   pwb.SetUIHandler

   ' // Navigate to the path
   DIM wszPath AS WSTRING * MAX_PATH = ExePath & "\index.html"
   pwb.Navigate(wszPath)
   ' // Processes pending Windows messages to allow the page to load
   ' // Needed if the message pump isn't running
   AfxPumpMessages
   ' // Set the focus in the hosted html page
   pwb.SetFocus

   ' // Note: Instead of pWindow.DoeEvents, we need to use a custom message pump
   ' // to be able to forward the keyboard messages to the WebBrowser control.
   ' // Otherwise, the web page will not respond to them.

   ' // Display the window
   ShowWindow(hWndMain, nCmdShow)
   UpdateWindow(hWndMain)

   ' // Dispatch Windows messages
   DIM uMsg AS MSG
   WHILE (GetMessageW(@uMsg, NULL, 0, 0) <> FALSE)
      IF AfxForwardMessage(GetFocus, @uMsg) = FALSE THEN
         IF IsDialogMessageW(hWndMain, @uMsg) = 0 THEN
            TranslateMessage(@uMsg)
            DispatchMessageW(@uMsg)
         END IF
      END IF
   WEND
   FUNCTION = uMsg.wParam



José Roca

Updated the download file in the first post with all the changes discussed in this thread.

José Roca

#179
Updated again.

CVAR and CDispInvoke classes modified.
- The change to allow to print CVARs directly did break the CDispInvoke class.

AfxStr.inc
- Added functions to encode/decode base 64.

Modified CGdiPlus.inc and CADODB.inc to remove a couple of suffixes (FB 1.06 doesn't allow suffixes anymore).

- Added new methods to CDispInvoke to allow an easier syntax. e.g.


'#CONSOLE ON
#define UNICODE
#include once "Afx/CDispInvoke.inc"
USING Afx

' // Create an instance of the RegExp object
DIM pDisp AS CDispInvoke = "VBScript.RegExp"
' // To check for success, see if the value returned by the DispPtr method is not null
IF pDisp.DispPtr = NULL THEN END

' // Set some properties
' // Use VARIANT_TRUE or CTRUE, not TRUE, because Free Basic TRUE is a BOOLEAN data type, not a LONG
pDisp.Put("Pattern", ".is")
pDisp.Put("IgnoreCase", VARIANT_TRUE)
pDisp.Put("Global", VARIANT_TRUE)

' // Execute a search
DIM pMatches AS CDispInvoke = pDisp.Invoke("Execute", "IS1 is2 IS3 is4")
' // Parse the collection of matches
IF pMatches.DispPtr THEN
   ' // Get the number of matches
   DIM nCount AS LONG = VAL(pMatches.Get("Count"))
   FOR i AS LONG = 0 TO nCount -1
      ' // Get a pointer to the Match object
      DIM pMatch AS CDIspInvoke = pMatches.Get("Item", i)
      IF pMatch.DispPtr THEN
         ' // Get the value of the match
         print pMatch.Get("Value")
      END IF
   NEXT
END IF

PRINT
PRINT "Press any key..."
SLEEP