I'm trying to read fairly large ASCII text files, about 2MB, and 500,000 lines or so, some will be much larger. Each line will be something like 'n677767 a453.0009 b 78.6534 c-98.5555' I need to be able to perform various manipulations on the values following each letter, then save the resultant file. I can load the file into windows notepad, and quickly manually scroll anywhere in the file - (earlier notepad versions had a size limit, afaik).
My initial intention was to read the file, a line at a time, and add into a listbox. This works, albeit quite slow, but once the list box is populated, the scroll arrows are too slow, makes it rather useless for my purpose. Any suggestions as to getting the scrolling to perform? My intention was then to parse each line in the list, and separate into individual lists of values for a, b and c, etc. and perform the modifications on each list, then recombine and save as new file.
I'm guessing, instead of putting values into a list box (delays in showing values on screen) it be quicker to not use a list box, but just a list, and somehow use a slider to present the required screen view of part of the list.
Best wishes,
Ray
See this post: http://www.planetsquires.com/protect/forum/index.php?topic=4072.msg30786#msg30786
You could use a virtual list box or virtual listview. This may be a little too complex for your needs though. You read your data into array and the virtual control only reads and displays the data it needs to display on the screen. They are extremely fast however I don't think I've seen one for freebasic. There are several powerbasic source codes available (eg. Borje's PBVlist) but it would take someone time to convert it to freebasic.
Lol, I stand corrected. :) Jose has pointed to a port done by Jim. Nice.
Thanks, I'll have a look, and most likely will have to rethink my modus operandus. iirc, some time ago, when I was playing with c#, I was able to open a file in notepad by using some sort of system call. I can't remember if I was able to get at the data from the notepad file. But, back then notepad was restricted as to the size of file it could handle.
I'm guessing, handling larger files, etc., I'd be better off using 64 bit free basic/firefly.
I've now had a chance to look at the vlist.inc, etc. seems it will do the job, played with the demo program, generated a list of one and a half million lines similar to my requirements, scrolls fast enough, etc. Think I'll need to set variable speed scroll bars. It will take me a while to get my head around how to use it in my own program, however - a lot of statements I've never used.
I have thoroughly played with Jim's example code, and I think it will be fine for displaying/handling my files. At the moment, I want to use firefly to generate the gui part of things, but I'm completely stumped on how to merge in the virtual list box, and I think I'll need a number of them. I am not into the programmatically arranging of components, much prefer the visual approach of firefly, etc. Can I somehow equate the virtual list box to the firefly listbox, or will it be more expedient to place them by manually coding, (copying the bits of the create code from the example, and the rest of the form layout in firefly?) Any help much appreciated, but I'll need to take it in 'baby steps' I expect
Yes, you will have to do this manually. #Include Jim's inc file and then copy the bits of code from his .bas file that you need in order to make it work. I would put the following listbox creation code in the form's WM_CREATE message handler:
VL_Int
hVList1 = CreateWindow ("FBVLIST",_ 'window class name
BYVAL NULL,_ 'window caption
FBVLISTSTYLES,_ 'window style
4,4,300,300,_ 'initial Position
hWnd,_ 'parent window handle
cast(hMENU,IDC_VLIST),_ 'window menu handle
hInstance,_ 'program instance handle
BYVAL NULL)
I expect that you will encounter High DPI scaling issues if you use the code on systems where the default 96 dpi is not used. It doesn't look like the code scales up the incoming size parameters in the CreateWindowEx call.
Thanks Paul. If it was easy, everybody would be doing it...
I am attaching a sample FireFly FreeBasic project that shows how to use the control within FireFly. (I use two CustomControls and tie the virtual listboxes to them so you can visually move the controls around the screen, etc).
Thanks Paul. that does it fine.
Thanks to your help and suggestions, I'm now able to process the files, longest so far was 3 million lines. It would be nice if I could use a slider or scroll bar to get to the lines I want to inspect. I can use a line number to select a particular line, but then I sometimes want to be able to have a look at the adjacent 50 or so lines, and the scroll bars at the listbox side do not have fine enough resolution for the large arrays. I see there is an up/down control in firefly, but I'm not sure if that will do what I need. I have no real idea on how it works, and my 'suck it and see' test approach crashes the program
Function FRMMAIN_UPDOWN1_CUSTOM ( _
ControlIndex as Long, _ ' index in Control Array
hWndForm as HWnd, _ ' handle of Form
hWndControl as HWnd, _ ' handle of Control
wMsg as UInteger, _ ' type of message
wParam as WPARAM, _ ' first message parameter
lParam as LPARAM _ ' second message parameter
) as Long
Dim pp as Long
pp= FF_UpDown_GetPos( hwnd_frmmain_updown1)
?pp
Function = 0 ' change according to your needs
End Function
I am looking for a control that will step through a list, at a speed/step size based on the cursor position in the control, i.e. the further away from the centre position of control, the faster the window scrolls. Is that how the updown control works?
"have a look at the adjacent 50 or so lines"
If I understand you correctly, all you have to do is to click in the page up/down free space between the thumb and the line up/down button.
Also, for a custom control listbox or a subclassed listbox, you may use the keyboard, like CTRL-Click to do custom scroll.
You could also add more stand alone scroolbar to do medium and fine scrolling...
Hi Pierre, thanks for your suggestion. What I've simply done, until I can figure out if the updown control will be better, is create a column of 8 buttons, the two centre ones moves the selected line ten lines, the next two 100 lines, the next pair 1000, the end ones 10000 lines. Not perfect, but combined with the existing scroll bar it is good enough for me, unless a better idea comes up. I could vary the step size, based on the size of the file, or other parameters- not much point in stepping 10000 lines if the file is only 500 lines long!
I've been honing what I've been trying to do, it is now sort of usable, but a couple of things I'd like to be able to change.
1) Is it possible to show a multi-line tool tip? In some instances, 128 chars does not describe what the control does, and I'm sure that in a few month's time, I'll have forgotten
2) When I select a line in the list box with the cursor, it is highlighted with a blue background and white text, and is quite readable. However, when a line is selected programmatically the line is highlighted with white text on a grey background, and is not as easy to read. It would be nice to be able to change the grey to something more contrasting.
#2 the listbox probably needs keyboard focus. Probably do a SetFocus(hwndListBox) after you load the items. This may or may not work depending on if the focus gets stolen to a different control after you load the items. If it is done on program startup (when your form is loading) you can put it at the end of the the WM_CREATE handler (worst case you may have to do a PostMessage at end of WM_CREATE and set focus in the custom handler based on a user defined message - there are examples in this forum on that approach).
#1 you should look at all of Jose's tooltip functions in his AfxCtl.inc source code file.
#2 does the job, thanks. But, it has got me thinking in other directions... Is it possible to edit the font, say background colour, for an individual item in the listbox? I've yet to look into Jose's functions
Here is a listbox demo that can display 2,147,483,646 items via a subclass to overcome SB_THUMBPOSITION and SB_THUMBTRACK 16bits limits. This LBS_OWNERDRAWFIXED listbox also use different colors and fonts for item display.
#Define JumpCompiler "<D:\Free\64\fbc.exe>"
#Define JumpCompilerCmd "<-s gui -w pedantic "D:\Free\bas\~~Default.rc">"
#Print Coded on FreeBASIC 1.05.0
#Ifdef __FB_64BIT__
#Print 64bit compiler used
#Else
#Print 32bit compiler used
#EndIf
'_____________________________________________________________________________
#define unicode
#Include Once "Windows.bi"
#Include Once "String.bi" 'Format
#Include Once "Win\ShellApi.bi"
#Define ButtonExit 101
#Define Listbox 201
#Define ListboxItemCount MAXDWORD / 2 '2,147,483,646 items
#Define AppName "Big ListBox "
Dim Shared As HINSTANCE hInstance : hInstance = GetModuleHandle(NULL)
'_____________________________________________________________________________
Sub EnumCharSet(ByRef elf As ENUMLOGFONT, ByRef ntm As NEWTEXTMETRIC, _
ByVal FontType As Long, ByVal CharSet As Long)
'Get type of character set - ansi, symbol.. a must for some fonts.
CharSet = elf.elfLogFont.lfCharSet
End Sub
'_____________________________________________________________________________
Function MakeFontEx(ByVal hDC As HDC, ByVal pwsFontName As WString Pointer, ByVal PointSize As Long, _
ByVal Angle As Long, ByVal fBold As Long, ByVal fItalic As DWORD, _
ByVal fUnderline As DWORD, ByVal StrikeThru As DWORD) As HFONT
'Create a font and return its handle. Ex: hFont = MakeFontEx(hDC, "Segoe UI", 9, 0, 0, 0, 0, 0)
Dim CharSet As LParam
Dim CyPixels As Long
If hDC = 0 Then
hDC = GetDC(HWND_DESKTOP)
CyPixels = GetDeviceCaps(hDC, LOGPIXELSY)
EnumFontFamilies(hDC, pwsFontName, Cast(FONTENUMPROCW, ProcPtr(EnumCharSet)), CharSet)
ReleaseDC(HWND_DESKTOP, hDC)
Else
CyPixels = GetDeviceCaps(hDC, LOGPIXELSY)
EnumFontFamilies(hDC, pwsFontName, Cast(FONTENUMPROCW, ProcPtr(EnumCharSet)), CharSet)
End If
PointSize = 0 - (PointSize * CyPixels) \ 72
Function = CreateFont(PointSize, 0, _ 'Height, Width (default=0),
Angle, Angle, _ 'Escapement(angle), Orientation,
fBold, _ 'Weight (%FW_DONTCARE = 0, %FW_NORMAL = 400, %FW_BOLD = 700),
fItalic, fUnderline, StrikeThru, _ 'Italic, Underline, Strike thru,
CharSet, OUT_TT_PRECIS, _ 'Char set, Output precision,
CLIP_DEFAULT_PRECIS, DEFAULT_QUALITY, _ 'ClipPrecision, Quality,
FF_DONTCARE, pwsFontName) 'Pitch and family, Typeface
End Function
'_____________________________________________________________________________
FUNCTION ListboxProc(BYVAL hListbox AS HWND, BYVAL Msg AS UINT, BYVAL wParam AS WPARAM, BYVAL lParam AS LPARAM) AS LONG
'For big Listbox, due to the way where designed, it may be difficult to scroll past the first 65,536 items.
'The 16 bit returned with SB_THUMBTRACK and SB_THUMBPOSITION set this barrier.
'A workaround for this is simply to Subclass the ListBox to intercept those messages
'to set the ScrollBar 32bit position and scroll the ListBox accordingly.
'Mostly exe using Common-Controls 6.0.0.0 seems to need this correction.
'MSDN: WM_VSCROLL message carries only 16 bits of scroll box position data in HiWord of wParam.
'Thus, applications that rely solely on WM_VSCROLL and WM_HSCROLL for
'scroll position data have a practical maximum position value of 65,535.
Dim ScrollInf AS SCROLLINFO
Static pListboxProc AS WndProc
IF Msg = WM_VSCROLL AND (wParam AND &h0000FFFE) = 4 THEN 'wParam AND ... is a quick way to test for SB_THUMBPOSITION(4) or SB_THUMBTRACK(5)
ScrollInf.cbSize = SIZEOF(SCROLLINFO)
ScrollInf.fMask = SIF_ALL OR SIF_DISABLENOSCROLL
GetScrollInfo(hListbox, SB_VERT, @ScrollInf)
ScrollInf.nPos = ScrollInf.nTrackPos
SetScrollInfo(hListbox, SB_VERT, @ScrollInf, TRUE) 'Fix scrollbar thumb position avoiding the Lisbox 16 bit engine.
SendMessage(hListbox, LB_SETTOPINDEX, ScrollInf.nTrackPos, 0) 'Scroll listbox
ELSE
IF hListbox = 0 THEN pListboxProc = Cast(WndProc, wParam) 'Auto asign pListboxProc for later in WM_NCDESTROY
IF Msg = WM_KEYDOWN And (wParam = VK_HOME Or wParam = VK_END) THEN
ScrollInf.cbSize = SIZEOF(SCROLLINFO)
ScrollInf.fMask = SIF_POS 'HOME and END scrool half the way with 2,147,483,6460
IF wParam = VK_HOME THEN ScrollInf.nPos = 0 ELSE ScrollInf.nPos = ListboxItemCount - 1
SetScrollInfo(hListbox, SB_VERT, @ScrollInf, TRUE)
SendMessage(hListbox, LB_SETTOPINDEX, ScrollInf.nPos, 0)
EXIT FUNCTION
END IF
IF Msg = WM_NCDESTROY THEN SetWindowLongPtr(hListbox, GWLP_WNDPROC, Cast(LONG_PTR, pListboxProc)) 'Unsubclass listbox
FUNCTION = CallWindowProc(pListboxProc, hListbox, Msg, wParam, lParam) 'Let the Listbox engine finish the work
END IF
END FUNCTION
'_____________________________________________________________________________
Function WndProc(ByVal hWnd As HWND, ByVal uMsg As UINT, ByVal wParam As WPARAM, ByVal lParam As LPARAM) As Integer
Static hFont As HFONT
Static hFontArialBlack As HFONT
Dim hFontGenuine As HFONT
Static hListbox As HWND
Static hButtonExit As HWND
Dim Even As LONG
Function = 0
Select Case (uMsg)
Case WM_CREATE
'Get Windows default font
Dim NotClientMetrics As NONCLIENTMETRICS
NotClientMetrics.cbSize = SizeOf(NONCLIENTMETRICS)
SystemParametersInfo(SPI_GETNONCLIENTMETRICS, NotClientMetrics.cbSize, @NotClientMetrics, 0)
hFont = CreateFontIndirect(@NotClientMetrics.lfMessageFont)
'Use a second font
hFontArialBlack = MakeFontEx(0, "Arial Black", 9, 0, True, 0, 0, 0)
'Button Exit
hButtonExit = CreateWindowEx(0, "Button", "E&xit", _
WS_CHILD Or WS_VISIBLE Or BS_CENTER Or WS_TABSTOP Or _
BS_NOTIFY Or BS_TEXT Or BS_VCENTER, _
10, 325, 205, 30, _
hWnd, Cast(HMENU, ButtonExit), _
hInstance, NULL)
SendMessage(hButtonExit, WM_SETFONT, Cast(WPARAM, hFont), TRUE)
'Listbox
hListbox = CreateWindowEx(WS_EX_CLIENTEDGE, "Listbox", "", _
WS_CHILD OR WS_VISIBLE OR WS_TABSTOP OR LBS_OWNERDRAWFIXED OR _
LBS_NODATA OR LBS_NOTIFY OR WS_TABSTOP OR WS_VSCROLL, _
10, 10, 205, 310, _
hWnd, Cast(HMENU, Listbox), _
hInstance, NULL)
SendMessage(hListbox, LB_SETCOUNT, Cast(UInteger, MAXLONG), 0) '34,463
SendMessage(hListbox, WM_SETFONT, Cast(UInteger, hFont), TRUE)
'Subclass listbox with a a one liner and no variables needed...
ListboxProc(0, 0, SetWindowLongPtr(hListbox, GWLP_WNDPROC, Cast(LONG_PTR, ProcPtr(ListboxProc))), 0)
Case WM_COMMAND
Select Case LoWord(wParam)
CASE Listbox
IF HiWord(wParam) = LBN_SELCHANGE THEN
SetWindowText(hWnd, "Index is " & STR$(SendMessage(hListbox, LB_GETCURSEL, 0, 0)))
END IF
Case ButtonExit, IDCANCEL
If HiWord(wParam) = BN_CLICKED Then
PostMessage(hWnd, WM_CLOSE, 0, 0)
Exit Function
EndIf
End Select
Case WM_DRAWITEM
If wParam = Listbox Then
Dim pDrawItemStructure AS DRAWITEMSTRUCT Pointer
pDrawItemStructure = Cast(pDRAWITEMSTRUCT, lParam)
If pDrawItemStructure->itemID <> &HFFFFFFFF THEN 'Is list is empty?
Even = (pDrawItemStructure->itemID And 1) 'A simple Odd/Even flip-flop for the demo
If (pDrawItemStructure->itemAction = ODA_DRAWENTIRE) OR (pDrawItemStructure->itemAction = ODA_SELECT) Then
If (pDrawItemStructure->itemState AND ODS_SELECTED) = False Then 'Not selected
If Even Then ' Even colors
FillRect(pDrawItemStructure->hDC, @pDrawItemStructure->rcItem, GetSysColorBrush(COLOR_ACTIVECAPTION)) ' Cls
SetBkColor(pDrawItemStructure->hDC, GetSysColor(COLOR_ACTIVECAPTION)) ' Text background
SetTextColor(pDrawItemStructure->hDC, GetSysColor(COLOR_INFOBK)) ' Text color
Else ' Odd color
FillRect(pDrawItemStructure->hDC, @pDrawItemStructure->rcItem, GetSysColorBrush(COLOR_WINDOW)) ' Cls
SetBkColor(pDrawItemStructure->hDC, GetSysColor(COLOR_WINDOW)) ' Text background
SetTextColor(pDrawItemStructure->hDC, GetSysColor(COLOR_WINDOWTEXT)) ' Text color
EndIf
Else 'Selected
FillRect(pDrawItemStructure->hDC, @pDrawItemStructure->rcItem, GetSysColorBrush(COLOR_HIGHLIGHT)) 'Cls
SetBkColor(pDrawItemStructure->hDC, GetSysColor(COLOR_HIGHLIGHT)) 'Text background
SetTextColor(pDrawItemStructure->hDC, GetSysColor(COLOR_HIGHLIGHTTEXT)) 'Text color
EndIf
'Draw some text
Dim sElement AS WSTRING * 50
sElement = "Element " & FORMAT(pDrawItemStructure->itemID, "#,")
If Even Then hFontGenuine = SelectObject(pDrawItemStructure->hDC, hFontArialBlack)
DrawText(pDrawItemStructure->hDC, Cast(LPCTSTR, STRPTR(sElement)), _
LEN(sElement), @pDrawItemStructure->rcItem, DT_SINGLELINE OR DT_LEFT OR DT_VCENTER)
If Even Then SelectObject(pDrawItemStructure->hDC, hFontGenuine)
'Draw grid lines
Dim hPen As HPEN
hPen = SelectObject(pDrawItemStructure->hDC, CreatePen(PS_SOLID, 1, GetSysColor(COLOR_3DFACE)))
MoveToEx(pDrawItemStructure->hDC, 0, pDrawItemStructure->rcItem.Bottom - 1, BYVAL NULL)
LineTo(pDrawItemStructure->hDC, pDrawItemStructure->rcItem.Right, pDrawItemStructure->rcItem.Bottom - 1)
DeleteObject(SelectObject(pDrawItemStructure->hDC, hPen))
Function = TRUE
END IF
END IF
END IF
Case WM_SIZE
If wParam <> SIZE_MINIMIZED Then
Dim As LONG SizeX = LOWORD(LPARAM)
Dim As LONG SizeY = HIWORD(LPARAM)
MoveWindow(hListbox, 10, 10, SizeX - 20, SizeY - 50, TRUE)
MoveWindow(hButtonExit, 10, SizeY - 35, SizeX - 20, 30, TRUE)
End IF
Case WM_DESTROY
DeleteObject(hFont)
DeleteObject(hFontArialBlack)
PostQuitMessage(0)
Exit Function
End Select
Function = DefWindowProc(hWnd, uMsg, wParam, lParam)
End Function
'_____________________________________________________________________________
Function WinMain(ByVal hInstance As HINSTANCE, ByVal hPrevInst As HINSTANCE, _
ByVal CmdLine As WString Ptr, ByVal CmdShow As Integer) As UINT
Dim wsAppName As WString * 128
Dim WinClass As WNDCLASS
Dim WindowSize As SIZEL
Dim hWnd As HWND
Dim wMsg As MSG
wsAppName = AppName & " - " & SizeOf(UInteger) * 8
WindowSize.cx = 240
WindowSize.cy = 390
WinClass.hIcon = ExtractIcon(hInstance, "%SystemRoot%\system32\shell32.dll", 293)
WinClass.style = CS_HREDRAW Or CS_VREDRAW
WinClass.lpfnWndProc = ProcPtr(WndProc)
WinClass.cbClsExtra = 0
WinClass.cbWndExtra = 0
WinClass.hInstance = hInstance
WinClass.hCursor = LoadCursor(NULL, IDC_ARROW)
WinClass.hbrBackground = Cast(HGDIOBJ, COLOR_BTNFACE + 1) 'Default color
WinClass.lpszMenuName = NULL
WinClass.lpszClassName = @wsAppName
If (RegisterClass(@WinClass)) Then
hWnd = CreateWindowEx(WS_EX_WINDOWEDGE, _
wsAppName, wsAppName, _
WS_OVERLAPPED OR WS_CLIPCHILDREN Or WS_DLGFRAME Or WS_BORDER Or WS_VISIBLE Or WS_CAPTION Or _
WS_MAXIMIZEBOX Or WS_MINIMIZEBOX Or WS_SIZEBOX Or WS_SYSMENU , _
(GetSystemMetrics(SM_CXSCREEN) - WindowSize.cx) / 2, _ 'PosX screen center
(GetSystemMetrics(SM_CYSCREEN) - WindowSize.cy) / 2, _ 'PosY screen center
WindowSize.cx, WindowSize.cy, _ 'SizeX, SizeY
NULL, NULL, hInstance, NULL)
ShowWindow(hWnd, SW_SHOW)
UpdateWindow(hWnd)
While GetMessage(@wMsg, ByVal NULL, 0, 0) > 0
If IsDialogMessage(hWnd, @wMsg) = 0 Then
TranslateMessage(@wMsg)
DispatchMessage(@wMsg)
End If
Wend
End If
DestroyIcon(WinClass.hIcon)
Function = wMsg.message
End Function
'_____________________________________________________________________________
End WinMain(hInstance, NULL, Command(), SW_NORMAL) 'Call main() and return the error code to the OS
'_____________________________________________________________________________
'
Thanks Pierre, Looks interesting and useful. Once I've sorted out my current bugs features I'll maybe look and see if I can incorporate some colour changes to selected text.
I've been trying to speed up the processing on the large files, and in the absence of knowing of any free profiling software, I've been using printing timer values to the console window. Anyway, I've found a major time delay is in code similar to the following
Dim aa as String
Dim lk as Long
Dim num as Double
Dim t1 as Double
Dim t2 as Double
t1=Timer
FF_ListBox_ResetContent( hwnd_frmmain_listg)
FF_ListBox_ResetContent(hwnd_frmmain_listf)
aa = ""
lk = FF_ListBox_GetCount( hwnd_frmmain_listg)
For j as Long = 0 To maxlines
If arrayg(j)="" Then Goto getnextln
aa= arrayg(j)
'check list
If lk =0 Then
FF_ListBox_AddString( hwnd_frmmain_listg,aa )
lk=1
Goto getnextln
End If
'check if there
For k as Long = 0 To lk
If aa = FF_ListBox_GetText( hwnd_frmmain_listg,k ) Then Goto getnextln
Next
FF_ListBox_AddString( hwnd_frmmain_listg,aa )
lk=lk+1
getnextln:
Next
where maxlines is 600,000 it takes about 100 seconds to process. The arrayg is sparsely populated with maybe 3 or 4 different value (but there can be many instances of an individual value), and the code goes through the list and creates another list (listg) consisting of the individual values found. Any ideas of getting a significant speed improvement (other than changing pc) . Would there be significant speed improvement, if instead of using the ff listbox listg, I temporarily stored/checked values in an array, and copied over at the end of the processing? I've already slightly reduced the time by using the lk count instead of calling the ff getcount function.
Ok, changed it around a bit....
Whereas the above code was taking 153 seconds to process for a particular file, it now takes 0.119 seconds. The arrayg should only have values between 0 and 99, so I simply entered a value in another array, dimensioned to 100. Had to initially fill said array with a negative number. fwiw I've copied a chunk of the relevant code here Dim aa as String
Dim lk as Long
Dim num as Double
Dim t1 as Double
Dim t2 as Double
t1=Timer
Dim arra (100) as Integer
Dim vg as Integer
Dim j as Integer
FF_ListBox_ResetContent( hwnd_frmmain_listg)
FF_ListBox_ResetContent(hwnd_frmmain_listf)
For j= 0 To 100
arra(j)=-9
Next j
For j = 0 To maxlines
If arrayg(j)="" Then Goto getout
vg=Val(arrayg(j) )
arra(vg)= vg
getout:
Next j
' copy to listbox
For j = 0 To 100
If arra(j) > -9 Then FF_ListBox_AddString( hwnd_frmmain_listg,Format(j,"00" ) )
Next j
t2=Timer
?t1,t2,t2-t1, "gandf just g"
aa=""
t1=Timer
I would love to help you optimize your code or use different algorithms but the code you posted is meaningless to me without context. I have no idea what is being read, why the different arrays exist, and what kind of data is in each array and their purpose. I am happy that you were able to find a solution that cut down your processing time so significantly :) :) :)
Hi Paul,
would you like a more detailed description of what I was trying to do?
OK, maybe this is more understandable - I've added a few more comments and removed items that are not relevant to the problem. arrayg and listg are defined/dimensioned outside of this snippet. It's creating a list of unique values in listg from whatever values that are unique or duplicated in arrayg
Dim aa as String 'temp string
Dim lk as Long 'size of listg
'listg is a listbox to hold a list of the unique values from arrayg
aa = ""
lk = FF_ListBox_GetCount( hwnd_frmmain_listg)
' there may already be values in listg
'maxlines can be a value 1000000 or more (the size of arrayg )
'arrayg may or may not have a value in each position.
For j as Long = 0 To maxlines
' if the value in arrayg(j) is a blank then get the next one
If arrayg(j)="" Then Goto getnextln
'set aa to the value if not blank (saves time 'cos its used again )
aa= arrayg(j)
'if listg is empty, then add aa to listg and inc list size
If lk =0 Then
FF_ListBox_AddString( hwnd_frmmain_listg,aa )
lk=1
Goto getnextln
End If
'if listg is not empty, then check if a value for aa is already there
' only add aa if value is not already in listg also inc list size
For k as Long = 0 To lk
If aa = FF_ListBox_GetText( hwnd_frmmain_listg,k ) Then Goto getnextln
Next
'not found aa in listg, so add it in now
FF_ListBox_AddString( hwnd_frmmain_listg,aa )
lk=lk+1
getnextln:
Next
If arrayg is only a few thousand items, the code runs fast enough, but when a million or so it takes much longer. The time more or less exponentially increases as more values of listg are processed , since each time a value is added it has to check the existing listg.
Not sure if that makes it any clearer...
here's the sort of thing it does
arrayg
50,10,50, ,60,50,10,10,10,00, ,70,50,50,70, ,70,10,10,10,10
listg
50,10,60,70
Hi Ray,
Here you go... this should be faster than your current code. Give it a shot and let me know if it needs to be tweaked more.
' Per Raymw's description:
' It's creating a list of unique values in listg from whatever values that are unique or duplicated in arrayg
'
' Quick internet search to find array remove duplicate code for FreeBasic
' https://www.rosettacode.org/wiki/Remove_duplicate_elements#FreeBASIC
'
' It appears that your arrays are strings so I modifed the code to work with
' string arrays instead of integer arrays. Also modified to skip array elements
' that are empty or evaluate to zero.
Sub RemoveArrayDuplicates( a() As string, b() As string)
Dim lb As Integer = LBound(a)
Dim ub As Integer = UBound(a)
If ub = -1 Then Return ' empty array
' Initialize the first element of b array to first non-zero element of a.
for i as integer = lb to ub
if val(a(i)) > 0 then
lb = i: exit for
END IF
NEXT
Redim b(lb To ub)
b(lb) = a(lb)
Dim count As Integer = 1
Dim unique As Boolean
For i As Integer = lb + 1 To ub
if val(a(i)) = 0 then continue for
unique = True
For j As Integer = lb to i - 1
If a(i) = a(j) Then
unique = False
Exit For
End If
Next j
If unique Then
b(lbound(b) + count) = a(i)
count += 1
End If
Next i
If count > 0 Then Redim Preserve b(lbound(b) To lbound(b) + count - 1)
End Sub
'arrayg
' 50,10,50, ,60,50,10,10,10,00, ,70,50,50,70, ,70,10,10,10,10
'
'listg
' 50,10,60,70
Dim arrayg(...) As string = {"50","10","50","","60","50","10","10","10","00","","70","50","50","70","","70","10","10","10","10"}
Dim listg() As string
RemoveArrayDuplicates( arrayg(), listg() )
For i As Integer = LBound(listg) To UBound(listg)
Print listg(i); " ";
' You could also load your listg listbox here...
Next
Print
Print "Press any key to quit"
Sleep
I noticed that for very large arrays (eg. 600,000) the function was very slow. I created a new set of code that works based on the assumption that data in the arrayg array can only be empty, 0, or a value between 1 and 100. The following code produces a sorted array of non-duplicate 600,000 items in less than a tenth of a second on my machine.
' Per Raymw's description:
' It's creating a list of unique values in listg from whatever values that are unique or duplicated in arrayg
'
' Because values in the arrayg array can only be empty, 0, or value from 1 to 100, we can craft
' an extremely fast routine that will identify the unique numbers in the array.
#Include "string.bi" ' only used for the Format statement used below in output message
Sub RemoveArrayDuplicates( a() As string, b() As string)
If ubound(a) = -1 Then Return ' empty array
' Because the values in b() array must equate between 1 and 100 then we can
' create an array with O(1) access time similar to a hash structure.
Dim ValidNumbers(1 to 100) as Boolean
dim as long count, nValue, i, nextSlot
' Cycle through the a() array and check to see if it is already hit in our
' valid numbers array. If not, then set that array index to true.
For i = LBound(a) To uBound(a)
' Bypass any empty or value 0 array values
nValue = val(a(i))
if nValue = 0 then continue for
' Value should be between 1 and 100 but if it is not then the assumption
' about the format of the a() array was wrong. Better check to ensure we
' don't get any array out of bounds errors.
if (nValue < 1) or (nValue > 100) then continue for
' Finally, set the hit in our ValidNumbers array
if ValidNumbers(nValue) = false then
ValidNumbers(nValue) = true
count = count + 1
end if
next
If count Then
Redim b(1 to count)
For i = LBound(ValidNumbers) To uBound(ValidNumbers)
if ValidNumbers(i) = true then
nextSlot = nextSlot + 1
b(nextSlot) = str(i)
end if
next
END IF
End Sub
'' Function to a random number in the range [first, last), or {first <= x < last}.
Function rnd_range (first As Double, last As Double) As long
Function = Rnd * (last - first) + first
End Function
Randomize
' Create a random array of 600,000 string elements
? "Creating random data array..."
reDim arrayg(599999) As string
Dim listg() As string
for i as integer = lbound(arrayg) to ubound(arrayg)
arrayg(i) = str(rnd_range(0, 100))
NEXT
?
? "Array created."
? "Calling RemoveArrayDuplicates..."
dim as single t1, t2
t1 = timer
RemoveArrayDuplicates( arrayg(), listg() )
t2 = timer
? "Time to remove duplicates for array of"; ubound(arrayg)-lbound(arrayg)+1; " elements: "; format(t2-t1, "##0.00"); " seconds."
'For i As Integer = LBound(listg) To UBound(listg)
' Print listg(i); " ";
' ' You could also load your listg listbox here...
'Next
Print
Print "Press any key to quit"
Sleep
Thanks Paul.
Your second version is very similar in principle to what I did in my second attempt, except you used a temporary boolean array, and mine was integer, and your code is neater. I was originally thinking of keeping a count of the unique 0 to 99 values in the integer array - the advantage being if I later deleted a value from the large array, then I'd only need to reduce the appropriate integer value in the integer list, but I don't think that would save any noticeable time compared to rebuilding the new integer list. Realising the values lie between 0 and 99 is the clue.
My early programming days in IBM Fortran (50 years ago, G & H iirc) trained me to use i,j,k as integers, (not often used l, m, but n always used as limit in for/next loops. e.g. for j= 0 to n) and also short variable names, less than four characters. I tend not to use the new fangled basic stuff, like while/wend and functions. :-[
I suppose I may as well post here, since it is related to lengthy list boxes, in as much I'm trying to parse a file, separating various values into lists. The original data looks something like the following
(130mm diam z 9mm deep)
N0001 G90
N0002 t32
(Begin Next Pass at -3.000)
N0003 M06 S24000
N0004 M03
N34128 G01 X-17.5362 Y-7.0818 Z-2.3281
N34129 G01 X-17.5362 Y-7.0158 Z-2.3324
N34130 G01 X-17.5362 Y-6.9498 Z-2.3345
N34131 G01 X-17.5362 Y-6.8838 Z-2.3362
N34132 G01 X-17.5362 Y-6.8178 Z-2.3400
N34133 G01 X-17.5362 Y-6.7518 Z-2.3432
N34134 G01 X-17.5362 Y-6.6858 Z-2.3470
(a million or so lines of it.) basically it's a list of x,y,z coordinates (although there are additional axes, such as a,b,c, but I'm only concerned at the moment with x,y,z,a.)
Anyway, I read the lines from the file into an array, array1. I then parse each line in array1 for the various axis values, etc., saving the results into other arrays. This takes about 86 seconds for processing 1113262 records. I'm not sure if the speed can be much improved, but obviously I would like to get it to run faster.
Sub fill() ' fills the columns by parsing array1
Dim ln as String 'line of code
Dim k as Integer 'counter
Dim t as Integer ' total chars in ln
Dim r as String 'result string if found
Dim lns as String ' rhs of line
Dim kk as Integer
Dim t1 as Double 'for timer
Dim t2 as Double 'for timer
t1=Timer
For i as Long = 0 To Maxlines 'for every line
'set lines in arrays to blank
arrayn(i)=""
arrayg(i)=""
arrayx(i)=""
arrayy(i)=""
arrayz(i)=""
arrayf(i)=""
arrayo(i)=""
arraya(i)=""
ln= Array1(i) 'array1 is populated with g-code lines
lns=ln 'lns is for right hand end of line
t = Len (ln) 'ln is the line being considered
For k = 0 To t 'go through line, character by character
lns= Right(ln,t-k)
' none of N, X, Y etc. may not be present, and can be mixed upper/lower case
' N numbers need not be sequential
If UCase(Left (lns,1))= "N" Then
arrayn(i)= Format(Val(Right(lns,(t-(k+1)))),"0000" )
Goto finnum
End If
'values are to be formatted to 4dp, usually
'arrayo holds comments -should be enclosed in ()
'arrayo also holds other codes, except N,G,X,Y,Z,A,F
If Left(lns,1)= "(" Then 'comment
arrayo(i)= lns
'get next line
Goto finline
End If
If UCase(Left (lns,1))= "G" Then
arrayg(i)= Format(Val(Right(lns,(t-(k+1)) )),"00")
Goto finnum
End If
If UCase(Left (lns,1))= "X" Then
arrayx(i)= Format(Val(Right(lns,(t-(k+1)))),"#.0000")
Goto finnum
End If
If UCase(Left (lns,1))= "Y" Then
arrayy(i)= Format(Val(Right(lns,(t-(k+1)))),"#.0000")
Goto finnum
End If
'
If UCase(Left (lns,1))= "Z" Then
arrayz(i)= Format(Val(Right(lns,(t-(k+1)))),"#.0000")
Goto finnum
End If
If UCase(Left (lns,1))= "A" Then
arraya(i)= Format(Val(Right(lns,(t-(k+1)))),"#.0000")
Goto finnum
End If
'
If UCase(Left (lns,1))= "F" Then
arrayf(i)= Format(Val(Right(lns,(t-(k+1)))),"#.0000")
Goto finnum
End If
'
finnum:
Next
'arrayo also holds other codes, except N,G,X,Y,Z,A,F
'check for m, t, etc
kk= InStr (LCase(ln),Any "bcdehijklmopqrstuvw" )
If kk>0 Then arrayo(i)=" "+Right(ln, t-kk+1)
'check for special characters
kk= InStr (ln,Any "!£$%^&*{}:@~#:<>?" )
If kk>0 Then arrayo(i)=" "+Right(ln, t-kk+1)
finline:
Next ' get next line
t2=Timer
?t1,t2,t2-t1 ,"b4 assign" 'takes 86 seconds to get to here - ignore the following wrt timing
' Assign the string arrays to the listboxes.
SendMessage( HWND_FRMMAIN_code, VL_SETARRAY, Cast(WPARAM, VarPtr(Array1(0))) , Cast(LPARAM, UBound(Array1)) )
SendMessage( HWND_FRMMAIN_VLISTn, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayn(0))) , Cast(LPARAM, UBound(Arrayn)) )
SendMessage( HWND_FRMMAIN_VLISTg,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayg(0))) , Cast(LPARAM, UBound(Arrayg)) )
SendMessage( HWND_FRMMAIN_VLISTx,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayx(0))) , Cast(LPARAM, UBound(Arrayx)) )
SendMessage( HWND_FRMMAIN_VLISTy,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayy(0))) , Cast(LPARAM, UBound(Arrayy)) )
SendMessage( HWND_FRMMAIN_VLISTz,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayz(0))) , Cast(LPARAM, UBound(Arrayz)) )
SendMessage( HWND_FRMMAIN_VLISTf, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayf(0))) , Cast(LPARAM, UBound(Arrayf)) )
SendMessage( HWND_FRMMAIN_VLISTo, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayo(0))) , Cast(LPARAM, UBound(Arrayo)) )
SendMessage( HWND_FRMMAIN_VLISTa, VL_SETARRAY, Cast(WPARAM, VarPtr(Arraya(0))) , Cast(LPARAM, UBound(Arraya)) )
' Refresh the listboxes so that the changes can visibly be seen.
SendMessage( HWND_FRMMAIN_code, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTn, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTg, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTx, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTy, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTz, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTf, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTo, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTa, VL_REFRESH, 0, 0)
gandf()
hidew()
t2=Timer
?t1,t2,t2-t1 ,"ready to go" 'Note, this time is not significantly greater than time to 'b4 assign'
End Sub
I'm not sure if my possibly bad coding habits impinge on performance. Somewhere, in the distant past, I read that to exit a loop you increased the counter, I tend to use a goto.
For code such as arrayn(i)= Format(Val(Right(lns,(t-(k+1)))),"0000" ) is the processing quicker if I separate the calculations, e.g. v=t-(k+1), b= val(right,v), and finally arrayn(i)= format(b,"0000")?
I vaguely remember the myths that said always use a variable instead of a number, and some of the various nonsense/rules wrt sorting, which have since been disproved, but I've no idea as to what is currently supposed to be best practice.
Maybe, instead of the for/next going through each line, a character at a time, checking for x,y,z, etc., it may be quicker to use instr(array1(i),"X"), etc. for each wanted character on the whole line- it would save testing the numbers. Also, whether using Ucase would be quicker than testing for 'X' and 'x' separately. and so on...
edit to add- just thought of
Using Instr would need some thinking as to how to handle comments () which can be at the end of lines, or be the whole line, and can contain words that have x, y,z etc in there. Not insurmountable, but would most likely take a while.
I have no idea if the code below I wrote will even compile.
Obviously I have no way to test it without your data, etc.
If it actually works, it should significantly speed up the parsing of your data.
function GetDataToSpaceChr( DestArray() as string, byval nLineNum as long, byval nStartPos as long) as long
' Get the characters from the current position up to the first blank space or end of line.
' Return the new line parsing offset so the line can continue to be parsed.
dim as string tempString
for i as long = nStartPos to len(Array1(nLineNum)) - 1
if Array1(nLineNum)[i] = 32 ' space character
exit for
else
tempString = tempString & chr(Array1(nLineNum)[i])
end if
NEXT
DestArray(nLineNum) = tempString
function = i
END FUNCTION
Sub fill() ' fills the columns by parsing array1
Dim ln as String ' line of code
Dim k as Integer ' counter
Dim t as Integer ' total chars in ln
Dim r as String ' result string if found
Dim lns as String ' rhs of line
Dim kk as Integer
'(130mm diam z 9mm deep)
'N0001 G90
'N0002 t32
'(Begin Next Pass at -3.000)
'N0003 M06 S24000
'N0004 M03
'N34128 G01 X-17.5362 Y-7.0818 Z-2.3281
'N34129 G01 X-17.5362 Y-7.0158 Z-2.3324
'N34130 G01 X-17.5362 Y-6.9498 Z-2.3345
'N34131 G01 X-17.5362 Y-6.8838 Z-2.3362
'N34132 G01 X-17.5362 Y-6.8178 Z-2.3400
'N34133 G01 X-17.5362 Y-6.7518 Z-2.3432
'N34134 G01 X-17.5362 Y-6.6858 Z-2.3470
dim as long lb = lbound(Array1)
dim as long ub = ubound(Array1)
Dim as double t1, t2 ' for timer
t1 = Timer
' Read an process every line in Array1()
For i as Long = lb To ub
' Setting the lines to blank for each arry is not necessary. FreeBASIC already
' initializes each string element in the arrays to blank.
' arrayn(i)=""
' arrayg(i)=""
' arrayx(i)=""
' arrayy(i)=""
' arrayz(i)=""
' arrayf(i)=""
' arrayo(i)=""
' arraya(i)=""
' Is reassigning the Array(i) string to a temporary string necessary? It
' seems like a redundant step.
' ln= Array1(i) 'array1 is populated with g-code lines
' lns=ln 'lns is for right hand end of line
' t = Len (ln) 'ln is the line being considered
For k = 0 To len(Array1(i)) - 1 ' go through line, character by character
' Let's use string pointer indexing because it is super cool and much easier
' to use that in other BASIC's. It also means we don't have to use expensive
' operations line UCASE, LEFT, MID, blah blah blah
' Format() is a slow FB function as well. Might want to consider not using it
' or replace with a faster locally developed version. It appears that your data
' is already formated to the correct number of digits so the benefit of Format
' seems a little redundant.
' Also, let's use a Select Case As Const so that a super fast jump table is
' created for comparisons.
select case as const Array1(i)[k]
CASE 78, 110 ' N n
' Fill arrayn() with data until we reach the first space of end of line
k = GetDataToSpaceChr(arrayn(), i, k)
CASE 71, 103 ' G g
k = GetDataToSpaceChr(arrayg(), i, k)
CASE 88, 120 ' X x
k = GetDataToSpaceChr(arrayx(), i, k)
CASE 89, 121 ' Y y
k = GetDataToSpaceChr(arrayy(), i, k)
CASE 90, 122 ' Z z
k = GetDataToSpaceChr(arrayy(), i, k)
CASE 65, 97 ' A a
k = GetDataToSpaceChr(arraya(), i, k)
CASE 70, 102 ' F f
k = GetDataToSpaceChr(arraya(), i, k)
CASE 40 ' ( comment to end of line
k = GetDataToSpaceChr(arrayo(), i, k)
exit for ' exit out of processing the current line and start on next line
END SELECT
Next
' I wasn't sure what the following lines meant so I didn't add any
' parsing logic for them?
' 'arrayo also holds other codes, except N,G,X,Y,Z,A,F
' 'check for m, t, etc
' kk = InStr (LCase(ln),Any "bcdehijklmopqrstuvw" )
' If kk>0 Then arrayo(i)=" "+Right(ln, t-kk+1)
' 'check for special characters
' kk = InStr (ln,Any "!£$%^&*{}:@~#:<>?" )
' If kk>0 Then arrayo(i)=" "+Right(ln, t-kk+1)
Next ' get next line
t2 = Timer
? t1,t2,t2-t1 ,"b4 assign" 'takes 86 seconds to get to here - ignore the following wrt timing
' Assign the string arrays to the listboxes.
SendMessage( HWND_FRMMAIN_code, VL_SETARRAY, Cast(WPARAM, VarPtr(Array1(0))) , Cast(LPARAM, UBound(Array1)) )
SendMessage( HWND_FRMMAIN_VLISTn, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayn(0))) , Cast(LPARAM, UBound(Arrayn)) )
SendMessage( HWND_FRMMAIN_VLISTg,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayg(0))) , Cast(LPARAM, UBound(Arrayg)) )
SendMessage( HWND_FRMMAIN_VLISTx,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayx(0))) , Cast(LPARAM, UBound(Arrayx)) )
SendMessage( HWND_FRMMAIN_VLISTy,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayy(0))) , Cast(LPARAM, UBound(Arrayy)) )
SendMessage( HWND_FRMMAIN_VLISTz,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayz(0))) , Cast(LPARAM, UBound(Arrayz)) )
SendMessage( HWND_FRMMAIN_VLISTf, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayf(0))) , Cast(LPARAM, UBound(Arrayf)) )
SendMessage( HWND_FRMMAIN_VLISTo, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayo(0))) , Cast(LPARAM, UBound(Arrayo)) )
SendMessage( HWND_FRMMAIN_VLISTa, VL_SETARRAY, Cast(WPARAM, VarPtr(Arraya(0))) , Cast(LPARAM, UBound(Arraya)) )
' Refresh the listboxes so that the changes can visibly be seen.
SendMessage( HWND_FRMMAIN_code, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTn, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTg, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTx, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTy, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTz, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTf, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTo, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTa, VL_REFRESH, 0, 0)
gandf()
hidew()
t2 = Timer
? t1,t2,t2-t1 ,"ready to go" 'Note, this time is not significantly greater than time to 'b4 assign'
End Sub
Thanks Paul, I'll have a look later (or earlier, it's 2am here...)
The function getdatatospacechr() would need modifying a tad, (maybe to getdatatonotanumber() since there is not always a space where expected (I did mention the code was something like...) lines could be x-55z900.83y0a4524f500(and a comment at the end) or x -45A 67.8 and I'm not sure at the moment if there would be a speed improvement over checking every character in the line, but the pointer indexing concept looks useful. I'll try it and see.
(Once this is sorted, I'll be attempting to draw pictures, 3d views etc. That'll be fun... :-\
Quote from: raymw on June 24, 2018, 10:43:24 PM
The function getdatatospacechr() would need modifying a tad, (maybe to getdatatonotanumber() since there is not always a space where expected (I did mention the code was something like...) lines could be x-55z900.83y0a4524f500(and a comment at the end) or x -45A 67.8
That should be easy enough... just check for the first non-numeric character or end of line.
Quote...and I'm not sure at the moment if there would be a speed improvement over checking every character in the line, but the pointer indexing concept looks useful.
But your code is already checking every character in the line albeit indirectly by creating temporary strings using RIGHT() in the "Format(Val(Right(lns,(t-(k+1)))),"#.0000") " type of code. The whole idea of the parsing is to eliminate as much as possible the creation of additional strings and utilizing unnecessary PB string commands. String pointer indexing is super fast - much faster than MID(). It is also easier to use than PB's method of creating a byte pointer to the string. :)
looking good, so far. Thanks Paul, I dropped your code into my program, ran it (had to do a few minor corrections, as you said you'd not compiled it) and it ran the routine in 3.64 seconds, instead of the previous 83! Now I need to modify it as mentioned to get it to work with other variations of data and sort out the comments (). I was not aware of the 'exit for', which tidies things up a bit.
Well I've got it working as I want, but probably could be faster. It takes 14 seconds, compared to 83 before. I think that is tolerable for the large files. I had to use Format, to take care of the different number of decimal places, but I'm not sure if I could speed up that process. The biggest improvement was using the string pointer indexing.
The code that puzzled you has to be included, or rather something that does the same job i.e, find odd code in array1, and shove it into arrayo.
kk = InStr (LCase(array1(i)),Any "bcdehijklmopqrstuvw" )
If kk>0 Then arrayo(i)=" "+Right(array1(i),Len (array1(i))-kk+1)
it's to take care of all the other values that may or may not be present in a line of G-code e.g. M, I, J etc, (for example, a line such as g2 x15 y-29 i34 j5 ) but such lines are often infrequent and they do not deserve a separate list. That bit of code takes 4 seconds for the million line file. I altered your function, to allow the passing of a formatting string
I'm not sure if this area can be easily improved wrt speed. Would it be faster in 64bit, or if the function was inlined?
Function GetDataToarrays( DestArray() as String, ByVal nLineNum as Long, ByVal nStartPos as Long, ByVal fs as String) as Long
' Get the characters from the current position up to the first blank space or end of line.
' Return the new line parsing offset so the line can continue to be parsed.
Dim as String tempString
Dim i as Long
For i = nStartPos To Len(Array1(nLineNum)) - 1
If Array1(nLineNum)[i] = 32 Then ' space character
Exit For
Else
tempString = Format(Val(Right(array1(nlinenum),Len (array1(nlinenum))-nstartpos-1)),fs)
Exit For
End If
Next
DestArray(nLineNum) = tempString
Function = i
End Function
' ------------------------------------------------------
Sub fill() ' fills the columns by parsing array1
Dim ln as String ' line of code
Dim k as Integer ' counter
Dim t as Integer ' total chars in ln
Dim r as String ' result string if found
Dim lns as String ' rhs of line
Dim kk as Integer
'(130mm diam z 9mm deep)
'N0001 G90
'N0002 t32
'(Begin Next Pass at -3.000)
'N0003 M06 S24000
'N0004 M03
'N34128 G01 X-17.5362 Y-7.0818 Z-2.3281
'N34129 G01 X-17.5362 Y-7.0158 Z-2.3324
'N34130 G01 X-17.5362 Y-6.9498 Z-2.3345
'N34131 G01 X-17.5362 Y-6.8838 Z-2.3362
'N34132 G01 X-17.5362 Y-6.8178 Z-2.3400
'N34133 G01 X-17.5362 Y-6.7518 Z-2.3432
'N34134 G01 X-17.5362 Y-6.6858 Z-2.3470
Dim as Long lb = LBound(Array1)
Dim as Long ub = UBound(Array1)
Dim fs as String 'format string
fs="#.0000"
Dim as Double t1, t2 ' for timer
t1 = Timer
' Read an process every line in Array1()
For i as Long = lb To ub
For k = 0 To Len(Array1(i)) - 1 ' go through line, character by character
' Let's use string pointer indexing because it is super cool and much easier
' to use that in other BASIC's. It also means we don't have to use expensive
' operations line UCASE, LEFT, MID, blah blah blah
' Format() is a slow FB function as well. Might want to consider not using it
' or replace with a faster locally developed version. It appears that your data
' is already formated to the correct number of digits so the benefit of Format
' seems a little redundant.
' Also, let's use a Select Case As Const so that a super fast jump table is
' created for comparisons.
Select Case as Const Array1(i)[k]
Case 40 ' ( comment - get out of way first in case x,y,z etc inside
arrayo(i)=Right(array1(i),Len (array1(i))-k)
Goto nextline
Case 78, 110 ' N n
' Fill arrayn() with data
k = GetDataToarrays(arrayn(), i, k,"0000")
Case 71, 103 ' G g
k = GetDataToarrays(arrayg(), i, k,"00")
Case 88, 120 ' X x
k = GetDataToarrays(arrayx(), i, k,fs)
Case 89, 121 ' Y y
k = GetDataToarrays(arrayy(), i, k,fs)
Case 90, 122 ' Z z
k = GetDataToarrays(arrayz(), i, k,fs)
Case 65, 97 ' A a
k = GetDataToarrays(arraya(), i, k,fs)
Case 70, 102 ' F f
k = GetDataToarrays(arraya(), i, k,fs)
End Select
'anything else goes in comments
Next
'from here
'arrayo also holds other codes, except N,G,X,Y,Z,A,F
' 'check for m, t, etc
kk = InStr (LCase(array1(i)),Any "bcdehijklmopqrstuvw" )
If kk>0 Then arrayo(i)=" "+Right(array1(i),Len (array1(i))-kk+1)
' 'check for special characters
kk = InStr (array1(i),Any "!£$%^&*{}:@~#:<>?" )
If kk>0 Then arrayo(i)=" "+Right(array1(i),Len (array1(i))-kk+1)
' to here takes about 4 seconds for my million line test file
nextline:
Next ' get next line
t2 = Timer
? t1,t2,t2-t1 ,"b4 assign" 'now 14 seconds....
' Assign the string arrays to the listboxes.
SendMessage( HWND_FRMMAIN_code, VL_SETARRAY, Cast(WPARAM, VarPtr(Array1(0))) , Cast(LPARAM, UBound(Array1)) )
SendMessage( HWND_FRMMAIN_VLISTn, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayn(0))) , Cast(LPARAM, UBound(Arrayn)) )
SendMessage( HWND_FRMMAIN_VLISTg,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayg(0))) , Cast(LPARAM, UBound(Arrayg)) )
SendMessage( HWND_FRMMAIN_VLISTx,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayx(0))) , Cast(LPARAM, UBound(Arrayx)) )
SendMessage( HWND_FRMMAIN_VLISTy,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayy(0))) , Cast(LPARAM, UBound(Arrayy)) )
SendMessage( HWND_FRMMAIN_VLISTz,VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayz(0))) , Cast(LPARAM, UBound(Arrayz)) )
SendMessage( HWND_FRMMAIN_VLISTf, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayf(0))) , Cast(LPARAM, UBound(Arrayf)) )
SendMessage( HWND_FRMMAIN_VLISTo, VL_SETARRAY, Cast(WPARAM, VarPtr(Arrayo(0))) , Cast(LPARAM, UBound(Arrayo)) )
SendMessage( HWND_FRMMAIN_VLISTa, VL_SETARRAY, Cast(WPARAM, VarPtr(Arraya(0))) , Cast(LPARAM, UBound(Arraya)) )
' Refresh the listboxes so that the changes can visibly be seen.
SendMessage( HWND_FRMMAIN_code, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTn, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTg, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTx, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTy, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTz, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTf, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTo, VL_REFRESH, 0, 0)
SendMessage( HWND_FRMMAIN_VLISTa, VL_REFRESH, 0, 0)
gandf()
hidew()
t2 = Timer
? t1,t2,t2-t1 ,"ready to go" 'Note, this time is not significantly greater than time to 'b4 assign'
End Sub
Thanks again for your help, this is now becoming useful, at least to me.
Hi Ray,
I am not sure that this line in GetDataToSpaceChr is what you should be using:
tempString = Format(Val(Right(array1(nlinenum),Len (array1(nlinenum))-nstartpos-1)),fs)
Exit For
Basically, you're losing the whole benefit of the byte pointer scanning the line. Maybe try the code below - it scans the line up to the first non-numeric character and then exits the For/Next. Finally, the resulting string is formatted using your specified mask.
function GetDataToSpaceChr( DestArray() as string, _
byval nLineNum as long, _
byval nStartPos as long,
ByVal fs as String _
) as Long
' Get the characters from the current position up to the first non-zero character or end of line.
' Return the new line parsing offset so the line can continue to be parsed. We start scanning
' from the position immediately following the found "N", "X", "Y", "Z", etc.
dim as string tempString
for i as long = nStartPos + 1 to len(Array1(nLineNum)) - 1
select case Array1(nLineNum)[i]
CASE 45, 46, 48 to 57 ' - . 0-9
tempString = tempString & chr(Array1(nLineNum)[i])
case else
exit for
end select
NEXT
DestArray(nLineNum) = format(val(tempString), fs)
function = i
END FUNCTION
I haven't looked at the other changes you made but I see that you are using a combination of INSTR, RIGHT, LCASE which will defeat the byte scanning as well. If I get time later I'll try looking at those areas as well.
Hi Paul, I copied your function above, and replaced mine (the editor comment/uncomment block is very useful), but I found your version was slower :o. I did comparisons on a couple of data files. The million or so lines, my time 14secs, yours 19. For 820000 line file mine 10 yours 14.5sec . I retested a few times, got the same results. Seems like the fb Val, is quite fast.
latest version, in 64 bit, 11.5 seconds for the million or so file. Having taken about 8hours testing, timing, etc, then I'll need to run the program about 30,000 times to recover that time.... :'(
That's awesome that the built-in FB functions are speeding up the process. Cool. :)
As an aside, make sure that you're always aware of the fact that the byte pointer approach is zero based whereas the FB functions are one based when dealing with character positions.
not wanting to drag it out more than necessary, but I wrote a simple ff/fb test (printing times to console) #Include "string.bi"
Dim t4 as Double
Dim t1 as Double
Dim t2 as Double
Dim t3 as Double
Dim vv as Double
Dim tempstring as String
Dim maxlines as Long
Dim arrayg as String
arrayg="2000 anything)" 'random? string starting with number
maxlines =10000000
t1=Timer
For i as Long =0 To maxlines 'time empty loop
Next
t2=Timer
t3=t2-t1
?t1,t2,t3, "empty loop"
t1=Timer
For i as Long =0 To maxlines
vv= Val(arrayg) 'test setting a tempstring to a val of a string
tempstring= Str(vv)
Next
t2=Timer
t3=t2-t1
?t1,t2,t3 , "val"
For i as Long =0 To maxlines
tempstring=""
For j as Long = 0 To Len(Arrayg) - 1 'test using pointers to numbers
Select Case Arrayg[j]
Case 45, 46, 48 To 57 ' - . 0-9
tempString = tempString & Chr(Arrayg [j])
Case Else
Exit For
End Select
Next
Next
t2=Timer
t3=t2-t1
?t1,t2,t3 , "val2"
'--------------------------------------------------------------------------------
The approximate times I got for 32 bit compiler were 7.4secs and 15.3 sec, the fb val function being more or less twice as fast.
For 64bit compiler times were 6.3sec and 12.6 sec - a definite speed improvement.
I suppose I ought to vary the arrayg string length, number of numbers, etc., but maybe a tad ocd.
In my quest for speed, I wondered if it was faster for code to be in-line, instead of calling functons. I wrote another little timing program Function sumup (a as Double, b as Double) as Double
Dim c as Double
c=a+b
Function = c
End Function
'---------------------------------------------
Sub addup (a as Double,b as Double )
d=a+b
End Sub
'---------------------------------------------
Dim t1 as Double
Dim t2 as Double
Dim t3 as Double
Dim a as Double
Dim b as Double
Dim c as Double
Dim Shared d as Double
Dim maxlines as Long
maxlines =100000000
t1=Timer
For i as Long =0 To maxlines 'time empty loop
Next
t2=Timer
t3=t2-t1
?t1,t2,t3, "empty loop"
a=100
b=200
t1=Timer
For i as Long =0 To maxlines
'test calling function
c=sumup(a,b)
c=sumup(a,b)
c=sumup(a,b)
Next
t2=Timer
t3=t2-t1
?t1,t2,t3 , "by function"
?c
a=100
b=200
t1=Timer
For i as Long =0 To maxlines
'test calling subroutine
addup(a,b)
addup(a,b)
addup(a,b)
Next
t2=Timer
t3=t2-t1
?t1,t2,t3 , "by subroutine"
?d
a=100
b=200
For i as Long =0 To maxlines
c = a + b
c = a + b
c = a + b
Next
t2=Timer
t3=t2-t1
?t1,t2,t3 , "by inline"
?c
and from my not extensive testing, it appears that the fastest is to use a subroutine, passing the result back as a share. Of course, I expect the results in the real world will vary depending on the complexity of the calculations, but the overhead in calling a subroutine a number of times appears to be much less than I thought. Maybe I'll play around a bit more with shared values, and passing byref or byval. Mind you, there is always the chance that I've written something daft.
edited to add - I've since added a couple more timing loops, for subroutine and function using shared variables. for 100,000,000 iterations, function with shares 1.5 secs, subroutine with shares 0.13 secs, in line 0.32 secs .I think I've got some code re-writing to do.