hi all
Reading data from a text file is easy, but how do one go about trying to get workable information from a file which has NO structure at all.
There are no delimited lines or anything one can parse into something.
I have a report that sometimes has hundreds of serial numbers on them and i want to see if i cannot find some miraculous way to read this file and extract only what i need.
I have attached the file if anyone can think of a way one can read this into a parsed field.
I have been trying various delimiters but not with much success.
Well luckily it turned out all the different products have 14 digit serial numbers.
But its still darn hard to to detect certain strings even with INSTR to read the different product codes.
But at least the trail i have managed to filter out the serial numbers.
You can use regexpr. It's pretty powerful and quite fast.
FUNCTION PBMAIN () AS LONG
GetSerials("Cons1.TXT")
End Function
Sub Getserials(FileSpec as string)
Local Str As String
Local SerNo As String
Local iPos, iLen As Long
Open FileSpec For Binary As 1 'Open in binary mode
Get$ #1, Lof(1),Str 'Read it all
Close 1
Do
RegExpr "[0-9A-F]* TO" In Str At iPos + iLen To iPos , iLen 'Note: Your example showed each SerNo in duplicate because the
SerNo += Mid$(Str, iPos, iLen - 3) & $CrLf ' form had SerNo TO SerNo and in each case they were the same.
Loop Until iLen = 0 ' I'm only grabing the one before the TO but still not checking
' for duplicates
MsgBox SerNo 'List them
End Sub
By the way, the regular expression in my example looks for any contiguous characters (of any size) containing digits (0-9) or upper-case letters (A-F) followed by ' TO'. Your sample file works with this regular expression, but you may find situations that it would need to modified. It would match 'B TO' in 'TAB TO COL 10' for instance.
David
Hi David
Thanks a million, i didnt even know about this command.
I will most definitely go and try this.
-Thanks, Peter
Thanks David, that works perfectly.
Now i will experiment with that and see what else i can retrieve like order numbers etc.
The system can export in Excel and CSV (now they tell me), but I am now curious to what extend i can play with this.
Thanks a million.
It's in the PB help file. There are many other sources out there of regular expressions as well and each version has a slight 'dialect'.
There used to be an online regexpr tester at PowerBasic.com. It seems to be missing now. Kevin Voell wrote one in PB as well, it's still available in the forums. You can supply a sample file to search, and it will show you what your current expression will find. Then you just copy it to your code.
It gets easier after looking at it for a while and trying things. Very powerful.