PlanetSquires Forums

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Reading data from a text file  (Read 137 times)

Petrus Vorster

  • FireFly3 Registered User
  • Senior Member
  • *
  • Posts: 392
Reading data from a text file
« on: July 25, 2018, 04:25:49 PM »

hi all

Reading data from a text file is easy, but how do one go about trying to get workable information from a file which has NO structure at all.
There are no delimited lines or anything one can parse into something.

I have a report that sometimes has hundreds of serial numbers on them and i want to see if i cannot find some miraculous way to read this file and extract only what i need.
I have attached the file if anyone can think of a way one can read this into a parsed field.

I have been trying various delimiters but not with much success.
Logged

Petrus Vorster

  • FireFly3 Registered User
  • Senior Member
  • *
  • Posts: 392
Re: Reading data from a text file
« Reply #1 on: July 25, 2018, 05:54:56 PM »

Well luckily it turned out all the different products have 14 digit serial numbers.
But its still darn hard to to detect certain strings even with INSTR to read the different product codes.

But at least the trail i have managed to filter out the serial numbers.
Logged

David Kenny

  • FireFly3 Registered User
  • Senior Member
  • *
  • Posts: 444
  • Windows 7
Re: Reading data from a text file
« Reply #2 on: July 26, 2018, 06:20:35 AM »

You can use regexpr.  It's pretty powerful and quite fast.

Code: [Select]
FUNCTION PBMAIN () AS LONG
    GetSerials("Cons1.TXT")
End Function                           

Sub Getserials(FileSpec as string)
    Local Str           As String
    Local SerNo         As String           
    Local iPos, iLen    As Long
    Open FileSpec For Binary As 1               'Open in binary mode
    Get$ #1, Lof(1),Str                         'Read it all
    Close 1
    Do                                                                   
        RegExpr "[0-9A-F]* TO" In Str At iPos + iLen To iPos , iLen      'Note: Your example showed each SerNo in duplicate because the
        SerNo += Mid$(Str, iPos, iLen - 3) & $CrLf                       '      form had SerNo TO SerNo and in each case they were the same.
    Loop Until iLen = 0                                                  '      I'm only grabing the one before the TO but still not checking
                                                                         '      for duplicates
    MsgBox SerNo                                'List them
End Sub

By the way, the regular expression in my example looks for any contiguous characters (of any size) containing digits (0-9) or upper-case letters (A-F) followed by ' TO'. Your sample file works with this regular expression, but you may find situations that it would need to modified. It would match 'B TO' in 'TAB TO COL 10' for instance.

David
« Last Edit: July 26, 2018, 06:40:50 AM by David Kenny »
Logged

Petrus Vorster

  • FireFly3 Registered User
  • Senior Member
  • *
  • Posts: 392
Re: Reading data from a text file
« Reply #3 on: July 26, 2018, 01:33:04 PM »

Hi David

Thanks a million, i didnt even know about this command.
I will most definitely go and try this.

-Thanks, Peter
Logged

Petrus Vorster

  • FireFly3 Registered User
  • Senior Member
  • *
  • Posts: 392
Re: Reading data from a text file
« Reply #4 on: July 26, 2018, 01:43:07 PM »

Thanks David, that works perfectly.
Now i will experiment with that and see what else i can retrieve like order numbers etc.

The system can export in Excel and CSV (now they tell me), but I am now curious to what extend i can play with this.

Thanks a million.
Logged

David Kenny

  • FireFly3 Registered User
  • Senior Member
  • *
  • Posts: 444
  • Windows 7
Re: Reading data from a text file
« Reply #5 on: July 26, 2018, 02:01:59 PM »

It's in the PB help file.  There are many other sources out there of regular expressions as well and each version has a slight 'dialect'. 
There used to be an online regexpr tester at PowerBasic.com.  It seems to be missing now.  Kevin Voell wrote one in PB as well, it's still available in the forums.  You can supply a sample file to search, and it will show you what your current expression will find.  Then you just copy it to your code.

It gets easier after looking at it for a while and trying things.  Very powerful.
Logged