• Welcome to PlanetSquires Forums.
 

Extracting info from a large string and fields

Started by Petrus Vorster, September 18, 2019, 05:58:15 PM

Previous topic - Next topic

Petrus Vorster

Hi All

I am trying to update one of my Powerbasic projects.

It retrieves information from a website using a TCP open command.
What i get back is a pile of text with several kinds of delimiters.
I assume its Code block start and end lines what is used by the site to display data on the whereabouts of an item in a list format.

{ is code block start and  } is code block end.
Information in between is in parentheses "Tracking Status":"-1" and so forth.

All they do is give you a history breakdown, top to bottom on the site so you can read the progress till the end / delivered / returned etc.

I would like to extract information from that big block of information, only to retrieve whether it has been returned or delivered locally.

I found the code for "Delivered" and "Returned" and can extract that with REGEXPR successfully.
What i want to do next is try and retrieve the dates and names in that code block in which i found that Delivered marker.
In the entire big chunk of info there are many dates (Transport dates, Posting dates etc).

So what i know, is if i find the Returned marker or the Delivered marker, there will be somewhere to the left a "{" marker and somewhere to the right a "}" marker. (variable lengths strings)
How could one search only for certain keywords between only those specific two markers?

In other words, how do I search for the first instance of "{" to my current position in a text block and the first instance to my right of "}"?
The in-between part i should be able to extract with REGEXP.

Hope this is understandable.

regards, Peter
-Regards
Peter

raymw

I know nothing, other than having heard the term, years ago, about regex, but if you could do what you want in pb, then you could most likely do it in fb. Without knowing the full details, I would read in a chunk of the data, look for the keywords you know, then use mid/left/right whatever to get the {}. Then pull  out what you need. Probably quickest to search for the rarest occurrences first. If the block terminator } is always followed by { to start next block, then you only need to search for { if going from left to right. I think {+ in regex will get the previous instance of {, from my quick scan of the wikki.

Petrus Vorster

Thanks

I think i will first break it down starting from the beginning of the big block of text, finding all the smaller code blocks starting and ending with { }.
Then remove those i dont need.
Once i have a string containing only the one code block needed, it should be easy.
Will let you know what happened..

-Peter
-Regards
Peter

raymw

Hi Peter,
It's maybe just semantics, but if you're reading a chunk of data into memory, instead of removing blocks you don't need, save the blocks you do need to another string/file. Inevitably you will need to find more information about them. What will happen, you'll got to your boss with the results of the first inquiry, he/she/it will be impressed and then ask something like 'how many of these deliveries were on a Friday?' Don't immediately go back with the answer to this second question, 'cos then you'll get asked a third, fourth and so on.

Petrus Vorster

Hi

Was away for a while.
No, none of this is "Official" work, just to make my life easier at work.
I am a Manager by trade and just a hobbyist programmer.
In our head office they have a number of developers, but they follow "red tape guidelines" and not really what we need.

So, no showing it to any bosses, regrettably.
The national system corresponds to an internationally agreed system between like 52 countries, so nobody will touch it.
(It updates as far a Switzerland for the global community)
Any query gives you the full history in a list format, and that is pretty much the same across the globe.

What I am trying to do is send a number of tracking numbers after hours to that server, obtain the whole list as it supplies it, and filter out what i don't want/or what i want and update my own database of items for which my app had send SMS notices. (THe SMS part works like a charm)

From what i understand is that no country "really want" people to do that in volume in the fear of overloading servers, so internationally, its currently TYPE and Receive response.
I don't think that developer that gave me the API command actually thought i would manage to obtain the data like that....

But I would love to send him a working example though....

-Peter
-Regards
Peter

SeaVipe

Hi Peter,


Here is a parsing routing I've used in various configurations for ages, you might find something interesting in it...




#Lang "FB"


? "Compiled c:/winfbe_suite/freebasic-1.06.0/fbc64.exe"
? "See also AfxStrParse and FireFly FF_Parse()"
? "--------------------------------------------------"
? "Build a big block of text."
? "Hit a key"
Sleep


Sub parser( Byval big_string As String, accumulated() As String )


Dim as String sString = ""
Dim as String l_brace = "{"
Dim as String r_brace = "}"
Dim As Integer iBegin = 0, iEnd = 0, iCount = 0

For i as Integer = 1 To Len( big_string )


sString = Mid( big_string, i, 1 )


Dim As Integer h = Instr( sString, l_brace )

If h Then iBegin = i

Dim As Integer j = Instr( sString, r_brace )

If j Then

iEnd = i

Redim Preserve accumulated( iCount )
accumulated( iCount ) = Mid( big_string, iBegin, (iEnd - iBegin) + 1 )

iCount += 1
iBegin = 0
iEnd = 0

End If

Next i


End Sub


' Create a large block of text
Dim as String your_big_text_block '' some text with braces {}
Dim As Single x = 0 '' Loop counter


Do While x < 1000


your_big_text_block += " -a large block of text )(*&^%$#@!{your keeper text here " + Str( x ) + "}qwerty!@#$%^&*()_+"
x += 1


Loop


? your_big_text_block
? "-----------------------------------------------------------------------------------------"
?
? "Parse the above block..."


Sleep


Dim As String accumulated() '' Variable to hold parsed text


Parser( your_big_text_block, accumulated( ) )
? "-----------------------------------------------------------------------------------------"
? "Result:"


For i As Integer = Lbound( accumulated ) To Ubound( accumulated )


? Str( i ) + "-" + accumulated( i )


Next i


?
? "--------------------------------------------------"
?
? "Parse the accumulated() array to extract the desired information."
? "Hit a key..."
Sleep
Clive Richey

Petrus Vorster

Thanks a million, I will give it a try!

Regards, Peter
-Regards
Peter

raymw

Hi Peter,

Most of my parsing, recently, has been on lines of g-code, similar to below

(size of top is 80 by 73)
g00 x-1.3750 y74.3750 z2.0000
g01 x-1.3750 y74.3750 z-1.6250 f66.6667
g01 x-1.3750 y74.3750 z-1.6250 f200.0000
g01 x81.3750 y74.3750 z-1.6250
g01 x81.3750 y-1.3750 z-1.6250
g01 x-1.3750 y-1.3750 z-1.6250
g01 x-1.3750 y74.3750 z-1.6250 f200.0000
g00 x-1.3750 y74.3750 z2.0000
(tabs on edges)
G00  x-3.0000 y76.0000 z2.0000
G01  x-3.0000 y76.0000 z-2.9500 f66.6667
G01  x34.0000 y76.0000 z-2.9500 f200.0000
G00  x34.0000 y76.0000 z-2.2500
G01  x46.0000 y76.0000 z-2.2500 f200.0000
G01  x46.0000 y76.0000 z-2.9500 f66.6667


I need to extract the values of x, y, z etc.
I have found, since case does not matter,( in this case...) e.g. can be XYZ  I convert the line I'm checking to upper case, then find the X,then simply val the rest of the string to it's right. No need to read in each digit. the following picks out the x and y values, if any

function xylen(ln as string) as Double  ' get distance from origin to xy in line.
             dim p as integer ,t as integer
             dim x as double, y as double
             dim lns as string      ' the rhs of line of text ln
            x=0
            y=0             
            t = Len (ln)
                         
     For k as Integer = 0 To t
                     
                 lns= Right(ln,t-k)
               if left (lns,1)= "(" then return 0
               If UCase(Left (lns,1))= "X" Then
                      x= Val(Right(lns,(t-(k+1)) ))
              end if
              If UCase(Left (lns,1))= "Y" Then
                      y= Val(Right(lns,(t-(k+1)) ))
             end if
                         
      next k
             return sqr((x*x) +(y*y))   
               
   end function


I'm having to do this, because my original code was generated by software I wrote in C#, and i don't want to go back to that bloatware, so I'm adjusting the code it produces using fb. If I can find my original source code, if it's lengthy, then I'll parse that too, and convert much of it to fb. Hopefully the fb compiler will pick up the rest.

Petrus Vorster

Thanks a million for the ideas.
I am tinkering with all the concepts. Will let you know how this went.

-Peter
-Regards
Peter