Jump to content
coolboy7163

Custom function to extract text from web content

Recommended Posts

coolboy7163

Hello to all!

 

This is a quick custom function that you can use to extract text from an HTML source when extracting the source from a web page using the filemaker function:

 

GetLayoutObjectAttribute ( objectName ; attributeName {; repetitionNumber ; portalRowNumber} )

 

Here is my custom function:

 

 

Middle (source ; Position ( source ; start ; 1 ; 1 ) + Length ( start ) ; Position ( source ; end ;Position (source; start ; 1 ; 1 ) ; 1 )-(Position ( source ; start ; 1 ; 1 ) + Length ( start )))

 

 

You need to create 3 parameters: Source, start, end

 

Where source is the HTML text, "start" is a unique text string in the HTML text right before the text you want to extract for example "ADDRESS_ESC_=", and finally, "end" is the text character or text string immediately after the text string you want to extract.

 

I tested this function with whitepages.com to extract a person information into separate field when searching by phone number and it worked. There may be bugs, but so far I haven't found any.

 

Good Luck!

Share this post


Link to post
Share on other sites
Palmtops

It works well Coolboy, thanks for sharing!

Share this post


Link to post
Share on other sites
lostatsea

This sounds interesting. Could you go into more details as to how you set this up in FileMaker and how FileMaker knows where to pull the data from a particular site? What if the site has a sessionID and it expires?

 

 

(I'm new)

Share this post


Link to post
Share on other sites
eristoddle

Has anyone discovered if there is a way to fill forms through a web viewer and post them, possibly.

Share this post


Link to post
Share on other sites
benlev

Very clean, and it works great. Thank you. (where were you yesterday when I spent the day trying to pull together a similar script.)

 

Second question for anyone. Does anyone have a script that will strip ALL the HTML from a page and leave it as only text?

 

I'm using a very complex script now where I use extensive substitution functions, but the end result is not consistant.

 

For example, I'll replace all

tags with "that paragraph symbol" as well as all

. Then I have a custom function that removes all text that is between tag brackets .

 

Then I try to get rid of all blank lines but some odd HTML code must still be in there and screws up things.

 

Anyone? I use Filemaker to pick up my POP and iMap email with a plug in and then I want to file it under the persons name but if the email comes in as HTML I don't want to store that; just the text of the message.

 

Thanks.

 

Ben

Share this post


Link to post
Share on other sites
This thread is quite old. Please start a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.




×
×
  • Create New...

Important Information

Terms of Use