GarryBargsley Posted February 3, 2008 Share Posted February 3, 2008 Parsing data out of one filed to multiple fields and each field is a record in my FM database. Example data: User | First Name | Last Name | Code | Address1 | City | State | ZipCode | E-Mail jerryjones Jerry Jones 1111 1234 Cowboy Lane Dallas TX 12345 jerry.jones@cowboys.com brucewayne BRUCE WAYNE 1111 999 GOTHAM ST GOTHAM TX 76992 bat_man_99@batcave.com gbargsley Garry Bargsley 1111 8510 Hwy 1 Dr. 8510 Hwy 1 Dr. Dallas TX 75111 gbargsley@sbcglobal.net billcosby Bill Cosby 1111 4406 debbie Drive #443 Dallas TX 75237 billcosby@jello.com I have been able to get user, first and last name. I am having real trouble with address since there are multiple spaces and unknown number of spaces and length. City is also a challenge and state is not that important. I am having trouble with the e-mails that have an "_" in them as FileMaker thinks this is a space and breaks. Any help or ideas would be great. I have over 6000 records to split out, but they are all in the format above. Thanks Garry Quote Link to comment Share on other sites More sharing options...
Jack Rodgers Posted February 3, 2008 Share Posted February 3, 2008 If the data was exported from another db then it should be either tab delimited, comma delimited or fixed length. I cannot tell which it is since you may have changed the structure of the data to display it. However, your mention of multiple spaces seems to indicate that it is fixed length. Open the original un-modified file in a text editing application and apply a fixed width font to it. The data should line up in visible columns something like: jerryjones Jerry Jones 1111 1234 brucewayne BRUCE WAYNE 1111 999 gbargsley Garry Bargsley 1111 8510 billcosby Bill Cosby 1111 4406 If this is so you can use length to parse out the fields. If not, they you have tabs to deal with. Quote Link to comment Share on other sites More sharing options...
AHunter3 Posted February 3, 2008 Share Posted February 3, 2008 a) Bone up on Middle and Position and PatternCount. Middle and Position, and to a slightly lesser extent PatternCount, are totally your friend here. The combo of those constitutes the great socket-wrench of text-parsing tools. Tools like RightWords or MiddleValues are easier to learn and the code is a lot cleaner-looking, but they're far more limited in scope. Middle is versatile. You can go at it with Middle even if Jack Rodgers is wrong in his guess, and your data has neither tabs to delimit fields NOR has each field starting x number of spaces deep into each line. b) Except City. City is going to be the killer. Even if you were instructing an air-breathing liveware data processor from another country, you'd have a damn hard time explaining how to know where ADDRESS leaves off and CITY starts: does "999 GOTHAM ST GOTHAM TX 76992" mean "999 Gotham" in the city of "St. Gotham TX"? Well, probably not, since "ST" probably means "Street". But will all addresses in your db end in "ST"? Or in "ST" or "RD" or "AVE"? How about "1417 BROADWAY ST LOUIS MO 63131"? Person lives on good old Broadway Street down in Louis MO? Depending on how many records you have, you may wish to eyeball them and manually enter a 1 or a 2 for "number of words in city", using forms of human intuition that are going to be godawful difficult to distill down into FileMaker If or Case statements. c) email, like everything else except the above CITY versus STREET ADDRESS problem, will succumb very nicely to Middle + Position: Middle (YourTable::UglyTextString, Position (YourTable::UglyTextString, " ", 1, PatternCount(YourTable::UglyTextString, " "))+1, Length (YourTable::UglyTextString)) in other words "start as many characters deep as the position of a specific space character — which one? the one that's as many as there ARE, in other words the LAST one — plus one character so we don't include the space itself at the front — and from there go to length of the whole thing, in other words to the end". There's your email address, underscores or not. See? Quote Link to comment Share on other sites More sharing options...
GarryBargsley Posted February 3, 2008 Author Share Posted February 3, 2008 Thanks for the suggestion Jack, but if the data was in a standard format I would be able to fly through this project. AHunter3, thanks for the suggestions. I will work with the position and patterncount functions to see if I can get at least some of this data to work. Let me ask you if I told you that "TX" as the state is the same for every record do you think the city could be captured? Or, do you think I could capture Address through ZipCode into a field and then break that field in to individual components? Thanks again. Garry Quote Link to comment Share on other sites More sharing options...
GarryBargsley Posted February 3, 2008 Author Share Posted February 3, 2008 FYI, in the data the "1111" is always the value before the start of the address and the zipcode is a five digit number. Maybe those can be the key to extracting the address info. Garry Quote Link to comment Share on other sites More sharing options...
AHunter3 Posted February 3, 2008 Share Posted February 3, 2008 Again, no, the problem is figuring out where street address stops and city starts. Knowing where STATE starts is easy & automatic, and likewise for ZIPCODE. Even knowing where address starts is neither a help nor something that was problematic to begin with. It's strictly street address versus city. If it were me I'd run a routine that assumed all cities have one-word names. Then I'd peer at them and manually mark a field "CityWords" with the number of words in City for records where City is a multi-word field. (In Texas you could speed it up substantially with a find for "SAN", I suspect ). With that additional info, I could middle out your street address separate from your city pretty dependably. Given that for one-word citynames, the city starts at one character beyond exactly the FOURTH " " (space character) from the end, in other words THREE less than PatternCount (UglyField, " ") —— Let ([ citystartspacect = 3 + Case(isempty (YourTable::CityWords), 0, YourTable::CityWords); citystartpos = Position (YourTable::UglyString, " ", 1, PatternCount(YourTable::UglyString, " ")-citystartspacect; cityendspacect = citystartspacect+ Case(isempty (YourTable::CityWords), 1, YourTable::CityWords)]; ]; Middle(YourTable::UglyString, citystartpos, Position(YourTable::UglyString, " ", 1, cityendspacect) - citystartpos) - 1) ) (or something to that effect. usual disclaimers about this being off the top of my head & untested) Quote Link to comment Share on other sites More sharing options...
Jay Oliver Posted February 22, 2008 Share Posted February 22, 2008 I have been reviewing your use of the Middle Function. For some time now I have unsuccessfully attempted to parse out the First Name, Middle Name, and Last Name from a Full name field. I can get my Excel spreadsheet to perform the correct function but being new at FileMaker Pro, I am hurting. Here is an example of my fields and what I want them to do. I am trying to use the "," as my focus point. Full Name Pearson, Mary Ann Last Name Pearson, First Name Mary Middle Name Ann Quote Link to comment Share on other sites More sharing options...
AHunter3 Posted February 22, 2008 Share Posted February 22, 2008 You want to include the comma in last name? I assume not? Middle (full name, 1, position(full name, ",", 1, 1)-1) if you actually want it in there, ditch that final minus-1. First Name = Middle (full Name, Position(full name, " ", 1, 1)+1, position(full name, " ", 1, 2)-position(full name, " ", 1, 1)-1) Middle Name = Middle (full name, Position(full name, " ", 1, 2)+1,length (full name) ) Quote Link to comment Share on other sites More sharing options...
David Head Posted February 22, 2008 Share Posted February 22, 2008 If we make the following assumptions for a name like Smith, Mary-Jo Jane: 1. Last name is everything before the comma 2. First name is the string after the comma and before the next space 3. Middle name is the last string after the last space 4. There is always a space after the comma If those assumptions are true (and there would be cases in which they are not), the following are methods of extracting each: LastName = Left ( FullName; Position ( FullName; ","; 1; 1 ) - 1 ) FirstName = Left ( Right ( FullName ; Length ( FullName ) - Position ( FullName ; "," ; 1 ; 1 ) - 1 ) ; Position ( Right ( FullName ; Length ( FullName ) - Position ( FullName ; "," ; 1 ; 1 ) - 1 ) ; " " ; 1 ; 1 ) - 1 ) MiddleName = Right ( FullName ; Length ( FullName ) - Position ( FullName ; " " ; Length ( FullName ) ; -1 )) Now the FirstName calculation is a bit of a nightmare to troubleshoot so it is best to write it as a Let statement: Let ([ name = FullName; lenName = Length ( FullName ); posFirstComma = Position ( FullName ; "," ; 1 ; 1 ) ; first_and_middle = Trim ( Right ( name ; lenName - posFirstComma ) ) ]; Left ( first_and_middle ; Position ( first_and_middle ; " " ; 1 ; 1 ) - 1 ) ) Note that the addition of the Trim function in the calculation of first_and_middle means that there could be no space or more than one space after the comma. There are still things that break the parsing e.g. a trailing space in the full name. So more work is required to get solid results.smiley-wink Quote Link to comment Share on other sites More sharing options...
Jack Rodgers Posted February 22, 2008 Share Posted February 22, 2008 Advice: don't kill yourself trying to do a perfect parse... Let's assume you use a script for your calc. Manual Loop, not a scripted Loop: run the script. Then you scan the records and omit the ones that did not work correctly. Show omitted. Adjust some of them manually and omit them. Revise your script as needed. Rerun script Exit Loop When all the records are finished. End Manual Loop This is a time tested method that preserves insanity and accommodates the unknown and accepts that sometimes it is easier to manually enter the data than spend a half an hour crunch a calc. Quote Link to comment Share on other sites More sharing options...
David Head Posted February 22, 2008 Share Posted February 22, 2008 ... method that preserves insanity ... Reasonable advice Jack. But I would like a method that preserves my sanity rather than my insanity. smiley-wink Quote Link to comment Share on other sites More sharing options...
Jack Rodgers Posted February 23, 2008 Share Posted February 23, 2008 Sanity is for weaklings... Quote Link to comment Share on other sites More sharing options...
Jay Oliver Posted February 25, 2008 Share Posted February 25, 2008 Thank you for your time and interest in my query. I have test driven each of your solutions and find them very helpful. One problem I encountered was that if a person did not have a middle name, the first name did not populate in the first name field but showed up in the middle name field. If I put a space after the first name, the middle name field remains empty, as it should, however, the first name field remains empty. Please advise. Smith, Betty= Smith (Last Name); empty (First Name); Betty (Middle Name) Smith, Betty(space)= Smith (Last Name); empty (First Name); empty (Middle Name) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.