Jump to content
Salesforce and other SMB Solutions are coming soon. ×

One table or many for "Categorization"


dmontano

Recommended Posts

I thought 1 table - Categorization - with these four fields was appropriate:

 

Category

Subcategory

Type

Subtype

 

However, recently, someone with more experience using FileMaker thought it would have been better if I had created the tables as follows:

 

CATEGORIES table

with field for category;

 

SUBCATEGORIES table

with fields for category AND subcategory;

 

TYPES table

with fields for subcategory AND type;

 

SUBTYPES table

with fields for type AND subtypes.

 

Maybe I misunderstood him, but the above is the impression I was left with. Does this make sense? Good sense? Can anyone shed any light on what I may be missing or just not getting?

 

Because I am working on a "solution" that incorporates my current "categorization" table and have many TOs that will spawn from it, I would hate to realize or see later that I would have been better off splitting it as he mentioned.

 

Thanks

Link to comment
Share on other sites

Is there (or will there conceivably be in the future,) more than one "subcategory" per "category"? Will there EVER be more or less than now?

 

Ditto for "types" per "subcategory"?

 

Ditto for "subtypes" per "type"?

 

(BTW, usually the answer to one or more of these questions is "Uh, yeah.")

 

If so, you want to use separate tables for each entity. But in each, you need an id field for that entity, and an id field for its parent:

 

Category::Category_ID

Category::CategoryData

 

Subcategory::Subcategory_ID

Subcategory::Category_sub_ID

Subcategory::SubcategoryData

...etc.

 

Then you will relate them by Category::Category_ID=Subcategory::Category_sub_ID

...etc.

 

Then you won't have to rebuild your solution when subcategory 2 comes down the pike, just add another subcategory record in the subcategory table.

 

--Doug

Link to comment
Share on other sites

Thanks Doug,

 

Help me out here for a moment. I am starting to think that a lot of "entities" can be broken down if one looks at it in a particular manner.

 

If "Subtype" must always have a "Type" associated with it, and if "Type" must always have a "Subcategory" associated with it, and lastly, if "Subcategory" must always be associated with a "Category"... is this an instance where every field below "Category" in the heirarchy is an "attribute" of "Category"?

 

or

 

Is this blatantly clear to everyone, except me, that it is "only" or "always" a Great-grandparent, grandparent, parent, child relationship?

 

My one table "seems" to work just fine, but my ignorance is abundant so I am all "ears". I am asking because it seems the wheels are always falling off this jalopy.

 

Is this a classic case of redundant data in a table and breaking some rule of table normalization? (Thought I would throw out a term I read - do not know what it means - but sounds good).

 

This is not an attempt to refute what you have pointed out, just my attempt at getting some additional help seeing what you are "seeing". I keep having to rebuild at almost every step of the way that I am going and I can only read so many books. If I was in college again - I would have bombed this class. Be easy on the simple minded....

Link to comment
Share on other sites

Well, it would be easier if we had some more specific information about your entities.

 

But in general, if your dataset has, always has had, and always will have, an attribute that is in a one-to-one relationship with its parent, then that data should be in a field in that entity's table. For example, a person has one and only one date of birth. It should go in the People table.

 

OTOH, if the entity has more than one instance of a given attribute, then that attribute should have multiple entries in a related table. In the real world, if there are two of something related to your subject, someday there are going to be three. The person mentioned above has a home and work phone number, so you plan your solution to have a home phone field and a work phone field. Next month, he buys a cell phone. A year after that, he buys a fax machine. If that data needs to get in your database, you have to change the structure, sometimes years after you first wrote it (and have forgotten everything about it, no matter how hard you worked to document and comment it!).

 

Instead, if you had put phone numbers in a related Phones table (because you knew from your life experience with the data you are working with, that there would someday be more phones associated with that person), all it would take is another record in the phones table for each new phone number. Your data would be complete and you wouldn't have to rip your whole solution to shreds and put in kludge after kludge.

 

Now, let's suppose this person (along with many others) works for a company. The People table we started with then relates to the Company table (you certainly wouldn't put a couple hundred employee fields in the Company table!). The phone data we have accumulated may or may not have any relationship to the Company. If the person is Joe Schmoe all the company really needs is a primary contact phone. But if the person is Harvey Keyman and the company really needs to contact him Right Now then all the phone numbers would be necessary data related to the company. So you may or may not have or need a grandparent relationship, but it is there with no muss, no fuss.

 

--Doug

Link to comment
Share on other sites

Thanks Doug,

 

You actually created an example of what I was thinking - that you can in many cases segragate the data into many tables. Your example uses "phones" and I am glad that you can extrapolate from what I was trying to say.

 

My phone / fax etc. are not in a separate "phones" table, but encased within the "employees" table. But I have been reevaluating things as I have been reading more and more and I can see that there are no hard and fast rules. There may be instances where one versus the other is "better" or "really-should-be-done-this-way".

 

Your "phone" example makes sense - however, I am now trying to see where is it a "better" choice to do the "seperate" table.

 

I thought that by putting Categories, Subcategories, Type, and Subtype in one separate table and not in a value list was a structurally good idea - and a little more work. Then I was exposed to the thought of breaking that table down into 3 or 4 tables. More work, and reworking scripts, layouts, blah, blah, blah.

 

Anyways, I do want to do it the structurally best way - even if it may be "overkill". I am trying to learn so I would rather error on the hyper-conservative side until I understand when it is appropriate to do things with less work involved up front (i.e quick value list versus 1 table, versus multi-table).

 

I went ahead last night and built the tables as you recommended and am working on the relationships to tie them together. My difficulaties with relationships causes me to shy away from them when possible.

 

Thanks again!

Link to comment
Share on other sites

I am now trying to see where is it a "better" choice to do the "seperate" table.
Like I said
In the real world, if there are two of something related to your subject, someday there are going to be three.
If you are absolutely convinced that the number of items will never change over the lifetime of the solution, leave that attribute in the original table. If you have the slightest possibility that the number of items WILL change (or is variable from record to record now), save yourself molto work down the road by planning for it now with a related table. As our esteemed colleague Mr. Head is fond of paraphrasng, "Anytime you see fields somedatum1, somedatum2, somedatum3... in a table, there is a problem with your solution's structure".
My difficulaties with relationships causes me to shy away from them when possible.
Don't! They are far too powerful a tool not to use them. In my phone example, the People table has an ID field, autoentered serial number (plus all the other info about the person). In the Phones table, you will have ID (autoentered serial number) and PersonID (plus phone info). The relationship will be People::ID = Phone:: PersonID. In the course of running your program, perhaps through a script, you will create a new Phone record you will Set Field(Phones:: PersonID; People:ID). That phone record is now indelibly linked to that person. How difficult is that? Once that's done, you can put a field Phones::AnyFieldYouWant on any layout based on the People table and see the related phone info for the person. Of course, more complex and really geeky relationships can be found, but you don't drive in NASCAR the day after you get your learner's permit, either.

 

--Doug

Link to comment
Share on other sites

Hi Doug,

Hope you still with me one this one...

 

I tossed out my single table of "categorization" and took the plunge. I can not get it to work though.

 

Here is what I have: 4 tables, each with their own primary key as auto-enter serial.

 

Table 1: categories

category_keyid_pk

category_name

 

Table 2: subcategories

subcategory_keyid_pk

category_keyid_fk

subcategory_name

 

Table 3: types

type_keyid_pk

category_keyid_fk

subcategory_keyid_fk

type_name

 

Table 4: subtypes

subtype_keyid_pk

category_keyid_fk

subcategory_keyid_fk

type_keyid_fk

subtype_name

 

I have related as such:

categories:category_keyid = subcategories:category_keyid_fk

 

subcategories:subcategory_keyid = types:subcategory_keyid_fk

subcategories:category_keyid_fk = types:category_keyid_fk

 

types:type_keyid = subtypes:type_keyid_fk

types:subcategory_keyid_fk = subtypes:subcategory_keyid_fk

types:category_keyid_fk = subtypes:category_keyid_fk

 

Have I done something wrong here?

 

Thanks again,

David

Link to comment
Share on other sites

It is very hard for me to follow the logic of this discussion (particularly the first-time poster about WoW), but this is what I think your relationships should look like:

 

Company -> Category (GM -> Autos ; IBM -> Computers)

Company -> Sub-Category (GM-> Domestic ; IBM -> Large/mainframe)

Company -> Employees (GM-> GM Employees ; IBM -> IBM Emploees)

 

The company should have the primary keys for Category, Sub-Category, etc. (unless one of your tables is a join table).

Link to comment
Share on other sites

Can you supply some sample data or at least some information about a member of "Category", etc., for your different tables? It would make it far easier to understand (and therefore to advise). Both Techphan and I are in the same boat, having to conjecture what your dataset is all about.

 

Big broad generalizations in the question necessarily generate big broad generalizations in the replies. More specific questions invite more lucid answers. Extrapolation to general cases will then follow.

 

--Doug

Link to comment
Share on other sites

  • 2 weeks later...

Thanks Doug and Techpan for your replies,

 

I have been WRESTLING big time with this. As I mentioned earlier, I have rebuilt my "categorization" table which had the four fields: Category, Subcategory, Type, and Subtype.

 

Now I have this new structure:

Table 1: categories

category_pk

category_name

 

Table 2: subcategories

subcategory_pk

category_fk

subcategory_name

 

Table 3: types

type_pk

subcategory_fk

type_name

 

Table 4: subtypes

subtype_pk

type_fk

subtype_name

 

I have related as such:

categories:category_pk = subcategories:category_fk

subcategories:subcategory_pk = types:subcategory_fk

types:type_pk = subtypes:type_fk

 

For the life of me, I can not create a conditional value list to facilitate assigning the proper values needed when creating some of these records:

1. I can not create a conditional value list in my "Subcategory" table to provide only valid "Subcategories" after I select a "Category".

 

2. I can not create a conditional value list in my "Types" table to provide only valid "Types" after I select a "Subcategory".

 

3. I can not create a conditional value list in my "Subtypes" table to provide only valid "Subtypes" after I select a "Type".

 

I am struggling, but believe I am just going about this wrong. I was able to create a conditional value list using my "one" table approach, but can not get this to work as a multi-table set-up.

 

More info: I "think" what I am basically trying to create is a "validation" table for "Equipment" categorization. I will not be altering these values often, but will need to add, delete as Doug pointed out. Therefore, each entity in a separate table makes sense, however, I have my problems getting this built.

 

I need a way to create records in each of these four tables that is conditional so it is easier to make the correct selection in order to populate each table. I also (and ultimately) want to use these combined four related tables to act as a validation table for other areas of my solution - for example, when creating a new "Equipment" record, I want to be able to have that "Equipment" item categorized at thie time the record is being created via conditional drop-downs using thie categorization scheme.

 

Unfortunately - FileMakers' intuitiveness in this area is not agreeing with my brain - I guess I am not bright enough to see the ways of doing this.

 

Any help is appreciated.

Link to comment
Share on other sites



×
×
  • Create New...

Important Information

Terms of Use