Thursday, January 5, 2012

On Roles, Attributes, and Definitions

Dave Hay commented on my post How Many Attributes Do I Have?  Dave notes that there is a difference between me and the roles that I play.  This is an important point that I struggled with previously.  Dave states "most of the examples are attributes of my role as a customer", meaning the examples I provided in my post.

"Role" is a term that gets bandied around a lot in data modeling.  In my previous post on Role vs. Relationship I argued that roles really refer to certain kinds of relationships.  However, Dave's point is one that I have heard on a lot of occasions and has to be taken seriously.

Let's state the question this way: is the attribute Customer Lifetime Value to Hardbitten Liquors an attribute of me, or an attribute of my role as a customer of Hardbitten Liquors?  And if the latter, just what do we mean by "role".

There is no doubt that I am an instance of a concept.  The concept is human being.  Further, Customer Lifetime Value to Hardbitten Liquors can be predicated of me, strongly suggesting it is an attribute I possess.  

But now let us think of the role that is being suggested in this discussion.  What is it?  Is this role "Customer of Hardbitten Liquors"?  If so, I would argue that this is a relationship between me and Hardbitten Liquors.  And if an entity type has attributes, and relationships do not, then we cannot say that a role has any attributes.

But suppose Dave is right and the role does have attributes.  It will have to be an entity of some kind.  What other thing could the role be - apart from "human being".  There is a possibility.  Suppose I only ever bought one bottle of Grandpa's Tipple from Hardbitten Liquors.  Then, my entire relationship with Hardbitten Liquors could be encompassed by this one event - the purchase of this one bottle.  Now, Purchase is an entity type, albeit non-material, so it can at least be a candidate for the role.

But can Purchase really be the same as role?  I do not think that an event can have an attribute such as Customer Lifetime Value to Hardbitten Liquors, which really refers to the individual customer.  And I do not think this can be true of any aggregate of instances of Purchase events either, supposing, for instance, that I buy one bottle of Grandpa's Tipple every week.  

So if role is not to be identified either with me or my purchases, what other entity types can it be identified with?  I need to do some more research to be able to answer that.  However, for now I am still going to stick with attributes like Customer Lifetime Value to Hardbitten Liquors as being an attribute of me.  So my original point provisionally remains: a concept can have a vast number of attributes and some methodology is needed to decide which ones to include in a definition.

Wednesday, January 4, 2012

Legislating vs. Discovering Definitions: Radical Differences

Most of my experience in doing definition work has mostly been from the perspective of an analyst involved in systems development.  These days it usually data-centric applications, such as building Master Data Management (MDM) or business intelligence (BI) environments.  The method of the analyst begins with understanding scope and requirements, and then finding the business concepts and data objects that need to be defined.

However, there are other perspectives.  Terminologists, often oriented to the language translation industry, do deinition work.  So, I suspect, do brand managers, who want to control messaging to customers. I work a good deal in financial services, and I am aware that there is another group that gets involved in definitions.  These are business people who create completely new products.  For instance, Asset Backed Securities (ABS) and Colladeralized Debt Obligations - both now notorious as weapons of mass financial desctruction - were created by investment bankers in the past few years. 

My experience with ABS begins with the legal documents of a bond issuance deal.  These documents contain the full definition of everything in the deal, the rules for how the bond is supposed to work over time, and the contractual obligations of the parties involved.  One of the tasks I was involved in was to take these documents, reverse engineer them, and create cashflow models under various scenarios to see how the bonds would perform.  A global meltdown caused by changing the credit rating of the these bonds from AAA to DDD overnight was not part of these scenarios (in case you were wondering).

The definition work that goes into an ABS structure has to be precise.  It is essentially part of building a conceptual system - a new piece of reality - that will be set in motion.  A major problem in finance is that the products are all non-material.  It is not like manufacturing new designs of plastic gnomes to decorate gardens, or baking a different kind of doughnut.  The laws of metaphysics, mathematics, and nature do not supervene to automatically take care of things.  The new plastic gnome will not suddenly melt down overnight for no apparent reason.  A doughut I place in the fridge will not evaporate for no apparent reason.  But equally strange things can and do happen in financial systems.  Contradictions, I would maintain, do not exist in material reality - but they can be present (albeit unrecognized) in financial reality.  An ABS issuance can both be AAA-rated and have significant defaulted underlying collateral at the same time. 

Legislative definitions are those which are created as part of creating the concept being defined.  I agree that creating a concept is diffenent to creating an instance of a new type of already existing concept.  Each ABS deal includes a lot of concepts that were defined previously - in other ABS deals.  However, differences can still arise.  My point is that the degree of care involved in such definitions much be much greater than that of the analyst.  An investment banker can create a Doomsday Machine if he or she puts together a flawed ABS deal.  An analysts can mess up an integration point for a data object, but can usually remedy it after discovering the problem.

So I think we can conclude that the consequences for bad definition work vary depending on what the work in being done for.  In some situations the definitions have to be rock solid from the start.  Other situations may be more forgiving.  Recognizing the risks involved is an important part of definition work.
One final point.  I have illustrated legislative definitions using examples from financial services.  However, I would maintain that the same problems apply to all sectors, e.g telecommunications, pharmaceutical, and government.  The problems will arise in any situation where non-material reality is constructed.

Tuesday, January 3, 2012

How Many Attributes Do I Have?

Characteristics of a concept - its attributes - are central to definitions.  But to what extent should the characteristics of a concept be listed in its definition?  Should it be few, or some, or as many as possible?  As a step in beginning to answer that question, I think that we need to ask if we can reliably determine all the characteristics that a concept possesesses.  And I now intend to see if I can answer that question by finding out if I can list all the attributes that I possess.

I am aware that I am an instance and not a concept (at least a general concept).  However, I would submit that there is a prima facie case that I should be able to provide a list of my attributes.   If such a list could be produced, then we couldn see if the attributes apply to the concept I am an instance of (humans).  We could then move on to figuring out what attributes should or should not be included in a definition.  But, if I cannot even reliably figure out what attributes I have, then I may have difficulties I have not yet recognized in the method I have chosen to get answers to the questions I am posing.

It is easy to start listing out all the physical characteristics I possess: height, weight, eye color, and so on.  I could add some non-material ones too, such as age, and IQ score.  However, from my experience as a data modeler and developer, these seem rather trivial.  I have come across many examples of database tables such as Customer, where I could conceivably be represented by a record.  These tables have columns (representing attributes) for e.g. Customer Lifetime Value, Customer Sales Year to Date, Customer Average Order Size, and so on.  I would guess that every company for which I am a customer maintains such attributes to describe me. 

Actually, I am guessing.  I know I possess a height, weight, eye color, etc, because I know what these attibutes are and I know I possess them.  However, when it comes to a company of which I am a customer, say, Big Box Super Store, it is not so clear.  Specifically: (a) I do not really know what attributes Big Box Super Store considers I have; and (b) I do not know how Big Box Super Store defines each of these attributes. 

Many of the Customer tables I have seen have had hundreds of columns (attributes).  Some have had thousands.  At this scale, even when working with these tables it is difficult to keep track of all the attributes they are representing.  Admittedly, the tables were not always designed well, and included columns that represented attributes that were not truly part of Customer.  But even allowing for this, the scale is still great.  Furthermore, Big Box Super Store is not the only company I buy from.  I probably have a similar relationship with about 50 other companies.  So the total number of attributes I have as a result of these relationships is certainly in the thousands, maybe in the tens of thousands.

It could be argued that many of these attributes are really the same.  Suppose Big Box Super Store calculates Customer Lifetime Value the same way as Hardbitten Liquors (of which I am also a customer).  Then, are we talking about one attribute or two?  As a practical problem, however, I cannot give an answer to this because I do not know how each company is calculating the attribute each calls "Customer Lifetime Value", or how each defines this attribute.

What I strongly suspect is that I carry around with me a vast burden of attributes that companies, government agencies, educational institutions, and other organizations have heaped on me, mainly without my consent, and certainly without my knowing what they are.  Not as many as the grains of sand on the seashore, or stars in the night sky, but enough to wonder at.

So the answer to the question posed in the title is that I cannot reliably say how many attributes I have, but it must be a vast number, and some of them are likely to be outside my range of understanding.  Does this present an issue for definition work?  I think it does.  It suggests some kind of need for scoping.  It also suggests that I appear in different ontologies, and that my definition in each may vary.  But the Muse of Blogging now decrees an end to the current post, so these topics will have to be taken up when Her inspiration returns.