2014-03-14

A Little DB2 SQL Adventure

I don't usually blog about this sort of technical stuff, though I deal with it daily. That probably explains right there why I avoid it as if it is tainted by the plague. But this particular problem was interesting enough that I thought it might be worth conveying it.

The challenge for my associate came in the form of a bit of SQL that needed to pull apart a string (unknown until runtime) before it was inserted into a data store. A stored procedure wasn't possible, and the parsing requirement was unavoidable. It also had to be done with inline SQL.

It was known that this particular string could have several forms:
  1. A string containing alphanumeric elements separated by a dash;
  2. A string containing alphanumeric elements without a dash; and
  3. A zero-length string containing nothing (that could not be converted to a NULL on the incoming side for reasons not relevant here, but such that COALESCE was suddenly not a functional solution).
The SQL causing them grief had two variations in the INSERT, but to isolate the data elements I'll stay with simple SELECT statements from the system dummy table. And we'll call the input containing the string {Landmark} for the sake of abstraction.

The first SELECT would look thus:

SELECT SUBSTR('{Landmark}',1,LOCATE('-','{Landmark}')-1) AS DATA_ELEMENT FROM SYSIBM.SYSDUMMY1;
In essence the data element expected for return would see a dash in the Landmark string and return everything left of that dash. (For those wondering why not use LEFT in this context, it ultimately creates the same basic problem in as much as it will fail to return correctly if the Landmark value has no dash. So, it would have been a different approach with the same sticking point.)

The second SELECT would look thus:

SELECT SUBSTR('{Landmark}',1,LOCATE('-','{Landmark}')) FROM SYSIBM.SYSDUMMY1;
A simpler view of the same issue, which is easily illustrated by plugging in a few actual values that landmark might pass in.

The First Form

To focus now we can strip the SELECT and FROM SYSIBM.SYSDUMMY1 parts of the exemplar. It leaves us with the inline SQL reading:

SUBSTR('{Landmark}',1,LOCATE('-','{Landmark}')-1)

And the SQL works perfectly if the data coming in contains a dash and is not zero-length. In other words, '12345-12345-666' parses perfectly:

SUBSTR('12345-12345-666',1,LOCATE('-','12345-12345-666' )-1)

...returns a LOCATE value of 6, and when you subtract 1 it works out to return the substring starting at the first character and returning the first 5 characters.

The problem is that if the incoming string contains no dashes the LOCATE returns 0, and when you subtract the 1 from 0 the SUBSTR function rightfully chokes on the parameter. For example:

SUBSTR('1234512345666',1,LOCATE('-','1234512345666' )-1)

...is really:

SUBSTR('1234512345666',1,-1)

And SUBSTR doesn't quite know hat to do with that.

An even more complex problem arises when the string coming in is zero-length, because the resultant SUBSTR not only faults on -1 for is third parameter, it chokes on the first one having no length.

Using a stored procedure here would be glorious because it could check the length and do all that work in an orderly way, but my associate was stumped by how to get it to work inline and handle the three possibilities of proper dashes, no dashes and a zero-length condition.

What does work though is the inline use of a dual CASE that can handle the three possibilities (yes, it is not the only possibility, but it turned out to be the easiest to convey, even if it is ugly as sin). The statement then would have looked thus:

SELECT (
CASE WHEN LENGTH('{Landmark}')=0 THEN ''
ELSE SUBSTR('{Landmark}',1,LOCATE('-','{Landmark}')
- (CASE WHEN LOCATE('-','{Landmark}') > 0 THEN 1 ELSE 0 END))
END
) AS DATA_ELEMENT
FROM SYSIBM.SYSDUMMY1;

It parses by checking Landmark length and returning a zero-length safely if it is thus; and otherwise it runs the SUBSTR with an inner CASE block that tests the LOCATE return value and returns 1 if it is >0 and 0 otherwise. (This return is subtracted from the LOCATE value, so returning 1 subtracts 1, else we subtract 0.) With an example in place:

SELECT (
CASE WHEN LENGTH('ABC-DEF-GHI')=0 THEN ''
ELSE SUBSTR('ABC-DEF-GHI',1,LOCATE('-','ABC-DEF-GHI')
- (CASE WHEN LOCATE('-','ABC-DEF-GHI') > 0 THEN 1 ELSE 0 END))
END
) AS DATA_ELEMENT
FROM SYSIBM.SYSDUMMY1;

The benefit is that this protects you from the three possible data conditions.

The Second Form

For the stripped form of the second:

SUBSTR('{Landmark}',1,LOCATE('-','{Landmark}'))
...we can modify it with a simpler inline CASE conditional. We end up with:

SUBSTR((CASE WHEN LENGTH('{Landmark}')= 0 THEN ' ' ELSE '{Landmark}' END),1,LOCATE('-','{Landmark}'))

Or in the actual SELECT form:

SELECT SUBSTR((CASE WHEN LENGTH('{Landmark}')= 0 THEN ' ' ELSE '{Landmark}' END),1,LOCATE('-','{Landmark}')) FROM SYSIBM.SYSDUMMY1;

This works because SUBSTR accepts a 0 as its third parameter, and all we need to do is ensure that Landmark (first parameter) is never a zero-length string. The specific CASE part is:

CASE WHEN LENGTH('{Landmark}')= 0 THEN ' '
ELSE '{Landmark}'
END

In other words, return a single space if the LENGTH of the actual field data is zero.

Final Thoughts

No one would ever claim that SQL was pretty to look at, but it illustrates how powerful SQL actually can be, and how it is possible to make the SQL layer of a solution (even without aid of a stored procedure) impose some order on incoming data elements.







2014-03-10

Getting Paid

This is one of those posts of frustration, so take it for what it is worth.

I have been doing development work since the very early eighties, and a trend has been becoming more notable in those many, many years. Part of it is probably that the industry has devalued itself with the pretence that software is an entertainment-focused industry. You flop Facebook into place, for example, and it becomes the superficial representation of the industry. That it is a poorly designed consumers-as-products farce is irrelevant. Businesses see it, become distracted by it, and suddenly everything must connect, or be like it, or, worse, become some extension of it. This changes expectations in a way that makes it harder to sell the idea of functional software as a business helper, or at the very least changes how people value what is built to support their operational requirements.

The actual trend though is purely financial. A vast number of businesses seem to consider good, focused development of software as a nickel-and-dime extra. Getting a fair contract price for required software development is tough, and even when possible getting paid is sometimes nearly impossible. It means that custom (and consequently effective) solutions are fast becoming less common. It isn't lack of need , or even lack of recognized need, but that businesses just don't feel like paying for a proper solution when they can get something "close enough" without paying. And I'm not talking about open-source here, either, but the far more problematic application of poorly-fit tools to clearly defined problems. 

A recent client (who shall remain nameless, but is an enormous distributor of entertainment wares) proves the nature of the end-result of devaluing the idea of proper-fit software solutions. They essentially managed a multimillion dollar warehouse system via CSV files, and spreadsheets. Now, I have nothing against either, but the idea that this is a suitable platform for big enterprise management and operations is nonsensical. It creates latency, reporting issues, and data quality concerns when you have to manage across such transport forms. Of course, now the problem is that to change is costly, and scary, so more putty gets lodged into more cracks -- and there is no apparent cohesive approach to solving the real problem, which is the lack of responsive integrative systems on the operational front. 

But again the real issue isn't the poor thinking so much as that even with it there is an apparent sense that there's no need to actually that the provision of software services with any respect. The same client base who will carp about solutions being late (not a problem I've made any of them face) are not so conscientious as to make their payments to their service providers on time. It is as if there is some new level of self-centred entering the world of business, and it seems irrelevant whether the provider is good or indifferent. It seems like there is a new rule that reads, "pay as late as possible, and well past due if possible; screw whether we need this service again." And it is that last part that confuses me.

My clients have universally praised me for my focus, my solutions, and my concerns about their businesses. And yet, of late, I spend an inordinate amount of time chasing them for payments. That providing quality service costs something seems, to many, a foreign concept. Maybe they believe I have a money tree, and it will support the family so well that I will always be present when needed?

I know I'm not the only person in this industry who is experiencing this trend toward indifference about quality of service, quality of solution, and general quality. I talk to many each week who say the same thing, which is that many of their clients are well behind the curve in paying for what they received. And while some lag is always forgiven, it reaches a point where all of them have said the words, "I can't continue in an industry where the quality is irrelevant, and the pay is unreliable." For the world this represents a serious problem, because while the young dogs may provide the next new iPhone app, it is the seasoned developers who provide the applications no one sees, but everyone uses. This squeezing of that talent pool is a critical mass issue the industry will eventually face in a bad way.

Now, I will return to fishing for cash, since my money tree has not bloomed this year...yet.