Reality check - when to stop adding features

Bit off-topic here, but I'm having a crisis of faith.

I'm working on a PHP project that of course has a back-end database in MySQL. The web monkey developer in me starts to get all twinkly-eyed about abstraction layers and database agnosticism and structured query builders and all that. The code hacker in me thinks about the hundreds of lines of code doing what can essentially be done in far less time with hard-coded text.

Not to mention that SQL is a pretty rich language, and to create a query builder that is not more limiting than a two-sizes-too-small pair of undies, it would have to support a lot of syntax that, frankly, I don't use very often. Can't remember the last time I used "INNER JOIN" or "UNION" or even "GROUP BY". The bulk of what I need to do is mostly SELECT .. FROM .. LEFT JOIN .. ON .. WHERE .. ORDER BY .. LIMIT ... and that's about it. Except for those cases where I use INSERT INTO .. VALUES .. ON DUPLICATE KEY UPDATE .. which I imagine would be a nightmare to build from an OOP perspective.

OK, so let's weigh the merits:

$cols = ('sessions.id', 'sessions.user_id', 'users.user','sessions.key');
$qb = new MyBuilder();
$sql = $qb->select($cols, $table)
          ->leftjoin('users')->on('sessions.user_id', '=', 'users.id')
          ->where('users.admin', '!=', 'TRUE')
          ->limit(1);

...compared to:

$sql = "SELECT `sessions`.`id`, `sessions`.`user_id`, `sessions`.`key`, `users`.`user` "
     . "FROM `sessions` LEFT JOIN `users` ON (`sessions`.`user_id` = `users`.`id`) "
     . "WHERE (`users`.`admin` != TRUE ) "
     . "LIMIT 1;";

OK, so the query builder version looks pretty good.... at least, more code-like, rather than random blobs of quoted text. However, I'm not exactly saving keystrokes here. Most of the parameters are still hard-coded literals. At best, I can avoid some typos because my editor can parse and highlight the PHP syntax, and through the method names, I get a helping hand with the exact SQL syntax (which I don't use regularly enough to rattle off the top of my head without errors).

But, it starts to get messy when you need to, for e.g., use sub-queries. Or what if you want to join on multiple columns? Then there has to be extra logic in the builder code, and mixed input parameters (arrays and such), or a bunch more method calls (like "orwhere", "andwhere", "wherenot", "wherebetween", etc.) There are so many options, it almost gets to the point where every keyword is a method, in which case I'm not really gaining much of anything, and potentially costing a lot of CPU time for text processing.

Finally, I thought... well, at least I get quoting for free. I can programmatically add the tick marks, and even recognize table.column pairs and quote them separately (i.e., "table.column" becomes "table.column"). But, that's a weak argument in the cost-benefit analysis, and how do you even know what's a column name and what's a literal? Or a function? You pretty much have to write a basic SQL parser that is context-aware ... and even then, you can't reliably tell a text literal from a column name without knowledge of the schema. (The SQL parser can at least resolve potential ambiguities, or it enforces rigid qualification and/or quoting, or all of the above.)

Solutions include fixed formats ("left, operator, right" tuples might be forced to column on the left, literal on the right) but what about "where 1=1" type stuff? How do I prevent my code from creating "where 1='1'"? (Ignoring the obviousness of integer parameters in this case.)

So I'm thinking this is a fundamentally flawed approach, but generating dynamic queries always takes a certain amount of code that I find myself duplicating, so I have a strong urge to centralize that stuff somewhere. It's really tough finding the balance, though, between code re-use, and wheel re-inventing. And while it has been an interesting day of problem-solving, I'm ready to scrape this sheet of burnt, yet somehow still half-baked, cookies into the bin and go back to canned queries, maybe with a couple of DB util functions to quote things, or arrays of things.

Sigh... well thanks for listening. :wink:

I didn't read all of that, must confess, way too long, so what I say below may be redundant.

Reality check - when to stop adding features

The answer must surely be, when the URS has been met. The URS dictates the project scope, and you get paid to deliver that, nothing more, nothing less.

You don't just "add" features- they're all designed in and constructed accordingly. This is true of any project, regardless of the discipline. You don't just, halfway through building a house, say "hmmmmm let's stick a swimming pool over there".

Change during the project must be managed: so if for example a new product / technique etc comes along, or the client has a whim to add something, or you as developer have such a whim, simply a matter of change request which gets rejected or approved. If approved, scope is amended, project continues.

Features do not just get added.

For a commercial project, I wouldn't disagree with you. For a pet project, though, I follow my whim. I've abandoned some, and I'm still working on some years later (more off than on, but still). That's the prerogative of a hobbyist.

A sketch is like a Wikipedia page: you will never finish to edit it :wink:

as long as requirements are not frozen, features will come (and go)

Creeping featurism is a term synonymous with bad management.

GoForSmoke:
Creeping featurism is a term synonymous with bad management.

Exactly..... bad project management, especially; the dreaded scope creep.

JimboZA:

GoForSmoke:
Creeping featurism is a term synonymous with bad management.

Exactly..... bad project management, especially; the dreaded scope creep.

Last time it got pulled on me by a no-nothing in charge, it was to be like Microsoft.
I just about screamed. That is Wrong in so many ways!

GoForSmoke:
Creeping featurism is a term synonymous with bad management.

or no management at all

robtillaart:

GoForSmoke:
Creeping featurism is a term synonymous with bad management.

or no management at all

The cynic in me, thinks that no management is better than bad management....

SirNickity:
Finally, I thought... well, at least I get quoting for free. I can programmatically add the tick marks, and even recognize table.column pairs and quote them separately (i.e., "table.column" becomes "table.column"). But, that's a weak argument in the cost-benefit analysis, and how do you even know what's a column name and what's a literal? Or a function? You pretty much have to write a basic SQL parser that is context-aware ... and even then, you can't reliably tell a text literal from a column name without knowledge of the schema.

One of the reasons you use the tick marks with MySQL queries is to protect yourself when you encounter reserved words. Assuming you've debugged your code and/or created a parser "today" to make sure you're protected, you still have no way of predicting what reserved words or new syntax will appear in future versions of MySQL. The only valid tick-mark-parser would need to be a facility included in the MySQL driver itself.

Yep, that's exactly what I (eventually) decided. I use tick marks in my code because I know precisely what's a keyword and what's a proper name. My code would never know that for sure. So, I saved what I had written to a scrap file, in case any of it ever came in handy elsewhere, and removed it from the application.

Yet another flawed attempt at idealizing an abstraction layer. Oh well. :slight_smile: Just thought I might be in good company here.

When I was writing my previous response I also tried finding a way to use a MySQL facility to do that type of quoting. I have plenty of experience with MySQL so I'm quite familiar with the documentation, but I can't find anything that would help you with your problem. I'm kinda surprised that there's nothing in the MySQL drivers/libraries for that.

I personally don't use those tickmarks at all. You definitely gave me some food for thought tho :slight_smile:

"Though a program be but three lines long, someday it will have to be maintained.''

I think I started doing it because I had a column named "key" that sometimes caused ambiguous syntax. If not "key" then something similar.. I don't remember for sure. I've been throwing gratuitous tick-marks in my queries ever since. May as well make the parser's life easy.