A little less than a week ago, I opened a zoomerang survey about database stored procedures. In this post, I’m presenting the results.
In just a few days, the survey was visited 232 times, and no less than 155 people completed the survey! (There was a very small (2) number of people that answered some, but not all the items, and these are excluded from the results proper.)
I didn’t really know what to expect, but I did not expect as much as this! So, I’m quite pleased, and I want to thank everybody that took the time to complete the survey. Thank You very, very much, I appreciate your efforts a lot!
Before we Begin
Before I present the results, I want to explain a few things about the survey.
First of all, I want to stress that the results I am about to present are not to be generalized. There is no way to discover to what extent the responses are representative for a larger group.
To put things in perspective, one should realize that I actively solicited response by posting announcements on my own blog, the O’Reilly Database blog and the MySQL stored procedure forum. In addition, I posted to the PlanetMySQL Google Group.
If you want to use the results somewhere, feel free but be sure to link back here so people can judge for themselves what the status is of the results. Thank you.
Secondly, why do I need this survey? Isn’t it clear when and where to use stored procedures?
Well, I have the impression that there are quite a few myths concerning stored procedures in general. Stored procedures seem to be the kind of topic that deeply divides database professionals in two camps. The discussion whether stored procedures are “Good” or “Bad” seems to have revived ever since MySQL started supporting them.
I happen to believe that stored procedures are Good in some cases, but Bad in others. I also think that in most cases, it is possible to deduce whether one should use a stored procedure - and when not. I’ve been wanting to write an article about this for some time now. However, there are already a lot of articles about it, and I decided it would be fun to link it to an inquiry so it would be easier for people to relate to it.
(The article is going to have to wait until next time - I’m just presenting the results of the survey -your opinion- today)
Now - without further ado - the results!
Performance
The first item in the survey was:
Certain application tasks can be implemented either as a server-side stored procedure or using client-side application code and ‘pure’ SQL statements. What is your opinion on the difference in performance between these two approaches?
Three alternative answers were offered:
- Server side stored procedures are faster than client side application code and pure SQL statements (72%; 112)
- Server side stored procedures are slower than client side application code and pure SQL statements (8%; 12)
- I have a different opinion (20%; 31)
Here’s a pie chart of the results:
I must say, this one really surprised me. There seems to be quite a large group of people that have the opinion that procedures outperform normal, plain SQL statements. The people that think the opposite are in the smallest group. There is quite a substantial group that did not want to choose a particular stand and chose to formulate a different opinion.
In the group that chose “I have a different opinion”, the vast majority (25 respondents, 16% of the total) provided an answer to the effect that it all depends on the specific situation. In some cases, a reason was given, usually to indicate that stored procedures could help to reduce network roundtrips.
One respondent had a very original and interesting answer:
Maybe faster but a severe CPU scalability risk.
If I understand correctly, what is meant is that although executing an individual stored procedure maybe faster than executing the equivalent application code, stored procedures can have an adverse effect on scaling up the solution.
Because stored procedures reside in the database, and are executed on the database server, the application will be limited by the processing power of the database server. This can be remedied by physically separating the processing tasks from the database tasks. This allows one to grow the application layer more independently from the data layer - something which will become more and more important as the number of application users grows.
Two persons gave a response that was very MySQL specific. Here are the responses:
Neither. Depends with MySQL on num of things…
and
*MySQL* SP are slower than normal SQL
I find the latter comment interesting because MySQL stored procedures are indeed different from, say, Oracle or MS SQL procedures in this respect. MySQL compiles stored procedures very differently than Oracle or MS SQL does. I’m planning to illustrate this in more detail next time.
A minority (2 respondents) just didn’t have a specific opinion, and 1 person argued that it didn’t matter at all, and that stored procedure performance would be about the same as that of application code running ordinary SQL statements.
Portabilty
The second item in the survey:
What is your opinion on the effect that stored procedures have for the portability of database applications?
Again, three alternative answers were offered:
- Using stored procedures increases the portability of database applications (21%; 33)
- Using stored procedures decreases the portability of database applications (68%; 105)
- I have a different opinion (11%; 17)
And the pie chart of the results:
There’s a large majority that thinks stored procedures make applications less portable. The second largest group thinks the opposite. The group of people that have another opinion is now much smaller as compared to the previous item, but still substantial.
The largest (7) group of people with another opinion say something to the effect of “It depends”. I was a bit surprised that only two people answered:
Portability of what - application or database?
Three individuals answered that portability is indeed hampered, but they argued that portability just isn’t an issue at all, because they don’t port. (Presumably, they mean they don’t ever port the database)
Appropriateness
The third item tried to probe how people felt about database stored procedures in general. Given that databases are supposed to offer an implementation of the relational model, some people argue that stored procedures are actually completey alien and should not be a part of them. So my third item was:
Which of the following statements best matches your own opinion regarding stored procedure support in relational database products?
Again, three alternatives:
- A stored procedure is a completely non-relational thing. Therefore, stored procedures do not belong inside a relational database management system. (12%; 19)
- Stored procedures offer functionality that is complementary to the relational features of a relational database management system. Therefore, stored procedures are a useful feature for building database applications. (83%; 127)
- I have a different opinion (5%; 7)
Here’s a piechart of the results:
So, this certainly is the least controversial item when looking at the numbers: a clear majority seems convinced that stored procedures are a useful tool for building applications. Although it is still considerable, it is the minority that feels stored prodedures are inappropriate (the survey does not grant extra points for passionate defense of the opinons ;).
In this case the group of people with different opinions was small, at least as compared to the other items. It was also not really possible to categorize these responses any further: everyone had their own reason for not choosing one of the other alternatives.
Programming Languages
The survey also inquired after the programming languages people use for their stored procedures and application. These item are a bit different as they ask about a fact rather than an opinion. Also, people could select multiple answers in addition to adding extra options.
First of all, I wanted to know which stored procedure languages are used. The item:
Which stored procedure language do you use regularly to build databases/database applications? Please check only those languages from which you use procedural constructs (rather than pure SQL constructs)
People could choose from these languages:
- DB2 Stored Procedures - IBM DB2 (5%; 8)
- java stored procedures - Oracle (5%; 8)
- MySQL Stored Procedures - MySQL (45%; 65)
- PL/pgSQL - Postgres (18%; 26)
- PL/SQL - Oracle (36%; 52)
- T-SQL - Sybase (7%; 10)
- T-SQL - MS SQL Server (33%; 48)
- .NET stored procedures - MS SQL Server (6%; 9)
- Other, specify database vendor and language name (10%; 15)
Here’s a bar chart, sorted by frequency:
So, most people that did the survey use MySQL. This is not very surprising as it is my usual topic (and I expect most respondents to be reader of my blog). Still, a considerable number of respondents use other languages (too); Oracle PL/SQL, and T-SQL (MS SQL) are really not that far behind. Beyond that, the numbers become considerably smaller, although PL/pgSQL (Postgres) and “Other” are still quite significant.
The “Other” category indicated two important omissions on my part: 6 respondents (4%) named “Firebird”, “Interbase” or some combination of both. I feel I should’ve included that in the survey and I’m ashamed to say it did not cross my mind when I created the survey. Second to that, 5 respondents (3%) specified that they used no stored procedures at all, or at least not regularly. I should have made a separate category for that too I guess.
What is maybe one of the most interesting obervations is the relatively small but still substantial amount of general purpose languages in the stored procedure arena. Oracle java stored procedures (8 respondents; 5%) and MS SQL .NET stored procedures (9 respondents; 6%) show that these things are actually used. It would be very interesting to find out if these respondents used that in addition to or instead of traditional PL/SQL and T-SQL.
The last item in the survey was about application programming languages. The item:
Which computer languages do you use regularly for programming database applications?
The options were:
- C (17%; 25)
- C++ (15%; 23)
- C# .NET (19%; 28)
- COBOL (1%; 1)
- Delphi (5%; 8)
- Java (38%; 57)
- Javascript (14%; 21)
- LISP (0%; 0)
- Perl (33%; 50)
- Python (15%; 23)
- PHP (53%; 80)
- Smalltalk (1%; 2)
- VB (9%; 13)
- VB .NET (8%; 12)
- VBScript (5%; 8)
- .NET - other (1%; 1)
- Other (15%; 23)
And the bar chart:
Clearly, PHP is the most popular language among the respondents, but Java and Perl are also quite strong. After that, there are quite some languages that certainly are used, but not by far as much as PHP, Java and Perl.
The “other” category showed quite a few recurring responses, again indicating I forgot to mention a few important ones. Ruby, Rails, and combinations of both were mentioned 13 times (8%) which is too substantial to ignore. PL/SQL was mentioned 4 times (3%) and it would be interesting to see if these respondents mean they use PL/SQL to create full blown end-user (web)applications (this is a feature I heavily used when I was working as application developer) or ‘just’ utilities, ETL and the like.
Summary
- Most repondents think stored procedures are faster than pure SQL and application code
- Most repondents think stored procedures make the application less portable
- Most repondents think stored procedures are a useful addition to the database
- Most repondents use MySQL, PL/SQL and/or MS SQL stored procedures
- Most repondents use PHP, Java and/or Perl
Stuff that remains to be done
These are just the results. I’m planning to do a real analysis of the results to see if I can identify groups of respondents. For example, it seems obvious that respondents that use PHP and MySQL stored procedures are also the ones that think stored procedures are a good tool for application development, and that they are faster but less portable (it’s a safe assumption, as these sum up all majorities ;). But how well do these groups really hang together? This an other things shortly on this blog.







Slower? Making the database parse a SQL string *in addition* to executing it cannot ever take less time. And dragging huge datasets through the wire so the app can filter/join/transform doesn't seem like it could ever improve performance, either. Maybe the fact that most of the respondents use MySQL belays a shortcoming in the maturity of sproc support in that particular engine, more than indicting the performance of sproc performance as a whole.
Less portable? Does MySQL not support the SQL-99 spec, either? Or is this a reference to the fact that MySQL only recently supported sprocs at all? Is there a reason why sprocs cannot use standard SQL-99 SELECT, INSERT, UPDATE, and DELETE statements?
I'm pretty disappointed that the survey asked nothing about security, which is the best reason to use them in the first place. If you don't want to open up access to the entire schema to an application (putting you at serious risk if someone uses the same credentials in an ad-hoc environment), sprocs allow you to gate access to just the *records* appropriate to that user, or enforce logging, or implement business rule conformance. Sprocs (when used with command objects) also give you immunity to SQL injection attacks.
Also, if the application has to perform all of your business logic, then you have to re-implement that code if you have more than a single app using that database. Each app would include its own copy of the business rules, and that is certainly not relational. With sprocs-only access to the database, you always know that all current and future apps will behave consistently. This also allows you to implement schema changes from one version to the next without changing the sprocs (usually), requiring *no code changes*. This is why this approach is a best practice in OOP.
These are the reasons to use sprocs whenever possible: security, encapsulation/reusability, and performance.
I look forward to hearing about the "other" cases when sprocs are "Bad".
Hi Brianary,
Please realize that the survey itself is not in anyway focused on MySQL - I merely pointed out that I expect many of the respondents to be MySQL users.
"Slower? Making the database parse a SQL string *in addition* to executing it cannot ever take less time. And dragging huge datasets through the wire so the app can filter/join/transform doesn't seem like it could ever improve performance, either. Maybe the fact that most of the respondents use MySQL belays a shortcoming in the maturity of sproc support in that particular engine, more than indicting the performance of sproc performance as a whole."
Maybe you should re-read that section. I'm never saying anything is slower; Not so far at least ;) I'm merely reporting that the majority of the respondents are convinced stored procedures are *faster*.
"Less portable? Does MySQL not support the SQL-99 spec, either? Or is this a reference to the fact that MySQL only recently supported sprocs at all? Is there a reason why sprocs cannot use standard SQL-99 SELECT, INSERT, UPDATE, and DELETE statements?"
MySQL supports a subset of the SQL-2003. I don't really understand the comment about the use of INSERT, UPDATE etc in stored procedures. At least, MySQL supports all these constructs.
"I'm pretty disappointed that the survey asked nothing about security, which is the best reason to use them in the first place."
This is *your opinion* ;) It is indeed a pity I did not include an item for security, we could've tested to what extent your opinion is shared by others ;)
"If you don't want to open up access to the entire schema to an application (putting you at serious risk if someone uses the same credentials in an ad-hoc environment), sprocs allow you to gate access to just the *records* appropriate to that user, or enforce logging, or implement business rule conformance."
I would probably prefer views to restrict access to particular rows. However, it's a mystery to me how you want to safely do this if you cannot distinguish the users from their account data (user name, possibly host from which they are connecting).
"Sprocs (when used with command objects) also give you immunity to SQL injection attacks."
Well, so do prepared statements. So, that's not really a defining property of stored procedures as far as I can see :)
"Also, if the application has to perform all of your business logic, then you have to re-implement that code if you have more than a single app using that database. Each app would include its own copy of the business rules, and that is certainly not relational."
Well, exactly for this reason, I highlighted a few individual responses. The item about "Portability" is intended to be imprecise, and it leaves the respondent to decide: "portability of what, database or application".
Clearly, your focus seems to be on porting application code more than on porting to another database platform. If you are in the business of selling a particular application (SAP comes to mind) you are probably better off building as much of the logic as you can in the application, or inside some middleware between the application and the database.
I don't really see what putting business logic in the application has to do with not being relational though...Or do you mean with "business logic" plain database constraints? I mean, usually, there's a long way between business logic and database constraints ;)
"This also allows you to implement schema changes from one version to the next without changing the sprocs (usually), requiring *no code changes*."
Mmm, I'm having some trouble understanding exactly how the stored procedures can benefit from the changed schema structure if you won't touch the code. I mean, I can see a scenario where the stored procedures implement (steps of) the business processes and the application does not have to change, or is at least somewhat shielded. However, usually adding functionality goes hand in had with changing table structures to convey more or other data. The application still needs a way to communicate with the database in order to benefit from that. So, your application (and procedures) need to be changed anyway.
"These are the reasons to use sprocs whenever possible: security, encapsulation/reusability, and performance."
Well, at least, it is *your opinion* that this are the reasons ;)
"I look forward to hearing about the "other" cases when sprocs are "Bad"."
I never said they were Bad. Maybe you should reread the introductory section. I very clearly stated there that I think they are Good in some cases and Bad in others and that I think it is usually pretty easy to figure out which one. I will be doing a write up about that Real Soon Now. As ever, I'll be glad to receive your comments on that too :)
Hi Roland, the survey is great despite the statistical limitations you mention.
Your results seem to demolish the popular stereotypes that "mid-tier" developers (using PHP, Perl, Java etc) are averse to leveraging programmatic features of the database tier.
Regarding "portability", I think the answer to the question "portability of what, database or application?" is usually: neither! In all the cases I've seen of teams striving for portability, it seems the only actual value captured is in terms of "developer skills portability". Not an insignificant concern, but one often obscured by proponents of high-faluting architectural aesthetics. Database portability was a real concern 5-10 years ago when vendor stability was by no means assured and standards were still evolving, nowdays I think the pragmatic architect realises that a well-factored data access layer means proprietary database features can be exploited not avoided. But it is the hidden desire to maximise (perhaps my own) "developer skills portability" that keeps the database independence imperitive alive.
I enjoyed the survey. I was suprised at the majority who still love stored procedures. Since 2005, every org I've been at as an application arch, we've moved every stored procedure off of the databases, letting DBs do what they do best. Stored Procs coded into enterprise objects that can be called by clustered portals with HA, and all fiber intrastructure. The business case has to exist in order to move from stored procs to alternatives.
survey interesting.
sp's have their place. I've seen bad ones, no ones and good ones. All depends on the 'human' involved.
they work well within a good architected system. they work very poorly when created by non-dba's (i.e. those that don't know a 'sarg' from a clustered index)
Lal
i had a project some months age, it worked with lots of tables, and i had added lot's of queries inside my code, when we started the beta test with a mass of data, the performance decreased a lot(e.g. 10sec wait for a form to load!!!) it was disappointing. after hanging around the web for some days i decided to move all queries to seperate sql stored procedures. after it, the result was wonderful, we speed up the project up to 10 times, that's why i always prefer SPs
Hi Alireza!
Interesting result! What database and programming language did you use in your project?
@Roland:
This normally means that some of the her queries are:
1) full table scans (select * from table) without a (or with a poor or badly index covered) where clause used for processing large resultsets (probably reports)
2) very large and complex queries, which sometimes need to complemented by minor queries
3) complex updates
These cases can be really covered with advantages by stored procedures, because the procedural approach of sproc can simplify the code and perform better.