November 10, 2012

Emulate long running SQL queries in CUBRID

Hi.

If you are here and reading this blog post, it’s most probably because you had (or have?) the need to get your hands on a long running query… (…and by “long-running query” I mean a SQL query which takes more than just a few seconds to execute – maybe minutes or more…)


Actually, any database programmer must deal with this, sooner or later, simply because you need to test how you application behaves when a database query execution takes like forever (== a long time)!
How will your application handle it…? It will gracefully wait for the query to complete…? It will crash your app…? It will hog your CPU…?


What’s the problem, after all?


So all it takes is to have a long running query and use it - that’s it, right…?
Well, this is much-much easier said than done! ;)  …Where is that “magical” long running universal soldier query…? How do you control, if possible, how much time its execution will take…? What does it take to make it work in CUBRID…?

OK, let’s start with a good old  Google search… …and another Google search
Hmm, not very promising… :(
Summarizing, it looks like we would have to look into:
-          Getting big data (millions of rows) in place
-          Doing heavy JOINs
-          Implement WHERE conditions which avoid indexes (SUBSTR etc.)
-          Use sub-queries
-          Use “slow” functions, if any available; see, for example, http://stackoverflow.com/questions/3892374/how-to-intentionally-create-a-long-running-mysql-query

Bottom line - not that plain simple at all…!

Searching for more, I found on SourceForge.net an interesting project named CUBRID Performance project, which actually does have some long running CUBRID queries:


 But it seems that I would have to go through some databases setups… …still not that as simple as I would like to…

OK, time to step back and clearly define what I want:
-          I want an easy way to get a long running query
-          I want an easy way to have a SQL (query) execution time controllable
-          I want NO CPU hogging
-          I want NO memory or any other resources hogging (so forget about big tables!)

How do we solve the problem?


As said before, it seems there are 2 main possible approaches:
-          Use query which runs on “big” data, and do “heavy” stuff in the database
-          Use procedures/functions which takes time to execute (some complex math stuff, for example)

Unfortunately, both of these approaches “proudly” fail my goals… :(
So I needed another way… …another solution…

…And suddenly, while thinking about it, the solution was right there waiting for me! :)
…Doesn’t CUBRID have support for Java stored procedures? Yes, it does!!
…Doesn’t CUBRID let you write Java code which you can execute from anywhere, via stored procedure/functions calls? Yes, it does!!
…Doesn’t this solve all my concerns…? Most probably, YES!!

Wow – That’s it! - Let’s do it!

Solution


I will not bother you with all the details of doing stored procedures in CUBRID – it’s all online
…And I will definitely not bother you with all my trying’s and failures – I’ll just show you my solution.

A few highlights:
-          I wanted to be able to run SQL queries on “top” of the timeouts, so I decided to go on with a function – timeout_str - which returns an empty string – and can be used in STRING/VARCHAR functions, like CONCAT etc. This way, I could do things like:

SELECT CONCAT(timeout_str(3000), code.s_name) FROM code;

-          For those times where an INT would be useful, I decided to implement a variant – timeout_int – which returns an Integer ( == timeout input parameter)

Here is the Java code – plain and simple:

/**
 * CUBRID utility stored procedures
 */
public class utils {
  /**
   * Timeout and return a string
   *
   * @param timeout Timeout values in ms.
   * @param ret     Returned string value
   * @return String
   */
  public static String timeout_str(Integer timeout, String ret) {
    try {
      Thread.sleep(timeout);
    } catch (Exception ex) {
      // do nothing
    }
    return ret;
  }
 
  /**
   * Timeout and return an integer value
   *
   * @param timeout Timeout values in ms.
   * @param ret     Returned integer value
   * @return Integer
   */
  public static Integer timeout_int(Integer timeout, Integer ret) {
    try {
      Thread.sleep(timeout);
    } catch (Exception ex) {
      // do nothing
    }
    return ret;
  }
}

Next, after compiling the Java code, I loaded the compiled class in CUBRID, using the loadjava tool:


Then, I created the 2 stored functions, using the CUBRID Manager GUI:
-          timeout_str
-          timeout_int



And finally, let’s have some fun – everything works just great! – see below:




Moreover, timeouts do add (as they are supposed to!):


All in all, pretty cool, right…?!

Of course, let me know if there’s anything else I could help with or anything unclear.
I hope you enjoyed the post – see you next time!

Bye-bye,
/Dee



P.S. You can download the latest CUBRID 9.0 beta release from here:

They say it’s 3x times faster…! …More reason to go on with my timeout solution – ha-ha-ha! :)

P.S.S. Let me know if you would be interested in a MySQL (similar) solution… …or maybe you already have one and you would like to share it…? :)

P.S.S. Let's register this on Technorati - 7CZ7ETN4TB89

1 comment: