November 12, 2012

Playing with CLOB data in CUBRID

Hi.

One day, a colleague asked me how to update some CLOB data from the console (that being the CUBRID SQL command prompt)?
My answer was quick - "Why don't you use the CUBRID Manager client...?"
To my surprise, the next day my colleague came back: "Hey, I just can't figure this out - could you please gimme a hand?"...
OK - all said and done - time to work! :)

For the sake of simplicity, let's suppose we have a table (named "my_files"), with 2 columns:
- A "filename" (VARCHAR) column, which holds a file name
- A "content" column, which (obviously) stores the file content

Here is the SQL create table script:

CREATE TABLE "my_files"(
    "filename" character varying(256) NOT NULL,
    "content" clob,
    CONSTRAINT pk_my_files_filename PRIMARY KEY("filename")
); 


The problem is how to put the data (== the content of the file) into the "content" column?

OK, a quick search on www.cubrid.org will tell you that there is a very simple way - use the CLOB_FROM_FILE CUBRID function!
...and this is exactly what my colleague tried, but, here's the catch, it doesn't seem to work... :(





No matter what we tried, it just did not worked!! :(
(actually it looks to be a known bug - see http://www.cubrid.org/questions/359623)

Hmm, so what are the alternatives, if any...?

Well, the immediate solution is to use one of the many drivers CUBRID has (PHP, .NET, JDBC, Python etc.)!
...But what if you are not a programmer...?
...What if you don't wanna get caught up in compiling, debugging etc...?
...Is there no other way, really...?!

OK, so I become determined to find a solution using only plain SQL, no matter how complicated (and insane) it would be! :)
..at least, for the sake of the game! :)

The first thing to do was to figure out how exactly does CUBRID handle CLOB data?
The answer was simple - see http://www.cubrid.org/manual/90/en/Storing%20and%20Updating%20Columns.

Bottom line, LOB data (both BLOB and CLOB) is stored "outside" of the database in CUBRID, in external file(s).

So after all, it's just about some files and nothing else, right...?
Cool - now we have a plan! That is:
  1. Create some "fake"/start-up data in the table
  2. Find the "external" CUBRID file and update it manually with the new data
Let's do it!

...Let's put some "temp" data in the row:



...Let's get the file location used by CUBRID to store the data:



...Let's update manually the data in the file:



...Finally, let's confirm what we did:



Wow - that did not worked!! :(
...The new data is in the file, but CUBRID does not "see" it.... ...somehow, CUBRID seems to store the initial "length" and only shows you that much data in SELECTs...

BTW, I found a CUBRID bug - the CLOB_LENGTH function does now work ok from the CUBRID perspective, as the length reported is the "real" updated one, and not the length stored internally by CUBRID! ...So you end up with very confusing information:



I guess I will fill a Jira bug entry...

Ok, so what do we do now...? ...It seems pretty clear that we need to somehow handle the CLOB length in a way that CUBRID can "understand" it...
New plan:
  1. Create some "fake" data in the table (if not already in place)
  2. Get the length of the new data we want to put in the CLOB column! (take the length of the file containing the new data in bytes)
  3. Update the data in the CLOB column to some "fake" data which has exactly the desired length!
  4. Find the "external" CUBRID file and update it manually with the new data
Out of the the listed steps, I had only one concern - step 3.... ...how do you update the data in the column in an easy way, with the data having a desired length...?
In the end, it turned out to be quite simple - CUBRID has the REPEAT function! :) ...Step 3 become an UPDATE, using the REPLACE function to setup the desired new data length.

OK - Let's test all these:



And now everything works just great! :) ...We found a way to update CLOB data using just plain simple SQL commands! (and OS file editing)

Be aware! This is neither recommended neither supported! Don't do it at home work!!
What I showed you in this blog is only for fun!

As a side note, there are other things you might need to consider (unless you are a "mighty admin"):
  • Files editing permissions
  • Tables ownership/user rights

As always, let me know if there’s anything I could help with or anything unclear.
I hope you enjoyed the post – see you next time!

Bye-bye!
/Dee

P.S. I wonder what happens if the new data is shorter than the existing data... How does CUBRID handle that scenario...?
...What do you think...? Wanna try it...? ;)

No comments:

Post a Comment