JSON UDFs first feedback

Guest Author
Yesterday Ulf Wendel created great blog post with deep analysis of JSON UDF functions which I presented at MySQL Connect at September, 21.

Ulf found few bugs in these functions, I reported them all at bugs.mysql.com. You can find numbers in my comment to his blog post. But he also raised concerns, which can not be considered pure bugs, rather feature requests, or even design especiallities.

* First concern, of course, is the documentation. Ulf writes: "Here’s what the README, the only documentation available apart from the *.c[omment]/*.h[elp] files".

I agree single README file is not enough, but this is still Labs project for which I can not abuse MySQL documentation team for making proper documentation for me. But you still can find more information, than single README file. And these are slides from MySQL Connect, which are available on the conference website (they published them today!) at https://oracleus.activeevents.com/2013/connect/fileDownload/session/470E8AA1C46CAA3A9ACFBDB7653FC313/CON3130_Smirnova.pdf as well as on my own website at http://www.microbecal.com/ftp_load/JSON_UDFs.odp or http://www.microbecal.com/ftp_load/JSON_UDFs.pdf I strongly recommend you to download these slides, because this is the best user manual for JSON functions which exists so far.

But I also want to hear your suggestions about where I can put the complete user manual. I was going to post updates to my blog and update README file. Is this sufficient? Or do you think would be better if I put the documentation at my own website? Or pack it as a separate *-doc package at labs.mysql.com? Any other ideas? If you also miss the documentation, please comment.

* Another request which I have for you is about parameter syntax. Ulf writes: "A function that works “by key” on a subset of a JSON document, usually has the prototype: JSON_SOMETHING(string json_document, string key [, string key...], other_parameter). The first argument passed to the functions is always the JSON document to work on. It follows a variable number of arguments that describe on which subpart of the entire document the function shall work. Having this string key [, string key...] arguments in the middle of the function signature is a bit odd. Some programming languages may forbid this for style considerations. However, depending on the function, further arguments may follow, such as the value to search for or to add."

We discussed this syntax internally before I started implementing the functions and get to this style. But I understand what some people could prefer different syntax. If you such a person, please, file a feature request. At this stage it is easy to rewrite functions, so they support better style of user arguments. I will think some time about it and will post a poll to my blog with proposals which I think can be better than current syntax. Also it is possible to have different set of functions with the same functionality, but which support different style of arguments. Only requirement for this feature is users suggestions and votes. Therefore, please suggest and vote!

* Another feature for which I need your suggestions is "speed vs validity checks". I again will cite Ulf: "Taken from the README. For most functions manipulating or generating JSON documents is true: Warning! This version does not check whole document for validity. Hey, it is a labs.mysql.com pre-production release :-)."

But this design was done not because these functions are at the Labs and not because this is alpha version. We discussed this functionality internally, but decided to don't implement validity checks to speed up functions. If you think we did wrong, please, open a feature request at bugs.mysql.com about *safe* version of the functions. I also think that having two sets of "safe" and "fast" functions can also work. But I need to hear your opinions before implementing this feature.

* Another feature for which I want to count user votes is search abilities. Currently, function JSON_SEARCH uses only exact match. There is no support for wildcards, case insensitive search or full-text search. I got complains about such a limitation not only from Ulf, but from users at MySQL Connect too. Therefore I created two feature requests: http://bugs.mysql.com/bug.php?id=70571 about LIKE and http://bugs.mysql.com/bug.php?id=70572  about FULLTEXT. If you need these features, please go and vote for them using "Affects Me!" Button.

Current behavior of JSON_SEARCH is tricky and Ulf, expectedly, did mistake when tried to use it. Correct syntax should be select json_search('{"key":"value"}', '"value"' ); and select json_search('{"a":{"b":"c"}}', '"c"' ) instead of select json_search('{"key":"value"}', "value" ); and select json_search('{"a":{"b":"c"}}', "c" ) which he used:

mysql> select json_search('{"key":"value"}', '"value"' )\G
*************************** 1. row ***************************
json_search('{"key":"value"}', '"value"' ): key::
1 row in set (0.00 sec)

mysql> select json_search('{"a":{"b":"c"}}', '"c"' );
| json_search('{"a":{"b":"c"}}', '"c"' ) |
| b:a::                                  |
1 row in set (0.00 sec)

This syntax is used to let you search for objects:

mysql> select json_search('{"a":{"b":"c"}}', '{"b":"c"}' );
| json_search('{"a":{"b":"c"}}', '{"b":"c"}' ) |
| a::                                          |
1 row in set (0.00 sec)

You can also notice "::" in the end of the key path. This is the root element and done by design.

And last Ulf's concern is about syntactic sugar:"With a bit of syntactic sugar, say “SELECT document.member[offset]” this could look much more appealing. On the contrary, I am talking syntaxtic sugar only! Syntactic sugar is really hard to add with todays MySQL."

I so agree here! I wish I could implement the functions in such a way! But it is not possible to change MySQL syntax with UDFs!

And while we can not do much for syntactic sugar, you still can help to make better other features. Please comment and vote for the user manual format and location; parameter syntax; safe versions of the functions and extended search capabilites.

Thank you! :)

Join the discussion

Comments ( 6 )
  • Ulf Wendel Thursday, October 10, 2013

    .oO( I wonder if I pushed you into the right direction. Discussing features through blog posts is interesting, hope its not misread as discussing 'bugs'. We both fish for input. )

    Documentation -

    I vote for including anything you have in the download. Download and official website are the places I check, and I expect people to check.

    Validity check of JSON returned by functions -

    The big risk for the user is on creating garbage without realizing it in statements such as UPDATE doc = JSON_SOMETHING() [WHERE ...]. Any SQL warning about problems would come after data integrity has been destroyed. If there was an option to restrict return values to "valid JSON or NULL", then one should be able to do something along the lines of UPDATE doc = IFNULL(JSON_SOMETHING(), doc) [WHERE ...].

    Search syntax -

    You explain my mistake was on not using double quotes:

    JSON_SEARCH('{"key":"value"}', '"value"' )

    Correct: '"value"' (note the double quotes)

    Incorrect: 'value'

    The function prototype is:

    JSON_SEARCH(string document, string value)

    Seemingly the function wants to give the ability to do more than string match. It does search for type and value. If your document is {"a":[1]} and you search for the number, your search expression must be 1, the resulting string function argument for JSON_SEARCH must be '1'. But, if your document is {"a":["1"]} and you search for the number that is stored as a string, then your string function argument for JSON_SEARCH must be '"1"'.

    Now, that's a big pitfall for anybody coming from a dynamically typed environment such as PHP. In PHP the JSON_SEARCH() function signature would be JSON_SEARCH(string doc, mixed value). One could call the function with 1 to search for a number and "1" to search for a string.

    ... would be nice to have typed and untyped search. In PHP world this is "==" (identical value) vs. "===" (identical value and type).

    Even with this finally understood, search remains weak:

    mysql> select json_search('{"a":[true], "b":[true]}', 'true')\G

    *************************** 1. row ***************************

    json_search('{"a":[true], "b":[true]}', 'true'): 0:a::

    Which SQL expression would give me all the elements that have a value of true? And, what is this '::' bloat good for. It really serves no purpose, it adds no information.

    Also, if you wanted to use the JSON_SEARCH return value as an input to another JSON function - how, without a round trip to the client?

  • guest Thursday, October 10, 2013

    Post a Comment:

    Your comment was marked as spam and will not be displayed.

    The TypePad AntiSpam service (http://antispam.typepad.com) says this comment is spam.

  • Sveta Smirnova Thursday, October 10, 2013

    Thank you very much for the feedback, Ulf! I will consider to make better JSON_SEARCH function.

    > Also, if you wanted to use the JSON_SEARCH return value as an input to another JSON function - how, without a round trip to the client?

    In which format do you want to receive return value? This function searches for the path and no JSON function works with only path now. Please send me use cases.

  • Sveta Smirnova Thursday, October 10, 2013

    Please also ignore spam reports. This blog engine marks as spam even my own comments which I have to manually approve =) I always check replies to my entries and approve them unless they are really spam.

  • Sveta Smirnova Friday, October 11, 2013

    I opened two more feature requests.

    Regarding documentation - http://bugs.mysql.com/bug.php?id=70605

    Validity checks for all functions - http://bugs.mysql.com/bug.php?id=70606

    JSON_SEARCH for many occurrences - http://bugs.mysql.com/bug.php?id=70607

    Regarding SEARCH output format I still need to think on design.

  • urHelper Thursday, October 24, 2013

    try this http://codebeautify.org/ json editor with beautify,minify and also convert to xml,csv format and vice-versa.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.