Question? Leave a message!




Big Data Querying

Big Data Querying
Ghislain Fourny Big Data 10. Querying 1 pinkyone / 123RF Stock PhotoDeclarative Languages What vs. How 2Functional Languages while = let any return if for then + where else order by every exit with Expression 3Ever played Lego 4Ever played Lego if( ) then else 5Ever played Lego if( ) my:func( ) then else 6Ever played Lego if( ) my:func( ) then else 7Ever played Lego if( ) my:func( ) 2 then else 8Ever played Lego if( ) my:func( ) 2 then for x in let y := return else 9Ever played Lego if( ) my:func( ) 2 then for x in let y := return else 10Ever played Lego if( ) my:func( ) 2 then for x in let y := return else 11Language ecosystem XML JSON Navigation XPath JSONPath JSONSelect Transform XSLT JSONT Query XQuery 1.0/3.0 XQuery 3.1, JSON Query, JSONiq Update, XQuery Update JSONiq Scripting Facility Scripting 12Try it out 13XML Navigation (XPath, XQuery) 14The slash operator xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name /country countrycode="D" nameGermany/name /country countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country /countries doc("myfile.xml")/countries/country/name 15The slash operator xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name /country countrycode="D" nameGermany/name /country countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country /countries doc("myfile.xml")/countries/country/name 16 Axis xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" doc("myfile.xml") nameFrance/name /country /child::countries countrycode="D" nameGermany/name /child::country /country countrycode="I" /child::name nameItaly/name /country countrycode="A" nameAustria/name /country /countries 17 Axis xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country /child::name countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country /countries 18 All Axes Forward Axes Reverse Axes self:: attribute:: child:: parent:: descendant:: ancestor:: descendantorself:: ancestororself:: followingsibling:: precedingsibling:: following:: preceding:: 19Axis xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country /child::name countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country 20 /countries Axis xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name /country doc("myfile.xml") countrycode="D" /descendant::country nameGermany/name /country /attribute::code/data() countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country 21 /countries Attribute Abbreviation xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name /country doc("myfile.xml") countrycode="D" /descendant::country nameGermany/name /country /code/data() countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country 22 /countries Descendantorself Abbreviation xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name /country doc("myfile.xml") countrycode="D" //country nameGermany/name /country /code/data() countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country 23 /countries Filters xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country /code./data() eq "CH" countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country /countries 24 Implicit Context Item xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country countrycode="I" /codedata() eq "CH" nameItaly/name /country countrycode="A" nameAustria/name /country /countries 25 Joker xmlversion="1.0"encoding="UTF98" countries country)code="CH") ))))nameSwitzerland/name) ))/country) countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country /codedata() eq "CH" countrycode="I" nameItaly/name /country /parent:: countrycode="A" nameAustria/name /country /countries 26 Parent Abbreviation xmlversion="1.0"encoding="UTF98" countries country)code="CH") ))))nameSwitzerland/name) ))/country) countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country /codedata() eq "CH" countrycode="I" nameItaly/name /country /.. countrycode="A" nameAustria/name /country /countries 27 Alternative xmlversion="1.0"encoding="UTF98" countries country)code="CH") ))))nameSwitzerland/name) ))/country) countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country ./code/data() eq "CH" countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country /countries 28 Atomization xmlversion="1.0"encoding="UTF98" countries country)code="CH") ))))nameSwitzerland/name) ))/country) countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country code eq "CH" countrycode="I" nameItaly/name /country countrycode="A" nameAustria/name /country /countries 29 Atomization nameSwitzerland/name0 nameFrance/name0 nameGermany/name0 nameItaly/name0 nameAustria/name0 0 data(…) Switzerland, France, Germany, Italy, 30 Austria, , ,Kind test xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country countrycode="I" code eq "CH" nameItaly/name /country /child::element() countrycode="A" nameAustria/name /country /countries 31 All Kinds Simplest More precise documentnode() documentnode(element(countries)) element() element(countries) element(name, xs:string) schemaelement(country) attribute() attribute(, xs:integer) schemaattribute(code) text() comment() processinginstruction() processinginstruction(excel) namespacenode() 32Kind test xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country countrycode="I" code eq "CH" nameItaly/name /country /element() countrycode="A" nameAustria/name /country /countries 33 Simpler xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country countrycode="I" code eq "CH" nameItaly/name /country /name countrycode="A" nameAustria/name /country /countries 34 Continued xmlversion="1.0"encoding="UTF98" countries countrycode="CH" nameSwitzerland/name /country countrycode="F" nameFrance/name doc("myfile.xml") /country countrycode="D" /descendant::country nameGermany/name /country countrycode="I" code eq "CH" nameItaly/name /country /name/data() countrycode="A" nameAustria/name /country /countries 35 Construction (XPath, XQuery) 36Construction: Strings "foo" 37Construction: String Escaping "This is a linex000a;and this is a new line" "This is a quot;quotequot;" 38Construction: Numbers 42 3.141592653589793238462650283279 6.022E23 39Construction: Booleans true() false() 40Construction: Other Simple Types date("20130501") dateTime("20130621T05:00:00Z") long("1234567890123") 41Construction: XML foo attr="value" Text bar/ comment target pi /foo 42XQuery: Basic Operations 43Basic Operations: Sequences 1, true(), "foo", bar/ concatenation 1 to 100 range 44Basic Operations: Arithmetics 1 + 1 addition 42 10 substraction a6/a 7 multiplication 42.3 div 7.2 division 42 idiv 9 integer division 42 mod 9 module 45Basic Operations: Strings "foo" "bar" concatenation concat("foo", "bar") stringjoin( ("foo", "bar", "foobar"), "" ) substr("foobar", 4, 3) substring stringlength("foobar") length 46Basic Operations: Value Comparison 1 + 1 eq 2 equality 6 7 ne 21 2 inequality 234 gt 123 greater than 234 ge 123 42.3 lt 7.2 less than 42.3 le 7.2 47Basic Operations: Value Comparison 1. Zero or one item for each operand (1, 2) eq 2 error 2. Types must be compatible 1 eq "foo" error 48Basic Operations: General Comparison doc//country/name = "Switzerland" At least one = = = = 49Basic Operations: Logics 1+1 eq 2 and 2+2 eq 4 conjunction 1+1 eq 2 or 2+2 eq 4 disjunction not(100 mod 5 eq 0) not every i in (1 to 10) universal satisfies i gt 0 quantifier some i in (1 to 10) existential satisfies i eq 33 quantifier 50Basic Operations: Logics 1. It's twovalued logics 2. Nonbooleans get converted not("foo") 51Basic Operations: Effective Boolean Value Sequence Effective Boolean Value (EBV) "" false "foo" true 0 false 42 true (foo/, bar/, 1, "foo") true () false 52Composability 53Rule of Thumb Any Expression can be the operand of any other expression. 54Precedence Precedence (low first) Comma Data Flow (FLWOR, ifthenelse, switch...) Logic Comparison String concatenation Range Arithmetic Path expressions Filter predicates, dynamic function calls Literals, constructors and variables Function calls, named function references, inline functions Use parentheses to override or when in doubt 55XML Construction reloaded foo attr="stringlength("foobar")" stringjoin( for i in 1 to 10 return string(i), "" ) /foo foo attr="6"12345678910/foo 56Data Flow (XQuery) 57Conditional Expressions if(count(doc("file.xml")//country) gt 1000) then "Large file" else "Small file." 58Switch Expressions switch(country/code) case "CH" return ("gsw", "de", "fr", "it", "rm") case "F" return "fr" case "D" return "de" case "I" return "it" default return "en" 59Try Catch Expressions try xs:integer(country/code) catch "A country code is not an integer" 60Alexander Raths / 123RF Stock Photo FLWOR Expressions (XQuery) 61Let clauses let x := 2 return x x 62Let clauses let x := 2 let y := x + x let x := y + 3 return x x 63For clauses for x in (1, 2, 4) return x x 64Where clauses for x in 1 to 10 where x 2 gt 5 return x x 65For clauses for x in 1 to 10 return square number="x" x x /square 66Order by clauses for x in doc("countries.xml")//country order by x/population return x/name/data() 67Group by clauses for x in doc("countries.xml")//country group by continent := x/continent return continent code="continent" for country in x return country/name /continent 68Types 69Types § Simple types: by name xs:integer § Complex types: by kind, similar to XPath element(foo) 70Cardinality xs:integer xs:boolean element(foo)+ documentnode() 71Type usage let x as xs:integer := 2 return x + x 72Type usage let x as node() := foo2/foo return x + x 73Type usage for x as element(foo) in ( foo2/foo, foo3/foo ) return x + x 74Type usage for x as element(foo) in ( foo2/foo, foo3/foo ) return x + x treat as xs:integer 75Type usage for x as element(foo) in ( foo2/foo, foo3/foo ) return x + x cast as xs:double 76Type usage for x as element(foo) in ( foo2/foo, foo3/foo ) return xs:double(x + x) 77Type usage 3.14 instance of xs:integer 78Type usage 3.14 castable as xs:double 79Type usage declare function local:isbigdata( threshold as xs:integer, doc as documentnode() ) as xs:boolean count(doc//) gt threshold ; local:isbigdata(1000, doc("bigfile.xml") 80Validation import'schema'namespace'm=http://www.w3.org/1998/Math/MathML at"http://www.w3.org/Math/XMLSchema/mathml3/mathml3.xsd"; validate m:mathxmlns:m="http://www.w3.org/1998/Math/MathML" m:apply m:eq/ m:ci x /m:ci m:apply m:root/ m:cn 2 /m:cn /m:apply /m:apply /m:math 81Querying JSON (XQuery 3.1 / JSONiq) 82JSON Navigation: Objects let object := let object := map "foo": 31, "foo": 31, "bar" : "ETH" "bar" : "ETH" return object.foo return objectfoo JSONiq XQuery 3.1 83JSON Navigation: Arrays let object := let object := map "foo": 31, "foo": 31, "bar": "bar": array "ETH", "EPF" "ETH", "EPF" return return object.foo1 objectfoo1 JSONiq XQuery 3.1 84JSON Navigation: Arrays let object := let object := map "foo": 31, "foo": 31, "bar": "bar": array "ETH", "EPF" "ETH", "EPF" return return object.foo objectfoo JSONiq XQuery 3.1 85JSON Construction: Composability map "Swiss people" : doc("switzerland.xml") //country/population/data() 86JSON Construction: XQuery 3.1 and arrays array 1 to 3, (), 4 to 6 1, 2, 3, 4, 5, 6 1 to 3, (), 4 to 6 (1, 2, 3), (), (4, 5, 6) 87XQuery 3.1 vs. JSONiq XQuery 3.1 JSONiq Primary use case data structures document stores (maps, arrays) (JSON) Keys Any atomic types Strings only Values Any sequences of items Single items Constructor content Preserving identity Copied 88Final examples 89Example 90Titles sorted by price forbindoc("bib.xml")//book orderbyxs:float(b/price)descending, b/titleascending returnb/title 91Titles sorted by price forbindoc("bib.xml")//book 92Titles sorted by price forbindoc("bib.xml")//book orderbyxs:float(b/price)descending, b/titleascending 93Titles sorted by price forbindoc("bib.xml")//book orderbyxs:float(b/price)descending, b/titleascending returnb/title 94Exercise 1: How many books written by Stevens doc("bib.xml")//book+ 95How many books written by Stevens doc("bib.xml")//bookauthor,=,"Stevens", 96How many books written by Abiteboul count(doc("bib.xml")//bookauthor,=,"Stevens"), 2 97Number of books by author doc("bib.xml") 98Number of books by author doc("bib.xml")//author+ +++ 99Number of books by author foraindoc("bib.xml")//author/data() 100Number of books by author foraindistinct' values(doc("bib.xml")//author) 101Number of books by author xml version="1.0" encoding="UTF8" res nameStevens/name foraindistinct' /res res values(doc("bib.xml")//author) nameAbiteboul/name returnres namea/name /res count res nameBuneman/name count(doc("bib.xml")//bookauthor=a) /res res /count nameSuciu/name /res /res 102Number of books by author xml version="1.0" encoding="UTF8" res nameStevens/name count2/count foraindistinct' /res res values(doc("bib.xml")//author) nameAbiteboul/name returnres count2/count namea/name /res count res nameBuneman/name count(doc("bib.xml")//bookauthor=a) count1/count /res res /count nameSuciu/name /res count1/count /res 103Not covered § Higherorder functions § Computed constructors § Count clause § Window clause § Node comparison § Errors § Typeswitch 104There’s more XQuery Updates insert node foo/ into doc/bar XQuery FullText /books/booknumber="1"/title contains text "improve" using stemming XQuery Scripting while (x gt 0) x := x – 1; 105
sharer
Presentations
Free
Document Information
Category:
Presentations
User Name:
Dr.GordenMorse
User Type:
Professional
Country:
France
Uploaded Date:
22-07-2017