Question? Leave a message!




Natural language processing applications ppt

natural language processing ppt slides and semantic analysis in natural language processing ppt
Dr.BenjaminClark Profile Pic
Dr.BenjaminClark,United States,Teacher
Published Date:21-07-2017
Website URL
Comment
NATURAL LANGUAGE PROCESSING FOR COMMUNICATION Sections 23.1 – 23.3 (not covering 23.2.1-2) Please set your mobile devices to silent. CS 3243 - NLP for Communication 1 Last Time Introduction to Learning  Supervised Learning - Induction from observations  Trading model fit for simplicity  Algorithms   KNN   Naïve Bayes   Decision Trees / Information Gain CS 3243 - NLP for Communication 2 Outline  Formal Grammar  Parsing: Syntactic Analysis  Augmented Grammars  The larger context   Communication as Action   Semantic Interpretation   Ambiguity and Disambiguation   Discourse Understanding CS 3243 - NLP for Communication 3 Language   Formal language: A (possibly infinite) set of strings   Grammar: A finite set of rules that specifies a language   Rewrite rules Convention: Uppercase are   Non-terminal symbols not observed (S, NP, etc.) for non-   Terminal symbols observed (“he”) terminals, lowercase for terminals   S → NP VP   NP → Pronoun   Pronoun → “he” CS 3243 - NLP for Communication 4 Generative Capacity Noam Chomsky described four grammatical formalisms:   Recursively enumerable grammars   Unrestricted rules: both sides of the rewrite rules can have any number of terminal and non-terminal symbols; full Turing machines ABd → CaE   Context-sensitive grammars   The RHS must contain at least as many symbols as the LHS ASB → AXB   Context-free grammars (CFG)   LHS is a single non-terminal symbol S → XYa   Regular grammars   LHS is single non-terminal; RHS a terminal plus optional non-terminal X → a X → aY CS 3243 - NLP for Communication 5 Formal Grammar   The lexicon for ε : o Noun → stench breeze glitter wumpus pit pits gold … Verb → is see smell shoot stinks go grab turn … Adjective → right left east dead back smelly … Adverb → here there nearby ahead right left east … Pronoun → me you I it … Name → John Mary Boston Aristotle … Article → the a an … Preposition → to in on near … Conjunction → and or but … Digit → 0 1 2 3 4 5 6 7 8 9 CS 3243 - NLP for Communication 6 Formal Grammar   The grammar for ε : o S → NP VP I + feel a breeze S Conjunction S I feel a breeze + and + I smell a wumpus NP → Pronoun I Name John We’ll deal with Noun pits probabilistic rules later Article Noun the + wumpus Digit Digit 3 4 NP PP the wumpus + to the east NP RelClause the wumpus + that is smelly CS 3243 - NLP for Communication 7 Formal Grammar  Parts of speech   Open class: noun, verb, adjective, adverb   Closed class: pronoun, article, preposition, conjunction, …  Shortcomings of our grammar   Overgenerate: “Me go Boston”   Undergenerate: “I think the wumpus is smelly” CS 3243 - NLP for Communication 8 Parse Tree S NP VP VP Adjective Article Noun Verb the wumpus is dead CS 3243 - NLP for Communication 9 Syntactic Analysis (Parsing)  Parsing: The process of finding a parse tree for a given input string  Top-down parsing   Start with the S symbol and search for a tree that has the words as its leaves  Bottom-up parsing   Start with the words and search for a tree with root S CS 3243 - NLP for Communication 10 Trace of Bottom-up Parsing List of nodes Subsequence Rule the wumpus is dead the Article → the Article wumpus is dead wumpus Noun → wumpus Article Noun is dead Article Noun NP → Article Noun NP is dead is Verb → is NP Verb dead dead Adjective → dead NP Verb Adjective Verb VP → Verb NP VP Adjective VP Adjective VP → VP Adjective NP VP NP VP S → NP VP S   Left to Right processing per token CS 3243 - NLP for Communication 11 Blank spaces to fill in on this slide Intrasentence ambiguity “Have the students of CS 3243 take the exam” Artificial Intelligence … taken the exam” CS 3243 - NLP for Communication 12 Probabilistic Grammars   Probabilistic lexicon Noun → stench .05 breeze .1 wumpus .15 pits .05 … Verb → is .1 feel .1 stinks .05 … … Digit → 0 .1 1 .1 2 .1 …   Probabilistic Grammar VP→ Verb .4 stinks VP NP .35 feel + a breeze VP Adjective .05 is + smelly VP PP .1 turn + to the east VP Adverb .1 go + ahead Where each category (e.g., VP) rule has probabilities that sum to one. CS 3243 - NLP for Communication 13 CYK Chart Parsing   Uses dynamic programming to memoize intermediate results, saving to a chart.   Bottom Up Iterative Processing   Converts context free grammar into a special form: Chomsky Normal Form   X → “a”   X → YZ 2 3 n   Uses space O(n m) ≅ O(n ), despite O(2 ) possible parses.   Suitable for probabilistic CFGs (PCFGs). CS 3243 - NLP for Communication 14 CYK Parsing CS 3243 - NLP for Communication 15 Blank spaces to fill in on this slide An Ambiguous Example S → NP VP S → NP VP PP S,S,S PP → P NP NP → A NP S,VP NP → NP PP VP → V PP VP → V NP NP,VP VP → V See whether you NP → N understand what S PP each of the “S”s stand for S,NP S,VP NP N,A N,V N,V P A N British left waffles on Falkland Islands CS 3243 - NLP for Communication 16 Blank spaces to fill in on this slide Dealing with Probabilities S → NP VP .3 S3.6864E-4, S → NP VP PP .7 S4.608E-4, S5.37E-3 PP → P NP 1 S.00576, NP → A NP .4 NP → NP PP .2 VP.00384 NP → N .4 VP.048, ? We ignored the VP → V PP .4 NP.0128 probabilistic VP → V NP .3 lexicon for this S.0144 PP VP → V .4 example which .16 should also be used. S.048, S.048, NP.16 ? NP.16 VP.12 N,A N,V N,V P A N British left waffles on Falkland Islands CS 3243 - NLP for Communication 17 Subjective & Objective Cases  Overgeneration:   S → NP VP → NP VP NP → NP Verb NP   Pronoun Verb NP → Pronoun Verb Pronoun She loves him her loves he She ran towards him She ran towards he CS 3243 - NLP for Communication 18 Handling Subjective & Objective Cases S → NPs VP … NP → Pronoun Name Noun … s s NP → Pronoun Name Noun … o o VP → VP NPo … PP → Preposition NP o Pronoun → I you he she it … s Pronoun → me you him her it … o  Disadvantage: Grammar size grows exponentially CS 3243 - NLP for Communication 19 Augmented Grammars  Handling case, agreement, etc  Augment grammar rules to allow parameters on nonterminal categories   NP(Subjective)   NP(Objective)   NP(case) CS 3243 - NLP for Communication 20