- 2.1 名词短语词块划分
2.1 名词短语词块划分
我们将首先思考名词短语词块划分或 NP 词块划分任务,在那里我们寻找单独名词短语对应的词块。例如,这里是一些《华尔街日报》文本,其中的NP词块用方括号标记:
>>> sentence = [("the", "DT"), ("little", "JJ"), ("yellow", "JJ"), ![[1]](/projects/nlp-py-2e-zh/Images/f4891d12ae20c39b685951ad3cddf1aa.jpg)... ("dog", "NN"), ("barked", "VBD"), ("at", "IN"), ("the", "DT"), ("cat", "NN")]>>> grammar = "NP: {<DT>?<JJ>*<NN>}" ![[2]](/projects/nlp-py-2e-zh/Images/e5fb07e997b9718f18dbf677e3d6634d.jpg)>>> cp = nltk.RegexpParser(grammar) ![[3]](/projects/nlp-py-2e-zh/Images/6372ba4f28e69f0b220c75a9b2f4decf.jpg)>>> result = cp.parse(sentence) ![[4]](Images/8b4bb6b0ec5bb337fdb00c31efcc1645.jpg)>>> print(result) ![[5]](Images/bcf758e8278f3295df58c6eace05152c.jpg)(S(NP the/DT little/JJ yellow/JJ dog/NN)barked/VBDat/IN(NP the/DT cat/NN))>>> result.draw() ![[6]](Images/7bbd845f6f0cf6246561d2859cbcecbf.jpg)

