Linguistic Reference

All transcript data in TCSE is annotated using spaCy 3.8 (en_core_web_lg). This page documents the linguistic annotation scheme used throughout TCSE.

1. Universal POS Tags

Coarse-grained part-of-speech categories based on Universal Dependencies. Used in Advanced Search with {pos} notation (e.g., help{verb}, [be]{aux}).

TagPart of SpeechSearchTypeExamples
ADJAdjective{adj} or {a}Openbig, old, green, first, incomprehensible
ADVAdverb{adv} or {r}Openvery, well, exactly, tomorrow, here
INTJInterjection{intj}Openhello, ouch, bravo, psst
NOUNNoun{noun} or {n}Openpeople, time, world, way, thing
PROPNProper noun{propn}OpenMary, John, London, NATO
VERBVerb{verb} or {v}Openrun, eat, go, think, help
ADPAdposition{adp}Closedin, to, during, of, with
AUXAuxiliary{aux}Closedhas, is, will, should, must
CCONJCoordinating conjunction{cconj}Closedand, or, but
DETDeterminer{det}Closedthe, a, this, which, all, no
NUMNumeral{num}Closedone, two, three, seventy-seven
PARTParticle{part}Closed's, not, to
PRONPronoun{pron}ClosedI, it, you, we, they
SCONJSubordinating conjunction{sconj}Closedthat, if, while, because
PUNCTPunctuation{punct}Other. , ; : ! ? ( )
SYMSymbol{sym}Other$, %, §, ©
XOther{x}Otherforeign words, typos, fragments
SPACESpace{space}Otherwhitespace tokens

2. Fine-Grained Tags (Penn Treebank)

Detailed POS tags following the Penn Treebank tagset. Searchable with {@tag} notation (e.g., {@vbg} for gerunds/present participles).

TagDescriptionExamples
Nouns
NNNoun, singular or massdog, music, education
NNSNoun, pluraldogs, children, ideas
NNPProper noun, singularLondon, Obama, Google
NNPSProper noun, pluralAmericans, Alps
Verbs
VBVerb, base formgo, eat, run
VBDVerb, past tensewent, ate, ran
VBGVerb, gerund / present participlegoing, eating, running
VBNVerb, past participlegone, eaten, taken
VBPVerb, non-3rd person singular presentgo, eat, run
VBZVerb, 3rd person singular presentgoes, eats, runs
MDModalcan, could, may, might, will, would, shall, should, must
Adjectives
JJAdjectivebig, green, incomprehensible
JJRAdjective, comparativebigger, better, faster
JJSAdjective, superlativebiggest, best, fastest
AFXAffix (hyphenated adjective part)e.g., "anti" in "anti-war"
Adverbs
RBAdverbvery, well, quickly, never
RBRAdverb, comparativefaster, better, more
RBSAdverb, superlativefastest, best, most
WRBWh-adverbwhere, when, why, how
Pronouns & Determiners
PRPPersonal pronounI, you, he, she, it, we, they
PRP$Possessive pronounmy, your, his, her, its, our, their
WPWh-pronounwho, what, whom
WP$Possessive wh-pronounwhose
WDTWh-determinerwhich, that, what
DTDeterminerthe, a, an, this, that, these
PDTPredeterminerall, both, half
EXExistential therethere (is/are)
Other
INPreposition or subordinating conjunctionin, of, by, that, if, because
CCCoordinating conjunctionand, or, but, nor, yet
CDCardinal numberone, 2, 1,000
TOtoto (go), to (the store)
POSPossessive ending's, '
RPParticleup, off, out, in (phrasal verbs)
UHInterjectionoh, well, um, uh
FWForeign wordde, la, von
LSList item marker1., 2., a., b.
SYMSymbol$, %, +, =
HYPHHyphen-
NFPSuperfluous punctuation..., --
XXUnknownunanalyzable tokens
ADDEmail or URLuser@example.com, http://...
_SPSpacewhitespace

3. Dependency Labels

Syntactic dependency relations showing how words relate to each other in a sentence. Searchable with {@label} notation (e.g., {@nsubj} for nominal subjects).

Note: The {@...} notation searches both fine-grained tags and dependency labels simultaneously.

LabelDescriptionExample
Core Arguments
nsubjNominal subjectShe runs.
nsubjpassNominal subject (passive)It was built.
dobjDirect objectI see you.
dativeDative (indirect object)Give me a book.
attrAttributeShe is a teacher.
agentAgent (passive by-phrase)Built by engineers.
explExpletiveThere is a problem.
Clausal Arguments
csubjClausal subjectWhat she said is true.
csubjpassClausal subject (passive)That he came was unexpected.
ccompClausal complementI think he left.
xcompOpen clausal complementI want to go.
acompAdjectival complementShe looks happy.
oprdObject predicateI consider him smart.
Modifiers
amodAdjectival modifiera big house
advmodAdverbial modifierrun quickly
nummodNumeric modifierthree cats
nmodNominal modifiera cup of coffee
npadvmodNoun phrase as adverbial modifieryesterday, this way
quantmodQuantifier modifierabout 200
apposAppositional modifierSam, my brother
aclAdjectival/relative clausethe man who came
relclRelative clause modifierthe book I read
advclAdverbial clause modifierIf it rains, I stay.
negNegation modifierI do not agree.
detDeterminerthe book
predetPredeterminerall the people
possPossession modifiermy book
Prepositional & Case
prepPrepositional modifiergo to school
pobjObject of prepositionin the house
pcompComplement of prepositioninstead of going
caseCase markingJohn 's book
Coordination & Connectors
conjConjunctcats and dogs
ccCoordinating conjunctioncats and dogs
preconjPre-correlative conjunctioneither A or B
markMarker (subordinating conjunction)because it rained
Verbal
auxAuxiliaryI have eaten.
auxpassPassive auxiliaryIt was built.
compoundCompoundNew York, ice cream
prtParticle (phrasal verb)give up, turn off
Other
ROOTRoot of the sentenceShe runs fast.
punctPunctuationHello, world.
parataxisParataxis (loosely joined clause)He said — I agree.
intjInterjectionWell, I think...
depUnclassified dependent(catch-all)
metaMeta modifierstructural markup

4. Morphological Features

Grammatical properties of individual tokens. Searchable with {#feature} notation using partial matching (e.g., {#past} matches any token containing "Past" in its morphological annotation).

FeatureValuesSearch exampleDescription
VerbFormFin, Ger, Inf, Part{#ger}, {#inf}Finite, gerund, infinitive, participle
TensePast, Pres{#past}, {#pres}Past or present tense
AspectPerf, Prog{#perf}, {#prog}Perfect or progressive aspect
MoodInd{#ind}Indicative mood
VerbTypeMod{#mod}Modal verb
VoicePassNot annotated by en_core_web_lg; use dep labels {@auxpass} / {@nsubjpass} instead
NumberSing, Plur{#sing}, {#plur}Singular or plural
Person1, 2, 3{#person: 3}1st, 2nd, or 3rd person
CaseAcc, Nom{#nom}, {#acc}Nominative or accusative case
GenderFem, Masc, Neut{#fem}, {#masc}Grammatical gender (pronouns)
DegreePos, Cmp, Sup{#cmp}, {#sup}Positive, comparative, superlative
DefiniteDef, Ind{#def}, {#ind}Definite or indefinite article
PronTypeArt, Dem, Ind, Prs, Rel{#dem}, {#rel}Article, demonstrative, indefinite, personal, relative
PossYes{#poss}Possessive
ReflexYes{#reflex}Reflexive pronoun
PolarityNeg{#neg}Negative polarity (not)
NumTypeCard, Mult, Ord{#ord}Cardinal, multiplicative, ordinal
ForeignYes{#foreign}Foreign word
ConjTypeCmp{#conjtype}Comparative conjunction (than)

5. Named Entity Types

Named entities recognized by spaCy's NER model. Searchable with %TYPE notation (e.g., %PERSON, say %ORG).

TypeDescriptionExamples
PERSONPeople, including fictionalObama, Einstein, Hamlet
ORGOrganizations, companies, agenciesGoogle, the UN, NASA
GPECountries, cities, statesFrance, New York, California
LOCNon-GPE locations: mountains, bodies of waterthe Alps, the Pacific, Antarctica
FACFacilities: buildings, airports, highwaysthe Eiffel Tower, JFK Airport
NORPNationalities, religious/political groupsAmerican, Buddhist, Republican
DATEAbsolute or relative dates/periodstomorrow, last year, 2024
TIMETimes shorter than a day3 o'clock, this morning
MONEYMonetary values$5, 2 million euros
PERCENTPercentagesfifty percent, 10%
QUANTITYMeasurements: weight, distance, etc.5 kg, 2 miles, 100 degrees
CARDINALNumerals not covered by other typesfive, hundred, 42
ORDINALOrdinal numbersfirst, 42nd, third
EVENTNamed eventsWorld War II, the Olympics
WORK_OF_ARTTitles of worksHamlet, Mona Lisa
PRODUCTProducts, vehicles, foodsiPhone, Boeing 747
LAWNamed documents of lawthe Magna Carta, GDPR
LANGUAGENamed languagesEnglish, Mandarin, Arabic

6. Advanced Search Syntax Summary

Quick reference for all search notation available in Advanced Search mode.

SyntaxDescriptionExample
wordSurface form searchhelp
[lemma]Lemma (base form) search[be] → am, is, are, was, were, been, being
['s]Literal surface match['s] → 's (possessive/contraction)
word{pos}Surface + POS filterhelp{verb}, help{noun}
[lemma]{pos}Lemma + POS filter[help]{noun}
{pos}POS-only search{propn} (any proper noun)
{@tag_or_dep}Fine-grained tag or dependency{@vbg} (gerund), {@nsubj} (subject)
{#morph}Morphological feature{#past} (past tense), {#plur} (plural)
{-pos}Negative POS filter{-verb} (not a verb)
-wordNegative word match-the
a|bOR alternativeshelp|assist
+prefixPrefix match+un → un..., under..., until...
{} or *Wildcard (one word)[make] {} {noun}
_Noun chunk[give] _ _
^Start of segment^ however
%TYPENamed entity search%PERSON, say %ORG