All transcript data in TCSE is annotated using spaCy 3.8 (en_core_web_lg). This page documents the linguistic annotation scheme used throughout TCSE.
Coarse-grained part-of-speech categories based on Universal Dependencies. Used in Advanced Search with {pos} notation (e.g., help{verb}, [be]{aux}).
| Tag | Part of Speech | Search | Type | Examples |
|---|---|---|---|---|
ADJ | Adjective | {adj} or {a} | Open | big, old, green, first, incomprehensible |
ADV | Adverb | {adv} or {r} | Open | very, well, exactly, tomorrow, here |
INTJ | Interjection | {intj} | Open | hello, ouch, bravo, psst |
NOUN | Noun | {noun} or {n} | Open | people, time, world, way, thing |
PROPN | Proper noun | {propn} | Open | Mary, John, London, NATO |
VERB | Verb | {verb} or {v} | Open | run, eat, go, think, help |
ADP | Adposition | {adp} | Closed | in, to, during, of, with |
AUX | Auxiliary | {aux} | Closed | has, is, will, should, must |
CCONJ | Coordinating conjunction | {cconj} | Closed | and, or, but |
DET | Determiner | {det} | Closed | the, a, this, which, all, no |
NUM | Numeral | {num} | Closed | one, two, three, seventy-seven |
PART | Particle | {part} | Closed | 's, not, to |
PRON | Pronoun | {pron} | Closed | I, it, you, we, they |
SCONJ | Subordinating conjunction | {sconj} | Closed | that, if, while, because |
PUNCT | Punctuation | {punct} | Other | . , ; : ! ? ( ) |
SYM | Symbol | {sym} | Other | $, %, §, © |
X | Other | {x} | Other | foreign words, typos, fragments |
SPACE | Space | {space} | Other | whitespace tokens |
Detailed POS tags following the Penn Treebank tagset. Searchable with {@tag} notation (e.g., {@vbg} for gerunds/present participles).
| Tag | Description | Examples |
|---|---|---|
| Nouns | ||
NN | Noun, singular or mass | dog, music, education |
NNS | Noun, plural | dogs, children, ideas |
NNP | Proper noun, singular | London, Obama, Google |
NNPS | Proper noun, plural | Americans, Alps |
| Verbs | ||
VB | Verb, base form | go, eat, run |
VBD | Verb, past tense | went, ate, ran |
VBG | Verb, gerund / present participle | going, eating, running |
VBN | Verb, past participle | gone, eaten, taken |
VBP | Verb, non-3rd person singular present | go, eat, run |
VBZ | Verb, 3rd person singular present | goes, eats, runs |
MD | Modal | can, could, may, might, will, would, shall, should, must |
| Adjectives | ||
JJ | Adjective | big, green, incomprehensible |
JJR | Adjective, comparative | bigger, better, faster |
JJS | Adjective, superlative | biggest, best, fastest |
AFX | Affix (hyphenated adjective part) | e.g., "anti" in "anti-war" |
| Adverbs | ||
RB | Adverb | very, well, quickly, never |
RBR | Adverb, comparative | faster, better, more |
RBS | Adverb, superlative | fastest, best, most |
WRB | Wh-adverb | where, when, why, how |
| Pronouns & Determiners | ||
PRP | Personal pronoun | I, you, he, she, it, we, they |
PRP$ | Possessive pronoun | my, your, his, her, its, our, their |
WP | Wh-pronoun | who, what, whom |
WP$ | Possessive wh-pronoun | whose |
WDT | Wh-determiner | which, that, what |
DT | Determiner | the, a, an, this, that, these |
PDT | Predeterminer | all, both, half |
EX | Existential there | there (is/are) |
| Other | ||
IN | Preposition or subordinating conjunction | in, of, by, that, if, because |
CC | Coordinating conjunction | and, or, but, nor, yet |
CD | Cardinal number | one, 2, 1,000 |
TO | to | to (go), to (the store) |
POS | Possessive ending | 's, ' |
RP | Particle | up, off, out, in (phrasal verbs) |
UH | Interjection | oh, well, um, uh |
FW | Foreign word | de, la, von |
LS | List item marker | 1., 2., a., b. |
SYM | Symbol | $, %, +, = |
HYPH | Hyphen | - |
NFP | Superfluous punctuation | ..., -- |
XX | Unknown | unanalyzable tokens |
ADD | Email or URL | user@example.com, http://... |
_SP | Space | whitespace |
Syntactic dependency relations showing how words relate to each other in a sentence. Searchable with {@label} notation (e.g., {@nsubj} for nominal subjects).
Note: The {@...} notation searches both fine-grained tags and dependency labels simultaneously.
| Label | Description | Example |
|---|---|---|
| Core Arguments | ||
nsubj | Nominal subject | She runs. |
nsubjpass | Nominal subject (passive) | It was built. |
dobj | Direct object | I see you. |
dative | Dative (indirect object) | Give me a book. |
attr | Attribute | She is a teacher. |
agent | Agent (passive by-phrase) | Built by engineers. |
expl | Expletive | There is a problem. |
| Clausal Arguments | ||
csubj | Clausal subject | What she said is true. |
csubjpass | Clausal subject (passive) | That he came was unexpected. |
ccomp | Clausal complement | I think he left. |
xcomp | Open clausal complement | I want to go. |
acomp | Adjectival complement | She looks happy. |
oprd | Object predicate | I consider him smart. |
| Modifiers | ||
amod | Adjectival modifier | a big house |
advmod | Adverbial modifier | run quickly |
nummod | Numeric modifier | three cats |
nmod | Nominal modifier | a cup of coffee |
npadvmod | Noun phrase as adverbial modifier | yesterday, this way |
quantmod | Quantifier modifier | about 200 |
appos | Appositional modifier | Sam, my brother |
acl | Adjectival/relative clause | the man who came |
relcl | Relative clause modifier | the book I read |
advcl | Adverbial clause modifier | If it rains, I stay. |
neg | Negation modifier | I do not agree. |
det | Determiner | the book |
predet | Predeterminer | all the people |
poss | Possession modifier | my book |
| Prepositional & Case | ||
prep | Prepositional modifier | go to school |
pobj | Object of preposition | in the house |
pcomp | Complement of preposition | instead of going |
case | Case marking | John 's book |
| Coordination & Connectors | ||
conj | Conjunct | cats and dogs |
cc | Coordinating conjunction | cats and dogs |
preconj | Pre-correlative conjunction | either A or B |
mark | Marker (subordinating conjunction) | because it rained |
| Verbal | ||
aux | Auxiliary | I have eaten. |
auxpass | Passive auxiliary | It was built. |
compound | Compound | New York, ice cream |
prt | Particle (phrasal verb) | give up, turn off |
| Other | ||
ROOT | Root of the sentence | She runs fast. |
punct | Punctuation | Hello, world. |
parataxis | Parataxis (loosely joined clause) | He said — I agree. |
intj | Interjection | Well, I think... |
dep | Unclassified dependent | (catch-all) |
meta | Meta modifier | structural markup |
Grammatical properties of individual tokens. Searchable with {#feature} notation using partial matching (e.g., {#past} matches any token containing "Past" in its morphological annotation).
| Feature | Values | Search example | Description |
|---|---|---|---|
VerbForm | Fin, Ger, Inf, Part | {#ger}, {#inf} | Finite, gerund, infinitive, participle |
Tense | Past, Pres | {#past}, {#pres} | Past or present tense |
Aspect | Perf, Prog | {#perf}, {#prog} | Perfect or progressive aspect |
Mood | Ind | {#ind} | Indicative mood |
VerbType | Mod | {#mod} | Modal verb |
Voice | — | Not annotated by en_core_web_lg; use dep labels {@auxpass} / {@nsubjpass} instead | |
Number | Sing, Plur | {#sing}, {#plur} | Singular or plural |
Person | 1, 2, 3 | {#person: 3} | 1st, 2nd, or 3rd person |
Case | Acc, Nom | {#nom}, {#acc} | Nominative or accusative case |
Gender | Fem, Masc, Neut | {#fem}, {#masc} | Grammatical gender (pronouns) |
Degree | Pos, Cmp, Sup | {#cmp}, {#sup} | Positive, comparative, superlative |
Definite | Def, Ind | {#def}, {#ind} | Definite or indefinite article |
PronType | Art, Dem, Ind, Prs, Rel | {#dem}, {#rel} | Article, demonstrative, indefinite, personal, relative |
Poss | Yes | {#poss} | Possessive |
Reflex | Yes | {#reflex} | Reflexive pronoun |
Polarity | Neg | {#neg} | Negative polarity (not) |
NumType | Card, Mult, Ord | {#ord} | Cardinal, multiplicative, ordinal |
Foreign | Yes | {#foreign} | Foreign word |
ConjType | Cmp | {#conjtype} | Comparative conjunction (than) |
Named entities recognized by spaCy's NER model. Searchable with %TYPE notation (e.g., %PERSON, say %ORG).
| Type | Description | Examples |
|---|---|---|
PERSON | People, including fictional | Obama, Einstein, Hamlet |
ORG | Organizations, companies, agencies | Google, the UN, NASA |
GPE | Countries, cities, states | France, New York, California |
LOC | Non-GPE locations: mountains, bodies of water | the Alps, the Pacific, Antarctica |
FAC | Facilities: buildings, airports, highways | the Eiffel Tower, JFK Airport |
NORP | Nationalities, religious/political groups | American, Buddhist, Republican |
DATE | Absolute or relative dates/periods | tomorrow, last year, 2024 |
TIME | Times shorter than a day | 3 o'clock, this morning |
MONEY | Monetary values | $5, 2 million euros |
PERCENT | Percentages | fifty percent, 10% |
QUANTITY | Measurements: weight, distance, etc. | 5 kg, 2 miles, 100 degrees |
CARDINAL | Numerals not covered by other types | five, hundred, 42 |
ORDINAL | Ordinal numbers | first, 42nd, third |
EVENT | Named events | World War II, the Olympics |
WORK_OF_ART | Titles of works | Hamlet, Mona Lisa |
PRODUCT | Products, vehicles, foods | iPhone, Boeing 747 |
LAW | Named documents of law | the Magna Carta, GDPR |
LANGUAGE | Named languages | English, Mandarin, Arabic |
Quick reference for all search notation available in Advanced Search mode.
| Syntax | Description | Example |
|---|---|---|
word | Surface form search | help |
[lemma] | Lemma (base form) search | [be] → am, is, are, was, were, been, being |
['s] | Literal surface match | ['s] → 's (possessive/contraction) |
word{pos} | Surface + POS filter | help{verb}, help{noun} |
[lemma]{pos} | Lemma + POS filter | [help]{noun} |
{pos} | POS-only search | {propn} (any proper noun) |
{@tag_or_dep} | Fine-grained tag or dependency | {@vbg} (gerund), {@nsubj} (subject) |
{#morph} | Morphological feature | {#past} (past tense), {#plur} (plural) |
{-pos} | Negative POS filter | {-verb} (not a verb) |
-word | Negative word match | -the |
a|b | OR alternatives | help|assist |
+prefix | Prefix match | +un → un..., under..., until... |
{} or * | Wildcard (one word) | [make] {} {noun} |
_ | Noun chunk | [give] _ _ |
^ | Start of segment | ^ however |
%TYPE | Named entity search | %PERSON, say %ORG |