Input text and press SEARCH

Mode
User Interface
TED Stage Talk3,375
TED-Ed Original1,323
TEDx Talk958
TED Institute Talk278
Original Content270
TED Salon Talk (partner)104
Podcast (audio only)63
Best of Web46
Custom sponsored content2
Total6,419
 
  
(Speaker / Title / Description)  
     

     
 

About TCSE

TCSE is a search engine specializing in exploring transcripts of TED Talk. It has been created for educational and scientific purposes. TCSE uses data provided by TED under the Creative Commons BY-NC-ND license, but it is not an official service of TED.

Change Log   |    Disclaimer

TCSE Documentation

TCSE Bibliography


Using and Citing TCSE

TCSE is created by Yoichiro Hasebe at Doshisha University, Kyoto, Japan and made available free for non-commercial educational and scientific use. Please cite one of the following when you publish work which utilizes TCSE.

Hasebe, Yoichiro (2015) Design and Implementation of an Online Corpus of Presentation Transcripts of TED Talks. Procedia: Social and Behavioral Sciences 198(24), 174–182.

Acknowledgment

  • Special thanks to TED for being a great resource for language learners, educators, and researchers.
  • Thanks to Mura Nava (EFL Notes) for continuous feedback.
  • Thanks to Daisuke Nonaka for suggesting practical use cases of TCSE in his paper on TCSE.
  • Development of this system was partially supported by JSPS KAKENHI Grant-in-Aid for Young Scientists (B) Number 25870898 and Grant-in-Aid for Scientific Research (C) 18K00670

TCSE Specifications

TCSE Version12.1.0
Date of talk data compilationFebruary 28, 2026
English POS-Tagger / Syntactic ParserspaCy 3.8 (en_core_web_lg)

Statistics of English Transcripts

Number of talks 6,419
Number of segments1,419,926
Number of expanded segments677,487
Number of elements13,017,589
Number of lexical items106,707

Number of Translated Talks

Arabic6,290 talks
Bulgarian2,344 talks
Burmese2,102 talks
Chinese, Simplified6,033 talks
Chinese, Traditional5,701 talks
Croatian2,062 talks
Czech1,792 talks
Dutch3,263 talks
French5,894 talks
German3,722 talks
Greek3,407 talks
Hebrew4,869 talks
Hindi1,202 talks
Hungarian3,932 talks
Indonesian3,651 talks
Italian5,559 talks
Japanese4,688 talks
Korean5,600 talks
Kurdish, Central1,429 talks
Kurdish, Northern1,144 talks
Persian4,183 talks
Polish3,823 talks
Portuguese5,055 talks
Portuguese, Brazilian5,400 talks
Romanian3,989 talks
Russian5,223 talks
Serbian3,076 talks
Slovak1,128 talks
Spanish6,291 talks
Swedish1,390 talks
Thai2,764 talks
Turkish5,395 talks
Ukrainian2,356 talks
Vietnamese5,679 talks

Video Control Tips

How to skip to a specific segment

  1. Pause the video. (Otherwise you can see only closely adjacent segments.)
  2. Locate the segment that you like to skip to and click on it.

How to adjust sync between video and transcript

Sometimes video and transcript are not in sync for some reason. For such cases, the following solution is available on TCSE:

  1. Pause the video. (Otherwise you can see only closely adjacent segments.)
  2. Locate the segment that is expected to be played when the video is resumed
  3. Click on the "adjust" button of the segment,

About Text Highlight

In the video playback view, the following text highlights are available:

Keywords of the Talk — Words with TF-IDF score above 3.0 for the talk are highlighted with an underline. TF-IDF (Term Frequency–Inverse Document Frequency) measures how important a word is to a particular talk relative to the entire corpus. Higher values indicate words that are characteristic of that specific talk.

Discourse Markers — Common discourse markers (e.g. however, in other words, you know, I mean) are highlighted with a colored underline. These are words and phrases that organize speech, signal transitions, or manage the flow of conversation.

About Advanced Search

Advanced search is available only in English.

Linguistic Reference (POS, Tags, Dependencies, Morphology)

POS keys use spaCy Universal POS names (e.g. {verb}, {noun}). Short aliases are also accepted: {v}=verb, {n}=noun, {a}/{j}=adj, {r}=adv, {pr}=pron.
An advanced search query string cannot consist only of POS keys.


Advanced Search Syntax

Lemma[LEMMA]
Part of Speech{POS}
Surface + Part of SpeechSURFACE{POS}
(with no spaces in-between)
Lemma + Part of Speech[LEMMA]{POS}
(with no spaces in-between)
Logical Disjunction (OR)A|B
Segment Onset (Beginning)^
Noun Chunk_
Negative Match-X
Wild Card (matching exactly one element/word)-_
Wild Card (matching variable length of strings)*
Named Entity (NER)%PERSON, %ORG, %GPE, %DATE, etc.

Advanced Search Examples

[excite]
excite, excites, excited, exciting
{noun}
Noun, any kind
{verb}
Verb, any kind
to * surprise
to our surprise
to his surprise, etc.
[read] {det} [news|paper|article]
they read these articles
reading the paper or something
I'm reading the news at six, etc.
^ having {verb}
Having started the process,
Having said that, etc.
[help]{noun}
an aunt offered financial help,
we called people for help, etc.
[get] -rid of
get outside of
get ahead of
got tired of, etc.
[make] _ -_
made a bad design good.
make this happen.
make your life miserable., etc.
[give] _ _
give you an example
gave her a gift
give the government any further excuse, etc.
%PERSON said
Obama said
Einstein said, etc.

About Named Entity Recognition

In Advanced Search mode (check the "Advanced Search" checkbox), you can use %ENTITY notation to search for named entities recognized by spaCy NLP. Multi-token entities (e.g. "New York", "United Nations") are matched as a single unit. You can also search for NER patterns in the N-gram mode (e.g. %PERSON). The following entity types are available:

%CARDINALNumerals not covered by other types73,912
%DATEAbsolute or relative dates or periods72,487
%PERSONPeople, including fictional59,525
%GPECountries, cities, states48,806
%ORGCompanies, agencies, institutions47,748
%ORDINAL"first", "second", etc.21,850
%NORPNationalities, religious or political groups21,830
%LOCNon-GPE locations (mountain ranges, bodies of water)14,512
%TIMETimes smaller than a day9,389
%PERCENTPercentage (including "%")8,184
%QUANTITYMeasurements (weight, distance)6,854
%WORK_OF_ARTTitles of books, songs, etc.6,046
%MONEYMonetary values5,108
%PRODUCTObjects, vehicles, foods (not services)3,470
%FACBuildings, airports, highways, bridges2,649
%EVENTNamed hurricanes, battles, wars, sports events2,165
%LANGUAGEAny named language1,557
%LAWNamed documents made into laws758