TCSE is a search engine specializing in exploring transcripts of TED Talk. It has been created for educational and scientific purposes. TCSE uses data provided by TED under the Creative Commons BY-NC-ND license, but it is not an official service of TED.
Change Log | Disclaimer | DocumentationTCSE is created by Yoichiro Hasebe at Doshisha University, Kyoto, Japan and made available free for non-commercial educational and scientific use. Please cite one of the following when you publish work which utilizes TCSE.
Hasebe, Yoichiro. (2015) Design and Implementation of an Online Corpus of Presentation Transcripts of TED Talks. Procedia: Social and Behavioral Sciences 198(24), 174–182.Development of this system was partially supported by JSPS KAKENHI Grant-in-Aid for Young Scientists (B) Number 25870898
Special thanks to Mura Nava (EFL Notes) for continuous feedback.
| Number of talks | 4,227 |
| Number of segments | 974,360 |
| Number of expanded segments | 416,382 |
| Number of elements | 8,712,020 |
| Number of lexical items | 103,735 |
| Arabic | 4,143 talks |
| Bulgarian | 2,207 talks |
| Chinese, Simplified | 4,142 talks |
| Chinese, Traditional | 4,137 talks |
| Croatian | 1,983 talks |
| Czech | 1,578 talks |
| Dutch | 2,949 talks |
| French | 4,124 talks |
| German | 2,863 talks |
| Greek | 2,709 talks |
| Hebrew | 3,454 talks |
| Hungarian | 3,184 talks |
| Indonesian | 2,116 talks |
| Italian | 3,714 talks |
| Japanese | 3,847 talks |
| Korean | 4,006 talks |
| Persian | 3,282 talks |
| Polish | 3,052 talks |
| Portuguese | 3,657 talks |
| Portuguese, Brazilian | 4,168 talks |
| Romanian | 3,328 talks |
| Russian | 3,938 talks |
| Serbian | 2,673 talks |
| Spanish | 4,165 talks |
| Swedish | 1,164 talks |
| Thai | 1,858 talks |
| Turkish | 3,986 talks |
| Ukrainian | 2,174 talks |
| Vietnamese | 3,236 talks |
Sometimes video and transcript are not in sync for some reason. For such cases, the following solution is available on TCSE:
Advanced search is available only in English.
List of English POS tags
POS keys are specified either fully ({vb}) or partially ({v}).
An advanced search query string cannot consist only of POS keys.
| Lemma | [LEMMA] |
| Part of Speech | {POS} |
| Surface + Part of Speech | SURFACE{POS}(with no spaces in-between) |
| Lemma + Part of Speech | [LEMMA]{POS}(with no spaces in-between) |
| Logical Disjunction (OR) | A|B |
| Segment Onset (Beginning) | ^ |
| Negative Match | - |
| Wild Card (matching exactly one element/word) | -_ |
| Wild Card (matching variable length of strings) | * |
[excite] |
| excite, excites, excited, exciting |
{n} |
| Noun, any kind |
{v} |
| Verb, any kind |
to * surprise |
|
to our surprise to his surprise, etc. |
[read] {DT} [news|paper|article] |
|
they read these articles reading the paper or something I'm reading the news at six, etc. |
^ having {v} |
|
Having started the process, Having said that, etc. |
[help]{n} |
|
an aunt offered financial help, we called people for help, etc. |
[get] -rid of |
|
get outside of get ahead of got tired of, etc. |
| TCSE Version | 8.0.2 |
| Date of talk data compilation | February 5, 2021 |
| English POS Tagger | Enju 2.4.4 |