Recent Projects

See also: https://github.com/yohasebe

TCSE: TED Corpus Search Engine

A versatile corpus system to retrieve video/text segments from over 2,300 TED Talks

jReadability

jReadability is a Japanese text readability measurement system being developed in collaboration with Jaeho Lee at Waseda University.

RSyntaxTree

RSyntaxTree is a graphical syntax tree generator written in the Ruby programming language.

WP2TXT

WP2TXT extracts plain text data from Wikipedia dump file stripping all the MediaWiki markups and other metadata.

EngTagger

EngTagger is a probability based, corpus-trained tagger that assigns POS tags to English text based on a lookup dictionary and a set of probability values.

Paradocs

Paradocs is a paragraph-oriented document presentation system, created using Reveal.js

Intro to BYU Corpora [in Japanese]

A tutorial document to learn how to use COCA and other BYU corpora [in Japanese]

Using TCSE [in Japanese]

A tutorial document to learn how to use TED Corpus Search Engine (TCSE) [in Japanese]

Teaching

at Doshisha University

Kyotanabe Campus, Doshisha University  

No classes assigned for the period from Fall 2018 through Spring 2019.

Contact

  • yohasebe@gmail.com
  • Faculty of Global Communications
    Doshisha University
    1-3 Tatara Miyakodani, Kyotanabe-shi, Kyoto-fu, 610-0394, JAPAN