The CRITT TPR-DB 3.0¶
version 3.0
This is the documentation site for the CRITT TPR-DB version 3.0.
The Center for Research and Innovation in Translation and Translation Technology (CRITT) is the creator and maintainer of the Translation Process Research Database (TPR-DB), which is now in its 3rd major version (3.0).
What is the TPR-DB?¶
The Translation Process Research Database (TPR-DB) is a tool for organizing and processing key-logging and gaze-fixation data so that it can be easily analyzed in powerful and flexible ways.
Yeah, but what is it?
The TPR-DB has three concrete resources that help you analyze your translation process data:
- The TPR-DB web app to upload and process your data 🤖
- Utility libraries (Python and R) to help you analyze the data 🤓
- Documentation (you’re looking at it 😉) to help you along the way
TPR-DB Workflow¶
The Goal
starting point = gather data
end goal = analyze data
- Gather key-logging data with Translog-II or Trados Qualitivity and/or gaze data with an eye tracker
- Upload the raw data to the TPR-DB
- The TPR-DB will automatically process the raw data:
- Segment the texts
- Tokenize the texts
- PoS tag the tokens and parse syntactic dependencies
- Align source and target segments
- (optional) Automatically align source and target words
- Map keystrokes to words
- Map fixations to words
- The end result of this process are SessionProps XML files
- The TPR-DB will automatically process the raw data:
- Manually align words and annotate alignment groups
- The TPR-DB will automatically extract features to create TPR-DB data tables
- Use the features in the tables to visualize and analyze your data 🤓
History¶
The CRITT TPR-DB project started around 2010 with the data from a number of translation process studies that were recorded in Translog1 and integrated into a database under a common format. The TPR-DB was first a part of the Danish Dependency Treebanks Project at the Copenhagen Business School2 and then became a project on its own. Right from the beginning the idea was not only to provide a data repository for translation process research, but also to provide a toolkit for analyzing and visualizing the data. The TPR-DB toolkit started out with a number of functions in R to access and analyse the TPR-DB data, and subsequently developed into a browser-based toolkit with a Jupyter interface and a Python library.
The first public documentation of the CRITT TPR-DB 1.0 appeared in 2012 (Carl 2012), when Translog-II was amended to record non-European languages (Chinese, Hindi, Japanese, among others). Adding a web-interface and further toolkit functionalities resulted in the TPR-DB 2.0 some of which is documented in a Springer volume, New Directions in Empirical Translation Process Research (Carl et al. 2015). Another Springer volume appeared in 2021, Explorations in Empirical Translation Process Research (Carl 2021), when the TPR-DB was based at Kent State University.
In 2026, the TPR-DB processing chain was migrated from Perl to Python and was given a new user interface.