Aine Ito

Databases

SUBTLEX-UK: Frequency for British English based on subtitles of British television programmes

SUBTLEX-CH: A database of Chinese word and character frequencies

MRC Psycholinguistic Database: for familiarity, imageability and concreteness ratings, among others

MCWord: An Orthographic Wordform Database: for obtaining orthographic neighbourhood frequency and generating lists of nonwords

Sentence norms: Completion norms for 3085 English sentence contexts, for obtaining sentences with a target word with a certain cloze value

Cloze/Sentence norms: Cloze probability and completion norms for 498 English sentences (Block & Baldwin, 2010)

Cloze/Sentence norms: Cloze probability, predictability ratings, and computational estimates for 205 English sentences (de Varda, Marelli & Amenta, 2023)

Kanji database: Kanji frequency, On- and Kun-reading frequencies, On-reading ratio, kanji productivity of two-kanji compounds, symmetry of kanji productivity, entropy, number of meanings, etc.

Online Japanese Accent Dictionary : for obtaining Tokyo accent and generating lists of words with a certain accent type

University of South Florida Free Association Norms: Semantic Association database

Google Ngram Viewer: Ngrams in Google's text corpora

CLEARPOND: The Cross-Linguistic Easy-Access Resource for Phonological and Orthographic Neighborhood Densities

The Auditory English Lexicon Project (AELP): A multi-talker, multi-region psycholinguistic database of 10,170 spoken words and 10,170 spoken nonwords

LexiCAL: A calculator for lexical variables

Corpora

British National Corpus: This corpus contains a 100 million words of text texts from a wide range of genres

Corpus of Contemporary American English: This corpus contains more than 520 million words of text. It is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts.

Chunagon (中納言): Corpus of written & spoken Japanese from NINJAL

Auditory English Lexicon Project (AELP): A multi-talker, multi-region psycholinguistic database of 10,170 spoken words and nonwords. (English)

Eye-tracking corpora

The Provo Corpus: A Large Eye-Tracking Corpus with Predictability Norms. Luke, S.G. & Christianson, K. (2018). The Provo Corpus: A Large Eye-Tracking Corpus with Predictability Ratings. Behavior Research Methods, 50, 826-833.

GECO: An eye-tracking corpus of monolingual and bilingual sentence reading. Cop, U., Dirix, N., Drieghe, D., & Duyck, W. (2017). Presenting GECO: An eyetracking corpus of monolingual and bilingual sentence reading. Behavior Research Methods, 49(2), 602-615.

MECO: An eye-tracking corpus of multilingual L2 (English) sentence reading. Siegelman et al. (2022). Expanding horizons of cross-linguistic research on reading: The Multilingual Eye-movement Corpus (MECO). Behavior Research Methods.

Chinese eye-tracking reading corpus: An eye-tracking corpus of Chinese sentence reading. Zhang et al. (2022). The database of eye-movement measures on words in Chinese reading. Scientific Data.

E-books

Linear Mixed Models in Linguistics and Psychology: A Comprehensive Introduction by Shravan Vasishth, Daniel Schad, Audrey Bürki, and Reinhold Kliegl

R for Data Science by Hadley Wickham and Garret Grolemund

Learning Statistical Models Through Simulation in R by Dale Barr

R for Psychological Research (Course materials) by Glenn Williams

One Way ANOVA with R Completely Randomized Design - Between Groups by Bruce Dudek

Doing Meta-Analysis with R: A Hands-On Guide by Mathias Harrer, Pim Cuijpers, Toshi A. Furukawa & David D. Ebert

Power Analysis with Superpower by Aaron R. Caldwell, Daniël Lakens, Chelsea M. Parlett-Pelleriti, Guy Prochilo & Frederik Aust

Picture databases

Black & White

260 pictures standardised in English: Snodgrass & Vanderwart (1980)

360 pictures standardised in Japanese: Nishimoto, Ueda, Miyawaki, Une, & Takahashi (2012)

International Picture Naming Project

The Noun Project

Colour

Bank of Standardised stimuli (BOSS): Brodeur, Dionne-Dostie, Montreuil, & Lepage (2010)

Colour version of Snodgrass & Vanderwart (1980): Moreno-Martínez & Montoro (2012)

MultiPic: A standardized set of 750 drawings with multilingual norms

LinguaPix: 1,620 colour photographs normed in Dutch, English, Polish, and Cantonese

Tools

Mix: stimuli (pseudo-)randomisation tool

Research Randomizer: simple randomisation tool

Ralpha: a software for resizing images (only for Windows)

LexTALE: Lexical Test for Advanced Learners of English. Lemhöfer, K., & Broersma, M. (2012) (LexTALE on GitHub)

PCIbex Farm: a platform for running web-based experiments

Cognition: a platform for running web-based experiments

Working memory span tests: available in Czech, English, German, Japanese, Russian, and Spanish (credit: Titus von der Malsburg)

PsychoPy experiment templates: reaction time experiments, digit span, counterbalancing, mouse tracking, self-paced reading etc.

Linger: a software for self-paced reading experiments

Whisper Large V3: Transcribe Audio: transcribe long-form microphone or audio inputs

Restream: transcribe audio to text

WebPower: statistical power analysis online

RoBERTa: cloze norming with LLM

char-similar: a python tool for quantifying Chinese character shape similarity

Tutorials

PsyTeachR: Many useful R resources from the University of Glasgow

Introduction to mixed-effects models: by Ian Cunnings & George Pontikas (YouTube)

Tutorials for visual world eye-tracking data analysis in R: R tutorials I made for a workshop in 2023 based on Ito & Knoeferle (2022, Behavior Research Methods). You can find tutorial videos in the Media tab.

Simulation-based power analysis: from Kumle, Võ and Draschkow (2021, Behavior Research Methods)

ERP training: ERPLAB tutorials by Jen Lewendon