Search Abstracts | Symposia | Slide Sessions | Poster Sessions
PyCREA: A Python Package for Neurobiologically Grounded Semantic Representations
Poster Session D, Saturday, September 13, 5:00 - 6:30 pm, Field House
Alex Skitowski1, William Gross1; 1Medical College of Wisconsin
Semantic embedding spaces have been used to represent and quantify the meanings of words since the introduction of distributional linguistic models in the 1950s (Harris, 1954; Firth, 1962). Historically, these models have been based upon word co-occurrences and context to define the semantic content of a word. Modern NLP semantic embedding spaces (GloVe, Pennington et al., 2014; Word2Vec, Mikolov et al., 2013) have introduced more complicated analysis methods, but their creation of vectors is fundamentally limited to latent semantic analysis. These embedding spaces have been successful at representing the semantic content of words, but they have drawbacks in terms of explicit feature selection and linkage to neurobiological mechanisms. Finally, they efficiently represent linguistic semantics, but they don’t necessarily represent the same semantic space as the brain (Tong et al., 2022). Due to these limitations, our group previously developed the CREA (Concept Representation as Experiential Attributes) framework (Binder et al., 2016). CREA is derived from human judgements of words based on 65 neurobiological ratings from 0-6 including sensory, motor, spatial, temporal, affective, social, and cognitive experiences. This framework addresses potential limitations of LSA-based embeddings, particularly their lack of interpretability and weak alignment with large-scale brain networks. In this work, we present a new tool for efficiently accessing and using these ratings: PyCREA, a Python package. This package currently contains a database of 960 word embeddings, and the experimental scripts to acquire new embedding ratings. The library also contains functions to retrieve and compare vectors for selected words across all 65 dimensions or chose specific ratings for comparison. The package is currently available via Github (https://github.com/WiredBrains-Lab/CREA-Vectors) and installable via pip (https://pypi.org/project/PyCREA/). With the release of the PyCREA package, we hope to provide researchers with an additional resource to efficiently conduct studies using this neurobiologically grounded semantic model.
Topic Areas: Development of Resources, Software, Educational Materials, etc., Computational Approaches