Roberto Zamparelli

Corso Bettini, 31 - 38068 Rovereto
tel. 0464 808613
roberto.zamparelli[at]unitn [dot] it
Possible Theses Topics Theoretical Linguistic...
lunedì 01 aprile 2019

Possible Theses Topics

Theoretical Linguistics

I interested in supervising thesis on topics in theoretical linguistics, especially those concerned with the syntax/semantics interface.

Requirements: students interested in these topics should have followed various linguistics courses, including my Advanced Methods course, or have otherwised acquired an expertize in both generative syntax and formal semantics.

  • Nominalization: e.g. difference between various ways to derive nouns from verbs (e.g. "dest ruction" vs. "destroying" vs. Italian "Il distruggere" vs. "distruggere (è uno spasso)"). Connections to: Psycholinguistics, Computational Analyses.
  • Kinds: how to derive the differnt ways in whic languages refer to "kinds of things" and express generalizations about them (e.g. "Tigers eat meat", "The tiger eats meat", "A tiger eats meat"); anaphoric reference to kinds.
  • Definiteness and demonstrativity How do languages which do not have definite articles express definiteness? How are demonstratives used across languages? How definite are possessives?
  • Coordination Is there a common meaning for the word "and" in "He was tall and fat / He left and she returned / The guests were in the kitchen and on the balcony"? How is "and" different from "or", and why is "or" harder to process than "and"? How does "but" work?

Linguistic Methodologies

These are topics that explore novel methods to address classic linguistic issues. Students willing to pursue these issues should have at least some background in linguistics (syntax or semantics) and a good background in computational linguistics (including, in some cases, artificial neural networks ANN and distributional semantics)

  • Using ANN to simulate human linguistic judgments (part of the TREIL linguistic project) or ERP signals. Which network architecture? To what extent can ANN approximate human performances? What is the best way to measure it? Are there classes of phenomena that the ANN can/cannot get?
  • Feeding ANN a range of languages/language structures to converge on language universals (structures which are shared by all languages). How to code the input? What kind of tasks can be asked from a "polyglot" network?
  • Impossible languages. See previous task. Would an ANN learn languages that no human could naturally learn? Is it possible to contrain it that that it would not learn them? This task involve ganerating corpora, e.g. artificially injecting odd structures in existing corpora, or generating them with a grammar.
  • Structure folding Current theoretical syntax often envisions universal complex sequences of functional projections, parts of which move to generate the output we see in actual languages, in different ways depending on the feature structure present in each language. The task would be to create a model which explores the feature space by artificially creating all the ways in which a functional sequence can be folded, then comparing the output to known language structures.
  • The distribution of constructions. This is really a set of tasks which looks at finding a distributional semantics correlate of various constructions, at the morphological (nominalizations, gerunds, compounding, pluralization, diminutives, etc.) or syntactic (conjunction, passivization, definite) level. The general question is: what kind of insight on the meaning of constructions (as opposed to words) can distribuitional semantics give us?
  • Multilingual distributional representations. As our ability to create comparable semantic spaces for different languages increases, we start to be able to raise the question above at a multi language level: e.g. "Are passives in English more similar to passives in German than in Italian? ". There are a host of questions that can be asked if the semantics. 
  • Multilingual distributions and lexical ambiguity. With multilingual semantic representantions we could also start to address issues of homonymy and polysemy in a more principled way. "Chair" in English can be an object or a person ("He was chair at that conference"), but Italian "sedia" can only be an object. Can we used semantic distributional sematic mappings to separate different meanings?
  • Alternative ways to collect linguistic data especially via GWAPs (Games with a purpose). Examples include mappings between intonations and semantic features, on line grammaticality judgments, reading times.


While I am not a psycholinguist I am willing to cosupervise students interested in specific topics in sentence comprehension. In particular, I am interested in follow ups of the experiment THINK (extracting ERP correlates of the process of mentally repeating or translating sentences), which was carried out by a CIMEC master student. I am also interested, in principle, in studies that try to model EEG signals using ANN (again, on semantic/syntactic processes).

Linguistic education

Education for the general public and for younger students is part of a university's "third mission". In am interesting in developing new ways to teach language structures (which might or might not be a part of actually teaching "languages"). A couple of topics.

  • Testing and expanding Puzz-Ling, the physical language puzzle game. Puzz-ling is a puzzle game to teach basic sentence structures across English, Italian and German. Work on this topic could mean (in order of growing complexity)
    • Designing and optimizing game rules.
    • Testing puzz-ling and checking which aspects are not yet covered/overgenerate
    • Expanding it to other languages (e.g. Spanish, French) 
    • Designing alternative implementations which solve current problems (e.g. selection to heads, rather than categories; island sensitivity)
    • Creating a digital version.
  • Designing a physical model of syntax for case-rich, free word-order languages, like Latin and Greek. This could be based on a dependency grammar, rather than on a constituent-based grammar like Puzz-Ling
  • Better ways to convey distributional meanings. Distributional semantics can give us quantitative, vector based similarity measures for words, collocations and constructions (see above), but it remains difficult to convey this information to the general public, beyond saying "A is more similar to B than to C". Most likely a cosupervision with someone knowledgeable about graphics & interfaces. 

Practical computational tasks

These are simply computational topics that could have a useful output, and present interesting challenges along the way. They all require good computational skills. Do you want to become a millionaire? Do one of these:

  • An app that proposes a correct punctuation and formatting in an unformatted text (esp. what comes out from a dictation software)
  • An app that offers intelligent editing help: change a passive into an active at the click of a button, adjusts the gender/number of all adjectives/verbs when you change the noun.