Rio de Janeiro


The intent of this project is to look at linguistic characteristics that may make themselves evident through texting conventions. Some of the characterisitcs that this project may look into include the use of abbreviations and exxapnded forms, emojis, laughter notations, and speech errors and corrections.


The intenion of this project was to analyze texting discourse between Brazilian L1 Portuguese speakers and American L2 Portugues speakers. We wanted to look at the linguistic differences in texting conventions between the groups to determine if there were any patterns that might have developed or been present in the American speakers as they learned Portuguese. The linguistic elements that the project looked into consisted of the use of abbreviations as well as the use of expanded forms, the presence and distribution of different emoji characters, the usage of different textual laughter notations, and the methods for addressing and correcting 'speech errors' as they occured in texting.

Markup and Analysis

The markup was done in compliance with the TEI. The corpus that the project received came in the form of screen shots from individuals' phones who were using WhatsApp to communicate witht the Brazilian speakers as part of a class assignment. The data was then transcribed and coded in the following manner. The entirety of the corpus is contained within a single TEI document and each conversation, of which there are 10, has been given its own number so as to uniquely identify it. Additionally, each speaker, both Brazilian and American, has been given their own unique ID while having their names erased so as to identify them indiadually in an anonymous manner. Then, the corpus was searched through to find instances where an aspect of our research question may have been addressed and was coded.

Abbreviations were coded as <abbr> with possible @type attribute values of 'textese' 'informal' or 'contraction'. Expanded forms were coded as <expan> with possible @type attribute values of 'formal' 'full' or 'uncontracted'. Emojis were coded as <g> with a @ref attribute that identified their representation. Laughter notations were coded as <hi> with possible @type attribute values of 'ha' 'k' or 'rs'. Corrections were coded as <corr> with possible @type attribute values of 'implicit' 'explicit' or 'self'.

List of the technologies we used

link to our coding

For an indepth look into our repo organization and to view our acutal coding, you can visit our GitHub repo where we have stored all the information that is in anyway relevant to this project.

Created by: Brandon Rodgers, Patrick Brooks, and Tyler Bokan under the supervision of Zac Enick and Gregory Bondar

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.