Source: http://www.maketecheasier.com
Archive for the ‘corpus linguistics’ Category
How to Batch Convert Text Files to Other Formats in Mac via the Terminal
Posted: January 8, 2015 in applied linguistics, corpora, corpus, corpus linguistics, data, language analysis, MAC, Manipulating text, resources, software, text analysis, text toolsTags: Mac, terminal
#corpusMOOC Corpus Linguistics: Method, Analysis, Interpretation starts Sept 29
Posted: September 11, 2014 in corpus, corpus linguistics, English, English Language, learner corpus, learner language, learning, Manipulating text, MOOC, Recursos, research, researching corpus use, resources, Subjects at UMU, text analysis, text tools, text-analytics, universidad, www resourcesLinguistic Inquiry and Word Count (LIWC): Our Use Of Little Words Can, Uh, Reveal Hidden Interests @NPR
Posted: September 1, 2014 in corpus linguistics, investigación, language analysis, sentiment, vocabularyAn NPR feature which disccusses W. Pennebaker contribution to human use of words.
This was also the subject of my own contribution to the 2010 World Congress of Behavioral and Cognitive Therapies within the session “Interdisciplinary research between Corpus Linguistics and Clinical Psychology” at Boston University, MA.
Post at Cambridge Extra at the Linguist List: Researching New Uses of Corpora for Language Teaching and Learning
Posted: April 29, 2014 in blogs, Cambridge University Press, corpus linguistics, ReCALLReCALL Special Issue on Researching New Uses of Corpora for Language Teaching and Learning
Full text: Researching uses of corpora for language teaching and learning ReCALL, 26, 2, 121-127.
Posted: April 15, 2014 in applied linguistics, corpus linguistics, DDL, journals, ReCALLReCALL special issue: Researching uses of corpora for language teaching and learning
Editorial: Researching uses of corpora for language teaching and learning
ALEX BOULTON
University of Lorraine and CNRS, France
(email: alex.boulton@univ-lorraine.fr)
PASCUAL PÉREZ-PAREDES
Universidad de Murcia, Spain
(email: pascualf@um.es)
Boulton, A. Pérez-Paredes, P. 2014. Editorial: Researching uses of corpora for language teaching and learning. ReCALL, 26, 2, 121-127.
A review of Fluency in Native and Nonnative English Speech
Posted: April 9, 2014 in corpus linguistics, fluency, ICAME Journal, review
Pérez-Paredes, P. (2014). A review of Fluency in Native and Nonnative English Speech. Studies in Corpus Linguistics, 53. Amsterdam: John Benjamins, 2013. 238 pp. ISBN 978-9-027-203588. ICAME Journal.
Read the review.
Learners’ search patterns during corpus-based focus-on-form activities
Posted: April 3, 2014 in corpus linguistics, International Journal of Corpus Linguistics, My research
This research explores the search behaviour of EFL learners (n=24) by tracking their interaction with corpus-based materials during focus-on-form activities (Observe, Search the corpus, Rewriting). One set of learners made no use of web services other than the BNC during the central Search the corpus activity while the other set resorted to other web services and/or consultation guidelines. The performance of the second group was higher, the learners’ formulation of corpus queries on the BNC was unsophisticated and the students tended to use the BNC search interface to a great extent in the same way as they used Google or similar services. Our findings suggest that careful consideration should be given to the cognitive aspects concerning the initiation of corpus searches, the role of computer search interfaces, as well as the implementation of corpus-based language learning. Our study offers a taxonomy of learner searches that may be of interest in future research.
Pérez-Paredes, P., Sánchez-Tornel, M., & Alcaraz Calero, J. M. (2012). Learners’ search patterns during corpus-based focus-on-form activities.International Journal of Corpus Linguistics, 17(4), 483-516
Full text here.
Enhancing and extending corpora and corpora tools for learning and teaching
Posted: March 30, 2014 in conferences, corpus linguisticsValoriser et développer les outils autour des corpus dans une perspective didactique / Enhancing and extending corpora and corpora tools for learning and teaching
Mardi/Tuesday, mai/May 27th
Salle/Room 205 Site Rabelais, UJF Valence, France
9h30 – Speed-dating : Présentations/Presentations
10h – Présentation et discussion autour du livre/presentation and discussion about the book « Des documents authentiques aux corpus. Démarches pour l’apprentissage des langues ». Boulton et Tyne (2014). Discussion autour de l’abondance de matières exploitables dans les corpus et la sous-exploitation dans l’enseignement des langues/Including the abondance of exploitable corpora materials and the general lack of their use in language teaching.
Conférencier: Alex Boulton
11h – Présentation de la Plate-forme Chamilo : comment l’utiliser pour les corpus ? Suivi d’une discussion en français/anglais.
Jérémie Grépiloux et Hubert Borderiou (SIMSU)
13h30 – Pedagogical uses of corpora: theories and practices / Utilisations pédagogiques des corpus : théories et pratiques, 20-minute presentation followed by a group discussion
Conférencier: Pascual Pérez-Paredes
14h30 – Speed-dating : Consultation en ligne des corpus/Consulting on-line corpora: Montrer et voir des corpus en salle informatique
16h – Bilan de la journée et projets/Summary of the day and projects
Cristelle Cavalla and Laura Hartwell
Inscriptions (Gratuit et obligatoire)/Mandatoary free registration :
https://docs.google.com/forms/d/118xpaiTACRMW5KA5ja92oEGJqZ5Q6BUmqfVmSPq41U0/viewform
Logistics: Sylvain Perraud, Sylvain.Perraud@gmail.com (Compte rendu/minutes)
Contacts: Cristelle.Cavalla@univ-paris3.fr, Laura.Hartwell@ujf-grenoble.fr
References
SACODEYL : http://www.um.es/sacodeyl/
Chamilo : http://www.chamilo.org/fr
Scientext : http ://scientext.msh-alpes.fr/scientext-site-en/spip.php?article9
EmoBase/EmoProf : http://emolex.u-grenoble3.fr/emoBase/
Full-text data for the two largest BYU corpora
Posted: March 11, 2014 in COCA, corpora, corpus linguisticsI have received this through the CORPORA List:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
At http://corpus.byu.edu/full-text/ you can now download full-text data for the two largest BYU corpora:
Corpus of Contemporary American English (COCA). 440 million words of downloadable text; the largest, most up-to-date, publicly-available corpus of English that is balanced for genre (spoken, fiction, magazine, newspaper, and academic).
The corpus of Global Web-Based English (GloWbE). 1.8 billion words of downloadable text; divided into groups from twenty different English-speaking countries (US, UK, Canada, Australia, India, etc). About 60% from blogs, for very informal language.
With this full-text data, you will have the actual corpora on your computer, and you can search the data in any way that you’d like. You can generate your own frequency data, collocates, n-grams, or concordance lines; you can search by word, lemma, and part of speech; and you can carry out complex syntactic and semantic searches offline. You can even modify the lexicon and sources tables to search the corpora in ways that are not possible via the standard web interfaces.
The data comes in three different formats (see samples): data for relational databases (info), word/lemma/PoS (vertical), and linear text (horizontal). When you purchase the data, you purchase the rights to any and all of these formats.
Reading concordances is not a trivial task
Posted: March 6, 2014 in applied linguistics, concordances, corpus linguisticsThe methodological transfer from the CL research area to the applied ring of language learning and teacher underwent no adaptation, and thus learners were presented with the same tools, corpora and analytical tasks as well-trained and professional linguists.
[…]
Reading concordances is, by no means, a trivial task. Sinclair (1991) recommends a complex procedure which involves five distinct stages. Let us review very briefly what they entail. The first stage is
that of initiation. Learners here will look to the left and to the right of the nodes and determine the dominant pattern. Then, learners are prompted to interpret and hypothesize about what it is that these
words have in common. Thirdly, the consolidation stage, where students are to corroborate their hypothesis by looking more closely at variations of their hypotheses. After this, these findings have to be reported and, finally a new round of observations starts. Although typically reduced in language classrooms, this procedure is common in the possibilities scenario and certainly characterises the so-called bottom-up approach (Mishan, 2004: 223). A recent analysis (Kreyer, 2008) deconstructs the idea of corpus competence in different skills, namely, interpreting corpus data, knowledge about corpus design, knowledge about resources in the Internet, some linguistic background, knowledge about how to use concordances and, finally, some corpus linguistics background. This is a positive effort in the
right direction as the author admits the need to create the conditions for the use of corpora in the language classroom or, in other words, the Kreyer recognizes that pedagogic mediation is necessary if we want to turn the corpus into a learning tool. Notwithstanding, the challenges are significant.
Pérez-Paredes, P. (2010). Corpus Linguistics and Language Education in Perspective: Appropriation and the Possibilities Scenario. In T. Harris & M. Moreno Jaén (Eds.), Corpus Linguistics in Language Teaching (pp. 53-73). Peter Lang.